[00:45] $1.3MM for the drive fund drive [01:23] SketchCow: just know that the totally rad show collection will need direct access [01:23] there is over 700 videos of that [02:23] may be old news.. http://appleinsider.com/articles/14/01/01/google-to-shut-down-bump-and-flock-apps-delete-all-data [03:57] new to me. http://blog.bu.mp/post/71781606704/all-good-things [08:10] http://5secondfilms.com/ [08:10] They're shutting down what they are and redoing themselves. [12:34] DFJustin: the IA is downloading all the youtube videos? [12:34] or just the html pages with the data? [12:51] so i just may have found a complete collection of 3d artist magazine [12:51] it only start in 2009 so its not going up right away [13:13] SketchCow: running a link discovery for the website http://5secondfilms.com/ now [13:26] !a http://www.fukuleaks.org/web/ [13:27] wrong channel again [13:28] SketchCow: this should be in pctoday magazine colleciton: https://archive.org/details/pctoday-magazine-v11i12 [13:29] this collection: https://archive.org/details/pctoday-magazine [13:56] oh script gurus.. on win7 running mingw's sed.. how to i delete all " symbols from a stream? sed s/\042//g was my best guess and it is not accepted TIA! [13:57] sed 's/"//g' not working? [13:57] sed: Couldn't open file more: No such file or directory [13:57] windows shell requires matching " [13:57] damn [13:58] and " " " don't work either :( [13:58] sed [13:58] oops [13:58] \" ? [13:59] sed: -e expression #1, char 11: Unterminated `s' command [14:00] i'm sure - couild get -f to work, but don't really want to make a file for this [14:09] hmm.. i'm not up on unicode.. is $EF $BB $BF one of the special sequences? [14:14] doh.. it is [14:14] stupid mingw sort does not ignore it, and i ended up with it silently in the middle of my txt file after sorting [15:24] Cowering: you should not have any BOM in UTF-8 files [15:24] I know windows software likes to put it in, but it's generally not recommended [16:34] Cowering: the escape character in the Windows shell is ^ if that helps. not sure it will, though. [19:35] Hello everyone [19:35] i'm trying to mirror a website with httrack [19:35] without success [19:36] a phpbb forum [19:36] i need to be logged in to view most of the content [19:36] i've followed this: http://httrack.kauler.com/help/CatchURL_tutorial [19:36] but i can't get it to work - the user is always disconnected [19:37] if anyone has any hints or suggestions, that would be great [20:14] NiTo: I'm not sure what OS you are on, but what you could do is use wget (preferably saving into a WARC as well using --warc-file=), and load the cookies from your browser (while you are logged in there) [20:14] wget just has a cookie-loading switch [20:15] it should then grab everything as the user you were logged into in your browser [20:15] check the wiki, i think there are notes about phpbb2 [20:19] joepie91: Schbirid: thanks, i've tried wget and it kind of works [20:19] the user stays authentified but pages i have with wget are not good [20:19] i mean, links are not redirected to the local files (or wrongly) [20:20] and it seems it doesn't get all the css and js files somehow [20:20] i've seen this page http://www.archiveteam.org/index.php?title=PhpBB [20:21] links are converted only when it is done [20:21] oh ok [20:23] i'm gonna try again with wget then [20:23] and if it works, i think i should update the phpbb wiki page with info from here: http://www.zoros.org/wiki/index.php?title=A_script_to_login_phpbb_forum_using_wget [20:23] to mirror while logged in, what do you think? [20:30] you just need to nab the cookies from your browser and feed them to wget or whatever [20:42] ok thanks i think i might be able to make it work [20:42] with wget [20:49] so looks like older wall steet journal videos was hosted on bightcove [20:49] *brightcove [20:51] http://www.businessinsider.com/wsj-drops-brightcove-for-web-video-2009-3 [20:51] of course i can tell its old news even then [20:52] i have video coming from october 2008 [21:24] INSTANT win for Archive Bot [21:24] They completely deleted Penny Arcade Report from their site [21:24] Archive.org, complete copy, thanks to archivebot run [21:29] Anybody here sadedownload at archive.org? [23:44] Can the WayBack Machine save Flash data on websites it grabs? I am wondering because I am going to buy some Avatar movie action figures on Craigslist and they have this whole interactive scanning thing. I went to WBM snapshot of the site but am not sure the Flash program/file saved is just the live/current on or the version when snapshot was taken [23:45] if it's a single swf embed it will save that for sure, if the swf loads other resources then it gets more problematic [23:45] But ya I'd like to know in case i want to use the interactive feature years down the road when the website shuts down [23:45] you can use the network tab of the developer console in firefox/chrome/? to see where it's loading the stuff from [23:46] so if it fully works on an old snapshot.. then years from now the same snapshot should work even when actual site is down? [23:46] oh [23:47] if i'm doing it right then it looks as if it is loading all of it from the IA snapshot [23:48] err [23:48] wait. now i'm confused.. lol [23:48] can you check it out for me. I have no idea how to use the dev/network tools on chrome. Or read it/whatever. [23:48] it doesn't make sense to me [23:49] https://web.archive.org/web/20130302075316/http://www.avataritag.com/#/toys/ [23:50] looks like it is all loading through wayback except a couple tracker things [23:51] so it's probably ok [23:52] okay [23:52] thanks! DFJustin [23:52] I appreciate it :) [23:53] Bad thing is they are all the way in Arvada. Which is like 45 minute drive i'm not up for atm.