#archiveteam 2014-01-02,Thu

↑back Search

Time Nickname Message
00:45 🔗 xmc $1.3MM for the drive fund drive
01:23 🔗 godane SketchCow: just know that the totally rad show collection will need direct access
01:23 🔗 godane there is over 700 videos of that
02:23 🔗 lemonkey may be old news.. http://appleinsider.com/articles/14/01/01/google-to-shut-down-bump-and-flock-apps-delete-all-data
03:57 🔗 ivan` new to me. http://blog.bu.mp/post/71781606704/all-good-things
08:10 🔗 SketchCow http://5secondfilms.com/
08:10 🔗 SketchCow They're shutting down what they are and redoing themselves.
12:34 🔗 arkiver DFJustin: the IA is downloading all the youtube videos?
12:34 🔗 arkiver or just the html pages with the data?
12:51 🔗 godane so i just may have found a complete collection of 3d artist magazine
12:51 🔗 godane it only start in 2009 so its not going up right away
13:13 🔗 arkiver SketchCow: running a link discovery for the website http://5secondfilms.com/ now
13:26 🔗 godane !a http://www.fukuleaks.org/web/
13:27 🔗 godane wrong channel again
13:28 🔗 godane SketchCow: this should be in pctoday magazine colleciton: https://archive.org/details/pctoday-magazine-v11i12
13:29 🔗 godane this collection: https://archive.org/details/pctoday-magazine
13:56 🔗 Cowering oh script gurus.. on win7 running mingw's sed.. how to i delete all " symbols from a stream? sed s/\042//g was my best guess and it is not accepted TIA!
13:57 🔗 Schbirid sed 's/"//g' not working?
13:57 🔗 Cowering sed: Couldn't open file more: No such file or directory
13:57 🔗 Cowering windows shell requires matching "
13:57 🔗 Schbirid damn
13:58 🔗 Cowering and " " " don't work either :(
13:58 🔗 Cowering sed
13:58 🔗 Cowering oops
13:58 🔗 Schbirid \" ?
13:59 🔗 Cowering sed: -e expression #1, char 11: Unterminated `s' command
14:00 🔗 Cowering i'm sure - couild get -f to work, but don't really want to make a file for this
14:09 🔗 Cowering hmm.. i'm not up on unicode.. is $EF $BB $BF one of the special sequences?
14:14 🔗 Cowering doh.. it is
14:14 🔗 Cowering stupid mingw sort does not ignore it, and i ended up with it silently in the middle of my txt file after sorting
15:24 🔗 balrog Cowering: you should not have any BOM in UTF-8 files
15:24 🔗 balrog I know windows software likes to put it in, but it's generally not recommended
16:34 🔗 Baljem Cowering: the escape character in the Windows shell is ^ if that helps. not sure it will, though.
19:35 🔗 NiTo Hello everyone
19:35 🔗 NiTo i'm trying to mirror a website with httrack
19:35 🔗 NiTo without success
19:36 🔗 NiTo a phpbb forum
19:36 🔗 NiTo i need to be logged in to view most of the content
19:36 🔗 NiTo i've followed this: http://httrack.kauler.com/help/CatchURL_tutorial
19:36 🔗 NiTo but i can't get it to work - the user is always disconnected
19:37 🔗 NiTo if anyone has any hints or suggestions, that would be great
20:14 🔗 joepie91 NiTo: I'm not sure what OS you are on, but what you could do is use wget (preferably saving into a WARC as well using --warc-file=), and load the cookies from your browser (while you are logged in there)
20:14 🔗 joepie91 wget just has a cookie-loading switch
20:15 🔗 joepie91 it should then grab everything as the user you were logged into in your browser
20:15 🔗 Schbirid check the wiki, i think there are notes about phpbb2
20:19 🔗 NiTo joepie91: Schbirid: thanks, i've tried wget and it kind of works
20:19 🔗 NiTo the user stays authentified but pages i have with wget are not good
20:19 🔗 NiTo i mean, links are not redirected to the local files (or wrongly)
20:20 🔗 NiTo and it seems it doesn't get all the css and js files somehow
20:20 🔗 NiTo i've seen this page http://www.archiveteam.org/index.php?title=PhpBB
20:21 🔗 Schbirid links are converted only when it is done
20:21 🔗 NiTo oh ok
20:23 🔗 NiTo i'm gonna try again with wget then
20:23 🔗 NiTo and if it works, i think i should update the phpbb wiki page with info from here: http://www.zoros.org/wiki/index.php?title=A_script_to_login_phpbb_forum_using_wget
20:23 🔗 NiTo to mirror while logged in, what do you think?
20:30 🔗 xmc you just need to nab the cookies from your browser and feed them to wget or whatever
20:42 🔗 NiTo ok thanks i think i might be able to make it work
20:42 🔗 NiTo with wget
20:49 🔗 godane so looks like older wall steet journal videos was hosted on bightcove
20:49 🔗 godane *brightcove
20:51 🔗 godane http://www.businessinsider.com/wsj-drops-brightcove-for-web-video-2009-3
20:51 🔗 godane of course i can tell its old news even then
20:52 🔗 godane i have video coming from october 2008
21:24 🔗 SketchCow INSTANT win for Archive Bot
21:24 🔗 SketchCow They completely deleted Penny Arcade Report from their site
21:24 🔗 SketchCow Archive.org, complete copy, thanks to archivebot run
21:29 🔗 SketchCow Anybody here sadedownload at archive.org?
23:44 🔗 arkhive Can the WayBack Machine save Flash data on websites it grabs? I am wondering because I am going to buy some Avatar movie action figures on Craigslist and they have this whole interactive scanning thing. I went to WBM snapshot of the site but am not sure the Flash program/file saved is just the live/current on or the version when snapshot was taken
23:45 🔗 DFJustin if it's a single swf embed it will save that for sure, if the swf loads other resources then it gets more problematic
23:45 🔗 arkhive But ya I'd like to know in case i want to use the interactive feature years down the road when the website shuts down
23:45 🔗 DFJustin you can use the network tab of the developer console in firefox/chrome/? to see where it's loading the stuff from
23:46 🔗 arkhive so if it fully works on an old snapshot.. then years from now the same snapshot should work even when actual site is down?
23:46 🔗 arkhive oh
23:47 🔗 arkhive if i'm doing it right then it looks as if it is loading all of it from the IA snapshot
23:48 🔗 arkhive err
23:48 🔗 arkhive wait. now i'm confused.. lol
23:48 🔗 arkhive can you check it out for me. I have no idea how to use the dev/network tools on chrome. Or read it/whatever.
23:48 🔗 arkhive it doesn't make sense to me
23:49 🔗 arkhive https://web.archive.org/web/20130302075316/http://www.avataritag.com/#/toys/
23:50 🔗 DFJustin looks like it is all loading through wayback except a couple tracker things
23:51 🔗 DFJustin so it's probably ok
23:52 🔗 arkhive okay
23:52 🔗 arkhive thanks! DFJustin
23:52 🔗 arkhive I appreciate it :)
23:53 🔗 arkhive Bad thing is they are all the way in Arvada. Which is like 45 minute drive i'm not up for atm.

irclogger-viewer