[01:13] ivan`: good news [01:14] we can archive it just by grabbing the html and looking for mp4 [01:16] we have to search with google i think [01:16] site:color.com/profie [01:17] i think jason scott will have to look at this [01:17] but this maybe some help it getting the scripts wrote [01:20] http://www.color.com/profile/fb-10714086302 [01:21] snl had a channel [07:23] http://www.reddit.com/r/Games/comments/13glp8/sega_starts_quietly_removing_youtube_users_videos/ [07:45] We need to push this into archive.org then [14:21] Grabbing a copy of ftp.lotus.com [14:21] Since the reduction of IBM's belief in the brand means that thing could die anytime. [14:21] I think we need to think about some FTP downloading. [14:21] I mean, you know. [14:22] like, doing it? I could do more FTP downloading myself. Just wondering whether I should continue using lftp mirror, wget, or what :) [14:22] ftp is a little different from http/html [14:22] I use wget straight on, through UNIX. [14:22] That does a lot of work. [14:22] with WARC? [14:23] oh one positive for lftp: it supports foreign charsets, which nearly no other tools do [14:23] No, not with WARC. [14:23] ok. [14:23] I mean, I don't for this. [14:23] Unless we think it's totally dying. [14:23] WARC doesn't seem strictly needed for ftp, unless you know something I don't :) [14:24] I generally use warc for http sites no matter what [14:25] anyway I have multiple web site archives here I'd like to upload. should I just use that bulk uploader that was linked? [14:28] How many? [14:28] around 100gb that I have in front of me [14:28] more that's on other disks [14:29] 28 large tar.gzs in this folder [14:29] this was before I started using warc, so it's tar.gz files with index lists [14:30] Two choices. Either I can do it, or you can use scripts I wrote for S3, or you can use FTP. [14:31] question though, do .tar.gzs like this need any postprocessing? [14:31] Not for this purpose. [14:31] Remixes of this stuff could be done later. [14:31] ok [14:32] "Either I can do it" — how? not sure what you mean there [14:32] Wow, ftp.lotus.com is full of K-razy Stuff [14:32] I mean I give you a place to rsync and I do it. [14:32] yeah that would work [14:32] some of these were preemptive captures; others were sites that have disappeared [14:32] Everyone loves the Jason Does Everything option! [14:33] It always wins the vote [14:33] RESULTS OF VOTE: FOR: 34 AGAINST: 1 [14:33] well, I don't mind making a text file explaining what is what [14:33] I just want these in more than one place, mainly. [14:33] some were because of the timeframe — for example, I captured familyradio.com a few days before May 21, 2011, when they predicted the world would end. lol [14:34] I think stuff like that belongs in the archives :) [14:34] Where I live we are around 30 years behind so I am not worrying [14:39] uploaded: https://archive.org/details/G4.E3.10.Live.Day.1.DSR.XviD-SYS [14:40] the rest of day 1 live coverage is with the live ea and ubisoft spotlight [14:41] over 4 hours of video with that one [14:43] SketchCow: i uploaded the slax.org forums today [14:43] that was 400+ warc.gz [15:28] (just curious) [15:28] SketchCow: Any reason you use wget instead of lftp mirror? [15:28] both ways work. afaik wget does not work for foreign-charset ftp servers though [15:28] that's something to note [15:29] Yeah [15:29] Also, I like the parallelization of lftp's miror [15:29] mirror* [15:30] yes, though some servers don't like that [15:31] there are two types of parallelization it can do [15:31] multipart or multifile or both [15:54] 1. Servers hate parallelization, especially old ones we're pulling massive copies from [15:54] 2. I've used WGET for half your life. [15:54] if a server doesn't like multifile, then it deserves punching [15:55] Old ones. [15:55] Creaky end-of-life places that might crash or give up when you suck the entire contents out [15:56] I'm grabbing some amazing amount of crap out of this ftp site. [16:02] balrog: I always just do multifile [16:02] Too many misbehaving FTP servers for multipart [16:15] underscor: yup [16:22] 52222222222222222222222222222222 [16:23] says cat [16:23] hey socks [16:37] This was Jetta [16:37] The Angriest Cat in the world [16:38] When there's food available, she literally mugs me [16:38] Like, tries to trip me to make me fall [16:38] She's the meanest cat in the planet [16:38] She's an outdoor cat except when we hit frost point [16:38] So normally she just murders every living thing for a mile around the house [16:40] wow... [16:41] lol [16:41] oh yeah, the archives I'm uploading include a copy of ftp.funet.fi from 2008 [16:41] i have a cute cat who loves his food [16:41] he'll just grab on to your leg [16:41] however hes old and loosing the ability to retract his claws [16:41] and as he doesn't go out much...... - ow. [16:42] balrog: sweet [17:22] After all this stuff is getting uploded, how does it get accessed? Is it all on the Wayback machine? [17:22] *is uploaded [17:35] https://archive.org/details/archiveteam [17:42] anyone here with IA edit privs? [17:42] https://archive.org/details/archiveteam-mobileme-hero should say .Mac not .me [19:50] http://www.petapixel.com/2012/11/20/photo-sharing-app-color-shutting-down-sold-for-7m-after-raising-41m/ [19:50] the amount of buzz when this thing launched was insane [19:53] balrog: this can backed up thanks to videos being hosted directly there [19:54] look for a page with video player then search mp4 [19:56] ------------------------------------------ [19:57] Here's your Delightful Statistic of the Day [19:57] size: 319,301,465,472 KB [19:57] Archive Team: 320 Terabytes of Data [19:57] You're Welcome, Internet [19:57] ------------------------------------------ [19:57] i hope save 1/320 of that [19:57] *helped [20:02] all_logs_from_all_webservers_2012.tar.xz [20:42] https://twitter.com/archiveteam/status/270990234945200128 zing [20:49] niceness [21:05] "Alert: We hope you've enjoyed sharing your stories via real-time video. Regretfully, the app will no longer be available after 12/31/2012." - Color.com