[00:02] ia interface is installing now- what command should I use once it's ready? [00:04] It'sthe current ia one but it shouldn't be 0.6.6. [00:04] It's like 0.7.2 or something. [00:04] Or more. [00:04] It's the most recent. [00:08] okay- I'm upgrading it to 0.7.1 now [06:37] hm, is there a channel for the swipnet archiving? [06:40] #swiped [06:40] Scuttle: #swiped [06:41] lol [06:42] exactly [06:47] was thinking I'd set my GBit connection to work... [06:50] hm, the meter in the bottom left corner, is that an indication of how much I have up/downloaded? [06:55] For the warrior, yes. [15:27] Excellent news mates! The wayback machine has working backups of youtube videos now! Anybody got any ideas for a way to just scour youtube and route videos into the waybackmachine? [15:27] https://web.archive.org/web/20110804113440/http://www.youtube.com/watch?v=npHWX1dciOE&gl=US&hl=en&has_verified=1 Example number 1 here [15:28] I was thinking simply converting the save url into a ip and putting it as a proxy in a spider might work, just set the spider to strictly crawl and not save [15:28] yeah that's existed off and on for a while, afaik there's no way to make them get a specific video [15:28] ... [15:29] goddamn webchat [15:29] was gonna say, supposedly it grabs every video that gets tweeted but I haven't noticed that to be the case in practice [15:29] I feel like webchat makes more trouble than it's worth [15:29] ah, only the ones in the 1% "spritzer" twitter feed [15:29] that would make sense [15:30] but that's not what sketchcow's been telling everyone [15:30] hm [15:30] ok [15:32] for whatever reason installing an irc client is a huge barrier for some people, I had to walk someone through using webchat before [15:32] it does seem to be the case that they're not good for much once they finally connect though [15:35] would it be possible to have the Tracker link to the project wiki page along with the website that is being saved and the leaderboard? [15:36] or the warrior status page displayed by runpipeline? [18:01] HEY WHAT [18:02] hey folks [18:02] Hi, juver. [18:02] DFJustin: I found out the policy changed. [19:00] do you have a twitter [19:01] Emcy: who exactly? [19:01] there is @archiveteam and @sketchcow respectively [19:02] lol there is no sketchcow [19:02] @archiveteam is the one that announces new projects [19:02] probably/ [19:02] ? [19:03] i tend to forget i have warrior installed until i read about another site shutting down, then i fire it up [19:03] i bet most people with warrior do that [19:06] @archiveteam-warrior i think [19:07] Emcy: that's fine [19:07] to be honest most projects end up with too many people, which is awesome [19:16] SPOON [19:16] Me and the spoon were hanging out. [19:16] * SketchCow baller [19:17] WikiTeam doesn't! We always have space for more [19:18] eh i was already following archiveteam [19:18] just dont tweet a lo [21:59] is there any best-practice for archiving email? as in maildir/mbox/others.. [22:08] tcan i shut this down now [22:08] the tracker says 0 to do + 1400 "out" [22:18] Emcy: yeah [22:18] if you wish :) [22:20] ok [22:25] SketchCow: finally got the current IA python setup- how do I grab all the cdbbsarchive images? [22:26] ia search collection:cdbbsarchive [22:26] That returns a list of all items in that collection. [22:26] Do, like: ia search collection:cdbbsarchive | sort -u > hitlist.txt [22:26] So now you have hitlist.txt, which is a nice alphabetic list. [22:28] for each in `cat hitlist.txt` [22:28] do [22:28] ia download $each [22:28] done [22:29] deathy: tar of maildir is nice. [22:29] mbox has issues [22:29] is there a way to tell it to only grab jpgs or pngs? (that's all I really want from the collection) [22:32] add | grep jpg or png on the end? [22:32] well, before the sort [22:32] | grep jpg | sort -u > blah [22:44] Smiley: Wrong [22:45] More like: [22:45] ia list $list | grep -i \.[JjGg][PpIi][FfGg] [23:14] thanks SketchCow ! the list of pictures is downloading now (I hope), and I'll grab the actual pictures later