[00:00] you're a lot more experienced then me though so I was hoping you'd be able to lay down some of the framework [00:16] nightpool: i'll see what i can do sure [00:30] *** Ravenloft has joined #archiveteam-bs [00:33] *** godane has joined #archiveteam-bs [01:01] *** Ravenloft has quit IRC (Ping timeout: 190 seconds) [01:07] *** DoomTay has joined #archiveteam-bs [01:08] *** Coderjoe has quit IRC (Read error: Operation timed out) [01:28] *** Coderjoe has joined #archiveteam-bs [01:50] *** godane has quit IRC (Read error: Operation timed out) [02:36] *** Aranje has quit IRC (Ping timeout: 260 seconds) [03:11] Question: why was I banned from #linux ? [03:12] perhaps you were being irritating [03:17] I didn't even say anything in the channel :/ [03:17] They banned all it cloud users [03:17] Oops, no it [03:18] Irccloud [03:18] I'm wondering why archiveteam would know [03:19] I assumed it was an archive team related channel [03:22] #linux? [03:49] *** Coderjoe has quit IRC (Read error: Operation timed out) [04:12] *** Aranje has joined #archiveteam-bs [04:35] *** Coderjoe has joined #archiveteam-bs [04:38] yeah [04:39] @nightpool [04:45] Apparently WikiLeaks just got some juicy stuff [04:45] http://www.washingtontimes.com/news/2016/jul/23/twitter-users-erupt-dncleaks-disappears-from-trend/ [04:46] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:52] *** Sk1d has joined #archiveteam-bs [04:58] uh, no, efnet is much more than archiveteam [04:58] much as we would like it to be otherwise [05:13] EFNet is kinda the original IRC network. [05:14] efnet is cool [05:14] i s'pose [05:17] pikhq_: well not quite [05:17] it's the original rebel IRC network ;) [05:18] In a more literal sense, it's result of most of the original IRC network kicking out eris. [05:18] *the result [05:31] *** DoomTay has quit IRC (Quit: Page closed) [05:34] fuckin eris [05:53] *** yipdw has quit IRC (Read error: Operation timed out) [05:54] *** Aranje has quit IRC (Quit: Three sheets to the wind) [05:56] *** JesseW has joined #archiveteam-bs [06:08] *** Ravenloft has joined #archiveteam-bs [06:08] *** yipdw has joined #archiveteam-bs [06:29] *** Ravenloft has quit IRC (Ping timeout: 633 seconds) [06:33] *** tomwsmf has quit IRC (Ping timeout: 258 seconds) [07:07] *** metal_cam has joined #archiveteam-bs [07:31] *** JesseW has quit IRC (Ping timeout: 370 seconds) [09:16] *** schbirid has joined #archiveteam-bs [10:14] *** GLaDOS has quit IRC (Read error: Connection reset by peer) [10:14] *** GLaDOS has joined #archiveteam-bs [10:21] *** terg has joined #archiveteam-bs [10:26] This is for our not-so-important messages [10:27] * terg notes [10:27] #archiveteam is for project announcements and important updates and alll [10:27] Anyway, we have a good copy o rutracker in the wayback machine [10:27] (and queries as to what our crawlers are doing) [10:27] Planning on doing the other site too [10:29] I wanted to create one giant index of torrents from current/dying/dead websites, no ads, no profit, nothing suspicious - so a combination of TPB, isoHunt (which I threw some firepower at when we were archiving it), ruTracker and KAT if possible [10:30] well, everything we grab is going into the wayback machine. [10:31] WARC files contain the data [10:31] You can extract data from those [10:34] yeah, I've been taking a look at it all. above everything I'd like to see if there's a pragmatic way to save as much as possible of torrent indexes as possible [10:35] torrent sites are like hydra, as soon as they're investigated and taken down, they'll be cloned and replaced multiple times - the anti-piracy effort seems futile to me [10:35] I see there are mirrors of KAT for example, but they're missing lots and lots of torrents and metadata and that's a large problem [10:36] yeah [10:39] it'd be lessened if they offered daily database dumps, but from what I've been reading it seems the operator of KAT was involved in money laundering on a pretty big scale, so I'm presuming he didn't offer such a database to protect his interests [10:44] TPB might be a good target to archive, even though it's seemingly immortal [10:45] tpb already has a daily dump I believe. [10:46] demonoid is ran by teh FBI... [10:47] really? I looked around but couldn't find anywhere with recent TPB dumps [10:47] there are torrents on the site itself with scrapes of everything though [10:49] yeah those [12:06] *** Coderjoe has quit IRC (Read error: Operation timed out) [12:06] *** terg has quit IRC (My Mac has gone to sleep. ZZZzzz…) [12:16] *** Coderjoe has joined #archiveteam-bs [12:49] *** BlueMaxim has quit IRC (Read error: Operation timed out) [12:49] *** BlueMaxim has joined #archiveteam-bs [13:12] oh yaaaaay, the munich guy got his gun from "darknet" [13:13] *** kristian_ has joined #archiveteam-bs [13:20] *** Coderjoe has quit IRC (Read error: Operation timed out) [13:23] *** Coderjoe has joined #archiveteam-bs [13:31] *** Ravenloft has joined #archiveteam-bs [13:31] *** Ravenloft has quit IRC (Client Quit) [13:49] *** GLaDOS has quit IRC (Quit: Oh crap, I died.) [13:49] *** GLaDOS has joined #archiveteam-bs [13:56] *** REiN^ has quit IRC (Ping timeout: 244 seconds) [14:14] *** BlueMaxim has quit IRC (Quit: Leaving) [14:16] *** REiN^ has joined #archiveteam-bs [15:05] *** DoomTay has joined #archiveteam-bs [17:02] *** kristian_ has quit IRC (Leaving) [17:43] oh nice, Windows 10's Resource Monitor is like what happens if you merge htop and lsof and allow filtering by process [17:43] I like this [17:50] *** JesseW has joined #archiveteam-bs [18:12] *** tomwsmf has joined #archiveteam-bs [18:30] *** Ravenloft has joined #archiveteam-bs [18:38] *** metal_cam is now known as metalcamp [19:05] *** Start has quit IRC (Read error: Connection reset by peer) [19:05] *** Start has joined #archiveteam-bs [19:11] *** Ravenloft has quit IRC (Ping timeout: 190 seconds) [19:15] Woah, bwn, any idea why your warrior seems to have gone down to a crawl? [19:19] *** JesseW has quit IRC (Ping timeout: 370 seconds) [20:09] baby happened [20:09] !http://imgur.com/xj2vKME.jpg7 [20:09] http://imgur.com/xj2vKME.jpg [20:40] Smiley: congrats! [20:45] ty [20:51] *** kristian_ has joined #archiveteam-bs [21:08] *** Kazzy is now known as Kaz [21:17] doomtay: not sure, I'll check it out in a bit [21:52] *** godane has joined #archiveteam-bs [21:53] i lost internet last night [21:53] i finally got on now after i came back [21:54] i was asleep until 4pm then had to go out with my brother [22:02] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [22:09] *** Dyskette has joined #archiveteam-bs [22:10] Heyo [22:10] *** kristian_ has quit IRC (Leaving) [22:13] I'm a little new to this while thing - I've been collecting a library of zines and other writings for a long time, mostly LGBT (especially trans) culture because, well, we have so little surviving history, and it is mostly second-hand accounts of people's experiences and identities at best. I've been wanting to look into archiving a lot of blogs and personal websites as well, as I've noticed a bun [22:13] ch of these starting to disappear. I'm not sure if this is an apporpriate place to ask, but I'm not SUPER technical, and I'd like to be sure I'm doing things sensibly before just starting to dump chunks of internet to disk. [22:14] I *think* I've got wget grabbing WARC archives, but I don't actually KNOW. Is there an easy way to view or verify WARC archives? [22:15] (Apologies if I'm in the wrong place to be asking this sort of thing!) [22:20] You can use webarchiveplayer to try and replay the WARCs [22:20] https://github.com/ikreymer/webarchiveplayer [22:21] Which OS are you using? [22:21] Debian [22:22] Yeah, this looks perfect! Thanks a bunch :) [22:22] What are you going to do with the archives? [22:23] I'm not quite sure yet [22:23] Maybe you can upload them to IA with mediatype 'web' [22:23] I have some reservations about just putting it up, because of the... sensitive nature of a lot of it [22:23] They'll be in the wayback machine then [22:23] The Wayback Machine doesn't care about sensitivity ;) [22:24] Librarians should never choose sides :) [22:24] You might also want to have a look at https://github.com/ludios/grab-site [22:24] I know, but some of the people who were running websites that effectively out them as trans that they've taken down might, and it's an area where I'm more concerned than I would be otherwise [22:24] I see [22:25] So for now I'm just making sure I have all this stuff, with redundant backups [22:25] Ok [22:26] If you ever want them in the wayback machine and need help, let us know :) [22:26] Like a lot of the paper stuff I collect (which I also digitise) I would have less reservation about making it available (ie throwing it at the IA) with a bit more distance, timewise [22:28] But yeah, I fully don't expect people to necessarily agree with my approach, and to be honest, I wouldn't claim to have any sort of grand master plan [22:29] But at the same time, I figure doing SOMETHING is better than letting the artefacts of a culture that has been evolving and changing incredibly quickly just disappear [22:29] * Dyskette shrugs [22:29] Thanks for the help again. [22:29] :) [22:29] If you need any more help, you can always ask us [22:29] And, you guys are awesome in general, and what you're doing is fantastic. [22:30] :D [22:30] * arkiver agrees [22:30] Heh heh [22:33] *** ndiddy has joined #archiveteam-bs [22:37] *** Coderjoe has quit IRC (Ping timeout: 260 seconds) [22:38] *** Coderjoe has joined #archiveteam-bs [22:51] *** dashcloud has joined #archiveteam-bs [23:11] *** kristian_ has joined #archiveteam-bs [23:19] *** Coderjoe has quit IRC (Ping timeout: 260 seconds) [23:34] *** Coderjoe has joined #archiveteam-bs [23:37] *** BlueMaxim has joined #archiveteam-bs [23:55] *** closure has joined #archiveteam-bs