[00:57] I'm still seeing nothing for abklex.de (doing a google search for "abklex" site:archive.org) [00:58] It was archivebotted a week or so ago. I didn't see it anywhere in the archivebot collection items from around when I thought it would have finished, either [00:58] (also: am I still banned from #archivebot? That would probably be a better place for these questions) [01:05] *** kyan has quit IRC (Quit: Leaving) [01:06] *** kyan has joined #archiveteam-bs [01:07] kyan: you got banned? [01:08] chfoo: yes don't remember the circumstances exactly but if I recall correctly, I tried to give someone ops following a tutorial online that suggested using mode flags, typoed, and fscked up the channel modes [01:08] decided not to have ops in other peoples' channels any more :P [01:10] kyan: but you were in archivebot recently though [01:11] chfoo: a few seconds ago by accident [01:11] kyan: so you're not banned :) [01:11] wrong server — was trying to join #archivebot on localhost [01:11] oh? it would prevent me from joining if i was? [01:12] (this = why I don't accept ops anymore :P) [01:13] kyan: yeah, that's basically what a ban is; it won't let you in. we have voice now so people with voice can use it now without needing ops. [01:14] also archivebot suffered some unfortunate accident so things are being redone [01:14] chfoo, I see. Thanks! I'll join then and ask about that archive since I'm worried about it :) [01:14] Aah thats probably what ate it then [01:14] kyan: yours was requeued, so it will show up eventually. [01:14] I've got my own archivebot instance now so I'll probably just do it myself [01:15] aaaaaaaaa, ah, thanks :) I guess i won't then [01:15] ah, was that when fos died? [01:16] no, it was human error [01:18] fair enough [01:20] *** Coderjoe has quit IRC (Read error: Operation timed out) [01:23] *** brayden__ has joined #archiveteam-bs [01:24] let's just say uploading files into localhost into the same directory is not a good idea [01:31] *** brayden_ has quit IRC (Read error: Operation timed out) [01:37] *** Coderjoe has joined #archiveteam-bs [02:02] *** schbirid2 has quit IRC (Read error: Operation timed out) [02:02] *** norbert79 has quit IRC (Read error: Operation timed out) [02:03] *** schbirid2 has joined #archiveteam-bs [02:04] *** primus104 has quit IRC (Leaving.) [02:08] *** norbert79 has joined #archiveteam-bs [02:32] *** ex-parro1 has joined #archiveteam-bs [02:52] i'm downloading those warc.gz that are randomly in my cbsnews.com video files [02:53] i'm going to put them in another item so i can then delete them out of my cbsnews.com-video items [02:54] *** schbirid2 has quit IRC (Read error: Operation timed out) [02:59] *** zenguy_pc has joined #archiveteam-bs [03:00] SketchCow: do who ximm@archive.org is? [03:00] *do you know who [03:02] SketchCow: anyways he upload warc.gz files to the my cbsnews.com-video items [03:03] and i'm pissed off about it [03:04] the way the items should be named in cbsnews.com collection he shouldn't have added them to my item names [03:07] *** schbirid2 has joined #archiveteam-bs [03:24] godane, I would assume: http://monoskop.org/Aaron_Ximm [03:28] ok [03:32] i would guess that these warc.gz files are not even in wayback machine [03:34] anyways i'm downloading the warc.gz files to put into one item [03:34] in archiveteam-fire collection maybe [03:35] also this way i can clean my items he started adding warc.gz too [03:36] i'm on 2004-02-28 and i'm still finding warc.gz in them [03:47] i'm on 2014-03-20 and still finding warc.gz files [03:47] :-/ [03:48] i'm took some screen shots of thiese pages and history logs [03:48] just in case anyone asks for proof [03:55] *** Sellyme has quit IRC (Read error: Connection reset by peer) [04:09] *** BlueMaxim has joined #archiveteam-bs [04:41] *** ex-parro1 has quit IRC (Leaving.) [04:49] *** mistym has joined #archiveteam-bs [04:52] *** aaaaaaaaa has quit IRC (Leaving) [05:34] *** APerti has joined #archiveteam-bs [05:50] *** APerti has quit IRC () [05:53] *** Famicoman has quit IRC (Read error: Connection reset by peer) [06:05] *** godane has quit IRC (Ping timeout: 492 seconds) [06:15] *** godane has joined #archiveteam-bs [06:33] SketchCow: i think aaron ximm added warc.gz files to alot of my items now [06:34] fuck me [06:34] https://archive.org/details/cbsnews.com-video-2003-11-30 [06:35] web archives are there too [06:50] can whoever alerted me to the bitcasa fiasco initially, PM me? I forgot who you were... it's kind of important [07:04] ohh there we go [07:04] damnit, not here anymore [07:07] k, found them [07:10] *** midas has quit IRC (Read error: Operation timed out) [07:11] *** midas has joined #archiveteam-bs [07:12] *** espes__ has quit IRC (ircd.choopa.net irc.teksavvy.ca) [07:12] *** Mayonaise has quit IRC (ircd.choopa.net irc.teksavvy.ca) [07:12] *** jk[SVP] has quit IRC (ircd.choopa.net irc.teksavvy.ca) [07:12] *** closure has quit IRC (ircd.choopa.net irc.teksavvy.ca) [07:12] *** joepie91 has quit IRC (ircd.choopa.net irc.teksavvy.ca) [07:15] *** espes__ has joined #archiveteam-bs [07:15] *** Mayonaise has joined #archiveteam-bs [07:15] *** jk[SVP] has joined #archiveteam-bs [07:15] *** closure has joined #archiveteam-bs [07:15] *** joepie91 has joined #archiveteam-bs [07:15] *** irc.teksavvy.ca sets mode: +oo closure joepie91 [07:38] *** primus104 has joined #archiveteam-bs [08:01] *** mistym has quit IRC (Remote host closed the connection) [08:20] *** xtr-107 has joined #archiveteam-bs [08:26] *** xtr-201 has quit IRC (Read error: Operation timed out) [08:49] *** primus104 has quit IRC (Leaving.) [09:44] *** BlueMaxim has quit IRC (Quit: Leaving) [10:38] i'm over 50k items [10:39] in my godaneinbox collection [10:53] *** Sellyme has joined #archiveteam-bs [10:55] so i have found about 11gb of warc.gz files so far in my cbsnews.com-videos collection [10:59] damn [11:01] 1d19h36s | ... Percent Done: 5.2% Peers: ^ 0 kB/s to 0, v 227 kB/s from 1, of 1 (Ratio: 0.0) (0s idle) [11:11] i'm now at 12gb [11:12] :-/ [11:12] still from the same guy godane ? [11:12] i think so [11:12] its all cbsnews.com web archives too [11:12] he created that collection by the looks of things [11:13] no reply from him yet? [11:13] no reply yet [11:14] i still don't see how he could have made type of mistake [11:14] just based on the item names it should have been different [11:15] *** primus104 has joined #archiveteam-bs [11:15] his collection: https://archive.org/details/cbsnews.com [11:15] and based on the collectionname too [11:15] https://archive.org/details/cbsnews.com-20140324-205801 [11:16] thats a example of a item name [11:16] i shouldn't have had video or dash in the dates [11:16] still, just assume good faith and wait for his reply [11:16] no use worrying and speculating :) [11:17] i will have to reupload it anyway [11:17] why? [11:17] its not in the web collection [11:17] for starters [11:18] also i want to 'clean' the warc.gz out of my video tiems [11:18] *items [11:18] yeah but they (archive.org) can move the warc files from your collection to the correct collection [11:18] atleast, i assume they should be able to do that [11:18] i think they would move the full item [11:19] not just warc.gz but everything in the item [11:19] thats why this has to be done [11:20] lets wait for the guy to reply or SketchCow to reply. [11:20] i'm just downloading the warc.gz for now [11:21] Is there a section for OEM/bundled stuff? [11:31] *** robink has quit IRC (ny.us.hub west.us.hub) [11:31] *** torvik has quit IRC (ny.us.hub west.us.hub) [11:31] *** lysobit has quit IRC (ny.us.hub west.us.hub) [11:31] *** amerrykan has quit IRC (ny.us.hub west.us.hub) [11:31] *** Baljem_ has quit IRC (ny.us.hub west.us.hub) [11:31] *** cloudmons has quit IRC (ny.us.hub west.us.hub) [12:02] *** robink has joined #archiveteam-bs [12:02] *** cloudmons has joined #archiveteam-bs [12:02] *** Baljem_ has joined #archiveteam-bs [12:02] *** amerrykan has joined #archiveteam-bs [12:02] *** lysobit has joined #archiveteam-bs [12:02] *** torvik has joined #archiveteam-bs [12:02] *** west.us.hub sets mode: +o Baljem_ [12:06] *** username1 has joined #archiveteam-bs [12:10] *** schbirid2 has quit IRC (Read error: Operation timed out) [12:55] *** SadDM has joined #archiveteam-bs [13:13] *** SN4T14 has joined #archiveteam-bs [13:14] *** SN4T14_ has quit IRC (Ping timeout: 246 seconds) [13:18] so i have +18gb of warc.gz in my cbsnews.com-video colleciton [13:45] *** BiggieJon has quit IRC (Leaving.) [14:03] *** sankin has joined #archiveteam-bs [14:04] So here's a question... is there any reason that I shouldn't just use wpull instead of wget on a day-to-day basis? [14:41] any plans on archiving deviantART? [14:42] looks like the older stuff maybe premium now [14:52] wow, that will be huge godane [14:56] ok [14:57] i wish we could go after the older images first [14:58] i found a way [14:58] www.deviantart.com/download/170936604 [14:58] we can brute force it by ranges [14:59] godane: I remember SketchCow talking about Deviantart before. Think he knows the people behind it etc [14:59] ok [15:00] if anything else we at least know we can brute force it by range now [15:00] godane: I'd like to create a project for it :) [15:00] ok [15:00] but it will be very big [15:01] SketchCow needs to approve first [15:01] ok [15:01] i figure that [15:04] Downloaded: 1164905 files, 709G in 2d 18h 48m 44s (3.02 MB/s) 728G 2014.11.ftp.sunet.se-X11.tar [15:04] what? [15:04] 28GB bigger when compressed? [15:05] midas: I'd think it handles the small files not as efficient as your os [15:06] hm thats one option [15:06] *** primus104 has quit IRC (Leaving.) [15:06] the content textfile is already 100MB in size :p [15:12] man DeviantArt... that thing is as much a social network as it is an image sharing site. Lots of comments, and blogs, and user inter-connectedness. [15:12] and furries [15:12] lots and lots of furries [15:17] How 'bout imgur? [15:19] ah imgur, the site thats famous for being totally crap. they are rather good at breaking stuff. we could do something there yeah [15:19] did anyone see vice's article about archive.is [15:20] http://motherboard.vice.com/read/dear-gamergate-please-stop-stealing-our-shit [15:23] ugh vice [15:25] I'm not pro gamergate [15:25] but archive.today has been used by both sides [15:25] well, the other side has used it to prevent not-so-nice stuff from disappearing [15:25] as is fairly common [15:27] i never know what gamergate was [15:28] archive.today pulls from google cache which IA doesn't [15:28] it's some debacle about gaming and sexes [15:30] thats what i thought it was [15:30] how do I explain this [15:31] http://digiday.com/brands/wtf-gamergate/ [15:31] or look at the @-replies to anyone who speaks out against gamergate on twitter (especially if they're a woman) [15:34] or just realise that 99% is trolling and your life is better off by not getting involved in poop flinging contests [15:35] ximm has still not replied [15:35] *** aaaaaaaaa has joined #archiveteam-bs [15:35] Do not archive DeviantArt [15:35] ok [15:36] what's going on with DA? [15:36] SketchCow: do you know why Ximm added warc.gz to my cbsnews.com-video collection [15:36] I would not know that, no. [15:36] Give me an example. [15:37] https://archive.org/details/cbsnews.com-video-2003-12-25 [15:38] Interesting. [15:38] His name is Aaron Ximm, by the way. [15:39] i know [15:39] If you mailed him, I'm sure he'll respond. This happened 203 days ago, as you know - so there's maybe some thing that was being done for a reason, or an unintended overlap of IDs [15:39] i guessing overlap [15:40] image of one too: http://tinypic.com/r/bjg9b9/8 [15:40] SketchCow: just know its not just one [15:40] its lot of them in the cbsnews.com-video collection [15:41] best i can tell i happens to items with less videos in them [15:42] his collection for cbsnews.com: https://archive.org/details/cbsnews.com [15:42] Are the videos in the items that have the warc's combined less then 5 GB? [15:42] normal item name: https://archive.org/details/cbsnews.com-20140324-205801 [15:42] and an item with a lot of videos that doesn't have warc's more then 5 GB? [15:42] ? [15:43] not of the cbsnews.com videos have anything close to 5gb [15:43] ah ok [15:43] it was like 200mb to 400mb between 2004 to 2006 [15:44] He might be finding all items with a name cbsnews.com-* [15:44] Then checks them for how big they are [15:45] and the items that aren't the limit size get new warc's [15:45] so the warc's end up in your items [15:45] but that's just speculation, not sure about that [15:46] for example liveweb items are all max 5 GB, so that might also be the case for those websites and newsites crawls of aaron [15:47] ok [15:47] balrog: some leaked email document ended up on PDFy about gamergate also [15:47] got reasonably much traffic [15:48] (frets) [15:48] the size limit would have to be much smaller [15:48] What? Deviantart? Why mention Deviantart? It's alright isn't it? It's not closing is it? IS IT? Answer me!!1!! [15:48] * antomatic paces up and down [15:49] not that we know of [15:49] antomatic: its not closing [15:49] phew [15:49] we are not archiving it [15:49] *** mistym has joined #archiveteam-bs [15:50] SketchCow: i emailed him yesterday [15:50] i have not got a reply yet [15:51] *** mistym has quit IRC (Remote host closed the connection) [15:52] He's probably afraid of contacting you back; you are the great godane archiving machine after all. [15:53] also his score is 86k items [15:53] i'm past +270k [15:53] :-D [15:54] https://i.imgur.com/A7cdb.jpg fits godane [16:19] *** mistym has joined #archiveteam-bs [16:34] *** aaaaaaaaa has quit IRC (Leaving) [16:40] *** aaaaaaaaa has joined #archiveteam-bs [17:11] *** primus104 has joined #archiveteam-bs [17:14] *** sankin has quit IRC (Leaving.) [17:22] *** mistym has quit IRC (Remote host closed the connection) [17:38] *** mistym has joined #archiveteam-bs [18:54] *** tfgbd has quit IRC (Ping timeout: 265 seconds) [19:06] *** primus_ has quit IRC (Remote host closed the connection) [19:13] *** primus has joined #archiveteam-bs [19:48] haha, nice. i want to buy a book but all i get is pdfs and a torrent :) http://www.engineerguy.com/fourier/ [19:58] username1: ehhh, they need to do something about their presentation [19:58] first mental association that page brought up was "ugh, affiliate marketing ebooks" [19:59] :P [19:59] :) [19:59] watch the youtube vids, it's awesome [20:01] it seems cool, it's just the online presentation that's a bit eh [20:03] *** ex-parro1 has joined #archiveteam-bs [20:12] *** eprillios has quit IRC (Ping timeout: 252 seconds) [20:12] *** mistym has quit IRC (Remote host closed the connection) [20:15] *** eprillios has joined #archiveteam-bs [20:19] SketchCow: you were faster than me :) [20:20] *** balrog_ has joined #archiveteam-bs [20:20] *** swebb sets mode: +o balrog_ [20:21] *** Kazzy_ has joined #archiveteam-bs [20:21] *** Arkiver2 has joined #archiveteam-bs [20:21] zzz, what happened [20:21] *** arkiver has quit IRC (Write error: Broken pipe) [20:21] *** balrog has quit IRC (Read error: Connection reset by peer) [20:21] *** RainbowCo has quit IRC (Read error: Connection reset by peer) [20:21] *** GLaDOS has quit IRC (Write error: Connection reset by peer) [20:21] *** Kazzy has quit IRC (Write error: Connection reset by peer) [20:21] *** Kazzy_ is now known as Kazzy [20:22] *** balrog_ is now known as balrog [20:22] *** GLaDOS has joined #archiveteam-bs [20:22] *** swebb sets mode: +o GLaDOS [20:23] *** deathy___ has joined #archiveteam-bs [20:24] *** RainbowCo has joined #archiveteam-bs [20:33] *** mistym has joined #archiveteam-bs [20:43] *** antomati_ has joined #archiveteam-bs [20:45] *** antomatic has quit IRC (Read error: Operation timed out) [21:06] *** mistym has quit IRC (Remote host closed the connection) [21:09] *** antomatic has joined #archiveteam-bs [21:09] *** antomati_ has quit IRC (Read error: Connection reset by peer) [21:23] *** mistym has joined #archiveteam-bs [22:22] *** GLaDOS has quit IRC (Ping timeout: 272 seconds) [23:11] *** GLaDOS has joined #archiveteam-bs [23:11] *** swebb sets mode: +o GLaDOS [23:40] *** Jonimus has quit IRC (Read error: Operation timed out)