[00:07] *** j08nY has quit IRC (Quit: Leaving) [00:09] *** DoomTay has quit IRC (Ping timeout: 268 seconds) [00:12] *** DoomTay has joined #archiveteam-bs [00:19] *** JesseW has joined #archiveteam-bs [00:23] *** VADemon has quit IRC (Quit: left4dead) [00:26] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [00:27] *** ris has quit IRC () [00:42] *** dashcloud has quit IRC (Read error: Operation timed out) [00:46] *** dashcloud has joined #archiveteam-bs [00:57] DoomTay: What is your deal? [00:58] DoomTay: btw, your edit to http://archiveteam.org/index.php?title=Template:IRC stuffed all the pages using that template into Category:Templates. Fixing now. [00:59] Huh. I was wondering why that one wasn't categorized into Templates [01:00] and apparently my fix borked the site. wheee [01:01] but it's back now [01:09] ..though it frequently resultsi n bouts of a 508 [01:11] *** JesseW has quit IRC (Ping timeout: 370 seconds) [01:20] *** dashcloud has quit IRC (Read error: Operation timed out) [01:21] *** RichardG_ has quit IRC (Ping timeout: 258 seconds) [01:24] *** RichardG has joined #archiveteam-bs [01:27] *** dashcloud has joined #archiveteam-bs [01:28] *** vitzli has joined #archiveteam-bs [01:33] *** dashcloud has quit IRC (Read error: Operation timed out) [01:36] *** dashcloud has joined #archiveteam-bs [01:38] Is there anything that resuscitates deleted IMDb comments? Because Wayback Machine isn't helping [01:42] *** aschmitz has joined #archiveteam-bs [01:46] godane: Do you have a full copy of NTRS, or are you just going slowly at it? [01:48] *** dashcloud has quit IRC (Read error: Operation timed out) [01:49] i'm grabbing them slowly [01:49] year by year [01:52] *** dashcloud has joined #archiveteam-bs [02:16] *** BlueMaxim has joined #archiveteam-bs [02:18] *** nickname_ has joined #archiveteam-bs [02:31] *** JesseW has joined #archiveteam-bs [02:52] *** tomwsmf-a has joined #archiveteam-bs [03:21] *** RichardG has quit IRC (Read error: Operation timed out) [03:21] *** RichardG has joined #archiveteam-bs [03:49] *** RichardG has quit IRC (Read error: Operation timed out) [03:49] *** RichardG has joined #archiveteam-bs [03:51] *** nickname_ has quit IRC (Read error: Operation timed out) [03:53] *** DoomTay has quit IRC (Ping timeout: 270 seconds) [03:57] *** DoomTay has joined #archiveteam-bs [04:05] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:05] *** Start has quit IRC (Read error: Connection reset by peer) [04:14] *** Sk1d has joined #archiveteam-bs [04:47] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [04:57] Lord_Nigh: I had no idea your actual nick was Nightmare. I thought it was a reference to Monty Python... [04:57] We are the lords who say Nigh! [05:02] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [05:09] nope [05:10] its lord_nightmare but for historical reasons (oldest surviving irc network) efnet has a 9 char nickname limit [05:10] on all other irc networks i'm Lord_Nightmare [05:11] *** Sk1d has joined #archiveteam-bs [05:13] *** tomwsmf-a has joined #archiveteam-bs [05:16] *** DoomTay has left [05:20] *** tomwsmf-a has quit IRC (Ping timeout: 258 seconds) [05:22] *** DoomTay has joined #archiveteam-bs [05:40] *** JesseW has quit IRC (Ping timeout: 370 seconds) [06:02] *** hook54321 has joined #archiveteam-bs [06:14] *** DoomTay has quit IRC (Quit: Page closed) [06:57] *** dashcloud has quit IRC (Ping timeout: 244 seconds) [06:58] *** dashcloud has joined #archiveteam-bs [07:23] *** schbirid has joined #archiveteam-bs [07:31] *** dashcloud has quit IRC (Read error: Operation timed out) [07:34] *** dashcloud has joined #archiveteam-bs [07:38] *** remsen has quit IRC (ZNC 1.6.2 - http://znc.in) [07:38] *** remsen has joined #archiveteam-bs [08:01] *** godane has quit IRC (Quit: Leaving.) [08:03] *** godane has joined #archiveteam-bs [08:37] *** vitzli has quit IRC (Leaving) [09:01] *** hook54321 has quit IRC (Quit: Connection closed for inactivity) [09:35] *** dashcloud has quit IRC (Read error: Operation timed out) [09:38] *** dashcloud has joined #archiveteam-bs [10:13] *** dashcloud has quit IRC (Read error: Operation timed out) [10:16] *** dashcloud has joined #archiveteam-bs [10:19] *** dashcloud has quit IRC (Read error: Operation timed out) [10:23] *** dashcloud has joined #archiveteam-bs [11:02] I think i have to kill 2 google code tasks running on my machine. They are on 3,5mio requests right now, with 8 more million to do. Currently using 6Gig Ram each. [11:04] Yeah sometime you get some infinite recurring ones [11:04] either it loops on the same url over and over again or it just gets confused on all the tags of the pages [11:27] *** Fusl has quit IRC (Ping timeout: 260 seconds) [11:58] Does any body here knows json and can help me? [11:59] i have this very long and inline json data wich i would like to have formatted normally [12:09] luckcolor, do you just need it formatted once (online tool) or are you looking for a long term solution? (code) [12:10] for the former (first result on google) https://jsonformatter.curiousconcept.com/ [12:11] ah this work [12:11] *works [12:11] but i just discovered that this doesn't help me a lot [12:11] 100 urls over 700 i readed about [12:12] Anyway thanks Fletcher :P [12:13] np [12:22] *** jut has joined #archiveteam-bs [12:26] *** Fusl has joined #archiveteam-bs [12:38] THE NEXT GREAT GODANE INBOX CATTLE DRIVE HAS BEGUN [12:38] 29,000 items going into already existing or new collections [12:41] I've got four threads doing the moves, which I think is basically enough. [12:42] rip IA [12:45] 7,000 items moved already! [12:45] I'm using a method that was agreed upon that doesn't kill IA [12:46] Also removes a lot of error issues, where it will flat out reject "you done fucked up" instead of "well, let me try since you said URK" [12:48] *** ItsYoda has quit IRC (Ping timeout: 260 seconds) [12:54] SketchCow you are mass uploading or mass moving to a disk drive? :D [12:57] Neither in this case. I am doing mass metadata changes so items go from a central godane upload pool into a few dozen potential collections on the archive. [12:57] *** ItsYoda has joined #archiveteam-bs [12:57] ah ok [12:58] But 30,000 items done in the way I'm doing them (the script does one by one so if there's queue backup, it'll stop doing it), can still take quite a bit of time. [12:59] Examples of new collections created in the last hour for this: https://archive.org/details/the-laura-ingraham-show https://archive.org/details/the-sean-hannity-show [12:59] cool [13:37] *** anjacks0n has joined #archiveteam-bs [13:43] *** anjacks0n has quit IRC (anjacks0n) [13:44] *** anjacks0n has joined #archiveteam-bs [13:57] *** dashcloud has quit IRC (Read error: Operation timed out) [14:00] *** dashcloud has joined #archiveteam-bs [14:34] *** BlueMaxim has quit IRC (Quit: Leaving) [14:55] *** DoomTay has joined #archiveteam-bs [14:55] *** nickname_ has joined #archiveteam-bs [14:55] *** Aranje has joined #archiveteam-bs [15:04] *** j08nY has joined #archiveteam-bs [15:06] godane: When you have a chance, please look at https://archive.org/details/austinchronicle - several PDFs seem to be 100% blank (I checked) [15:08] In other news: There's "Ensign Magazine" https://archive.org/details/Ensign_Magazine which is a Mormon publication, and there's a boat magazine called The Ensign [15:08] They must fucking HATE each other [15:10] Yeah well I once saw that a few of the stuff at https://archive.org/details/doom-cds seem tobe broken too [15:10] Lemme try downloading and pluggin in the smallest one there [15:15] So which issues in particular are 100% blank? [15:15] I asked Godane. [15:20] Yup, the ISO at https://archive.org/details/DoomFever1995MapleMedia is "corrupted" [15:21] *** nickname_ has quit IRC (Read error: Operation timed out) [15:44] *** nickname_ has joined #archiveteam-bs [15:52] *** JesseW has joined #archiveteam-bs [15:54] https://en.wikipedia.org/wiki/MediaWiki_talk:Spam-blacklist#archive.is [15:54] :| [16:09] *** mr-b has quit IRC (Ping timeout: 246 seconds) [16:10] *** yakfish has quit IRC (Ping timeout: 246 seconds) [16:10] *** yakfish has joined #archiveteam-bs [16:11] *** mr-b has joined #archiveteam-bs [16:12] *** nickname_ has quit IRC (Ping timeout: 492 seconds) [16:12] *** nickname_ has joined #archiveteam-bs [16:13] eh, from my point of view, banning archive.is from wikipedia makes it less well known, which means it will be longer before the pressure on it is sufficient to kill it -- which I'm happy about [16:16] *** jut has quit IRC (Leaving) [16:20] joepie91: btw, as of today, it appears that it may be un-blacklisted: https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Archive.is_RFC_4 [16:23] I'm not really sure why grab-site is consuming a full CPU core grabbing a site...? [16:23] that seems unnecessary [16:24] ... okay, so it only does that when it's failing to fetch a URL... [16:24] it still uses a lot otherwise but not a full core [16:28] *** anjacks0n has quit IRC (anjacks0n) [16:36] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:37] SketchCow: i think some pdfs from them are just blank [16:38] from Austin Chronicle [16:39] Oh, I see it now [16:43] so [16:43] I might make a PR for wpull soon to try and implement cloudflare "ddos protection" bypass :| [16:43] fucking cloudflare [16:46] Something tells me that ddos protection is the real reason why the job on http://archive.fbi.ninja/lmao/ "finished" so quickly [16:54] example pdf thats blank: http://www.austinchronicle.com/download/2007-10-12/chronicle.pdf [16:58] DoomTay: yeah, it is. [16:58] DoomTay: irritatingly it seems to ask for the 'ddos captcha' again after X requiests [16:58] so just exporting cookies and useragent from the browser will not get you past it for the entire job [16:58] meaning this needs to be supported in the actual downloading tool to work [16:58] because it can encounter the wall again at any point [16:58] *** anjacks0n has joined #archiveteam-bs [16:58] and you get handed a fresh 'clearance cookie' every time you encounter the wall [16:59] it's also a hilariously bad captcha anyway: http://storage3.static.itmages.com/i/16/0622/h_1466613477_7341936_022a37d420.png [16:59] but still breaks archival [16:59] so, good job cloudflare, you fucked up legitimate bots and didn't hamper the ddos kids [16:59] thanks for breaking the web [16:59] [17:00] Maybe persude the guy behind the site to not use cloudflare, at least for a while? [17:00] Unless it wasn't his decision to make? [17:01] *** ris has joined #archiveteam-bs [17:01] DoomTay: doesn't solve the bigger problem [17:01] a ton of sites use cloudflare [17:01] we need to be able to deal with that [17:04] Can I just say, joepie [17:04] My favorite part of the post-IA-DDOS fallout was watching anonymous groups throw each other under a bus [17:07] <3 [17:17] *** tomwsmf-a has joined #archiveteam-bs [17:29] *** VADemon has joined #archiveteam-bs [17:40] *** JW_work1 has joined #archiveteam-bs [17:41] *** JW_work has quit IRC (Read error: Operation timed out) [17:44] *** anjacks0n has quit IRC (anjacks0n) [17:50] *** JW_work1 has quit IRC (Quit: Leaving.) [17:51] *** JW_work has joined #archiveteam-bs [17:59] *** JW_work has quit IRC (Quit: Leaving.) [18:00] *** tomwsmf-a has quit IRC (Ping timeout: 258 seconds) [18:02] *** JW_work has joined #archiveteam-bs [18:06] *** zino has quit IRC (Quit: Leaving) [18:12] *** Start has joined #archiveteam-bs [18:16] *** zino has joined #archiveteam-bs [18:16] *** JW_work has quit IRC (Quit: Leaving.) [18:16] *** JW_work has joined #archiveteam-bs [18:31] *** nickname_ has quit IRC (Ping timeout: 492 seconds) [19:13] *** dashcloud has quit IRC (Read error: Operation timed out) [19:16] *** dashcloud has joined #archiveteam-bs [19:21] *** anjacks0n has joined #archiveteam-bs [19:30] it really fucks with me that Ruby Range objects have #cover?, #include?, and #overlaps? methods [19:30] the first two especially, the difference is difficult to explain [19:31] #include? appears to require iterable objects whereas #cover? requires only a partial order [19:31] computers.txt [20:25] *** anjacks0n has quit IRC (anjacks0n) [20:33] *** j08nY has quit IRC (Quit: Leaving) [21:03] *** schbirid has quit IRC (Quit: Leaving) [22:03] *** anjacks0n has joined #archiveteam-bs [22:04] *** anjacks0n has quit IRC (Client Quit) [22:12] *** RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) [22:23] *** anjacks0n has joined #archiveteam-bs [22:46] *** dashcloud has quit IRC (Read error: Operation timed out) [22:50] *** dashcloud has joined #archiveteam-bs [22:52] *** anjacks0n has quit IRC (anjacks0n) [22:59] *** RichardG has joined #archiveteam-bs [23:06] *** RichardG_ has joined #archiveteam-bs [23:11] *** RichardG_ has quit IRC (Ping timeout: 250 seconds) [23:11] *** RichardG has quit IRC (Ping timeout: 370 seconds) [23:12] *** RichardG has joined #archiveteam-bs [23:24] *** hook54321 has joined #archiveteam-bs [23:26] *** tomwsmf-a has joined #archiveteam-bs [23:54] *** BlueMaxim has joined #archiveteam-bs