[00:00] *** dan- has joined #archiveteam-bs [00:14] 21/63 WARCs checked, no red rain so far [00:14] archivebot warcs that is [00:14] it's possible I'm not looking for the right terms. at the moment I grep -i cf-hot [00:14] er cf-host [00:23] yipdw: i thougt it was cf-ray? [00:36] CF-RAY is supposed to be there [00:56] *** nyany has quit IRC (Leaving) [00:58] *** nyany has joined #archiveteam-bs [01:00] *** joepie91 has joined #archiveteam-bs [01:02] i'm uploading more rev3games youtube channel videos [01:03] i'm close to getting all of that uploaded btw [01:03] i have about 500 videos to go now [01:16] *** fie has quit IRC (Ping timeout: 246 seconds) [01:28] *** fie has joined #archiveteam-bs [01:56] *** odemg has quit IRC (Remote host closed the connection) [01:56] *** odemg has joined #archiveteam-bs [01:57] so now i understand why jason was doing hosting stuff after a HEART ATTACK: http://ascii.textfiles.com/archives/5139 [02:19] SketchCow: jeff give me a reply [02:20] he talked about pdfinfo cannot process [02:20] here is my pdfinfo doing it right: http://pastebin.com/7TxNACVr [02:24] checking the history log and can't get the pdfinfo log [02:24] i want to see how its failing since my is not [02:25] ok here is a major diff below: [02:26] Creator: PScript5.dll Version 5.2.2 [02:26] Producer: Acrobat Distiller 6.0 (Windows) [02:26] where EJ720637.pdf is this: Producer: Acrobat Web Capture 7.0 [02:27] *** schbirid2 has joined #archiveteam-bs [02:30] *** schbirid has quit IRC (Read error: Operation timed out) [02:45] *** arkiver has joined #archiveteam-bs [03:01] *** VADemon has quit IRC (Quit: left4dead) [03:32] *** dxrt- has joined #archiveteam-bs [03:32] *** dxrt- has quit IRC (Remote host closed the connection) [03:34] *** dxrt- has joined #archiveteam-bs [03:51] *** Stiletto has quit IRC (Read error: Operation timed out) [04:22] *** Stiletto has joined #archiveteam-bs [04:26] *** pizzaiolo has left [04:51] *** dashcloud has quit IRC (Read error: Operation timed out) [04:51] *** dashcloud has joined #archiveteam-bs [05:01] *** BlueMaxim has quit IRC (Read error: Operation timed out) [05:02] *** BlueMaxim has joined #archiveteam-bs [05:13] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [05:21] *** Sk1d has joined #archiveteam-bs [05:21] *** Sk1d has quit IRC (Connection Closed) [05:25] *** godane has quit IRC (Leaving.) [06:08] *** HCross has quit IRC (Read error: Connection reset by peer) [06:08] *** HCross has joined #archiveteam-bs [06:10] *** GE has joined #archiveteam-bs [07:17] *** dashcloud has quit IRC (Read error: Operation timed out) [07:21] *** dashcloud has joined #archiveteam-bs [07:21] *** Aranje has quit IRC (Quit: Three sheets to the wind) [07:35] *** Ravenlow has joined #archiveteam-bs [07:55] *** Panasonic has joined #archiveteam-bs [08:01] *** Ravenlow has quit IRC (Ping timeout: 492 seconds) [08:41] *** Honno has joined #archiveteam-bs [08:42] *** Panasonic has quit IRC (Ping timeout: 492 seconds) [08:53] *** GE has quit IRC (Quit: zzz) [10:33] *** vitzli has joined #archiveteam-bs [10:41] *** odemg has quit IRC (Remote host closed the connection) [10:58] *** HP has quit IRC (Read error: Operation timed out) [10:59] *** odemg has joined #archiveteam-bs [11:09] *** godane has joined #archiveteam-bs [11:16] *** HP has joined #archiveteam-bs [11:24] Soo, anyone of CloudFlare already in contact with IA, as the promised to their customers? (See http://pastebin.com/pUnKJE3J) [11:29] "According to Cloudflare, Authy data was not discovered in any known cache. However, we are nonetheless treating this incident as if we have been impacted." - At least some people taking this seriously [11:49] *** odemg has quit IRC (Remote host closed the connection) [11:50] *** odemg has joined #archiveteam-bs [11:59] *** vitzli has quit IRC (Quit: Leaving) [12:19] It doesnt even really matter whether it is in a public cache. It could have been siphoned off anyway. THe bug was so easy that any senior skkruipt kiddie could have hit and used it. Also consider _all_ data that flowed through cloudflare in the affected timespan as breached. And use better security next time. [12:29] mkram: FullACK. That's why I wrote a script which lists every CloudFlare-hosted site from a provided file and am in the process of changing passwords and API keys... [12:32] The bad thing is that you have to, and not get forced password resets from the sites that know they use cf... [12:33] TobiX: are you sure that the sites were not infiltrated and that you changing those things is actually worth doing before the site announces complete confidence in being not compromised? [12:33] :] [12:38] *** Stiletto has quit IRC (Read error: Operation timed out) [12:49] schbirid2: DAMMIT :) [12:51] *** BlueMaxim has quit IRC (Quit: Leaving) [13:03] *** pizzaiolo has joined #archiveteam-bs [13:26] *** pizzaiolo has quit IRC (Ping timeout: 250 seconds) [13:26] *** pizzaiolo has joined #archiveteam-bs [13:27] *** VADemon has joined #archiveteam-bs [14:07] *** odemg has quit IRC (Remote host closed the connection) [14:13] Given that it turns out Gna code repositories can be grabbed anonymously in full by simple anonymous rsync, is there some awesomely powerful Archiveteam infrastructure that I should instruct to do so? (I don't have great bandwidth myself.) [14:22] How much data are we talking about, jtn2 ? [14:27] PurpleSym: I haven't figured that out yet. There's probably some way to get rsync to tell us without downloading it all, isn't there. [14:29] I don’t think there is. [14:30] *** yuitimoth has quit IRC (Remote host closed the connection) [14:30] *** yuitimoth has joined #archiveteam-bs [14:30] TobiX: their statement is a lie, btw [14:30] "Over the last week, we've worked with these caches to discover what customers may have had sensitive information exposed and ensure that the caches are purged." [14:31] this is literally impossible [14:31] they've only worked with a *few* caches [14:31] but there are thousands of private scrapers [14:31] whose dumps they are never going to get access to [14:31] well, I guess this will be a lesson to those not already using a decent password manager [14:31] Kaz: while a password manager reduces impact, it doesn't solve the problem [14:32] Kaz: there's not much you can do as a user when an infra provider is spewing private data left and right, other than asking the services you use to stop using that infra provider [14:32] you're right, but yeah I'm mainly focused on reducing impact for now [14:32] (which you should do, btw) [14:32] essentially, 'everything reasonably in my control' [14:32] rsync, gna> There are ~1500 projects. As an example, Freeciv is 2.9G (I'm guessing that's at the large end and most are tiny though). [14:37] *** fenn has joined #archiveteam-bs [14:38] * jtn2 -> #gnarm [15:03] Gna> details on #gnarm / wiki but in summary I've found <200G of bulk files (including all code repos) [15:04] *** odemg has joined #archiveteam-bs [16:02] *** LastNinja has quit IRC (Read error: Operation timed out) [16:12] *** DFJustin has quit IRC (Read error: Connection reset by peer) [16:12] *** DFJustin has joined #archiveteam-bs [16:12] *** swebb sets mode: +o DFJustin [16:15] *** mkram has quit IRC (Read error: Operation timed out) [16:18] *** Aranje has joined #archiveteam-bs [16:22] *** root___ has joined #archiveteam-bs [16:22] *** root___ is now known as mkram [16:31] *** LastNinja has joined #archiveteam-bs [16:52] *** LastNinja has quit IRC (Read error: Operation timed out) [16:53] can someone confirm: wpull --directory-prefix /tmp/nodirectoryexistshere www.time.com [16:53] leading to a crash because the directory does not get created and wpull searchings for tmp file inside [16:55] *** pizzaiolo has quit IRC (Ping timeout: 245 seconds) [17:00] schbirid2: confirmed here [17:00] thanks [17:10] *** pizzaiolo has joined #archiveteam-bs [17:10] *** pizzaiolo has quit IRC (Remote host closed the connection) [17:13] *** wm_ has quit IRC (Quit: gone) [17:13] *** pizzaiolo has joined #archiveteam-bs [17:19] *** LastNinja has joined #archiveteam-bs [17:22] *** GE has joined #archiveteam-bs [17:39] *** wm_ has joined #archiveteam-bs [17:45] *** wm_ has quit IRC (Read error: Connection timed out) [17:45] *** wm_ has joined #archiveteam-bs [18:42] SketchCow: i think the bad content error with pdfs is cause of AcroForm being in pdfs [18:43] pdfs with that are the ones rejected [18:44] and the form crap is cause of site search on the last page [18:59] *** pizzaiolo has quit IRC (Read error: Operation timed out) [19:11] *** Ravenloft has joined #archiveteam-bs [19:39] *** odemg has quit IRC (Remote host closed the connection) [19:52] *** odemg has joined #archiveteam-bs [20:36] *** odemg has quit IRC (Remote host closed the connection) [20:40] *** odemg has joined #archiveteam-bs [20:46] *** nrp3c has quit IRC (WeeChat 1.5) [20:48] *** BiggieJon has joined #archiveteam-bs [20:50] godane: Do you think it's an IA problem or a bad form problem? [20:51] *** ItsYoda has quit IRC (Remote host closed the connection) [20:56] *** PotcFdk has quit IRC (Ping timeout: 506 seconds) [20:59] *** PotcFdk has joined #archiveteam-bs [21:00] SketchCow my man, how are you doing today? [21:02] Nothing kills me, I am obviously immortal [21:06] *** ItsYoda has joined #archiveteam-bs [21:09] Just noticed https://www.reddit.com/r/Archiveteam/comments/5u7d4f/flavorsme_closing_down/ , do we already have a project for it? [21:11] arkiver, ^ [21:25] *** odemg has quit IRC (Read error: Connection reset by peer) [21:26] *** odemg has joined #archiveteam-bs [21:29] *** odemg has quit IRC (Remote host closed the connection) [21:31] *** odemg has joined #archiveteam-bs [21:35] *** odemg has quit IRC (Remote host closed the connection) [21:37] *** odemg has joined #archiveteam-bs [21:38] *** username1 has joined #archiveteam-bs [21:41] *** schbirid2 has quit IRC (Read error: Operation timed out) [21:48] *** odemg has quit IRC (Remote host closed the connection) [21:57] *** odemg has joined #archiveteam-bs [22:26] *** BlueMaxim has joined #archiveteam-bs [22:27] I (65%) and PurpleSym (35%) will be done downloading the rsync grab parts of Gna! [22:27] Any idea how to upload to IA? [22:28] I think we're quite good at that [22:28] *** icedice has joined #archiveteam-bs [22:28] archive.org/upload, give it a good name and fill out as much as you can in terms of metadata (I'm assuming as this is rsync it's just going to be the files, and no WARCs etc) [22:29] Kaz: it is a combined half million files :) [22:29] but only 180GiB [22:29] try to split the data up, around ~40gb per upload item [22:30] if you can sort alphabetically by project name, for example [22:43] *** GE has quit IRC (Remote host closed the connection) [22:44] Also please upload them as ZIP files if they're lots of tiny files (which is what I would expect source code to be organized as) [22:44] so i found another magazine to grab [22:44] called AM NewYork [22:48] *** kurt_ has joined #archiveteam-bs [22:56] rocode: no, but we will [23:08] *** schbirid2 has joined #archiveteam-bs [23:11] *** username1 has quit IRC (Read error: Operation timed out) [23:47] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [23:51] *** Honno has quit IRC (Ping timeout: 370 seconds)