[00:18] its done HCross2 [00:18] All setup and confirmed working, it lives in #torarchivebot channel [00:24] *** Sue has joined #archiveteam-bs [00:28] *** BlueMaxim has joined #archiveteam-bs [01:11] *** Sue_ has joined #archiveteam-bs [01:18] *** dashcloud has quit IRC (Remote host closed the connection) [01:23] *** dashcloud has joined #archiveteam-bs [01:25] *** pizzaiolo has quit IRC (Quit: pizzaiolo) [01:48] *** j08nY has quit IRC (Quit: Leaving) [02:06] *** DopefishJ is now known as DFJustin [02:55] *** BubuAnabe has quit IRC (Ping timeout: 268 seconds) [03:48] *** icedice has quit IRC (Read error: Operation timed out) [03:49] *** qw3rty2 has joined #archiveteam-bs [03:54] *** qw3rty has quit IRC (Read error: Operation timed out) [04:30] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:36] *** Sk1d has joined #archiveteam-bs [04:44] *** underscor has quit IRC (Read error: Operation timed out) [05:07] *** Harzilein has quit IRC (Ping timeout: 260 seconds) [05:10] *** underscor has joined #archiveteam-bs [05:10] *** swebb sets mode: +o underscor [05:43] *** Famicoman has quit IRC (Ping timeout: 260 seconds) [05:51] *** Famicoman has joined #archiveteam-bs [06:18] *** Honno has joined #archiveteam-bs [06:30] *** acridAxid has quit IRC (Quit: marauder) [07:40] *** acridAxid has joined #archiveteam-bs [08:09] *** Famicoman has quit IRC (Ping timeout: 260 seconds) [08:26] *** Famicoman has joined #archiveteam-bs [08:56] *** Honno has quit IRC (Read error: Operation timed out) [09:27] *** SHODAN_UI has joined #archiveteam-bs [09:30] *** j08nY has joined #archiveteam-bs [09:43] *** kristian_ has joined #archiveteam-bs [10:19] *** SHODAN_UI has quit IRC (Remote host closed the connection) [10:42] *** Famicoman has quit IRC (Ping timeout: 260 seconds) [10:51] *** Famicoman has joined #archiveteam-bs [11:10] *** pizzaiolo has joined #archiveteam-bs [12:18] *** SHODAN_UI has joined #archiveteam-bs [12:32] HCross2, more comics on the way in, just got up the entire DC chronology and now working on marvels [12:54] *** Harzilein has joined #archiveteam-bs [13:25] *** BlueMaxim has quit IRC (Read error: Operation timed out) [13:26] *** BlueMaxim has joined #archiveteam-bs [13:26] *** pizzaiolo has quit IRC (Read error: Operation timed out) [14:19] *** BlueMaxim has quit IRC (Read error: Operation timed out) [14:26] *** bmcginty has quit IRC (Ping timeout: 268 seconds) [14:29] https://archive.org/details/gna_tickets "The item is not available due to issues with the item's content." [14:29] Anyone know what this means? [14:30] (I'm finally going over others' work on Gna.) [14:36] *** Asparagir has quit IRC (Asparagir) [14:37] Also, ISTR something go by about how items from us on archive.org should be tagged as "Archive Team" somehow. Is that retrospective -- should I tell someone about AT-related items [14:37] > [14:37] #? [14:40] *** kristian_ has quit IRC (Quit: Leaving) [14:44] *** pizzaiolo has joined #archiveteam-bs [14:58] jtn2: it means a copyright claim or something like that [15:02] ugh [15:02] How does one find out what exactly it was? Will the item owner have more info? [15:03] (That's Zeryl, but they're not here any more. Can probably dig out their email address) [15:03] Could it have been automatically flagged from a malware scan? [15:09] *** pie_ has joined #archiveteam-bs [15:10] Lord_Nigh, any chance you know of a magical way to run windows steam games from linux steam? [15:10] i can get the package with download_depot but i cant actually start it with steam nor can i just run wine game.exe [15:16] If I suspect someone (Zeryl) caused some stuff to be ingested into the Wayback Machine, is there any way to verify this? [15:25] nevermind i was starting the wrong exe *facepalm* [15:25] it autmatically starts steam [15:26] hm nevermind. it starts steam but doesnt run :/ [15:30] jtn2: Marked as spam. https://catalogd.archive.org/log/672398126 [15:35] PurpleSym: argh. By Jeff Kaplan, presumably. Any idea if I can appeal this? (I am assuming it is not in fact spam; Zeryl appeared to be acting in good faith and their other items are good.) [15:36] (Thanks for digging that out) [15:38] I don’t know. info@archive.org ? [15:38] It's quite possible that some Gna tickets do have spammy content, although it appeared magically immune to spam. [15:38] I did have to flag things a few times though. [15:39] PurpleSym: should I say this is an Archive Team project, do you think? [15:56] *** superkuh has quit IRC (Remote host closed the connection) [15:57] *** superkuh has joined #archiveteam-bs [16:30] *** pizzaiolo has quit IRC (Read error: Operation timed out) [16:32] *** pizzaiolo has joined #archiveteam-bs [16:35] *** icedice has joined #archiveteam-bs [16:44] *** ivan has quit IRC (Leaving) [16:45] *** ivan has joined #archiveteam-bs [16:49] voidsta, git in hur [17:15] *** Swizzle_ has joined #archiveteam-bs [17:20] *** BubuAnabe has joined #archiveteam-bs [17:27] *** Swizzle has quit IRC (Read error: Operation timed out) [17:35] Five million URLs completed on the Tilt API grab. Unfortunately, the queue has been growing again since yesterday evening and is now at 6.31M URLs. I've changed the concurrency and delay settings a few hours ago and am now retrieving about 50k URLs per hour (previously 30k). [17:37] *** ivan is now known as marvinw [18:45] *** BartoCH has quit IRC (Remote host closed the connection) [18:49] *** BartoCH has joined #archiveteam-bs [18:59] GO JAA GO [19:36] What do I do when grab-site is stuck on a url? [19:40] Leave it. It'll sort itself [19:41] hook54321: are you using phantomjs or YouTube-dl? [19:43] HCross2: Whatever is on by default on grab-site [19:43] So neither [19:44] It fixed itself [20:05] HCross2: So, do you want to start an Al Jazeera project or should I continue throwing the URLs into ArchiveBot? [20:06] JAA: I'd do something by myself but all my crawl boxes are in use. Keep loading archivebot and I'll free some room [20:06] Ok. [20:07] Arguably, the most important parts (the news pages) have already been archived. But there's a lot more content to grab, obviously. [20:07] I also have a huge list of social media accounts, but I'm not sure how to reliably grab those. [20:08] I've figured out something for Instagram, but other sites I'm not so sure. [20:16] HCross2, do you want 1.2TB of manga? [20:17] Sure [20:20] so i'm uploading another 2849 pdfs for the ERIC archive [20:30] btw eric.ed.gov https sometimes fails to establish connection [20:30] i make my upload script to download the html using -O EDxxxxxx.html [20:31] so if can just check for any html as zero size files [20:32] *so i can just check for any html as zero size files [20:32] then use that to make a list to do a update-metadata from [20:48] *** Honno has joined #archiveteam-bs [20:56] HCross2: http://doc.aljazeera.net/ probably needs some special treatment to retrieve the videos (Brightcove player). Youtube-dl seems to have some support for Al Jazeera, but I don't think it will work here (plus it's broken in ArchiveBot). [20:57] JAA: looks like they do geo blocking too [20:57] I can't play some of the videos from my UK IP [20:58] Yay [21:01] Do you have an example which doesn't work for you? I just tested a few and those seemed to work here. [21:08] *** Honno has quit IRC (Read error: Operation timed out) [21:40] Has anyone else had issue with using WebRecorder Player to read warcs? I can't seem to be able find pages that should be there... [21:40] *issues [22:00] nvm it's a bug [22:01] what's the best way to load multiple warc files simultaneously so they can be browsed at the same time? [22:02] hook54321: another option is to combine them with tools like https://github.com/alard/megawarc [22:32] *** fie has joined #archiveteam-bs [22:34] *** SHODAN_UI has quit IRC (Remote host closed the connection) [22:39] *** Ravenloft has quit IRC (Read error: Operation timed out) [22:40] *** Panasonic has joined #archiveteam-bs [22:50] so, this guy merged two things SketchCow love https://arstechnica.com/gaming/2017/07/a-programmer-turned-wikipedia-into-a-classic-text-adventure/ [22:59] *** Dash has joined #archiveteam-bs [23:26] *** Dash has quit IRC (Quit: Page closed) [23:52] *** pie_ has quit IRC (Read error: Operation timed out)