[00:35] *** chazchaz has quit IRC (Read error: Operation timed out) [00:36] *** HCross has quit IRC (Read error: Connection reset by peer) [00:37] *** HCross has joined #archiveteam-bs [00:40] *** chazchaz has joined #archiveteam-bs [00:47] *** Stiletto has quit IRC (Ping timeout: 246 seconds) [01:06] bzc6p: FWIW: IMO unless you have really slow upload, there's no sense in using any lossy compression on scans. Given a folder of TIFFs, you can just put it into a losslessly compressed windows-style "ZIP" file, upload it to IA, and set it as a "Generic Raw Book Zip" and queue a rederive. It should work like a charm. [01:17] *** logan has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:17] *** zenguy has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:17] *** Mayonaise has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:17] *** closure has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:17] *** Nertsy has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:17] *** Baljem has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:17] *** mr-b has quit IRC (ircd.choopa.net irc.teksavvy.ca) [01:21] *** JesseW has joined #archiveteam-bs [01:35] *** tephra has joined #archiveteam-bs [01:35] *** tephra_ has quit IRC (Read error: Connection reset by peer) [01:36] *** logan has joined #archiveteam-bs [01:38] *** zenguy has joined #archiveteam-bs [01:38] *** closure has joined #archiveteam-bs [01:38] *** midas sets mode: +o closure [01:38] *** Nertsy has joined #archiveteam-bs [01:39] *** lbft_ has quit IRC (Read error: Operation timed out) [01:39] *** mr-b has joined #archiveteam-bs [01:39] *** GLaDOS has quit IRC (Ping timeout: 633 seconds) [01:39] *** achip has quit IRC (Read error: Operation timed out) [01:40] *** GLaDOS has joined #archiveteam-bs [01:40] *** midas sets mode: +o GLaDOS [01:40] *** lbft has joined #archiveteam-bs [01:41] *** kvieta has quit IRC (Excess Flood) [01:41] *** wyatt8750 has quit IRC (Read error: Operation timed out) [01:41] *** Baljem has joined #archiveteam-bs [01:45] *** Mayonaise has joined #archiveteam-bs [01:46] *** kyan_ has joined #archiveteam-bs [01:48] *** kvieta has joined #archiveteam-bs [01:53] *** JesseW has quit IRC (Leaving.) [01:55] *** kyan has quit IRC (Ping timeout: 663 seconds) [01:56] *** beardicus has quit IRC (Read error: Operation timed out) [01:57] *** lbft has quit IRC (Ping timeout: 626 seconds) [01:58] *** arkiver has quit IRC (Read error: Operation timed out) [01:58] *** lbft has joined #archiveteam-bs [02:00] *** jk[SVP] has quit IRC (Read error: Operation timed out) [02:01] *** jk[[SVP]] has joined #archiveteam-bs [02:01] *** jk[[SVP]] is now known as jk[SVP] [02:01] *** JesseW has joined #archiveteam-bs [02:02] *** kyan has joined #archiveteam-bs [02:02] *** arkiver has joined #archiveteam-bs [02:02] *** wyatt8750 has joined #archiveteam-bs [02:02] *** kvieta has quit IRC (Ping timeout: 629 seconds) [02:02] *** Sanqui has quit IRC (Ping timeout: 629 seconds) [02:10] *** kyan_ has quit IRC (Ping timeout: 620 seconds) [02:10] *** phuzion has quit IRC (Read error: Operation timed out) [02:10] *** toad1 has quit IRC (Ping timeout: 851 seconds) [02:10] *** mutoso_ has quit IRC (Read error: Operation timed out) [02:10] *** Sanqui has joined #archiveteam-bs [02:10] *** mutoso has joined #archiveteam-bs [02:21] *** logchfoo3 starts logging #archiveteam-bs at Tue Jan 19 02:21:49 2016 [02:21] *** logchfoo3 has joined #archiveteam-bs [02:21] *** logchfoo3 has quit IRC (Connection closed) [02:22] *** logchfoo4 starts logging #archiveteam-bs at Tue Jan 19 02:22:52 2016 [02:22] *** logchfoo4 has joined #archiveteam-bs [02:26] *** username1 has joined #archiveteam-bs [02:29] *** schbirid2 has quit IRC (Read error: Operation timed out) [02:49] *** beardicus has joined #archiveteam-bs [02:49] *** kvieta has joined #archiveteam-bs [02:56] *** kvieta has quit IRC (Read error: Operation timed out) [03:14] *** kyan_ has joined #archiveteam-bs [03:15] *** kyan has quit IRC (Ping timeout: 260 seconds) [03:16] *** kyan_ is now known as kyan [03:18] We might want to archive the BBC iPlayer stuff. [03:18] I've brought this up before in the context of the iPM Radio 4 podcast [03:18] but I've found this page (needs UK IP) http://www.bbc.co.uk/iplayer/cbbc/a-z [03:19] shows a bunch of TV shows [03:19] most of which are only available for a few days, apparently. [03:19] The TV shows mostly look dreadful, but that doesn't mean they shouldn't be archived [03:19] *** beardicus has quit IRC (Read error: Operation timed out) [03:21] I mean who green-lit a show called "Gangsta Granny", with what appears to be an entirely white cast. SRSLY? [03:23] And it requires Adobe Flush, I mean Adobe Crash, um, well the thing that crashes and was flushed down the toilet in fucking 2007 by the iPhone [03:24] *** kvieta has joined #archiveteam-bs [03:24] *** beardicus has joined #archiveteam-bs [03:24] kyan: if you set your user-agent to an iPad string, you can usually get around the Flash requirement [03:25] dashcloud: Ooh, that sounds handy! (why doesn't it detect a lack of flash for desktop. Browser plugins are so Netscape 4) [03:25] Thanks :) [03:27] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [03:29] *** dashcloud has joined #archiveteam-bs [03:29] Unfortunately that didn't work, AFAICT. Cleared bbc cookies, cleared cache, and reloaded; still getting served Flash [03:33] Urgh it's hosted as some sort of fragmented streaming thing like YouTube, as .f4f and .f4m files. At least it looks like there's an app to convert that [03:41] As it turns out, Firefox doesn't like having too many bookmarks, apparently. Opened my bookmarks menu and it crashed -_- [03:44] Man BBC's web site sucks. Gives an error message, and pressing refresh keyboard shortcut makes the error message smaller. (Why can plugins capture browser keyboard shortcuts, lol) (#WTF) [03:55] *** BlueMaxim has joined #archiveteam-bs [04:08] Ok, got a simple-ish way working to download those videos. Unfortunately the ones I was actually visiting the BBC's Web site for say "Sorry, this episode is not currently available"= [04:27] LOL binsearch's retention time is shorter than my provider [05:34] anyone happen to know the uncompressed size of https://archive.org/download/ia-bak-census_20150304/public-file-size-md_20150304205357.json.gz ? [05:34] it's over 12G [05:38] sorry, 17 (and still going) [05:40] 20 [05:44] 21 G is the final total: 22522862598 bytes [05:51] now added to http://archiveteam.org/index.php?title=Internet_Archive_Census [06:01] Getting connection refused from IA [06:01] for their homepage [06:01] and for s3 api [06:03] *** kyan_ has joined #archiveteam-bs [06:04] Still happening on another IP [06:07] *** kyan has quit IRC (Ping timeout: 260 seconds) [06:07] *** kyan_ is now known as kyan [06:09] who.is says archive.org is down https://who.is/whois/archive.org [06:09] I guess it's not my imagination [06:14] Ah, up again. Yay! [06:18] Theoretically this can download a Jamendo album in FLAC. https://gist.github.com/niclashoyer/10426194 [06:18] I can't get it to work, but if it can be gotten to be, then it might be some good stuff to archive [06:19] (FLACs are only downloadable by the website with a pricy commercial license) [06:24] Back [06:25] was that the result of the people DDOSing? [06:32] Out of the top 5 public non-collection items on IA, numbers 1, 3, 4, and 5 are Islamic religious works. Number 2 is a movie called "About Bananas". Wut. [06:33] * kyan loves IA [06:35] top 5 by what count? views? [06:36] yep [06:36] https://archive.org/search.php?query=mediatype%3A*+-mediatype%3Acollection&sort=-downloads&page=2 [06:36] or, not the &page=2, but yeah [06:37] https://twitter.com/kolubat/status/689335622645972992 [06:37] This ... looks wrong: https://archive.org/metadata/landthatisdesolaTest000001mbp -- check the "identifier" value. [06:38] cause there are 2 values? [06:39] yeah. that shouldn't happen, afaik... [07:05] and the census file has that item list 5 times... (?!) [07:06] s/list/listed/ [07:06] along with one item without an id at all [07:06] such interesting oddities [07:09] And there's one identifier duplicated in the identifier list. It is: "e-dv212_boston_14_harvardsquare_09-05_001.ogg" (obviously! :-) ) [07:16] JesseW, on the topic of interesting identifiers: my user uploads page links to https://archive.org/details/__new_item__ [07:16] (it's not a valid item) [07:16] hm, interesting [07:20] *** phuzion has quit IRC (Quit: No Ping reply in 180 seconds.) [07:24] *** phuzion has joined #archiveteam-bs [07:27] The item without an identifier is https://archive.org/details/lecture_10195 (jake, the person who ran the census, fixed it soon after running the census) [08:00] Hm, the main census file has only 13,075,195 normal string identifiers, with 113 duplicates [08:03] *** JesseW has quit IRC (Read error: Operation timed out) [08:04] proposal: add Archive Team motto http://oddstuffmagazine.com/wp-content/uploads/2015/12/All-your-files-are-exactly-where-you-left-them.jpg [08:48] *** JesseW has joined #archiveteam-bs [09:40] *** JesseW has quit IRC (Read error: Operation timed out) [10:03] *** godane has quit IRC (Quit: Leaving.) [10:05] *** SmileyG has quit IRC (Read error: Operation timed out) [10:10] *** ersi_ is now known as ersi [10:50] *** Smiley has joined #archiveteam-bs [11:01] *** PotcFdk has quit IRC (Ping timeout: 506 seconds) [11:20] *** brayden has quit IRC (Quit: Leaving) [11:39] *** brayden has joined #archiveteam-bs [11:45] kyan: https://github.com/get-iplayer/get_iplayer, dono if it still works [12:19] *** Ravenloft has joined #archiveteam-bs [12:29] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [12:32] I think this will be of interest to people here [12:32] https://worldbuilding.stackexchange.com/questions/33559/a-treasure-chest-for-your-post-apocalyptic-children [12:45] "Bottlecaps... lots of Bottlecaps. As many as he can find." [13:22] *** brayden has quit IRC (Quit: Leaving) [13:34] *** dashcloud has quit IRC (Read error: Operation timed out) [13:37] *** dashcloud has joined #archiveteam-bs [14:06] *** brayden has joined #archiveteam-bs [14:06] *** Ravenloft has quit IRC (Ping timeout: 370 seconds) [14:34] *** HCross2 has joined #archiveteam-bs [14:45] *** Ravenloft has joined #archiveteam-bs [16:15] *** chazchaz has quit IRC (Read error: Operation timed out) [16:16] *** Ravenloft has quit IRC (Ping timeout: 360 seconds) [16:20] *** chazchaz has joined #archiveteam-bs [16:34] modarchive.org apparently has failed [16:41] Tah dahhh [17:21] *** zerkalo_ has quit IRC (Read error: Connection reset by peer) [17:50] *** JesseW has joined #archiveteam-bs [17:59] *** JesseW has quit IRC (Leaving.) [18:13] *** JW_work has quit IRC (Read error: Operation timed out) [18:21] *** JW_work has joined #archiveteam-bs [20:19] https://www.freelancer.com/projects/php/Web-Scraping-entire-forum-sub/ [20:19] Who the fuck [20:23] lol — pointing them at https://archive-it.org/ might be a good idea. [20:27] *** JW_work has quit IRC (Quit: Leaving.) [20:31] lol [20:32] *** BlueMaxim has joined #archiveteam-bs [20:46] *** godane has joined #archiveteam-bs [21:00] *** VADemon has joined #archiveteam-bs [21:05] Would it unethical to claim the cash and just archivebot that? [21:06] call it a donation ;) [21:07] if you turn the cash into hosting fees for an archivebot pipeline, seems legit to me [21:07] Would it be more unethical to say "This project would require at least six hours of my time, but if you put up $300, I'll work on it nonstop until it's done"? [21:08] if you keep an eye on the logs that sounds reasonable [21:08] :) [21:08] Would it be even more unethical to mug the guy IRL and use his credit cards to pay for digital ocean instances to run archivebot pipelines? [21:09] Maybe I'm getting a bit carried away. [21:09] charge per url [21:09] and don't ignore /calendar/ [21:10] Even more UK news is going to begin marching its way into the archive :) [21:11] *** JesseW has joined #archiveteam-bs [21:12] *** JW_work has joined #archiveteam-bs [21:18] *** JW_work has quit IRC (Quit: Leaving.) [21:34] *** antomatic has joined #archiveteam-bs [21:34] *** LordNigh2 has joined #archiveteam-bs [21:35] *** brayden has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** tephra has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** Start has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** antomati_ has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** w0rp has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** mismatch_ has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** afics has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** Lord_Nigh has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** Apathy has quit IRC (hub.dk irc.inet.tele.dk) [21:35] *** dan- has quit IRC (hub.dk irc.inet.tele.dk) [21:36] *** w0rp_ has joined #archiveteam-bs [21:36] *** mismatchm has joined #archiveteam-bs [21:37] *** lytv has quit IRC (Quit: Leaving) [21:41] *** tephra_ has joined #archiveteam-bs [21:49] *** lytv has joined #archiveteam-bs [21:50] *** Apathy has joined #archiveteam-bs [21:50] *** afics has joined #archiveteam-bs [21:50] *** slyphic is now known as slyphic|a [21:50] *** w0rp_ is now known as w0rp [21:51] *** LordNigh2 is now known as Lord_Nigh [22:09] Good! Make her majesty's government fear The Bot [22:13] what's important right now? been away for a while [22:14] Friends Reunited starting soon, Gcode and MyVIP [22:14] *** achip has joined #archiveteam-bs [22:15] Hey Kazzy [22:16] hiya [22:16] *** Start has joined #archiveteam-bs [22:18] I should have said Hello! [22:18] o/ [22:49] *** dan- has joined #archiveteam-bs [23:10] *** JesseW has quit IRC (Leaving.) [23:54] *** Rotab has quit IRC (hub.se irc.du.se) [23:54] *** Boppen has quit IRC (hub.se irc.du.se) [23:54] *** w0rp has quit IRC (hub.se irc.underworld.no) [23:54] *** mutoso has quit IRC (hub.se irc.underworld.no) [23:54] *** wednesday has quit IRC (hub.se irc.underworld.no) [23:54] *** ersi has quit IRC (hub.se irc.underworld.no) [23:54] *** midas has quit IRC (hub.se irc.underworld.no) [23:54] *** Rye has quit IRC (hub.se irc.underworld.no) [23:54] *** Fletcher has quit IRC (hub.se irc.underworld.no) [23:54] *** espes___ has quit IRC (hub.se irc.underworld.no) [23:54] *** useretai- has quit IRC (hub.se irc.underworld.no) [23:54] *** will has quit IRC (hub.se irc.underworld.no) [23:54] *** antomatic has quit IRC (hub.se efnet.port80.se) [23:54] *** HCross2 has quit IRC (hub.se efnet.port80.se) [23:54] *** kyan has quit IRC (hub.se efnet.port80.se) [23:54] *** GLaDOS has quit IRC (hub.se efnet.port80.se) [23:54] *** unstable has quit IRC (hub.se efnet.port80.se) [23:54] *** _desu___ has quit IRC (hub.se efnet.port80.se) [23:54] *** wp494 has quit IRC (hub.se efnet.port80.se) [23:54] *** Muad-Dib has quit IRC (hub.se efnet.port80.se) [23:54] *** Famicoma1 has quit IRC (hub.se efnet.port80.se) [23:54] *** ivan` has quit IRC (hub.se efnet.port80.se) [23:54] *** SilSte has quit IRC (hub.se efnet.port80.se) [23:54] *** Kazzy has quit IRC (hub.se efnet.port80.se) [23:54] *** mistym has quit IRC (hub.se efnet.port80.se) [23:54] *** zhongfu has quit IRC (hub.se efnet.port80.se) [23:54] *** pikhq has quit IRC (hub.se efnet.port80.se) [23:54] *** Kenshin has quit IRC (hub.se efnet.port80.se) [23:54] *** Rickster has quit IRC (hub.se efnet.port80.se) [23:54] *** Ctrl-S___ has quit IRC (hub.se efnet.port80.se) [23:54] *** zyphlar_ has quit IRC (hub.se efnet.port80.se) [23:54] *** bauruine has quit IRC (hub.se efnet.port80.se) [23:54] *** sigkell has quit IRC (hub.se efnet.port80.se) [23:54] *** Fusl has quit IRC (hub.se efnet.port80.se) [23:54] *** joepie91 has quit IRC (hub.se efnet.port80.se) [23:54] *** SadDM has quit IRC (hub.se efnet.port80.se) [23:54] *** JSharp___ has quit IRC (hub.se efnet.port80.se) [23:54] *** deathy has quit IRC (hub.se efnet.port80.se) [23:55] *** JesseW has joined #archiveteam-bs [23:59] *** Rotab has joined #archiveteam-bs [23:59] *** Boppen has joined #archiveteam-bs