[00:01] *** REiN^ has quit IRC (Read error: Operation timed out) [00:56] *** kurt_ has quit IRC (Quit: Connection closed for inactivity) [01:03] *** vitzli has joined #archiveteam-bs [01:38] *** sigkell has joined #archiveteam-bs [01:38] *** voltagex has joined #archiveteam-bs [02:04] *** vitzli has quit IRC (Quit: Leaving) [02:45] *** username1 has joined #archiveteam-bs [02:48] *** schbirid2 has quit IRC (Read error: Operation timed out) [02:49] *** VADemon has quit IRC (Quit: left4dead) [02:50] Gna! (#gnarm) code repositories and (binary) downloads are secured. Ticket tracker content likely needs excessive scraping for a lack of responsiveness from the admins. Upload starts soon. [03:46] *** DopefishJ has joined #archiveteam-bs [03:46] *** swebb sets mode: +o DopefishJ [03:47] *** DFJustin has quit IRC (Ping timeout: 260 seconds) [03:59] *** ndiddy has quit IRC (Read error: Connection reset by peer) [04:07] *** DopefishJ is now known as DFJustin [04:51] *** icedice has quit IRC (Quit: Leaving) [05:09] Wrote a script that, if an item has just a .djvu in it, makes a .CBZ out of it so we can read it in the bookreader. [05:09] No idea how many of my uploads this affects. [05:11] *** Sk1d has joined #archiveteam-bs [05:11] *** Sk1d has quit IRC (Connection Closed) [05:44] *** wabu has quit IRC (Ping timeout: 246 seconds) [06:18] *** wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) [06:28] *** Stiletto has joined #archiveteam-bs [06:37] *** Sue_ has joined #archiveteam-bs [06:52] Doing a partial upload of www.booksie.com. They appear to be A) medium-largish and B) will IP ban you for as little as 2 concurrent. Pain in the ass. [07:19] *** Aranje has quit IRC (Quit: Three sheets to the wind) [07:24] *** Honno has joined #archiveteam-bs [08:08] *** dashcloud has quit IRC (Read error: Operation timed out) [08:12] *** dashcloud has joined #archiveteam-bs [08:13] *** godane has quit IRC (Quit: Leaving.) [08:31] *** nickware has joined #archiveteam-bs [08:52] *** nickware has quit IRC (Ping timeout: 1208 seconds) [09:03] *** GE has joined #archiveteam-bs [09:10] *** bwn has quit IRC (Ping timeout: 960 seconds) [09:14] *** GE_ has joined #archiveteam-bs [09:14] *** GE has quit IRC (Ping timeout: 255 seconds) [09:14] *** GE_ is now known as GE [09:17] https://www.reddit.com/r/emulation/comments/5w7duy/thegamesdb_is_down_more_than_its_up_i_have/ I don't suppose we could do anything anything about this? [09:21] *** bwn has joined #archiveteam-bs [10:00] *** schbirid2 has joined #archiveteam-bs [10:03] *** username1 has quit IRC (Read error: Operation timed out) [10:23] *** username1 has joined #archiveteam-bs [10:26] *** schbirid2 has quit IRC (Read error: Operation timed out) [10:30] *** schbirid2 has joined #archiveteam-bs [10:32] *** username1 has quit IRC (Read error: Operation timed out) [10:35] http://hiddenpalace.org/Main_Page [10:39] *** wp494 has joined #archiveteam-bs [10:55] *** Honno has quit IRC (Ping timeout: 370 seconds) [11:06] *** godane has joined #archiveteam-bs [11:51] *** odemg has quit IRC (Remote host closed the connection) [12:33] *** BlueMaxim has quit IRC (Quit: Leaving) [12:39] *** odemg has joined #archiveteam-bs [13:02] [13:59] .t https://www.buzzfeed.com/salvadorhernandez/sanctuary-churches-v-trump-deportation-mandate [13:02] [13:59] Churches Are Readying Homes And Underground Railroads To Hide Immigrants From Deportation Under Trump - BuzzFeed News [13:02] [14:00] .t http://edition.cnn.com/2017/02/23/us/california-immigrant-safe-houses/ [13:02] [14:00] An underground network is readying homes to hide immigrants - CNN.com [13:02] [14:00] Mark Krikorian, executive director of the conservative Center for Immigration Studies, says the law is clear about what these groups are intending to do. "They're committing a felony. Harboring is a felony," Krikorian says. "Regular folks hiding people in a basement face jail time because it is ultimately a smuggling conspiracy." [13:02] [14:01] Hoover, a self-professed good Midwestern-raised boy, says he's prepared for the federal consequences. Valiente says religious leaders opposing immigration crackdowns believe one simple thing: "We're doing what we think is right." [13:12] *** GE has quit IRC (Quit: zzz) [13:19] *** SketchCow has quit IRC (Read error: Connection reset by peer) [13:25] *** SketchCow has joined #archiveteam-bs [13:25] *** swebb sets mode: +o SketchCow [13:30] *** IllidanS4 has joined #archiveteam-bs [13:31] https://www.reddit.com/r/Archivists/comments/5uvfpw/youtube_archive_still_running_already_40k_videos/ [14:07] *** VADemon has joined #archiveteam-bs [14:10] *** schbirid2 has quit IRC (Read error: Operation timed out) [14:18] *** IllidanS4 has quit IRC (Read error: Operation timed out) [14:20] *** schbirid2 has joined #archiveteam-bs [14:35] *** GE has joined #archiveteam-bs [15:18] So here's what we have from Gna: http://www.archiveteam.org/index.php?title=Gna!/projects [15:18] Advice here was to chunk it up (a) in <40Gbyte chunks, (b) as ZIP (or similar aggregated lump) rather than lots of small files [15:19] Is there anything else we should do to make it easier to work with this stuff once it's on IA? [15:20] *** dashcloud has quit IRC (Read error: Operation timed out) [15:20] I'm looking at making a schedule which puts each project's different areas in the same ZIP-or-whatever and puts as many projects in a ZIP as will fit [15:20] (e.g., a2jmidid-aeternal, afoc-babs, etc) [15:20] Is ZIP the best container for archive.org? [15:21] Slightly worried it might mangle funny filenames [15:22] jtn2: to be clear, should be using both a) and b) there, not a) or b) [15:23] .tar.gz might be more suitable, if filenames are a concern [15:24] *** dashcloud has joined #archiveteam-bs [15:25] *** odemg has quit IRC (Remote host closed the connection) [15:25] Kaz: a) and b)> ack [15:25] We'll use .tar.something if IA is happy with that [15:26] Zip files should use UTF-8 encoding for their file names [15:27] So nothing should be mangled [15:27] Zip files are preferred over .tar.whatever since the IA can index them and allow people to browse them without downloading the whole archivd [15:28] MrRadar: no trouble with funny punctuation characters or colons or anything DOS mightn't like? [15:28] Kaz: MrRadar how big should the .zip files be? [15:30] jtn2: The zip format doesn't contain any restrictions other than '/' being used as the directory separator (as opposed to ' [15:30] '\\') [15:31] As for size, around 50 GB or so it a good maximum [15:33] OK, ZIP it is [15:33] Will chunk by uncompressed size since we can't predict the compressed size [15:42] *** odemg has joined #archiveteam-bs [16:01] *** odemg has quit IRC (Remote host closed the connection) [16:06] *** odemg has joined #archiveteam-bs [16:49] MrRadar: source for zip better than tar on IA? you can browse tar just fine at IA [16:49] if people rsync, then using zip would be a shame because all the permissions and flags would get lost... [16:50] You mean tar.gz vs zip? [16:50] we've started zipping already... [16:51] And AFAIK SVN doesn't retain linux file flags anyway. [16:51] Do we need file flags/permissions on source code? [16:51] Since .tar.gz files compress everything in a single deflate stream (unless you go out of your way to compress each file as a separate deflate stream like WARC tools do) there's no random-access support [16:52] This is the server representation of the svn history (+ non-svn stuff). [16:52] I hope there's not much significant info in file permissions, but I don't know. [16:52] There might be in CVS. I don't know about arch/tla at all. [16:53] I guess we can do a "ls -laR" against there being interesting info there [16:53] anyway, gtg [16:54] Has anyone else handled The Games DB dump? Else I am going to parse it and upload. [16:57] Hmm... I swore I heard somewhere that zip files were browsable on the IA, but I can't find any official documentation of that [16:57] MrRadar: schbirid2 if you guys want it some other way, the scrip you supply runs in the gnarm folder and sees the www, cvs, svn, download and arch folders there. Output is to ~/tmp/gnarm . If you have a better thing than .zip, tell me. If a good idea (I dont know), maybe dictionary compression with zstd, maybe on tar files of projects or so. [16:58] Details in #gnarm [17:29] *** IllidanS4 has joined #archiveteam-bs [17:30] *** ndiddy has joined #archiveteam-bs [17:31] *** IllidanS4 has quit IRC (Client Quit) [17:40] If I am uploading xml response files from an API, what is the mediatype I should file it under? Text, or web? [17:49] *** icedice has joined #archiveteam-bs [18:02] i think you cannot use web as normal user, can you? [18:02] i never could:( [18:04] Uploaded it under data. If it turns out to be a problem I can always bug Jason about it. [18:23] *** odemg has quit IRC (Remote host closed the connection) [18:24] *** odemg has joined #archiveteam-bs [18:58] *** Aranje has joined #archiveteam-bs [19:16] *** ae_g_i_s has quit IRC (Quit: killing me softly with his `shutdown -h now`) [19:24] *** odemg has quit IRC (Remote host closed the connection) [19:58] *** cooldude1 has joined #archiveteam-bs [19:58] mininova https://torrentfreak.com/torrent-legend-mininova-will-shut-down-for-good-170226/ [19:58] "The site's forum will close next week followed by the rest of the site a month later." might need panic? [20:01] *** odemg has joined #archiveteam-bs [20:02] cooldude1, forum is non-public. [20:02] ? [20:03] http://forum.mininova.org registration gated. [20:04] @rocode they might have disabled signups since they will close soon, someone might have a log though or could ask for database dump [20:04] login* [20:08] does anyone have daily alexa dumps by any chance? [20:14] I second the panic mode. I think however if will get saved. Other recent shutdown are at least in non-public archives, a frined of a friend told me. I do have reason to trust hem though. [20:17] well i got to go hope you guys can add it to the list of stuff to get backed up, just though i share the news [20:19] if anyone has any daily alexa top million dumps would appricate a link (not archive.org, but daily backups) , will check irc logs later, cheers [20:19] not archival-related, but: https://twitter.com/joepie91/status/835944918971473920 [20:22] (I feel like a lot of people are missing the significance of this) [20:22] I feel like it's probably sensationalistic [20:23] but I know zero [20:27] sensationalistic as in clickbait? I think the lone fact that to my knowledge there was no humanitarian violation of ethics with not allowing immigrants, and the suverenity of a nation, as well as it not really being tied to minority discrimination issues, makes me think that both jews and hitler (who got killed), and slaves (who were slaves, so that was wrong by itself, but I am not good with [20:28] american history details), are of much greater wrongdoing than trump and the immigrants [20:29] For what I know it is not that bad to not allow immigrants without a visa. If the rules are too strict your country just tends to starve by lack of special skills and technological diffusion around the world. [20:32] yes, that sounds about right [20:37] I think we should try to steer clear of overt political pandering in this channel. Hiding illegal immigrants is not some noble goal, and comparing it to hiding jews from the Holocaust or slaves from slavers is false equivalence and inane. [20:39] rocode: I conclude with the same. [20:58] *** cooldude1 has quit IRC (Leaving) [21:02] https://hypothes.is/blog/annotation-is-now-a-web-standard/ ! [21:13] hard to archive though [21:13] or maybe it wouldn't be with this standard [21:13] neat [21:40] Frogging: how is it hard to archive? [21:40] the various annotation things? I haven't seen a lot of them but the one's I've seen look javascript-y [21:41] but if it's a proper standard that would make it easier [22:03] Frogging: now we'd only have to archive 14 kinds of notations instead of 13! [22:03] :p [22:03] haha [22:03] good ol' 927 [22:04] :) [22:06] was just about to link that haha [22:06] i guess it is naive to assume that something being an "official" standard would translate to actual adoption on the web [22:06] but hey, it's possible [22:07] I don't see it replacing comment sections anytime soon though [22:20] *** nrp3c has joined #archiveteam-bs [22:30] *** BlueMaxim has joined #archiveteam-bs [22:30] *** pizzaiolo has joined #archiveteam-bs [22:44] *** schbirid2 has quit IRC (Quit: Leaving) [22:52] *** odemg has quit IRC (Remote host closed the connection) [23:00] *** icedice has quit IRC (Quit: Leaving) [23:22] *** pizzaiolo has quit IRC (Read error: Operation timed out) [23:23] *** GE has quit IRC (Remote host closed the connection) [23:24] *** odemg has joined #archiveteam-bs