[00:05] *** ElKDrago has quit IRC (Quit: Page closed) [01:16] *** yawkat has quit IRC (Ping timeout: 610 seconds) [01:38] *** yawkat has joined #archiveteam-ot [03:04] *** manjaro-u has quit IRC (Read error: Operation timed out) [03:11] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [04:27] *** qw3rty has joined #archiveteam-ot [04:34] *** qw3rty2 has quit IRC (Ping timeout: 745 seconds) [04:54] *** icedice has quit IRC (Quit: Leaving) [05:38] *** benjins has quit IRC (Read error: Connection reset by peer) [06:31] *** m007a83 has quit IRC (Quit: Fuck you Comcast) [06:32] *** dhyan_nat has joined #archiveteam-ot [07:30] *** BlueMax has quit IRC (Read error: Connection reset by peer) [07:38] *** schbirid has joined #archiveteam-ot [08:11] *** deevious has joined #archiveteam-ot [09:44] *** Dragnog2 has quit IRC (Quit: Connection closed for inactivity) [11:11] is github going to preserve in the artic vault only repos created before their announcement? (avoiding people to "hack" the idea, creating repos for the vault) [11:12] otherwise, we could put some important info in some repos to be preserved [11:15] I read the announcement as all active repos on the cutoff date being snapshotted, but not sure. [11:16] NB, it's only a snapshot of HEAD (i.e. no history, no branches, no tags), and binaries over 100 kB aren't included either. [11:17] yeah, but for exmaple, we could upload a XML dump of AT wiki, which counts as non binary, right? github limit for files is 100MB, enough for things like that [11:19] Possibly. They're not exactly precise about their definitions, probably on purpose to avoid exactly this. [11:20] we can rename the dump to .py, in the case the exclude XML files lol [11:21] Yeah, or split it up into files of 99999 bytes. [11:21] Though they also don't say whether that limit is for individual files or the total of all binaries. [11:22] "The 02/02/2020 snapshot archived in the GitHub Arctic Code Vault will sweep up every active public GitHub repository, in addition to significant dormant repos as determined by stars, dependencies, and an advisory panel. The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size. Each repository will be packaged as a single TAR file. For greater data density and integrity, [11:28] i wonder what they mean by "active" [11:28] Could be anything from "not archived" to "commits in the past X days", I guess. [11:30] and also what the star-threshold is for being considered "significang" [11:30] significant, even [11:30] a bunch of fuzzy terms in there making the scope a bit unclear :/ [11:31] aaanyways, has anyone seen https://www.vogons.org/viewtopic.php?f=46&t=69184 ? That is apparently happening, and the deadline is the 22. this month before a lot of stuff disappears [11:32] wonder how much of that is archived already, and how much isn't [11:32] I was just about to link that haha [11:33] Jason Did it last time SketchCow do you have a script or something to update this (assuming it's the right thing here) https://archive.org/details/2014.01.download.intel.com [11:35] I assume https://downloadcenter.intel.com is to big for #archivebot ? [11:43] *** odemg has joined #archiveteam-ot [11:58] Fusl grabbed all of that I believe, though I can't find it on IA right now. [12:09] ahh okay cool [13:05] *** benjins has joined #archiveteam-ot [13:09] https://github.com/romainPrignon/hello-future [13:09] found searching "github vault" and sorting by date [13:13] I like that they chose groundhog day. [13:20] I estimate in 500 years that https://en.wikipedia.org/wiki/ will either seem like a geocities link, or a book's Dewey Decimal System call number. [13:20] I like that the date is symmetric: 2020-02-02 [13:29] symmetric when ISO format and decimal :p [14:00] Indeed, and even in the broken US format. [14:02] That won't happen again until 2121 I think. [14:17] *** Flashfire has quit IRC (Remote host closed the connection) [14:17] *** kiska has quit IRC (Remote host closed the connection) [14:18] *** Flashfire has joined #archiveteam-ot [14:18] *** kiska has joined #archiveteam-ot [14:18] *** Fusl__ sets mode: +o kiska [14:18] *** Fusl sets mode: +o kiska [14:18] *** Fusl_ sets mode: +o kiska [14:51] Yeah, let me go grab that. [14:53] We do currently upload wiki dumps to the IA, I shove them down to FOS [14:53] like once a month I think [14:55] They're uploaded to IA automatically? Where? Last time I checked, I couldn't find anything newer than this spring I think. [14:56] Oh yeah, looks like they were uploaded again a few weeks ago. https://archive.org/details/archiveteam_wiki_backup [15:03] I hold them and then upload them in a bulk [15:03] that item needs some metadata [15:04] Should I make it more automatic. [15:04] *** akierig has joined #archiveteam-ot [15:06] VoynichCr: at least it's palindromic in non-ISO, 02022020 [15:15] Igloo: how does your twitter stream archive work? exploring hashtags or accounts? [15:15] *** Dallas has quit IRC (Quit: The Lounge - https://thelounge.chat) [15:30] *** dashcloud has joined #archiveteam-ot [15:55] *** VerifiedJ has joined #archiveteam-ot [16:00] eythian: Ah yes, palindromic is what I meant. [16:08] SketchCow: I like automated backups. Minimises the risk of losing data. But I also grab the dumps automatically, so the chance of losing it is pretty low. [16:09] *** dhyan_nat has quit IRC (Read error: Operation timed out) [16:24] Nominate a book https://www.memory-of-mankind.com/1000books/ [17:13] 3 more computer magazine issues are going up [17:13] from 1986 [17:16] *** akierig has quit IRC (Quit: later_gator) [17:20] *** akierig has joined #archiveteam-ot [17:24] https://blog.archive.org/2017/10/10/books-from-1923-to-1941-now-liberated/ [17:48] "In the initial deposit, GitHub has stored 6,000 of its most significant repositories in AWA for perpetuity, capturing the evolution of technology and software. This collection includes the source code for the Linux and Android operating systems; the programming languages Python, Ruby, and Rust; web platforms Node, V8, React, and Angular; cryptocurrencies Bitcoin and Ethereum; AI tools TensorFlow and FastAI; and many more. GitHub will st [17:49] honestly, archiving inactive public repos would be more important, they are more at risk than active projects [17:50] so, i think they will adjust the "active" definition to archive up to X terabytes [18:01] *** manjaro-u has joined #archiveteam-ot [19:14] *** Dallas has joined #archiveteam-ot [20:26] *** Zerote__ has joined #archiveteam-ot [20:29] *** Zerote_ has quit IRC (Ping timeout: 252 seconds) [21:15] *** dhyan_nat has joined #archiveteam-ot [21:26] *** Hani111 has joined #archiveteam-ot [21:37] *** Hani has quit IRC (Ping timeout: 745 seconds) [21:37] *** Hani111 is now known as Hani [21:42] *** manjaro-u has quit IRC (Read error: Connection reset by peer) [21:43] *** manjaro-u has joined #archiveteam-ot [22:12] *** akierig has quit IRC (Quit: later_gator) [22:20] I'm not really sure what to do with the github archive thing. there's apparently no way to tell if your repo is going to be included, no way to tell if you repo has been included and no way to tell if everything necessary to build it is going to be there [22:21] my biggest project so far is split into two repos (for reasons), so the engine is probably going to be archived but the repo containing all modules likely isn't [22:21] *** dhyan_nat has quit IRC (Read error: Operation timed out) [22:26] don't rely on others to archive what you care about [22:29] well, I won't be building my own arctic vault anytime soon [22:29] markedL Do you have access to the tracker? [22:35] kpcyrd: yeah, they say "all active public repos, all files, except binaries", but i am not sure about it [22:36] it seems too much data to preserve, and i think they will cut lot of repos [22:38] also, i doubt they are going to preserve forks [22:38] "active" is very vague. like, a certain level of recent activity? I've seen github stars being mentioned which wouldn't work for me because the 1st repo has 468 stars and the 2nd repo has 0 stars [22:38] for example, torvalds/linux repo has thousands of forks, they arent going to save so many redundant info [22:39] i assume they will ignore forks, but i am not sure [22:41] i will push a dummy commit in every repo of mine which i want to be preserved, to increase the chances [22:43] also, i will create my own "vault repo", adding some dumps (split), not really code but plain files, and i hope it is preserved [22:45] though if they dont publish an index, i will never know [22:47] *** X-Scale has quit IRC (Read error: Operation timed out) [22:51] kpcyrd: you could add the content from your 0 stars repo into the popular one, in a temporal directory, and remove the data after the snapshot is generated. I think your 400+ stars repo will be preserved for sure [22:54] Yay, history pollution. [22:55] :p [23:04] ffmpeg does not like huge video frames =( [23:07] Define: huge [23:10] 5600x1080 [23:10] :} [23:11] Neat [23:11] and if i encode with x265 the result has broken colors [23:14] VoynichCr: the 0 star repo contains the code maaaay cause a takedown request at some point, that's why I split it in the first place :D [23:28] *** schbirid has quit IRC (Quit: Leaving) [23:58] *** BlueMax has joined #archiveteam-ot