[00:00] do we know how large this is in total? [00:11] SourceForge as a whole "exceeds 50TB" according to https://sourceforge.net/p/forge/documentation/Join%20as%20a%20Mirror/ , but that page was last updated in late 2016. [00:11] 50TB is doable [00:11] I'd now assume a few hundred TB probably [00:12] *** twigfoot has quit IRC (Read error: Operation timed out) [00:13] *** twigfoot has joined #archiveteam-bs [00:17] *** twigfoot has quit IRC (Read error: Operation timed out) [00:17] *** twigfoot has joined #archiveteam-bs [00:25] *** twigfoot has quit IRC (Read error: Operation timed out) [00:40] *** justas1 has joined #archiveteam-bs [00:46] *** justas has quit IRC (Ping timeout: 745 seconds) [02:02] *** Verified_ has quit IRC (Ping timeout: 252 seconds) [02:05] *** Verified_ has joined #archiveteam-bs [02:14] *** Pixi` has quit IRC (Quit: Pixi`) [02:15] *** Pixi has joined #archiveteam-bs [03:11] *** twigfoot has joined #archiveteam-bs [03:15] *** qw3rty113 has joined #archiveteam-bs [03:17] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [03:21] *** qw3rty112 has quit IRC (Ping timeout: 600 seconds) [03:35] *** odemgi has joined #archiveteam-bs [03:39] *** odemgi_ has quit IRC (Read error: Operation timed out) [03:58] *** c0mpass has joined #archiveteam-bs [05:28] *** drcd_ has joined #archiveteam-bs [05:28] *** Deewiant_ has joined #archiveteam-bs [05:29] *** drcd has quit IRC (Ping timeout: 186 seconds) [05:29] *** Deewiant has quit IRC (Ping timeout: 186 seconds) [05:29] *** drcd_ is now known as drcd [06:18] *** systwi has quit IRC (Read error: Connection reset by peer) [06:19] *** arkiver has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** ShellyRol has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** Fusl__ has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** nyany_ has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** purplebot has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** nyany has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** h3ndr1k has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** Sokar has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** zerkalo has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** eientei95 has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** svchfoo3 has quit IRC (ircd.choopa.net irc.mzima.net) [06:20] *** twigfoot has quit IRC (Read error: Operation timed out) [06:20] *** systwi has joined #archiveteam-bs [06:20] *** twigfoot has joined #archiveteam-bs [06:22] *** Verified_ has quit IRC (Ping timeout: 252 seconds) [06:25] *** arkiver has joined #archiveteam-bs [06:25] *** ShellyRol has joined #archiveteam-bs [06:25] *** Fusl__ has joined #archiveteam-bs [06:25] *** nyany_ has joined #archiveteam-bs [06:25] *** purplebot has joined #archiveteam-bs [06:25] *** nyany has joined #archiveteam-bs [06:25] *** h3ndr1k has joined #archiveteam-bs [06:25] *** Sokar has joined #archiveteam-bs [06:25] *** zerkalo has joined #archiveteam-bs [06:25] *** eientei95 has joined #archiveteam-bs [06:25] *** svchfoo3 has joined #archiveteam-bs [06:25] *** irc.mzima.net sets mode: +oo Fusl__ svchfoo3 [06:25] *** Fusl sets mode: +o Fusl__ [06:25] *** Fusl sets mode: +o svchfoo3 [06:25] *** Fusl_ sets mode: +o Fusl__ [06:25] *** Fusl_ sets mode: +o svchfoo3 [06:27] *** Fusl__ sets mode: +o kiska [06:27] *** Fusl__ sets mode: +o ivan_ [06:27] *** Fusl__ sets mode: +o kiska1 [06:27] *** Fusl__ sets mode: +o HCross [06:27] *** Fusl__ sets mode: +o hook54321 [06:27] *** Fusl__ sets mode: +o Kaz [06:27] *** Fusl__ sets mode: +o Fusl_ [06:27] *** Fusl__ sets mode: +o SketchCow [06:27] *** Fusl__ sets mode: +o AlsoJAA [06:27] *** Fusl__ sets mode: +o Fusl [06:27] *** Fusl__ sets mode: +o Kenshin [06:27] *** Fusl__ sets mode: +o dxrt [06:27] *** Fusl__ sets mode: +o kiskabak [06:27] *** Fusl__ sets mode: +o JAA [06:27] *** Fusl__ sets mode: +o jrwr [06:27] *** Fusl__ sets mode: +o astrid [06:27] *** Fusl__ sets mode: +o chfoo [06:27] *** Fusl__ sets mode: +o svchfoo1 [06:35] *** benjinsmi has joined #archiveteam-bs [06:37] *** benjins has quit IRC (Ping timeout: 252 seconds) [06:38] *** Verified_ has joined #archiveteam-bs [07:10] *** bluefoo has quit IRC (Read error: Operation timed out) [08:26] *** N4Y has quit IRC (Ping timeout: 745 seconds) [08:27] *** N4Y has joined #archiveteam-bs [08:34] *** bluefoo has joined #archiveteam-bs [09:07] *** Maylay has quit IRC (Ping timeout: 252 seconds) [09:08] *** Verified_ has quit IRC (Quit: Quit) [09:09] *** Cameron_D has joined #archiveteam-bs [09:10] *** Maylay has joined #archiveteam-bs [09:29] *** icedice has joined #archiveteam-bs [10:37] *** BlueMax has quit IRC (Quit: Leaving) [10:39] *** benjinss has joined #archiveteam-bs [10:41] *** benjinsmi has quit IRC (Ping timeout: 252 seconds) [10:49] *** bluefoo has quit IRC (Quit: bluefoo) [11:04] *** bluefoo has joined #archiveteam-bs [12:07] *** kiskabak has quit IRC (Remote host closed the connection) [12:07] *** kiskabak has joined #archiveteam-bs [12:07] *** Fusl sets mode: +o kiskabak [12:07] *** Fusl__ sets mode: +o kiskabak [12:07] *** Fusl_ sets mode: +o kiskabak [13:19] *** asie has joined #archiveteam-bs [13:47] *** Damme has joined #archiveteam-bs [13:47] *** Damme has quit IRC (Remote host closed the connection) [13:48] *** Damme has joined #archiveteam-bs [13:55] *** killsushi has quit IRC (Quit: Leaving) [14:38] *** DogsRNice has joined #archiveteam-bs [14:45] *** zhongfu has quit IRC (Ping timeout: 745 seconds) [15:29] *** joepie91 has joined #archiveteam-bs [15:43] *** ola_norsk has joined #archiveteam-bs [15:45] Would anyone happen to if it's possible to cheese a 'Yes, i'm old enough to view this content' cookie? Steam forums have this annoying age restriction on certain of it's apps' discussions.. https://i.imgur.com/CJWRvXc.png [15:46] *** bluefoo has quit IRC (Read error: Operation timed out) [15:46] and scrapy seems to return a 404 on those items, unless the question is accepted. [15:47] e.g wheter or not there's a cookie saying 'it's ok, i'm old enough' [15:48] i'm guessing it's the rgTopicView_General* cookie [15:48] but, not sure how to apply it for other apps [15:51] is your question "how do you put a cookie into scrapy"? [15:52] yes, but more like _that specific_ cokkie [15:53] the one that allows 'viewing' age resticted 'apps' discussion [15:54] e.g https://steamcommunity.com/app/730/discussions/ will require age-consent question to be answered with 'Yes' [15:55] ivan_: ^ [15:56] So your question is "what is the cookie"? [15:57] i suppose, if one consent will allow other apps to no ask the question over again [15:57] to not* [15:58] if however there's one damn cookie for each app, each requiring consent..then problem grows [16:02] on further inspection, seems like one cookie consenting will do for other apps [16:05] that, or PUBG doesn't require age-consent.. [16:16] ivan_: JAA: I guess my question is ultimately 'How might i set an age-consent cookie, without having to set it by logging in to an actual account' [16:17] e.g CS:GO and DOOM will ask for age-consent [16:17] just to view the 'discussions' [16:27] i'm not even sure if it will help using a logged in cookie, i suspect it will still ask [16:32] *** icedice has quit IRC (Leaving) [16:38] you will have to set it once with your browser so you know the name and value [16:38] just use a private session [16:45] schbirid: i did try that, though i'm struggling to identify the actual cookie name [16:46] huh, i am unable to identify it myself. how annoying [16:46] aye [16:46] hm, in opera it is in "Session Storage" not a cookie. no idea about that new web stuff... [16:46] e.g https://store.steampowered.com/app/412020/Metro_Exodus/ will seemingly open in chromium incognito window [16:47] even though it's 18yo [16:49] schbirid: i wouldn't have come here to ask unless i also found it quite annoying :D [17:00] ola_norsk: FYI, I can access https://steamcommunity.com/app/730/discussions/ without any age consent bullshit. No cookies or anything, JS disabled, etc. [17:01] I can't from UK [17:02] I had to click through to view content [17:04] i also had to click though [17:06] Session Storage is the new cookie [17:06] more robust as it acts like a database [17:06] i assume this is on people's radar here? https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket [17:07] yes [17:07] "June 1, 2020: users will not be able to use Mercurial features in Bitbucket or via its API and all Mercurial repositories will be removed." [17:11] *** arkiver has quit IRC (Quit: Leaving) [17:12] *** Arkiver2 has joined #archiveteam-bs [17:12] *** Fusl sets mode: +o Arkiver2 [17:12] *** Fusl__ sets mode: +o Arkiver2 [17:12] *** Fusl_ sets mode: +o Arkiver2 [17:12] Yeah, so about that, is anyone here very familiar with how Mercurial works? I've been meaning to figure out how to best archive hg repos for a while. [17:13] * Arkiver2 is back :) [17:13] now going to join all my lost channels [17:13] thanks JAA [17:15] For SVN, we have svnrdump. For git, I've been creating bundles using git clone --mirror + git bundle create $filename --all (which preserves all branches and tags present on the remote). But I have no idea what the appropriate method for fully archiving an hg repo is, mostly since I've never used hg. [17:17] *** RichardG has joined #archiveteam-bs [17:18] *** Arkiver2 has quit IRC (Connection closed) [17:18] JAA: hmm..could it be on a country-basis then? With https://steamcommunity.com/app/730/discussions/ , in chromium incognito window, i get https://i.imgur.com/bkbcu2W.png [17:18] *** arkiver has joined #archiveteam-bs [17:18] *** Fusl__ sets mode: +o arkiver [17:18] *** Fusl sets mode: +o arkiver [17:18] *** Fusl_ sets mode: +o arkiver [17:24] it will however open fine in lynx, so i'll just set the user agent to that [17:25] *** pombreda has joined #archiveteam-bs [17:25] any new websites going down that I missed? [17:26] tumbler safe? [17:26] FYI, re BB retiring hg support, the SWH project has a the code to archive it https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/bitbucket/ and may have already don this [17:26] *done [17:31] arkiver: some might be found in this article https://www.thedailybeast.com/deadspin-editor-quits-rails-against-bosses-ive-been-repeatedly-lied-to-and-gaslit [17:34] pretty much anything at/under 'G/O Media' https://g-omedia.com/ [17:34] We really need to do a better job at keeping the Deathwatch page on the wiki updated... [17:38] JAA: i'm somewhat familiar with hg [17:38] the bitbucket announcement suggests using hg-fast-export and feed that into git-fast-import [17:38] or something like that [17:38] seems like anything listed here might be under scruteny to be 'axed' https://www.greathillpartners.com/portfolio/ if it's a social media/news/ outlet [17:38] i wonder if 'hg-fast-export' is a reliable archiving format [17:39] there's https://www.mercurial-scm.org/wiki/BackUp which says "use clone", "copy the filesystem" or "hg bundle --all" [17:40] i'd assume hg is a good bet [17:41] is hg better than git? [17:41] 'preferable/easier' to use, i mean [17:42] it's a matter of opinion [17:42] ok [17:42] i have found it easier to use [17:45] i've seen bitbucket supports it, don't know about github though [17:45] github does not [17:45] *** Dragnog94 has quit IRC (The Lounge - https://thelounge.chat) [17:45] it's called github for a reason [17:48] *** Dragnog94 has joined #archiveteam-bs [17:49] *** asdf0101 has quit IRC (Read error: Operation timed out) [17:49] *** markedL has quit IRC (Read error: Operation timed out) [17:50] i guess so [17:50] *** ola_norsk has quit IRC (leaving) [18:04] You'd think so, but GitHub has support for SVN. [18:05] it does? [18:10] Yup: https://help.github.com/en/articles/support-for-subversion-clients [18:12] Regarding hg, looks like Software Heritage also uses the clone + bundle method: https://forge.softwareheritage.org/source/swh-loader-mercurial/browse/master/swh/loader/mercurial/loader.py$149-170 [18:14] Specifically, it seems to be a plain "hg clone REPO DIR" and then "hg bundle -t none-v2 -a FILE". [18:16] ah right, because you can't "hg bundle URL" i guess [18:17] i wonder if software heritage already got us covered on this [18:21] Another question is what happens with issues, PRs, etc. [18:21] yeah, that's a different problem isn't it :/ [19:01] *** PhrackD has quit IRC (Read error: Operation timed out) [19:18] *** tomaspark has joined #archiveteam-bs [19:33] *** PhrackD has joined #archiveteam-bs [20:19] *** ShellyRol has quit IRC (Read error: Operation timed out) [20:20] *** ShellyRol has joined #archiveteam-bs [20:34] glad to see anarcat involved in this matter, I have faith we'll manage this one :_) [20:36] ha! [20:36] not sure i did anything :p [20:36] or will do anything, even :/ [20:38] well, discussing the tech issues is a big part of figuring stuff out [20:39] ^ [20:39] hehe [20:40] re other stuff than source, there's https://confluence.atlassian.com/bitbucketserver/exporting-957497835.html, but it seems this creates a tarball on the server so it's not useful for us [20:41] seems to be an API for https://confluence.atlassian.com/bitbucket/export-or-import-issue-data-330797432.html [20:41] i doubt this will be possible without an API key or a user [20:59] * Jon goes to bed [22:34] *** BlueMax has joined #archiveteam-bs [23:03] *** qw3rty114 has joined #archiveteam-bs [23:08] *** qw3rty113 has quit IRC (Ping timeout: 600 seconds) [23:38] *** markedL has joined #archiveteam-bs [23:38] *** asdf0101 has joined #archiveteam-bs [23:45] *** kiska1 has quit IRC (Remote host closed the connection) [23:45] *** d5f4a3622 has quit IRC (Read error: Connection reset by peer) [23:45] *** kiska1 has joined #archiveteam-bs [23:45] *** Fusl__ sets mode: +o kiska1 [23:45] *** Fusl sets mode: +o kiska1 [23:46] *** Fusl_ sets mode: +o kiska1 [23:48] *** d5f4a3622 has joined #archiveteam-bs [23:55] *** tomaspark has quit IRC (Ping timeout: 360 seconds)