[00:08] *** dashcloud has joined #archiveteam-ot [00:19] Whoa, Al Lowe's source codes are already at $10k each with 5 days left still... [00:20] For the unaware: https://arstechnica.com/gaming/2018/11/al-lowe-reveals-his-sierra-source-code-collection-then-puts-all-of-it-on-ebay/ [00:24] *** dashcloud has quit IRC (Ping timeout: 252 seconds) [00:28] :o [00:45] https://www.ebay.com/sch/al_lowe/m.html?item=183561117878&rt=nc&_trksid=p2047675.l2562 [00:45] some of these items are already beyond $10,000 each [00:46] Does archive.org have the funds to get them they seem awfully valuable to history [00:47] no [00:47] That sucks. Maybe someone will donate them who knows. More likely they will end up in some basement somewhere [00:48] JAA is buying them up isn't he [00:49] Just for us [00:56] Just buy them, copy, and sell them on eBay again. [00:57] Kind of reminds me of a manga publisher [00:57] *** dashcloud has joined #archiveteam-ot [00:58] There's a long going manga series that had a bunch of magazine exclusive colour pages that got grayscaled for the volume releases [00:58] And they lost the original colour versions [00:59] They publisher managed to get them back in meh scan quality [01:00] Meanwhile I'm buying the original magazines, debinding, and scanning it [01:00] I only have one of those magazines yet, since they're rare [01:02] There was also that whole thing with the original Japanese Dragon Ball Z audio [01:02] TV Tokyo lost it and only like two people got it on VHS [01:03] But TV Tokyo never reached out to them since they refused to admit that they messed up [01:03] It got released on Nyaa one or two years ago [02:13] *** icedice has quit IRC (Quit: Leaving) [03:15] *** hook54321 has quit IRC (Quit: Connection closed for inactivity) [03:24] *** hook54321 has joined #archiveteam-ot [03:34] *** vectr0n_ has joined #archiveteam-ot [03:41] *** vectr0n has quit IRC (ZNC - https://znc.in) [03:41] *** vectr0n_ is now known as vectr0n [04:03] *** vectr0n has quit IRC (ZNC - https://znc.in) [04:11] *** vectr0n has joined #archiveteam-ot [04:18] *** odemg has quit IRC (Ping timeout: 265 seconds) [04:30] *** odemg has joined #archiveteam-ot [04:38] *** Stiletto has joined #archiveteam-ot [04:57] *** jodizzle_ has quit IRC (west.us.hub irc.Prison.NET) [04:59] *** jodizzle_ has joined #archiveteam-ot [04:59] *** jodizzle_ has quit IRC (Read error: Operation timed out) [05:01] *** jodizzle has joined #archiveteam-ot [05:46] *** Jens has quit IRC (Read error: Connection reset by peer) [06:05] *** JensRex has joined #archiveteam-ot [06:14] *** icedice has joined #archiveteam-ot [07:26] *** wp494 has quit IRC (Read error: Operation timed out) [07:27] *** wp494 has joined #archiveteam-ot [07:27] *** svchfoo3 sets mode: +o wp494 [09:33] Raccoon: lol, I wish. [09:33] You have to ask yourself then; why even do any of this? [09:34] If you can never realize this crowning moment [09:37] That could be YOUR Softporn github repository [09:39] ... Honestly though. Somebody of stature should reach out to this gentleman and introduce him to the prospect of having his collection archived before he sells any of it. [09:41] He worked for Sierra and wrote text adventures. Of course he knows of and read and copied textfiles.com [09:42] "Hi. I'm internet celebrity Jason Scott, and I want to feature your Sierra Online collection... Online!" [09:43] i'd use those words exactly [10:29] JAA: Are there any plans for snscrape to support Mastodon/fediverse hashtags? [10:37] hook54321: No plans currently, but it could certainly be done. [10:37] well, kinda. It probably wouldn't be as simple as centralized social networks. [10:39] Do you have an example? [10:40] Instances can block other instances, so grabbing a list of posts from one instance might not necessarily grab all of them. https://mastodon.social/tags/mastodon [10:41] There's also rss feeds, but it doesn't show all the posts of course. https://mastodon.social/tags/mastodon.rss [10:42] Also, as far as I know most instances don't have any sort of full text post search built in, and I'm pretty sure the ones that do require a login, as far as I can tell at least. [10:43] Hmm, right. [11:04] *** ola_norsk has quit IRC (leaving) [11:09] *** Albardin has joined #archiveteam-ot [11:09] *** w0rmybak has joined #archiveteam-ot [11:09] *** kiskabak has joined #archiveteam-ot [11:15] *** BlueMax has quit IRC (Quit: Leaving) [11:20] hook54321: I guess any Mastodon scraper would have to work on a particular instance. If you need more than that, you need to run it multiple times starting from different instances and then merge those lists. At least I can't think of another way that would work reliably. Only following all results and then scraping on those instances as well can easily explode, and it would likely not manage to find [11:20] everything anyway; it would also not work well with the --max-results option. [12:09] hook54321: could we not run our own instance and then use that to scrape other instances? [12:10] I would assume it's theoretically possible [12:57] Found something interesting [12:58] "Pete Collins - Founder and CEO" - https://www.mozdevgroup.com/about/ [12:58] "Pete Collins, Director" - https://www.mozdev.org/about.html [12:59] *** Mateon1 has quit IRC (Read error: Operation timed out) [12:59] arkiver HCross: Can you please join #tumbledown [12:59] *** Mateon1 has joined #archiveteam-ot [13:39] run a dozen mastodon archivers, have them md5 the metadata of the post, and then a batcher to dedupe them and upload to the IA [14:59] *** ola_norsk has joined #archiveteam-ot [15:03] hei. Does anyone know if this collection is connected to "MirrorTube" ? https://archive.org/details/tubeup&tab=about . Reason i'm asking is that i'm slowly doing 'TNN News' (Tommy Sotomayor). And i noticed that in that collection, several items are marked "May be innapropriate", which might fit for Mr. Sotomayor at times... Or is anything from YouTube okidoki as far as 'appriorate' goes? [15:05] NVM..i just realized i got the wrong 'TubeUP' lol [15:08] *** ola_norsk has quit IRC (don't duckduckgo that!) [15:55] *** hook54321 has quit IRC (Quit: Connection closed for inactivity) [16:03] *** LFlare43 has quit IRC (Quit: The Lounge - https://thelounge.chat) [16:05] *** LFlare43 has joined #archiveteam-ot [16:28] *** wp494 has quit IRC (Read error: Operation timed out) [16:28] *** wp494 has joined #archiveteam-ot [16:28] *** svchfoo3 sets mode: +o wp494 [16:51] *** ola_norsk has joined #archiveteam-ot [16:53] cf: FWW, here's the links+ids that have come up as "400's" so far https://ia801506.us.archive.org/9/items/400_badrequest_yt_ids_2018-12-05_olno/BadRequest_yt_id.txt [18:03] Fuck me. Facebook started doing achievement badges. (today?) [18:06] T-10 SpaceX Dragon Launch to Resupply International Space Station ISS - https://youtu.be/Esh1jHT9oTA [18:49] *** Despatche has joined #archiveteam-ot [19:21] *** Despatche has quit IRC (Read error: Connection reset by peer) [19:21] *** Despatche has joined #archiveteam-ot [19:31] *** Despatche has quit IRC (Read error: Connection reset by peer) [19:32] *** Despatche has joined #archiveteam-ot [19:55] rofl "unprivileged users with UID > INT_MAX can successfully execute any systemctl command" https://gitlab.freedesktop.org/polkit/polkit/issues/74 [20:01] *** ola_norsk has quit IRC (leaving) [20:31] *** Despatche has quit IRC (Read error: Operation timed out) [20:32] *** Despatche has joined #archiveteam-ot [20:37] *** BlueMax has joined #archiveteam-ot [21:00] *** Stilett0 has joined #archiveteam-ot [21:01] *** Stiletto has quit IRC (Ping timeout: 252 seconds) [21:05] *** Bydoless has joined #archiveteam-ot [21:15] Is this a good channel to talk about more personal archiving/mirroring projects? [21:16] hi Bydoless. do you follow reddit.com/r/DataHoarder? [21:16] no [21:17] I'd highly recommend it. [21:17] thanks. [21:18] I'm more of a lurker/learner here, so I'd have to defer your question to the others. It's definitely a neat channel to hang out in. [21:18] Bydoless: sure! weleomc [21:20] on that note, does anyone have a ready-to-use collection of kitten head cutout images around? :D [21:24] https://www.reddit.com/r/cutouts/search?q=cat&restrict_sr=1 [21:24] *** kiskabak has quit IRC (Ping timeout: 265 seconds) [21:24] *** w0rmybak has quit IRC (Ping timeout: 265 seconds) [21:25] Main targets of mine include fc2.com, axfc.net, getuploader.com, blogs.yahoo.jp, and i guess geocities japan but AT's already working on that to some extent. [21:26] mainly japanese blogging and file sharing sites. [21:28] *** Stiletto has joined #archiveteam-ot [21:30] *** Albardin has quit IRC (Ping timeout: 600 seconds) [21:31] *** Stilett0 has quit IRC (Ping timeout: 633 seconds) [21:38] my main goal of my archiving purpose is to find files corresponding to custom characters made for the fighting game engine M.U.G.E.N through more primary sources than current 'warehousing' sites like MUGEN Archive and Mugen Free for All [21:43] *** teej_ has joined #archiveteam-ot [21:44] Hello again icedice. [21:44] Hi teej_ [21:45] I'm also trying Matrix/Riot. I log into it once every couple months to see what's new. [21:46] If you want to keep your username and voice (assuming that you are given voice) you should get a BNC (aka IRC bouncer, aka IRC shell) [21:46] I've been thinking about getting one set up since forever, but never got around to it [21:47] BNC keeps your IRC nick online 24/7 [21:47] It also records chat history [21:49] I've tried setting up a BNC a few years ago. I didn't think it was worth the hassle and started using IRCCloud instead. [21:50] Now, for axfc.net, it's a Japanese file sharing site with around 1.3 million public files, ~350000 of which are not password protected. Problem 1 is that it dislikes bots. For one, it quickly starts throwing letter image captchas. [21:51] Can we capture the cookie when it does that? Sometimes when cookies are set for that sort of thing, it'll ignore it for a while [21:51] I can only join 2 networks simultaneously using the free tier for IRCCloud. [21:51] IRCCloud works as well [21:52] There are self-hostable alternatives to IRCCloud that are unlimited, but the are a bit of a pain in the ass to set up [21:52] I've been on Freenode and the Mozilla network. I recently joined EFnet for this ArchiveBot channel. [21:53] So I'm currently offline on the Mozilla network. [21:54] Maybe https://convos.by/ has gotten easier to set up since I tried it back in like 2016 [21:54] Anyway, it seems pretty nice [21:54] icedice: I've been waiting for a few years for Matrix to mature. Hopefully Matrix will solve many IRC-related troubles. [21:55] *** BlueMax has quit IRC (Quit: Leaving) [21:56] I frequently restart my laptop, so a BNC wouldn't work. [21:57] @kiska not sure of the best way to do that [21:59] I thought BNC would be server solution that wouldn't require the computer to be on [21:59] Sigh [21:59] Matrix is a completely different protocol though [21:59] IRC user base is shrinking year for year [21:59] And everyone is going to Discord instead [22:02] For more info, the listing of the files looks like this: http://www.axfc.net/u/search.pl?num=50&sort=1&sort_m=DESC and a url for one of the files without password protection looks like this: http://www.axfc.net/uploader/so/3947763 [22:02] icedice: A BNC server essentially needs to stay online to keep the user online and save the chat history. [22:03] or this: http://www.axfc.net/u/3947763 [22:03] I don't use Discord. I think it's more popular for computer gamers. [22:05] there are free ZNC providers [22:06] #wl [22:06] I've seen people use Gitter too. [22:06] zulip is best [22:06] streams and topics = <3 [22:07] Thanks ivan. I didn't know there were free ones. [22:08] https://wiki.znc.in/Providers [22:08] https://shells.red-pill.eu/ [22:08] http://curlie.org/Computers/Internet/Chat/IRC/Shell_Providers/ [22:09] https://old.reddit.com/r/irc/search?q=ZNC&restrict_sr=on&sort=relevance&t=all [22:10] *** VerifiedJ has quit IRC (Quit: Leaving) [22:13] *** Despatche has quit IRC (Read error: Operation timed out) [22:14] Interesting... [22:14] I don't know if I would use these ZNCs. [22:14] No guarantee how much they'll log [22:17] every time the initial download page is loaded (ex: http://www.axfc.net/u/3947763), it inputs a form with a hidden 'sid' and 'dqn' input attributes containing random numbers plus a text field for the captcha called 'cpt' if prompted and a password text field called 'keyword' and two checkboxes that determines whether the filename is the one shown on the page or renamed to the file id number. [22:17] I opened up Riot. It still feels very buggy. [22:18] I think over the past couple years, I've grown to like terminal UIs and CLIs more than GUIs. [22:19] they get submitted to '/u/dl2.pl' upon clicking on the download button and whether it succeeds or not determines whether you get an error page or another page to confirm the download. [23:03] *** BlueMax has joined #archiveteam-ot [23:15] *** hook54321 has joined #archiveteam-ot [23:46] *** astrid has left ][