[00:13] *** cerca has joined #archiveteam-bs [00:30] *** mtntmnky has quit IRC (Remote host closed the connection) [00:30] *** mtntmnky has joined #archiveteam-bs [00:43] *** Atom__ has quit IRC (Ping timeout: 276 seconds) [00:44] *** Atom__ has joined #archiveteam-bs [01:00] SketchCow: i'm uploading the original cbz files that i made the mobile beat pdfs from [01:09] *** ShellyRol has quit IRC (Ping timeout: 496 seconds) [01:22] *** ShellyRol has joined #archiveteam-bs [01:25] *** mtntmnky has quit IRC (Remote host closed the connection) [01:26] *** mtntmnky has joined #archiveteam-bs [01:29] *** VerifiedJ has quit IRC (Quit: Leaving) [01:37] *** britmob has quit IRC (Read error: Connection reset by peer) [01:40] *** britmob has joined #archiveteam-bs [01:58] *** Jens has quit IRC (Remote host closed the connection) [01:58] *** Jens has joined #archiveteam-bs [02:08] *** godane has quit IRC (Ping timeout: 246 seconds) [02:23] *** godane has joined #archiveteam-bs [02:28] *** odemg has quit IRC (Quit: Leaving) [02:41] nicolas17, OrIdow6 from my quick look at po.st, it didn't appear to offer shortened URLs to the general public. Rather it seemed to be more so a traffic-tracker, possibly for emails in inside a company's own website. If a company that used po.st was still maintaining it's website, I would imagine they'll fix their URLs. I think losing bit.ly or tinyurl.com would be easily be worse. [02:42] I don't think they were for emails [02:42] It was primarily to track links from social media etc. [02:45] -> a lot of public links [02:46] OrIdow6: do you know who owns the company? [02:46] Big loss, I should expect, is from customers using a custom domain, e.g. tylt.it (which not returns 404s on everything) [02:46] *now [02:48] hook54321: https://www.rhythmone.com/ - I do not have any special knowledge about them except by poking about their website [02:55] I think it's this https://en.wikipedia.org/wiki/RadiumOne [02:59] *** Maylay has joined #archiveteam-bs [03:00] https://web.archive.org/web/20180813105249/https://blog.po.st/ is RadiumOne [03:02] BUT this references po.st also https://www.rhythmone.com/privacy-policy [03:02] cross ownership or ownership changes likely [03:05] As I said a day ago in #urlteam, the shutdown notice said that RythemOne were the owners at that time [03:06] "that time" being a day ago [03:09] here is the answer: https://www.cmo.com.au/article/621238/adtech-company-rhythmone-acquires-radiumone/ [03:11] I'll send them an email [03:24] *** Raccoon` has joined #archiveteam-bs [03:24] *** Raccoon has quit IRC (Ping timeout: 258 seconds) [03:24] *** Raccoon` is now known as Raccoon [03:29] *** AeonG has quit IRC (Read error: Operation timed out) [03:43] *** Raccoon has quit IRC (Ping timeout: 258 seconds) [04:40] *** odemgi has joined #archiveteam-bs [04:44] *** odemgi_ has quit IRC (Read error: Operation timed out) [04:58] *** qw3rty__ has joined #archiveteam-bs [05:05] *** qw3rty_ has quit IRC (Read error: Operation timed out) [05:14] *** kiska18 has quit IRC (Read error: Operation timed out) [05:14] *** kiska18 has joined #archiveteam-bs [05:15] *** svchfoo3 sets mode: +o kiska18 [05:15] *** svchfoo1 sets mode: +o kiska18 [05:32] atphoenix: do you know what the storage limit is for users using Frontier's FTP service? [05:32] I think 25 mb FTP and 25 mb web [05:33] Every Frontier customer starts off with 25MB for their web space and 25MB for their email—depending on their service plan. [05:33] per https://frontier.com/helpcenter/categories/online-services/advanced-features/upload-my-web-site [05:33] that URL lists the naming patterns to expect [05:33] and other related domains too [05:34] If your email address ends with… Your public files are available at… [05:34] @frontier.com ftp.frontier.com/pub/users/yourusername [05:34] @frontiernet.net ftp.frontiernet.net/pub/users/yourusername [05:34] @citlink.net ftp.citlink.net/pub/users/yourusername [05:34] @newnorth.net ftp.newnorth.net/pub/users/yourusername [05:34] @epix.net ftp1.epix.net/pub/users/yourusername [05:34] @gvni.com ftp://username@gvni.com/pub/users/yourusername [05:34] I recognize some of those various domains as companies Frontier ingested [05:36] so far all the sites I've found hosted on Frontier are simple pages that should be AB-friendly [05:37] except for the ftp, that is [05:37] and that caveat about the site throwing errors on read attempts [05:37] (sometimes) [05:42] https://transfer.notkiska.pw/FMcrL/Frontier_myplace_all_users_pages https://transfer.notkiska.pw/L0owC/Frontiernet_all_users_pages [05:43] Userlists so far [05:43] Not all are still alive [05:47] frontier ftp is slow. 13 mb file 4 minutes eta [05:48] What are you downloading? [05:49] I'm poking around in ftp://ftp.frontier.com/pub/users/usnraptor/Fighters%20Anthology/Music/ [05:52] the big file is a sound mod for the game. The small file contains many MIDIs. [06:00] *** cerca has quit IRC (Remote host closed the connection) [06:14] *** Raccoon has joined #archiveteam-bs [06:23] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [06:23] *** ShellyRol has joined #archiveteam-bs [06:32] *** Atom-- has joined #archiveteam-bs [06:36] *** Atom__ has quit IRC (Ping timeout: 276 seconds) [06:42] *** ShellyRol has quit IRC (Ping timeout: 610 seconds) [06:42] *** ShellyRol has joined #archiveteam-bs [06:46] *** nicolas17 has quit IRC (Ping timeout: 746 seconds) [06:50] *** RichardG has quit IRC (Ping timeout: 615 seconds) [07:51] SketchCow, tubeup means throw into youtubearchive? [07:53] I imagine they need to actually be on IA, so tubeup. [07:54] https://www.youtube.com/user/madbitcoins is currently on YT [07:55] all 3 are on YT currently [07:55] are you familiar with tubeup? https://github.com/bibanon/tubeup [07:55] It rips them from youtube and uploads to IA [07:56] no, not familiar with that. Only familiar with ivan's archive [07:57] well, I mean I've heard that YT->IA is on hold because junk was getting sent in [07:59] *** closure has quit IRC (Read error: Connection reset by peer) [08:01] yeah, risky to rely on it IMO. [08:03] I'll submit to Ivan's youtubearchiver under assumption these are at risk. [08:04] "tubeup uses youtube-dl to download a Youtube video (or any other provider supported by youtube-dl), and then uploads it with all metadata to the Internet Archive." [08:07] 1 of the 3 channels was already in ivan's archive. I don't know how to move anything to IA, so I'll leave that to someone else. [09:11] *** amelia386 has quit IRC () [09:11] *** amelia386 has joined #archiveteam-bs [10:23] *** picklefac has quit IRC () [10:23] *** picklefac has joined #archiveteam-bs [10:26] *** LowLevelM has quit IRC (Read error: Operation timed out) [10:31] *** ibachandl has quit IRC (Remote host closed the connection) [10:31] *** Dallas has quit IRC (Read error: Connection reset by peer) [10:32] *** ibachandl has joined #archiveteam-bs [10:34] *** marked1 has quit IRC (Read error: Connection reset by peer) [10:34] *** Maylay has quit IRC (Ping timeout: 276 seconds) [10:36] *** atphoenix has quit IRC (Ping timeout: 276 seconds) [10:37] *** RichardG has joined #archiveteam-bs [10:37] *** atphoenix has joined #archiveteam-bs [10:39] *** Maylay has joined #archiveteam-bs [10:39] *** OrIdow6 has quit IRC (Ping timeout: 276 seconds) [10:52] *** BlueMax has quit IRC (Read error: Connection reset by peer) [10:58] *** DFJustin has quit IRC (Ping timeout: 745 seconds) [11:01] *** OrIdow6 has joined #archiveteam-bs [11:31] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [11:35] *** ShellyRol has joined #archiveteam-bs [12:04] *** OrIdow6 has quit IRC (Ping timeout: 276 seconds) [12:21] *** ibachandl has quit IRC (Ping timeout: 610 seconds) [13:22] *** OrIdow6 has joined #archiveteam-bs [13:46] *** closure has joined #archiveteam-bs [15:13] *** schbirid has joined #archiveteam-bs [15:14] *** VerifiedJ has joined #archiveteam-bs [15:15] *** X-Scale has quit IRC (Read error: Operation timed out) [15:35] *** marked1 has joined #archiveteam-bs [15:40] *** anarcat has quit IRC (se.hub irc.efnet.nl) [15:40] *** d5f4a3622 has quit IRC (se.hub irc.efnet.nl) [15:40] *** brayden_ has quit IRC (se.hub irc.efnet.nl) [15:40] *** Tenebrae has quit IRC (se.hub irc.efnet.nl) [15:40] *** PurpleSym has quit IRC (se.hub irc.efnet.nl) [15:40] *** MrRadar2 has quit IRC (se.hub irc.efnet.nl) [15:40] *** anarcat has joined #archiveteam-bs [15:40] *** d5f4a3622 has joined #archiveteam-bs [15:40] *** brayden_ has joined #archiveteam-bs [15:40] *** Tenebrae has joined #archiveteam-bs [15:40] *** PurpleSym has joined #archiveteam-bs [15:40] *** MrRadar2 has joined #archiveteam-bs [15:40] *** irc.efnet.nl sets mode: +o PurpleSym [16:02] *** Ryz has quit IRC (Remote host closed the connection) [16:02] *** kiska18 has quit IRC (Remote host closed the connection) [16:03] *** kiska18 has joined #archiveteam-bs [16:03] *** svchfoo3 sets mode: +o kiska18 [16:03] *** Ryz has joined #archiveteam-bs [16:03] *** svchfoo1 sets mode: +o kiska18 [16:29] *** Dallas has joined #archiveteam-bs [17:15] What is all this [17:24] *** ibachandl has joined #archiveteam-bs [17:42] you remember how you darked a bunch of youtube uploads [17:42] well, noindex'd [18:05] *** Harzilein has joined #archiveteam-bs [18:34] *** ibachand has joined #archiveteam-bs [18:34] *** ibachandl has quit IRC (Read error: Connection reset by peer) [19:00] *** DLoader_ has joined #archiveteam-bs [19:08] *** DLoader has quit IRC (Ping timeout: 745 seconds) [19:08] *** DLoader_ is now known as DLoader [19:24] *** DFJustin has joined #archiveteam-bs [19:36] *** icedice has joined #archiveteam-bs [19:37] *** Craigle has quit IRC (Quit: The Lounge - https://thelounge.chat) [19:37] *** Craigle has joined #archiveteam-bs [20:06] *** ibachandl has joined #archiveteam-bs [20:12] *** kiska has quit IRC (Remote host closed the connection) [20:13] *** kiska has joined #archiveteam-bs [20:13] *** ibachand has quit IRC (Read error: Operation timed out) [20:13] *** Flashfire has joined #archiveteam-bs [20:13] *** svchfoo1 sets mode: +o kiska [20:13] *** svchfoo3 sets mode: +o kiska [20:16] *** britmob has quit IRC (Read error: Connection reset by peer) [20:29] arduino is a pretty widely used framework for iot stuff, both hobbyist and production [20:29] ^context? [20:29] it has a central archive of packages like cpan/pypi, but given iot stuff has a tendency to break/abandon/shutdown, it might be worth grabbing these packages [20:29] *** is- has quit IRC (Ping timeout: 496 seconds) [20:29] what do people think about that? [20:30] i have a line to download them all but i dont have any archival disk space [20:31] (ie i can put in the effort, i just need to know where is appropriate to put them, im not sure IA is the right place to save artefacts of code) [20:31] OrIdow6, this looks like a useful tool to find Frontier customer homepages. http://scraperr.com/ [20:31] left-pad but for iot is probably a bad thing [20:32] abstract, IA has github stuff and other code stuff in it [20:32] neat [20:32] if something is web-scrapable, IA SPN http://web.archive.org/save can be used by anyone to archive URLs [20:34] it's not, its an index of zips containing code, examples, help files, etc [20:34] if you have a link to the ZIP, you can put the link into SPN [20:35] i have a 11k links [20:35] s/a// [20:36] there is an SPN email submission option too. We also have tools that can take in long lists of links [20:37] $ wget -q -O - https://downloads.arduino.cc/libraries/library_index.json | jq ".libraries[].url" | wc -l [20:37] 11012 [20:37] each libraries entry also has metadata like author, short description, homepage, repo, etc [20:40] https://blog.archive.org/2019/10/23/the-wayback-machines-save-page-now-is-new-and-improved/ says Have you ever wanted to archive all the web pages linked from an email message? Well, you are in luck because now you can forward that email to “savepagenow@archive.org” and after a few minutes you will get an email back filled with Wayback Machine playback URLs. [20:42] cool, so i can spam it with 11,000 links, but will they be properly archived under a collection with metadata? im down for this i just dont know the tooling for doing so [20:43] https://paste.debian.net/1126573/ [20:43] bad archiving is good but good quality archiving is surely best [20:43] I do not know the limits to the email-based submission [20:44] I have heard it takes longer to reply for long lists [20:44] you might try lists of say 100 and see how it goes [20:44] and work upwards from there if it works as expected [20:45] items submitted via SPN end up in the Wayback Machine [21:00] abstract, what were you trying to demonstrate with the paste.debian link? [21:00] all the metadata i have [21:00] * 11k [21:01] https://archive.org/services/docs/api/internetarchive/index.html looks useful [21:09] is there no html index of all the downloads? [21:12] *** nicolas17 has joined #archiveteam-bs [21:17] OrIdow6, seems the search scraper I listed above isn't working very well right now. I have found something else (python script) intended for the same purpose https://github.com/NikolaiT/GoogleScraper . That github links what I guess is a commercial implementation https://scrapeulous.com/ that offers 500 free searches per month [21:23] https://scrapeulous.com/about/ says As of 2019, GoogleScraper is replaced by a modern successor named se-scraper that builds on top of puppeteer and headless Chromium browser. [21:23] https://github.com/NikolaiT/se-scraper [21:24] *** Raccoon` has joined #archiveteam-bs [21:25] *** OrIdow6 has quit IRC (Ping timeout: 276 seconds) [21:26] *** Flashfire has quit IRC (Ping timeout: 276 seconds) [21:26] *** Dallas has quit IRC (Ping timeout: 276 seconds) [21:27] marked1, nah, they have a custom tool in the IDE for managing them [21:28] *** Raccoon has quit IRC (Ping timeout: 610 seconds) [21:28] *** Raccoon` is now known as Raccoon [21:30] *** marked1 has quit IRC (Ping timeout: 276 seconds) [21:32] *** X-Scale has joined #archiveteam-bs [21:32] *** Atom__ has joined #archiveteam-bs [21:35] *** pew has quit IRC (Ping timeout: 276 seconds) [21:35] *** purplebot has quit IRC (Ping timeout: 276 seconds) [21:35] *** foureyes_ has joined #archiveteam-bs [21:35] *** Frogging has quit IRC (Quit: Close the World, Open the nExt) [21:35] *** Hoolootwo has joined #archiveteam-bs [21:36] *** Atom-- has quit IRC (Ping timeout: 276 seconds) [21:36] *** Hooloovoo has quit IRC (Ping timeout: 276 seconds) [21:36] *** foureyes has quit IRC (Ping timeout: 276 seconds) [21:37] *** Frogging has joined #archiveteam-bs [21:38] *** purplebot has joined #archiveteam-bs [21:45] *** britmob has joined #archiveteam-bs [21:48] *** DiscantX has joined #archiveteam-bs [21:54] *** Dallas has joined #archiveteam-bs [21:54] *** pew has joined #archiveteam-bs [21:55] *** marked1 has joined #archiveteam-bs [21:56] *** Flashfire has joined #archiveteam-bs [22:01] *** qw3rty has joined #archiveteam-bs [22:01] *** ibachand has joined #archiveteam-bs [22:02] *** britmob_ has joined #archiveteam-bs [22:02] *** Stiletto has joined #archiveteam-bs [22:02] *** OrIdow6 has joined #archiveteam-bs [22:03] *** benjins has joined #archiveteam-bs [22:04] *** Fionera_ has joined #archiveteam-bs [22:05] *** britmob has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** Atom__ has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** X-Scale has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** ibachandl has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** VerifiedJ has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** ShellyRol has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** Maylay has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** qw3rty__ has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** Fionera has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** Stilett0 has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** HP_Archiv has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** benjinsmi has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** obskyr has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** ctrl_ has quit IRC (hub.efnet.us efnet.deic.eu) [22:05] *** kiska3 has quit IRC (hub.efnet.us efnet.deic.eu) [22:08] *** Maylay_ has joined #archiveteam-bs [22:08] *** asdf0101 has quit IRC (The Lounge - https://thelounge.chat) [22:08] *** marked1 has quit IRC (Quit: The Lounge - https://thelounge.chat) [22:10] *** asdf0101 has joined #archiveteam-bs [22:10] *** marked1 has joined #archiveteam-bs [22:15] *** actually_ has joined #archiveteam-bs [22:21] *** HP_Archiv has joined #archiveteam-bs [22:21] *** ShellyRol has joined #archiveteam-bs [22:24] *** BlueMax has joined #archiveteam-bs [22:29] *** schbirid has quit IRC (Read error: Operation timed out) [22:39] *** X-Scale has joined #archiveteam-bs [22:41] *** ctrl_ has joined #archiveteam-bs [22:46] *** DiscantX has quit IRC (Remote host closed the connection) [23:20] *** af10b3e5e has joined #archiveteam-bs [23:20] *** d5f4a3622 has quit IRC (Read error: Connection reset by peer) [23:40] *** dewdrop3 has joined #archiveteam-bs [23:49] *** dewdrop has quit IRC (Ping timeout: 745 seconds) [23:49] *** dewdrop3 is now known as dewdrop