[00:00] "THAT IS THE CASE!"...it's just my shorthand english that blows chunks [00:00] Ah, got it. [00:01] "da e da som e saken" (nor: "Det er det som er saken")...en: That is the case [00:02] what about reddit? [00:02] frontpage here is showing me a VPN deal for "black friday" [00:02] 20 of the 25 posts on the frontpage are currently about FCC's net neutrality announcement. [00:03] one seriously must make an internet 2.0... [00:04] "Calm down about the Net Neutrality thing... Paying additional money to access certain sites will give you a sense of pride and accomplishment." [00:04] Ahahahaha [00:04] which site would that be? :D [00:05] facebook? lol [00:05] fucking hell..Gopher and telnet protocol is coming back! [00:06] This is what it could look like: https://i.imgur.com/QTL3At5.jpg [00:06] That's a real screenshot from a Portuguese mobile provider. [00:06] This crap is happening already. [00:07] Here's the direct link to that page: https://www.meo.pt/internet/internet-movel/telemovel/pacotes-com-telemovel [00:07] if it happens more, i think that would actually be beneficial to show people how shit it is [00:08] it's redicicouls, and against every intention of what internet was meant and planned to be [00:09] * ola_norsk is drunk [00:09] pardon the typos [00:13] it's not that scary though i think. I would claim it's impossible to keep going like that in the long run. Eventually it would backlash and implode. Hell, even "dark web" is just shit that was searcable on good not that long ago. [00:14] or dank web or deep web, or whatever the fuck it's called these days..(eventhough it's technically "higher web") [00:14] *** ola_norsk has quit IRC (d.r.u.n.k) [00:22] "Deep web" is the term you're looking for. "Dark web" is a subsection of that and refers to darknets, e.g. Tor or Freenet. [00:22] actually I really prefer "dank web" [00:22] that sounds like a way better term [00:23] It does. [00:26] *** ola_norsk has joined #archiveteam-bs [00:27] i'd be happy to pay taxes for IA mirror here: http://www.bbc.com/specialfeatures/horizonsbusiness/clips-library/?autoplay=true&vid=p01jssrc&tab=2 [00:27] i'm freezing my feet of [00:27] make it happen! [00:27] change.org WOULD work.. [00:28] *** ola_norsk has quit IRC (Leaving) [02:14] AudioBooks this way come: http://the-eye.eu/audiobookbay.mp4 [02:15] *** j08nY has quit IRC (Remote host closed the connection) [02:39] i think internet 2.0 is going be like the piratebay van in diggnation [02:46] *** ranavalon has quit IRC (Read error: Connection reset by peer) [02:51] or maybe we take the book scanning van and add rpi+librarybox+kiwix project to it [03:12] *** Mateon1 has quit IRC (Read error: Operation timed out) [03:12] *** Mateon1 has joined #archiveteam-bs [03:17] *** Stiletto has quit IRC () [03:19] *** qw3rty113 has quit IRC (Read error: Connection reset by peer) [03:20] *** qw3rty113 has joined #archiveteam-bs [03:26] *** pizzaiolo has quit IRC (Remote host closed the connection) [03:28] *** godane has quit IRC (Quit: Leaving.) [04:04] *** wp494_ has joined #archiveteam-bs [04:05] *** godane has joined #archiveteam-bs [04:06] so i'm starting to think i need to run a list of packages from slackware into debian apt-get to get close to what i need [04:07] also found out my wifi tp-link stick doesn't light up on boot in my slax-debian [04:10] *** wp494 has quit IRC (Read error: Operation timed out) [04:31] There a good place to host a Postgres database with somewhat flexible storage? [04:32] I’m parsing the Miiverse warcs I grabbed and are putting them into a database with a web front end, so it’ll be easier to find the stuff we grabbed [04:35] I was thinking of just getting a linode and hosting it there, but I don’t know what other options are out there. Usually I host on Azure, but Postgres is in preview. [04:38] *** qw3rty114 has joined #archiveteam-bs [04:40] *** wp494_ is now known as wp494 [04:44] *** qw3rty113 has quit IRC (Read error: Operation timed out) [05:48] *** godane has quit IRC (Leaving.) [06:10] *** TheLovina has joined #archiveteam-bs [06:47] *** godane has joined #archiveteam-bs [06:48] so i'm now on my debian based slax system [07:35] *** robogoat has quit IRC (Read error: Operation timed out) [07:41] *** robogoat has joined #archiveteam-bs [08:01] *** wp494_ has joined #archiveteam-bs [08:03] so some good news [08:03] and bad news [08:03] looks like my telegraph.co.uk upload script didn't upload them all [08:04] good news is still have the files so i can upload them [08:04] its all the 2006 pages as daily dumps of there sitemap archives [08:07] *** wp494 has quit IRC (Ping timeout: 492 seconds) [08:07] *** schbirid has joined #archiveteam-bs [08:08] cdx and warc.gz finally uploaded: https://archive.org/details/www.telegraph.co.uk-archive-2006-04-30-pages-20160707 [08:10] *** wp494_ has quit IRC (Ping timeout: 248 seconds) [08:10] *** wp494 has joined #archiveteam-bs [08:10] most of may 2006 archives have to be uploaded [08:47] *** MrDignity has quit IRC (Remote host closed the connection) [08:47] *** MrDignity has joined #archiveteam-bs [09:57] *** pizzaiolo has joined #archiveteam-bs [10:30] *** j08nY has joined #archiveteam-bs [10:37] i'm starting to upload my collection of rush Limbaugh radio show [10:37] https://archive.org/details/rush-limbaugh-radio-show-2005-06-03 [10:59] *** BlueMaxim has quit IRC (Quit: Leaving) [11:13] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [12:03] *** icedice has joined #archiveteam-bs [12:28] *** dashcloud has quit IRC (Read error: Connection reset by peer) [12:29] *** dashcloud has joined #archiveteam-bs [12:40] *** ranavalon has joined #archiveteam-bs [12:41] *** ranavalon has quit IRC (Read error: Connection reset by peer) [12:41] *** ranavalon has joined #archiveteam-bs [12:44] *** RichardG has quit IRC (Read error: Connection reset by peer) [12:44] *** RichardG has joined #archiveteam-bs [13:08] *** Stilett0 has joined #archiveteam-bs [13:22] *** refeed has joined #archiveteam-bs [13:22] *** refeed has quit IRC (Client Quit) [14:05] *** bithippo has quit IRC (Textual IRC Client: www.textualapp.com) [14:07] *** bithippo has joined #archiveteam-bs [14:48] aww they closed web-beta.archive.org [14:48] well made it private [14:48] I used the fuck out of it [14:50] Time to send an email then. :-) [14:57] *** icedice has quit IRC (Quit: Leaving) [15:27] *** ZexaronS has quit IRC (Ping timeout: 633 seconds) [15:34] *** schbirid2 has joined #archiveteam-bs [15:37] *** schbirid has quit IRC (Read error: Operation timed out) [15:44] Anyone have recommendations on the "best" way to archive youtube videos in cold storage locally? [15:44] youtube-dl, I guess. [15:44] Using youtube-dl now, but that doesn't store the metadata, headers, etc. [15:45] It doesn't? I thought it should. I've only rarely used it myself though. [15:45] It'll grab the highest quality audio and video renditions, mux them together, and voila, file created. [15:45] Today's project I suppose! [15:46] Yeah, but I think it also writes a JSON (or XML?) file with metadata. [15:46] :thinking: Good call, going to investigate more. Lost some videos out of my Favorites playlist today that were deleted or made private, never again! [15:48] @JAA: Maw gawd it does [15:48] --write-info-json Write video metadata to a .info.json file [15:48] Thank you!! [15:49] :-) [15:57] *** MrDignity has quit IRC (Remote host closed the connection) [15:59] *** MrDignity has joined #archiveteam-bs [16:01] *** pizzaiolo has quit IRC (Read error: Operation timed out) [16:03] *** pizzaiolo has joined #archiveteam-bs [16:05] bithippo: youtube-dl --title --continue --retries 4 --write-info-json --write-description --write-thumbnail --write-annotations --all-subs --ignore-errors -f bestvideo+bestaudio URL [16:05] From http://archiveteam.org/index.php?title=YouTube [16:06] *** pizzaiolo has quit IRC (Client Quit) [16:06] *** pizzaiolo has joined #archiveteam-bs [16:19] *** Chorca has quit IRC (Quit: leaving) [16:19] Doh. Thanks @Kaz [16:22] *** MrDignity has quit IRC (Read error: Connection reset by peer) [16:22] *** MrDignity has joined #archiveteam-bs [16:43] *** fie has quit IRC (Ping timeout: 248 seconds) [17:00] *** fie has joined #archiveteam-bs [18:02] Man, if AT ever has to save youtube [18:24] there are channels on YouTube that delete a very high percentage of their content (e.g. Apple or nokia) or upload television streams (of interest, e.g. news) that are inevitably taken down by the copyright holder [18:26] if you want a project, scrape youtube channels/users i.e. UU* playlists every day, learn which channels remove content [18:28] archive those and you'll have a nice collection in no time [18:39] YouTube loses some non-trivial percentage of videos every year because there are so many parties that can get a video taken down (uploader, copyright holder, random guy with fake copyright claim, privacy complainant, YouTube for ToS violation) [18:40] after some number of unresolved copyright strikes the entire channel gets nuked [18:44] Does YouTube provide a list of DMCA notices/removals in a consumable format? [18:44] bithippo: I don't think so, just the copyright holder notices on individual /watch pages [18:57] Ya, no chilling affects for Youtube [18:58] Disappoint. [19:10] wait [19:10] it does! [19:10] https://www.lumendatabase.org/ [19:10] it reports to these guys [19:10] everything [19:12] *** j08nY has quit IRC (Read error: Connection reset by peer) [19:12] well [19:12] not everything [19:13] Still something! Thanks for sharing! [19:15] *** ola_norsk has joined #archiveteam-bs [19:17] so someone who is not me, wrote a representative of Den Norske Dataforening yesterday. Regarding making a Norwegian IA mirror happen: https://imgur.com/1spd0ny [19:18] What does it say? [19:18] tl;dr "I like the idea, but could you tell me more about it?" [19:18] give me a few secs [19:18] Nice! thats a good sign [19:20] Running a IA mirror is no feat mind you, its huge! [19:20] aye i know [19:21] *** j08nY has joined #archiveteam-bs [19:21] but DND (The Norwegian Computer Society) is not exactly small [19:22] http://www.dataforeningen.no/in-english.128921.no.html [19:23] @jrwr Does every IA item provide a torrent to download it? [19:23] Just about [19:23] there are a ton of non public collections [19:24] https://archive.org/stats/ [19:24] they are storing a "metric fuckton" of data [19:25] I assume the non public collections can be gotten at for cold storage mirroring with someone's approval? [19:25] Ya [19:25] w00t [19:26] you would need to hit up SketchCow I guess for all that nonsense since he works there [19:26] jrwr: here's a (semi-bad) google translate version of the response https://pastebin.com/v4Hjp7Ys [19:26] also in 2014 they said they are hosting Total used storage: 50 PetaBytes [19:27] Crazy amount of data [19:27] my frustration is that the person seems to think waybackmachine is internetarchive [19:27] thats SUPER common [19:27] aye [19:27] I'll ask people have they ever seen Archive / Internet Archive [19:28] and they will nope out, then I fall back to the Wayback machine [19:28] :D [19:28] I would send over samples of what is stored to this guy, like all the news casts from the states ever done and indexed [19:28] my hope is that it will stirr up to something, and they look more into it [19:29] someone might do that... [19:29] Would be awesome for Brewster and Co to do a "What is the Internet Archive?" 2 min video for these sorts of thigns [19:29] old movies, films, maybe have a more active part from that part of the world archiving its history to it [19:29] I would love a CGP Gray on this [19:30] one point should be that Alexiandria is not only in a political "hotzone", but i think someone here said it's ~10 years outdated(?) [19:30] Ya [19:31] I know there is another one as well [19:31] its the old petaboxes, they are in a datacenter deep in the EU [19:31] doing /something/ that is unknown to me and the others I was speaking to [19:31] being kept current? [19:31] Unknown [19:31] it was asked if AT could take it over at one point [19:32] "deep in the EU" :D [19:32] lol [19:32] == I forgot where it was in the EU [19:32] this was a few months ago [19:33] yeah, i just liked that expression :D made me instantly think of swiss alps :D [19:33] https://archive.org/about/graphs.php [19:38] thing is, if it's not being kept current it's still good stuff..But then it's like "timebox", i don't know the proper term for it; When people bury a box with keep-sakes to dig up later. [19:39] in english, "time capsule" [19:39] yes [19:40] it seems to me like Alexandria is not so much mirror as well, but a time capsule, if its 10 years behind current "main data" [19:44] bithippo: Yes, an official presentation would be helpful. I think it's called a "pitch stack"..like a small facts presentation that could be included in propositions [19:44] We could always use more warriors :) [19:45] PitchDeck seems to be the word [19:49] my problem is i can't and don't want to be a any sort of representative or spokesperson. My hope is simply to contact the key person in my country that would. [19:50] "stirr it up" as he said :D https://youtu.be/SRyELKGLGag [19:54] beside NCS, which is independend community; There's Kulturdepartementet (Culture Department), Kunnskapsdepartementet (Knowledge/Education dep)..But i know Dataforerningen holds good sway in both [19:56] both of those state departments hold decisions that could make it happen [19:58] i put a post about the stuff i'm archiving before net neutrality is over: https://www.reddit.com/r/DataHoarder/comments/7etrwy/things_to_archive_before_net_neutrality_is_over/ [19:59] things are that bad? :/ [19:59] i dont think so [19:59] its just in case [20:00] better safe then sorry [20:00] good point [20:01] imagine, ISP beginning to sell "gaming packages"..where your game lags as f*ck if you don't subscribe to it :/ [20:02] "Sign up for decent connection to EA servers, for only $10 extra a month!" [20:05] "For only $2 extra, you could play CS:GO with acceptable latency!" [20:06] :) [20:17] Start bundling game season passes on your cable bill [20:17] _D [20:18] "It looks your playing the latest World of Warcraft addon, would you like to play it without lag? Subscribe today!" :D [20:21] *** jschwart has joined #archiveteam-bs [20:22] "For only £1, you could be accesing the best handpicked items that Internet Archive has to offer! Kind regards - ISP" [20:24] IMO though, just like DRM's and DNS blocking, it won't hold. But incentivice people to break and circumvent it. [20:25] i feel like this topic is vaguely offtopic, but also not [20:26] astrid: I doubt IA would be high priority connection.. [20:27] but yeah, i don't think it will be that bad. It has the potential for it though, but i see it as impossible to happen. [20:27] are you here to work on archiveteam projects, or to chat [20:27] let me check if my upload is done.. [20:28] what're you uploading? [20:28] Big_Cartoon video archive [20:28] ah [20:28] https://www.theverge.com/2017/11/22/16691794/net-neutrality-fcc-ajit-pai-comcast-block-bittorrent [20:29] *** Asparagir has joined #archiveteam-bs [20:29] so we need a way to switch protocols mid stream and random but still download files thur bittorrent [20:30] make it in impossible for them to filter it out without killing the hole net [20:39] it's possible to leach trough Tor is it not? [20:39] (most likely slow as h*ll though) [20:40] *** bithippo has quit IRC (My MacBook Air has gone to sleep. ZZZzzz…) [20:44] torrents are really bad to the tor network [20:45] if you want anonymous torrenting, use i2p, it is officially supported there and works quite well [20:45] yeah, but not impossible are they? [20:45] it is possible but you would be a massive dick as it is very stressful traffic [20:45] aye [20:46] one of my thoughts it make all look like https data without domain to tell where its coming from [20:46] i have no idea if can be done though [20:47] i dont know how they detect it to be https, outside of traffic going to port 80 or 8080 [20:48] maybe it could be obfuscated by some packets containing "genuine" http packets (which the torrent client/tracker would ignore? [20:51] enough so that it looks like http connection attempt for a "sniffer detector" i mean? [20:53] the trackers and peers on the other end would receive it as garbage though [20:54] it would be borderline ddosing perhaps? :/ [20:55] *** bithippo has joined #archiveteam-bs [21:06] *** ola_norsk has quit IRC (Veit ikkje du så veit ikkje eg) [21:09] *** pizzaiolo has quit IRC (Read error: Operation timed out) [21:09] *** pizzaiolo has joined #archiveteam-bs [21:11] *** pizzaiolo has quit IRC (Client Quit) [21:11] *** pizzaiolo has joined #archiveteam-bs [21:12] *** ola_norsk has joined #archiveteam-bs [21:18] *** ola_norsk has quit IRC (Leaving) [21:34] Arh shit. My 1,5 GB scratch disk is making awful clicking noises. [21:34] *** Darkstar has quit IRC (Ping timeout: 260 seconds) [21:36] *** Darkstar has joined #archiveteam-bs [21:38] *** hook54321 sets mode: +o Asparagir [21:51] You have a 1.5 GB HDD? [21:51] I would recommend not storing anything important on it. [21:52] Nothing terribly important on it. SMART has been complaining about it for years. [21:53] I'd just be slightly annoyed if my virtual machines were lost, but that's it. [21:53] I haven't gotten a reply back about my archive.org account yet, although thanksgiving is this week, so I guess some people might have multiple days off. [21:54] JensRex: You should move the virtual machines off of the 1.5 GB Hard Drive. [21:54] 1.5 gigabyte? still in active use? [21:54] Eh, TB. [21:55] That's a bit different lol [21:55] That makes more sense. [21:55] (Although, who ever buys hard disks that aren't a power of two in size??) [21:55] still can get a 4tb from bestbuy this week for $80 [21:56] *** BlueMaxim has joined #archiveteam-bs [21:56] thats a black friday special [21:56] ?? [21:56] link me? [21:56] I might need it if my account doesn't get unlocked lol [21:56] godane, ola_norsk: That is possible. It's fairly easy to set up OpenVPN to look similar to HTTPS traffic. Doesn't mean it's undetectable though, of course. [21:57] https://i.imgur.com/x0yTsiK.png [21:57] 4 TB disks are about 150 USD in Denmark. [21:57] That's 150 more than I have to spend on hardware. [21:57] Yeah, Europe never gets any of those sweet deals from the US... [21:57] *** ola_norsk has joined #archiveteam-bs [21:57] That's 8 GB o_O [21:57] 8 TB for $129, for example... [21:57] https://www.reddit.com/r/DataHoarder/comments/7e5yc2/black_friday_wd_easystore_8tb_for_12999_valid/ [21:58] *TB [21:58] is there a way to revert an issued (deleteion) task id? [21:58] I'll be back in a bit [21:58] *** robink has quit IRC (Ping timeout: 506 seconds) [21:58] specifically task_id=783333772 [21:58] My storage server is still using 2*2 TB disks :/ [21:58] *** RichardG has quit IRC (Read error: Connection reset by peer) [21:59] JensRex: Similar here, two 2 TB and two 1 TB disks. [22:00] 4 TB external for backups. [22:00] i accentidentally mixed to items, issued an delete all on the latest; But seems like around 200+ of that channels videos are removed since yesterday. [22:01] I'll probably get some ST8000AS0002 (Seagate Archive 8 TB) drives soonish, but I might wait a bit more until the Exos 5E8 become available, hoping that the older model's price decreases a bit. [22:01] anyway, no worries, just wondering if it would be possible to recall the delete all on an item [22:04] *** robink has joined #archiveteam-bs [22:05] *** ola_norsk has quit IRC (Leaving) [22:29] @JensRex: I have a 1.5TB Seagate spinning disk you can have for freesies if you want. [22:33] *** jschwart has quit IRC (Quit: Konversation terminated!) [22:34] ooh, I found 8TB seagate archive drives for £21/TB [22:34] Yeah, that sounds about right. [22:35] or Ironwolf Pro's for £27.50/TB [22:35] They have been around that price here for several months. [22:35] I haven't been paying attention then [22:36] *** ranavalon has quit IRC (Read error: Connection reset by peer) [22:37] Here != UK, maybe that's a first for the UK, not sure. [22:38] *** ranavalon has joined #archiveteam-bs [22:39] *** ranavalon has quit IRC (Remote host closed the connection) [22:39] *** ranavalon has joined #archiveteam-bs [23:14] *** ranav has joined #archiveteam-bs [23:15] *** ranav has quit IRC (Remote host closed the connection) [23:15] *** ranav has joined #archiveteam-bs [23:21] *** ranavalon has quit IRC (Read error: Operation timed out) [23:35] *** RichardG has joined #archiveteam-bs [23:42] bithippo: Thanks, but I think I have something stored away in The Box Of Things. [23:43] No worries, slowly parting out my own Box Of Things. [23:50] *** Asparagir has quit IRC (Asparagir)