[00:00] *** BlueMax has quit IRC (Quit: Leaving)
[01:04] *** ZizzyDizz has joined #archiveteam-ot
[01:04] <ZizzyDizz> Hello, I was wondering if anyone here has a way to archive a disqus channel?
[01:05] <ZizzyDizz> I just found out today they were getting deleted and there's two I really need to save in some capacity.
[01:05] <ZizzyDizz> And I don't have the bandwidth to do it myself on my home PC, as I have less than 600kbps.
[01:10] *** BlueMax has joined #archiveteam-ot
[01:22] *** killsushi has quit IRC (Read error: Connection reset by peer)
[01:26] *** DogsRNice has quit IRC (Read error: Connection reset by peer)
[01:38] *** nepeat has quit IRC (Read error: Connection reset by peer)
[01:39] *** nepeat has joined #archiveteam-ot
[01:46] *** nepeat has quit IRC (Quit: ZNC 1.7.4 - https://znc.in)
[01:47] <ZizzyDizz> I can't seem to get grab-site to respect --wpull-args=
[01:47] *** nepeat has joined #archiveteam-ot
[01:55] *** m007a83_ is now known as m007a83
[02:42] <markedL> is disqus public or password protected? 
[02:44] *** qw3rty115 has joined #archiveteam-ot
[02:49] *** qw3rty114 has quit IRC (Read error: Operation timed out)
[03:42] *** qw3rty116 has joined #archiveteam-ot
[03:47] *** qw3rty115 has quit IRC (Ping timeout: 612 seconds)
[04:00] *** lunik1 has quit IRC (:x)
[04:05] *** ZizzyDizz has quit IRC (Ping timeout: 260 seconds)
[04:10] *** dhyan_nat has joined #archiveteam-ot
[07:18] *** Mateon1 has joined #archiveteam-ot
[09:13] <JAA> ZizzyDizz: Disqus is heavily JS-based, so you won't be able to get much with wpull/grab-site, wget, etc.
[09:28] *** BlueMax has quit IRC (Quit: Leaving)
[10:33] *** tuluu has quit IRC (Read error: Connection refused)
[10:33] *** tuluu has joined #archiveteam-ot
[11:52] *** lunik1 has joined #archiveteam-ot
[12:29] *** dhyan_nat has quit IRC (Read error: Operation timed out)
[12:59] *** killsushi has joined #archiveteam-ot
[15:40] *** h3ndr1k_ has quit IRC (Ping timeout: 252 seconds)
[15:53] *** dhyan_nat has joined #archiveteam-ot
[17:18] *** h3ndr1k has joined #archiveteam-ot
[17:50] <Raccoon> How can I become a good steward once we get this 500/500 fiber installed
[17:50] <Raccoon> need a Quickstart Guide
[18:32] <kiska> Where do you live?
[18:32] <kiska> Can I put a rsync target on that connection?
[18:37] <Raccoon> I haven't even set up a box yet
[18:37] <Raccoon> will probably need to do some bouncy bouncy before renting to anyone who'll get us terminated for DMCA
[18:59] *** h3ndr1k has quit IRC (Quit:  )
[19:03] *** h3ndr1k has joined #archiveteam-ot
[19:41] *** ShellyRol has quit IRC (Ping timeout: 745 seconds)
[19:52] *** ShellyRol has joined #archiveteam-ot
[20:03] *** ZizzyDizz has joined #archiveteam-ot
[20:03] <ZizzyDizz> No markedL 
[20:13] *** dhyan_nat has quit IRC (Read error: Operation timed out)
[20:38] <markedL> ZizzyDizz: sounds like you might need to use chromebot.  What's the URL so we can experiment? 
[20:38] <JAA> Have fun with that. You'll need a *lot* of resources.
[20:38] <JAA> Also, archival talk should happen in -bs.
[20:38] <markedL> Raccoon: over quota is the bigger risk than DMCA with some of these folk 
[20:38] <JAA> I'm grabbing it since a few hours.
[20:57] *** Hani111 has joined #archiveteam-ot
[21:08] *** Hani has quit IRC (Ping timeout: 745 seconds)
[21:08] *** Hani111 is now known as Hani
[21:23] <Raccoon> markedL, yeah, but i wonder if they have a quota for 500/500 on a business fiber
[21:24] <Raccoon> if we do this, i'm dripping every drop of that sht
[21:24] <Raccoon> 5.15 TB/day
[21:29] <Raccoon> I want to start an archive group called Going Postal, where we circulate 4 or 10 TB harddrives between high-bandwidth transmission lines and low-bandwidth archivists.
[21:31] <JAA> Which provider?
[21:32] <Raccoon> Century Link
[21:39] <JAA> FWIW, "The data usage limit applies to residential HSI. It does not apply to business-class HSI."  from https://www.centurylink.com/aboutus/legal/internet-service-disclosure/full-version.html
[21:40] <JAA> (HSI = High Speed Internet)
[22:36] <Kaz> anyone know of an easy way to get current tab count in firefox?
[22:36] <Kaz> ideally without an addon
[22:42] *** BlueMax has joined #archiveteam-ot
[22:48] <ivan_> are you setting up a prometheus metric on your tab addiction
[22:53] <JAA> You could parse the sessionstore file in the Firefox profile directory.
[22:53] <ivan_> https://superuser.com/questions/1363747/how-to-decode-decipher-mozilla-firefox-proprietary-jsonlz4-format-sessionstor
[22:54] <JAA> "Proprietary"  *twitch*
[22:54] <ivan_> I see only sessionstore-backups
[22:54] <JAA> Yup
[22:54] <ivan_> and it's got some jsonlz4 junk that lz4cat can't read
[22:54] <JAA> They changed that a while ago.
[22:55] <JAA> It's not really "junk". One of the naswers on SU explains it: "Unfortunately, due to a non-standard header, standard tools won't work. There's an open proposal to change that. Apparently the Mozilla header was devised before a standard lz4 frame format existed; it does wrap a standard lz4 block."
[22:56] <ivan_> https://github.com/badboy/jsonlz4cat works
[22:59] <markedL> https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/windows/getAll#Examples
[23:00] *** ShellyRol has quit IRC (Read error: Operation timed out)
[23:01] <JAA> Hmm, all those tools just skip the first 8 bytes and then decompress the rest. But then why doesn't tail -c+9 | lz4cat work?
[23:03] <markedL> dd can do a byte skip also
[23:04] <JAA> So?
[23:04] <markedL> rephrased, I'd trust dd more than tail , but maybe thats not the issue 
[23:05] <JAA> They're both fine.
[23:05] <JAA> And yes, definitely not the issue.
[23:05] <ivan_> the size in the header might be different
[23:05] *** ShellyRol has joined #archiveteam-ot
[23:05] <JAA> Hmm
[23:08] <Raccoon> JAA: > "Residential Fiber Gigabit plans are also not subject to data usage limits."
[23:08] <Kaz> raaaah why can't this be simple
[23:08] <JAA> Mozilla likes to overengineer things.
[23:09] <JAA> Many things used to be simple. Adding a custom search engine was a simple modification of a .json file. Now it's essentially impossible.
[23:11] <ivan_> I suspect some of these profile annoyances are intentional things designed to discourage other software from touching the profile
[23:11] <ivan_> Chrome has a pretty crazy session format
[23:13] <JAA> If I'm reading https://github.com/badboy/jsonlz4cat/blob/master/src/main.rs correctly, it does a read(8) for the magic bytes, then a read(4) for the outsize, and then throws the rest into LZ4 decompression. Meanwhile https://gist.github.com/Tblue/62ff47bef7f894e92ed5 , which reportedly also works (haven't tested it), only skips the magic bytes. ¯\_(ツ)_/¯
[23:15] <Raccoon> I got tired of browser session managment.  use Session Buddy on chrome, prolly ff too
[23:15] <Raccoon> saved my ass manya time
[23:28] *** qw3rty117 has joined #archiveteam-ot
[23:34] *** qw3rty116 has quit IRC (Ping timeout: 612 seconds)
[23:59] <JAA> Apparently the problem is that lz4cat uses a different decompression routine than those tools. Specifically, it uses LZ4_decompress_safe, not LZ4_decompress_safe_partial. I think.