[00:00] *** IAmbience has quit IRC (Quit: Connection closed for inactivity) [00:59] *** BlueMax has joined #archiveteam-ot [01:24] *** qw3rty2 has joined #archiveteam-ot [01:30] *** qw3rty has quit IRC (Ping timeout: 745 seconds) [03:23] *** qw3rty has joined #archiveteam-ot [03:30] *** qw3rty2 has quit IRC (Ping timeout: 745 seconds) [03:53] *** godane has joined #archiveteam-ot [04:40] *** qw3rty2 has joined #archiveteam-ot [04:50] *** qw3rty has quit IRC (Ping timeout: 745 seconds) [04:59] *** killsushi has joined #archiveteam-ot [05:04] *** tonsofpcs has quit IRC (Read error: Operation timed out) [05:26] *** nataraj has joined #archiveteam-ot [06:17] *** Ryz has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** Fusl has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** markedL has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** SketchCow has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** paul2520 has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** nyany__ has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** Fionera has quit IRC (ircd.choopa.net irc.mzima.net) [06:17] *** svchfoo3 has quit IRC (ircd.choopa.net irc.mzima.net) [06:19] *** Ryz has joined #archiveteam-ot [06:19] *** Fusl has joined #archiveteam-ot [06:19] *** markedL has joined #archiveteam-ot [06:19] *** SketchCow has joined #archiveteam-ot [06:19] *** paul2520 has joined #archiveteam-ot [06:19] *** nyany__ has joined #archiveteam-ot [06:19] *** Fionera has joined #archiveteam-ot [06:19] *** svchfoo3 has joined #archiveteam-ot [06:19] *** irc.mzima.net sets mode: +ooo Fusl SketchCow svchfoo3 [06:19] *** Fusl__ sets mode: +o Fusl [06:19] *** Fusl__ sets mode: +o SketchCow [06:19] *** Fusl__ sets mode: +o svchfoo3 [06:20] *** Fusl_ sets mode: +o Fusl [06:20] *** Fusl_ sets mode: +o SketchCow [06:20] *** Fusl_ sets mode: +o svchfoo3 [06:23] *** tonsofpcs has joined #archiveteam-ot [06:28] *** Fusl sets mode: +o kiskabak [06:28] *** Fusl sets mode: +o kiska [06:28] *** Fusl sets mode: +o kiska18 [06:28] *** Fusl sets mode: +o chfoo [06:28] *** Fusl sets mode: +o Kenshin [06:28] *** Fusl sets mode: +o Fusl_ [06:28] *** Fusl sets mode: +o hook54321 [06:28] *** Fusl sets mode: +o Kaz [06:28] *** Fusl sets mode: +o HCross [06:28] *** Fusl sets mode: +o Fusl__ [06:28] *** Fusl sets mode: +o AlsoJAA [06:28] *** Fusl sets mode: +o arkiver [06:28] *** Fusl sets mode: +o jrwr [06:28] *** Fusl sets mode: +o astrid [06:28] *** Fusl sets mode: +o dxrt_ [06:28] *** Fusl sets mode: +o svchfoo1 [06:28] *** Fusl sets mode: +o dxrt [06:28] *** Fusl sets mode: +o ivan [06:28] *** Fusl sets mode: +o JAA [07:39] *** jake_test has joined #archiveteam-ot [07:59] *** Ivy has quit IRC (Quit: Connection closed for inactivity) [08:40] *** HP_Archiv has joined #archiveteam-ot [08:51] *** BlueMax has quit IRC (Quit: Leaving) [09:04] *** markedL has quit IRC (Read error: Operation timed out) [09:05] *** asdf0101 has quit IRC (Read error: Operation timed out) [09:19] *** asdf0101 has joined #archiveteam-ot [09:19] *** markedL has joined #archiveteam-ot [11:30] *** nataraj has quit IRC (Read error: Operation timed out) [11:46] still need help with non-English YouTube if anyone wants to find the good content [11:46] I used channelcrawler to get the popular stuff for each language but popular is usually not the best [11:54] *** Hani111 has joined #archiveteam-ot [12:02] *** Hani has quit IRC (Ping timeout: 745 seconds) [12:02] *** Hani111 is now known as Hani [12:18] *** X-Scale` has joined #archiveteam-ot [12:19] *** X-Scale has quit IRC (Ping timeout: 252 seconds) [12:19] *** X-Scale` is now known as X-Scale [13:04] ivan: normally when i'm looking for non-english stuff i'm just looking for old japanese tv show [14:23] *** Ivy has joined #archiveteam-ot [14:23] *** Ivy has quit IRC (Client Quit) [14:30] *** schbirid has joined #archiveteam-ot [14:37] *** scorche` has joined #archiveteam-ot [14:38] *** scorche has quit IRC (Read error: Operation timed out) [14:38] *** scorche` is now known as scorche [14:42] *** nataraj has joined #archiveteam-ot [15:06] *** killsushi has quit IRC (Quit: Leaving) [15:08] *** Ivy has joined #archiveteam-ot [15:13] *** nataraj has quit IRC (Quit: Konversation terminated!) [15:13] *** nataraj has joined #archiveteam-ot [15:40] *** Dallas has quit IRC (Read error: Connection reset by peer) [15:49] *** Dallas has joined #archiveteam-ot [16:21] *** nataraj has quit IRC (Read error: Operation timed out) [16:30] *** X-Scale` has joined #archiveteam-ot [16:31] *** X-Scale has quit IRC (Ping timeout: 252 seconds) [16:31] *** X-Scale` is now known as X-Scale [16:42] *** akierig has joined #archiveteam-ot [16:48] *** Atom__ has joined #archiveteam-ot [16:55] *** Atom-- has quit IRC (Read error: Operation timed out) [17:27] *** nataraj has joined #archiveteam-ot [17:29] *** akierig has quit IRC (Quit: later_gator) [17:57] *** Tenebrae has joined #archiveteam-ot [18:50] *** JH8813269 has quit IRC (Quit: Ping timeout (120 seconds)) [18:51] *** apache2 has quit IRC (Ping timeout: 745 seconds) [18:53] *** JH8813269 has joined #archiveteam-ot [18:58] *** manjaro-u has joined #archiveteam-ot [18:59] *** nataraj has quit IRC (Read error: Operation timed out) [19:03] *** dxrt has quit IRC (Ping timeout: 246 seconds) [19:09] *** Video has joined #archiveteam-ot [19:27] *** Video has quit IRC (Quit: Page closed) [19:46] *** IAmbience has joined #archiveteam-ot [19:49] *** akierig has joined #archiveteam-ot [19:49] *** systwi_ has joined #archiveteam-ot [19:55] *** systwi has quit IRC (Read error: Operation timed out) [20:08] *** Video has joined #archiveteam-ot [20:25] *** Video has quit IRC (Ping timeout: 260 seconds) [21:26] *** akierig has quit IRC (Quit: later_gator) [21:31] looks like archive.today now requires captchas from tor [21:31] ivan: I worked on the crawler for the annotation archive project, and have since extended it a lot. I still have issues with just how *much* content it crawls. My last run caught 63 million unique pages (~200k queued but uncrawled), 253 million link relationships (what page links to what)... and then crashed. [21:32] Turns out node really doesn't like having massive data structures that store tens of millions of keys and gigabytes of data in memory [21:33] 1) A single array has a very limited maximum .length, 2) A single object can only store so much before node gives up, 3) Node doesn't like giving WASM modules a lot of memory, so my Rust hashset implementation only worked for a little while. [21:34] *** Raccoon has joined #archiveteam-ot [21:35] What a surprise... [21:37] Yes, you can't get rid of me so easy! [21:37] JAA, do you have the latest version of Chrome installed? [21:38] I've never had any version of Chrome installed. [21:39] If someone here has Chrome installed, I would like to confirm an experiment that suggests Chrome is preventing users from downloading copyrighted content. [21:39] A) Visit https://cdn4.vectorstock.com/i/1000x1000/00/93/of-thief-vector-23180093.jpg B) right-click the image, select Save Image As... C) It should say "Download Failed: Blocked" [21:40] Eww [21:40] The image loads and displays fine, but the browser prevents it from being "downloaded" (user experience) [21:40] Who the fuck came up with that shit? [21:40] Is that a Chrome "feature" or also in Chromium? [21:40] Raccoon: Can't reproduce on Chromium [21:40] The same Google who came up with the shit of blocking use of the "Save Video" context menu on google owned domain names [21:41] I only use Google Chrome browser for Windows [21:41] I know I've seen the "Blocked" message on download before, but I haven't paid too much attention then [21:41] Mateon1: latest Chrome Browser update? [21:41] Chromium: Version 77.0.3865.120 (Developer Build) (64-bit) [21:42] Can reproduce on Version 80.0.3946.0 (Official Build) (64-bit) [21:42] Wait, is nixos really that far behind? [21:42] mls_: Had you seen this behavior before? [21:42] Nope, first for me [21:43] Could someone with more web hacker knowledge determine what meta tags or robots.txt is instructing Chrome Browser to behave this way? [21:43] Mateon1: I'm using a launcher that updates profusely so might be [21:44] Or maybe it's some sort of Copyright Bit set in the jpg image metadata/exif [21:46] I have Version 78.0.3904.87 (Official Build) (64-bit) [21:46] I wonder if maybe Developer Builds could also be exempt? heh [21:46] https://cs.chromium.org/chromium/src/components/download/public/common/download_interrupt_reason_values.h?q=%22Download+Failed%22+Blocked&sq=package:chromium&l=46&dr=C [21:46] "The file was blocked due to local policy." [21:47] say the what what. i have no such policies configured, no antivirus, no restrictions enabled. [21:47] https://cs.chromium.org/chromium/src/chrome/browser/supervised_user/supervised_user_service.cc?l=622 - There are download blacklisting features [21:50] Yeah, I've seen 2 different types of download blacklisting (which i'd disabled -- "suspicious sites") but even then the Downloads page lets you override this per-item. [21:50] These new blocks do not allow an override from the Downloads page [21:51] This has to be some new form of anti-piracy crucade [21:52] https://cs.chromium.org/chromium/src/chrome/browser/supervised_user/supervised_user_service.cc?type=cs&q=OnBlacklistDownloadDone&g=0&l=575 - "Downloads a static blacklist containing of hostname hashes of common inappropriate websites. This is only enabled for child accounts and only if the corresponding setting is enabled by the parent." - This is almost certainly wrong. [21:53] https://cs.chromium.org/chromium/src/chrome/browser/supervised_user/supervised_user_service.cc?q=LoadBlacklist&l=519&dr=C - This mentions safe browsing mode, and seems to be related to the same feature. [21:54] Wait, I'm confused, does safe browsing mean SFW sites or malware protection? [22:01] do you use uBlock ? https://github.com/gorhill/uBlock/issues/2813 [22:06] not actually related, it is on the easylist block [22:17] Safe browsing is malware etc. Or whatever Google deems to be malware. There have been various cases of sites getting blocked without actual malware involved. [22:24] the download is listed as blocked by ublock in its own logger [22:27] So yes, I do have uBlock, but no, it's not causing site blockage. The uBlock script seems to prevent some redirects on that site. I still cannot download that address with uBlock completely disabled. [22:28] What sort of technology might the website be using to frustrate "downloads", and make them distinct from direct view (no referer) [22:28] Does Chrome actually announce the type of resource viewing/downloading the user is conducting? [22:30] I doubt that. It probably doesn't even make a second request for the download since it'd already be in cache. [22:31] There's nothing in the headers, but I've also never heard of any header that could cause something like that. [22:31] that's why I was wondering if it's a robots.txt sorta thang [22:31] This should be something out-of-band, similar to safe browsing but with a different intent. [22:33] Ok so, I CAN download this in Incognito mode, but I can't download it in normal mode, even with uBlock disabled. [22:33] I'm stumped [22:40] as soon as I disabled ublock from the addons page it worked, safebrowsing site shows it as green [22:59] I found the easylist entry: ||vectorstock.com^*/tracking [23:00] That shouldn't(?) be affecting the display (doesn't) or direct-download (does) of the link: https://cdn4.vectorstock.com/i/1000x1000/00/93/of-thief-vector-23180093.jpg [23:02] easyprivacy/easyprivacy_specific.txt @ https://github.com/easylist/easylist/search?q=vectorstock&unscoped_q=vectorstock [23:03] i found removing easyprivacy from uBlock doesn't change behavior, so it's not that entry. maybe something built into uBlock's built-in blockers [23:04] *** dxrt has joined #archiveteam-ot [23:04] *** Fusl__ sets mode: +o dxrt [23:04] *** Fusl sets mode: +o dxrt [23:04] *** Fusl_ sets mode: +o dxrt [23:05] *** svchfoo1 sets mode: +o dxrt [23:05] you're right though, completely disabling the addon does enable chrome to download the link again. so I'll /end and consult uBlock peep [23:34] looks like there was a problem with recording this episode of Asia Insight: https://archive.org/details/KCSM_20141106_023000_Asia_Insight/start/0/end/60 [23:34] there is no audio that i can hear