#archiveteam-ot 2019-01-23,Wed

↑back Search

Time Nickname Message
00:14 🔗 Dj-Wawa has quit IRC (Quit: Connection closed for inactivity)
00:21 🔗 ats has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 Stiletto has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 argus has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 noirscape has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 Fusl has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 VoynichCr has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 N4Y has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 MrRadar2 has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 Tenebrae has quit IRC (se.hub irc.efnet.nl)
00:21 🔗 BnAboyZ has quit IRC (se.hub irc.efnet.nl)
00:23 🔗 ats has joined #archiveteam-ot
00:23 🔗 Stiletto has joined #archiveteam-ot
00:23 🔗 argus has joined #archiveteam-ot
00:23 🔗 noirscape has joined #archiveteam-ot
00:23 🔗 Fusl has joined #archiveteam-ot
00:23 🔗 VoynichCr has joined #archiveteam-ot
00:23 🔗 N4Y has joined #archiveteam-ot
00:23 🔗 MrRadar2 has joined #archiveteam-ot
00:23 🔗 Tenebrae has joined #archiveteam-ot
00:23 🔗 BnAboyZ has joined #archiveteam-ot
00:24 🔗 Despatche has quit IRC (Read error: Connection reset by peer)
00:24 🔗 Despatche has joined #archiveteam-ot
00:24 🔗 Despatche has quit IRC (Read error: Connection reset by peer)
00:24 🔗 Despatche has joined #archiveteam-ot
00:24 🔗 Despatche has quit IRC (Read error: Connection reset by peer)
00:24 🔗 Despatche has joined #archiveteam-ot
00:27 🔗 yano https://www.gnu.org/software/librejs/free-your-javascript.html 3.2.2.1
00:27 🔗 yano magnets of valid licenses
00:27 🔗 yano such as CC0, GPLv2.0, GPLv3.0, etc.
00:28 🔗 Despatche has quit IRC (Remote host closed the connection)
00:28 🔗 Despatche has joined #archiveteam-ot
00:30 🔗 JAA Better yet, get rid of JS entirely. :-)
00:31 🔗 JAA (If possible)
00:31 🔗 yano i don't care about the .js part, i'm just excited about the magnets of the licenses
00:32 🔗 JAA Ah yeah
00:34 🔗 yano i figured anyone who is into data hoarder or hosting data for others might be interested :)
01:36 🔗 VerfiedJ has quit IRC (Quit: Leaving)
01:43 🔗 JAA ivan_: For your YouTube archive: https://old.reddit.com/r/videos/comments/aio6jx/indian_youtuber_who_exposes_the_safety_measures/
02:42 🔗 ivan_ JAA: grabbing
02:42 🔗 ivan_ YouTube no longer serves annotations, right?
02:46 🔗 ivan_ youtube-dl still grabs something but the annotation data is blanked
02:46 🔗 ivan_ it doesn't even have the end cards either but rather some channel metadata
03:05 🔗 Mateon1 has quit IRC (Read error: Operation timed out)
03:05 🔗 Mateon1 has joined #archiveteam-ot
03:32 🔗 t3 Does IA deduplicate the WARC websites when they are integrated into the Wayback Machine?
03:44 🔗 t3 Am I likely to get IP banned when grabbing a site that has a DNS that points CloudFlare, Inc. (AS13335)?
04:04 🔗 wp494 JAA: finally /r/shutdown has real moderation now
04:04 🔗 wp494 good riddance to the ronald mcdonald trump posts
04:11 🔗 Hani has quit IRC (Read error: Operation timed out)
04:49 🔗 odemg has quit IRC (Ping timeout: 265 seconds)
05:01 🔗 odemg has joined #archiveteam-ot
05:02 🔗 ivan_ t3: no deduplication
05:02 🔗 ivan_ t3: should be fine if you get through the initial captcha unless they have attack detection on?
05:35 🔗 Despatche has quit IRC (Quit: Connection reset by deer)
05:40 🔗 m007a83 has quit IRC (Read error: Connection reset by peer)
05:49 🔗 m007a83 has joined #archiveteam-ot
05:53 🔗 t3 I don't like Cloudflare. It's everywhere.
05:58 🔗 t3 I also don't like websites that load contents by scrolling using JavaScript.
06:06 🔗 Hani has joined #archiveteam-ot
06:12 🔗 nataraj has joined #archiveteam-ot
06:32 🔗 Ryz has joined #archiveteam-ot
06:32 🔗 Ryz Google to try and block uBlock Origin? S:
06:32 🔗 Ryz https://www.ghacks.net/2019/01/22/chrome-extension-manifest-v3-could-end-ublock-origin-for-chrome/
06:33 🔗 icedice has quit IRC (Quit: Leaving)
07:14 🔗 t3 Ryz: That's really bad. Google shouldn't be evil.
07:14 🔗 t3 That's also why I stopped using Google Chrome.
07:15 🔗 Ryz To others, they are already evil in a way, not in this instance I pointed out, but the other countless ones in the past~
08:33 🔗 wp494_ has joined #archiveteam-ot
08:35 🔗 wp494 has quit IRC (Ping timeout: 255 seconds)
09:14 🔗 nataraj has quit IRC (Read error: Operation timed out)
09:27 🔗 nataraj has joined #archiveteam-ot
09:37 🔗 schbirid has joined #archiveteam-ot
10:33 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
10:33 🔗 JAA Google is an altruistic company that totally isn't interested in making money through ads. I'm sure they'll sort this API issue out with Gorhill...
11:30 🔗 chimyatta has joined #archiveteam-ot
11:53 🔗 Ryz has quit IRC (Remote host closed the connection)
12:37 🔗 VADemon_ https://mashable.com/2018/05/19/google-removes-dont-be-evil-motto-from-code-of-conduct/?europe=true
13:38 🔗 nataraj has quit IRC (Read error: Operation timed out)
14:33 🔗 JAA https://twitter.com/3lbios/status/1087848040583626753
14:55 🔗 LFlare has quit IRC (Quit: Ping timeout (120 seconds))
14:59 🔗 VerfiedJ has joined #archiveteam-ot
15:03 🔗 LFlare has joined #archiveteam-ot
15:21 🔗 yano https://yanovich.net/2018/08/help-archive-the-web/
16:16 🔗 moufu the addon isn't open source though
16:24 🔗 systwi JAA: thanks for the forum info. i might just make a local copy with grab-site (i use that already), but i did want to get a copy on wayback
16:27 🔗 nataraj has joined #archiveteam-ot
17:33 🔗 wp494 has joined #archiveteam-ot
17:36 🔗 wp494_ has quit IRC (Ping timeout: 265 seconds)
18:13 🔗 t3 yano: That's interesting. Just curious, because your nick is similar to the blog name, is that your blog? I've never used that "Page Cache Archiver" browser extension before. I generally use VerfiedJ's "Save to the Wayback Machine" browser extension.
18:20 🔗 schbirid has quit IRC (Remote host closed the connection)
18:42 🔗 m007a83 has quit IRC (Read error: Operation timed out)
18:44 🔗 yano t3: yea, that's my blog
18:44 🔗 yano t3: ah, PCA allows one to save to more than just IA
18:48 🔗 picklefac has joined #archiveteam-ot
18:54 🔗 yano moufu: the source is viewable but yeah, it's not FOSS-y in that you can do whatever you want with it :-\
19:14 🔗 Hani has quit IRC (Quit: Going offline, see ya! (www.adiirc.com))
19:16 🔗 Hani has joined #archiveteam-ot
19:20 🔗 Despatche has joined #archiveteam-ot
19:46 🔗 t3 yano: Well your blog's robots.txt blocks ia_archiver.
19:47 🔗 yano it shouldn't for the whole site
19:47 🔗 * yano checks
19:48 🔗 yano oh, i thought that was part of the image blocking thing I copy/pasted that included the Google-Images
19:48 🔗 t3 yano: You have `User-agent: ia_archiver` with `Disallow: /`. That's the entire site.
19:48 🔗 yano fixed
19:48 🔗 yano it's now removed
19:48 🔗 yano it should only block the stuff mentioned at the bottom
19:49 🔗 t3 Yay! Now it can be archived.
19:51 🔗 t3 Actually...
19:51 🔗 t3 There still might be an issue.
19:51 🔗 t3 https://web.archive.org/save/https://yanovich.net/pages/contact-me.html
19:52 🔗 JAA IA might have to pick up the new robots.txt first. Not sure how often that's checked.
19:53 🔗 t3 The robots.txt has been updated: https://web.archive.org/web/20190123194840/https://yanovich.net/robots.txt
19:54 🔗 yano my web logs aren't showing any UA's from IA for robots.txt
19:54 🔗 yano ah, there it is
19:55 🔗 yano [23/Jan/2019:19:45:40 +0000] "GET /robots.txt HTTP/1.1" 200 447 "-" "Mozilla/5.0 (compatible; archive.org_bot; Wayback Machine Live Record; +http://archive.org/details/archive.org_bot)" "-" yanovich.net
19:55 🔗 yano [23/Jan/2019:19:45:51 +0000] "GET /robots.txt HTTP/2.0" 200 435 "https://web.archive.org/save/https://yanovich.net/2018/08/help-archive-the-web/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:
19:55 🔗 yano 60.0) Gecko/20100101 Firefox/60.0" "-" yanovich.net
19:55 🔗 yano ah, bay-hoooey
19:55 🔗 yano *ba-hoooey
20:05 🔗 m007a83 has joined #archiveteam-ot
20:23 🔗 jrwr has quit IRC (Read error: Connection reset by peer)
20:24 🔗 jrwr has joined #archiveteam-ot
20:42 🔗 BlueMax has joined #archiveteam-ot
20:58 🔗 eientei95 https://xkcd.com/2102/ IA in XKCD
21:03 🔗 eientei95 Hm. ANyone else getting a 400 Bad Request error when trying to do a https://web.archive.org/save/<link> ?
21:20 🔗 ivan_ yes
21:21 🔗 ivan_ try again and it might work, or not
21:23 🔗 eientei95 ivan_: Nope, still 400
21:32 🔗 arkiver eientei95: nice
21:33 🔗 * arkiver is waiting for the day Archive Team is in XKCD
22:06 🔗 nataraj has quit IRC (Read error: Operation timed out)
22:45 🔗 yano yikes, https://web.archive.org/web/20190123224207/http://clerk.house.gov/evs/2019/roll043.xml
22:45 🔗 yano #FormattingFail
22:57 🔗 kiska1 has quit IRC (Ping timeout (120 seconds))
22:58 🔗 wmvhater has joined #archiveteam-ot
22:58 🔗 kiska1 has joined #archiveteam-ot

irclogger-viewer