#archiveteam-bs 2019-03-08,Fri

↑back Search

Time	Nickname	Message
00:12 ^🔗	godane	i'm recapturing a tape cause of sync issues
00:22 ^🔗	godane	ok now i think the recording is out of sync
00:22 ^🔗	godane	the other recording on this tape has no sync issues
00:26 ^🔗	godane	so the recording was christine movie on cbs in 1987-04-18
00:27 ^🔗		SimpBrain has quit IRC (Read error: Operation timed out)
00:29 ^🔗		SimpBrain has joined #archiveteam-bs
00:33 ^🔗		godane has quit IRC (Ping timeout: 246 seconds)
00:35 ^🔗	JAA	BartoCH: ^ That should be everything I threw into ArchiveBot. Looks like I didn't do anything for the March 2018 vote (Billag & finances). Once the bot runs, we'll get pretty tables for everything. :-)
00:36 ^🔗		tammy_ has joined #archiveteam-bs
00:36 ^🔗	tammy_	I'm the one who has the InterfaceLIFT warc scrape from a year or 2 ago. Is there a tool to upload this to IA that I can ratelimit the upload? I'd prefer to do it that way rather than go through the web interface.
00:36 ^🔗		evul_ has joined #archiveteam-bs
00:38 ^🔗	JAA	Sorry, didn't see your message on Reddit since I was busy adding stuff to our wiki.
00:38 ^🔗	tammy_	no worries
00:38 ^🔗	JAA	Looks like "ia" doesn't have a rate limiting option, but I think you can also upload with curl, and that should have an option somewhere.
00:38 ^🔗	tammy_	long time no see jaa
00:38 ^🔗	JAA	Indeed :-)
00:38 ^🔗	JAA	https://archive.org/help/abouts3.txt has details on how to upload with curl.
00:39 ^🔗	tammy_	ok, I'll look into that after dinner. playing some rocket league. dataset is nice and safe though :)
00:39 ^🔗	JAA	With that large an item, make sure to provide a size hint (described somewhere in that document).
00:40 ^🔗	JAA	But if you can, I'd suggest you just use the "ia" tool instead since it's the canonical way of uploading large amounts of data to IA.
00:40 ^🔗		godane has joined #archiveteam-bs
00:41 ^🔗	godane	i'm taking a break digitizing dashcloud tapes
00:42 ^🔗	godane	i was going to use vlc to sync the eariler rip but then vlc crash the system
00:43 ^🔗	godane	like mouse moved but nothing responsed to it
00:48 ^🔗	JAA	tammy_: I linked the tool in my PM, by the way, but here's the link again: https://archive.org/services/docs/api/internetarchive/ (Python package "internetarchive")
01:05 ^🔗		marked has quit IRC (Read error: Operation timed out)
01:06 ^🔗		marked has joined #archiveteam-bs
01:06 ^🔗		marked has quit IRC (west.us.hub irc.Prison.NET)
01:06 ^🔗		godane has quit IRC (west.us.hub irc.Prison.NET)
01:06 ^🔗		achip has quit IRC (west.us.hub irc.Prison.NET)
01:10 ^🔗		Exairnous has joined #archiveteam-bs
01:16 ^🔗		achip has joined #archiveteam-bs
01:16 ^🔗		marked has joined #archiveteam-bs
01:16 ^🔗		godane has joined #archiveteam-bs
01:23 ^🔗		BlueMax has joined #archiveteam-bs
01:32 ^🔗		Dimtree has joined #archiveteam-bs
01:40 ^🔗		ndiddy has joined #archiveteam-bs
01:41 ^🔗	Exairnous	JAA: I see what you mean about youtube on IA. Is there anyway to put a link on the youtube video page to the actual video
01:41 ^🔗	Exairnous	or do you know of another place to save youtube that handles playback better?
02:17 ^🔗		bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
02:56 ^🔗	JAA	Exairnous: No, fortunately there is no way to add links like that to the Wayback Machine. It could completely compromise the authenticity of the archived snapshot, since it would be a fake response. And no, I'm not aware of any good solution to archiving YouTube pages. Playback will always be tricky with sites like that; even if you use a full browser for the archival etc., playback might happen on a
02:56 ^🔗	JAA	different browser or platform, which can change which URLs (or in this case, which video resolution, for example) is requested, thus breaking the playback. Archiving these things is a huge nightmare really. The best solution is to extract the relevant information from it and present it in a more sane way.
03:02 ^🔗	Exairnous	JAA: What about links/embeds to the archived youtube videos on an archived website? Will they still work?
03:05 ^🔗	JAA	Exairnous: No, probably not. The WBM has some stuff for handling YT videos specially though and replacing them with their own player, but I'm not sure if we can feed into that in any way.
03:06 ^🔗	Exairnous	:(
03:07 ^🔗	Exairnous	JAA: There are several youtube links in pages on ngharmony.ca. Is there any way to have them resolve correctly after the youtube channel is taken down?
03:08 ^🔗	JAA	Exairnous: Define "resolve correctly". They'll still point to the same YouTube pages, which will be broken in the WBM.
03:10 ^🔗	Flashfire	Um I am having trouble with the save now feature
03:10 ^🔗	Flashfire	it 404s when I try and save a page
03:10 ^🔗	Exairnous	JAA: Point to a working video.
03:10 ^🔗	JAA	Exairnous: Almost certainly no.
03:11 ^🔗	Exairnous	:(
03:11 ^🔗	Flashfire	https://web.archive.org/save/https://www.youtube.com/watch?v=El41sHXck-E gave me a 404 damn it
03:11 ^🔗	JAA	YouTube's fault really. If they simply used an HTML5 <video> tag, it would all work fine.
03:11 ^🔗	Flashfire	Dont ask why I am trying to save youtube videos like that but why its not working and 404ing on me is annoying
03:11 ^🔗		godane has quit IRC (Ping timeout: 255 seconds)
03:12 ^🔗	Flashfire	Can someone else check if they are having problems with the save now feature please or if its just me?
03:12 ^🔗	JAA	Flashfire: Works fine for me. Well, except the saved page is broken, but that's expected.
03:13 ^🔗	Flashfire	I mean does using the https://web.archive.org/save/ work for any page
03:13 ^🔗	Flashfire	it just 404s when I click save page
03:13 ^🔗	JAA	Yes, I simply visited the link you pasted, and it archived the page.
03:13 ^🔗	Exairnous	JAA: I just had a look at a youtube video with a browser inspector. I'm pretty sure it had a video tag with a blob: link.
03:14 ^🔗	JAA	https://web.archive.org/web/20190308031204/https://www.youtube.com/watch?v=El41sHXck-E
03:15 ^🔗	Flashfire	Try visiting the save now page and saving a random link. It wont work for me
03:15 ^🔗	Flashfire	The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.
03:15 ^🔗	Flashfire	I get that error
03:15 ^🔗	JAA	Exairnous: Yeah, they use a <video> tag but then modify its contents with JS. Specifically, the <source> tag has an attribute src$="[[videoThumbnail_.url]]" instead of simply src with an actual URL.
03:15 ^🔗	JAA	What I really mean is a site that just works without JS.
03:16 ^🔗	Exairnous	of course they do :/
03:16 ^🔗		SimpBrain has quit IRC (Read error: Operation timed out)
03:17 ^🔗	JAA	Yeah, modern websites need to use at least three frameworks on both client and server, otherwise they're not modern enough.
03:18 ^🔗	JAA	Well, welcome to the hell that is archiving JS-heavy websites. :-)
03:19 ^🔗	Exairnous	JS-heavy purposely obfuscated websites :P
03:20 ^🔗	Exairnous	Cause I'm fairly sure WM can playback at least some JS?
03:20 ^🔗	JAA	Oh, JS on its own works fine. It's the xmlHttpRequests and similar stuff which break.
03:21 ^🔗	Exairnous	that sounds like it needs a server to wrok properly
03:21 ^🔗	Exairnous	*work
03:22 ^🔗		SimpBrain has joined #archiveteam-bs
03:22 ^🔗		underscor has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		Hani has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		noirscape has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		argus has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		arbin has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		ReimuHaku has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		Ganonmast has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		PurpleSym has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		kisspunch has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		Frogging has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		jodizzle has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		VoynichCr has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		MrRadar2 has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		Tenebrae has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗		BnAboyZ has quit IRC (hub.efnet.us irc.efnet.nl)
03:22 ^🔗	JAA	Yeah, but IA's URLs also come into play since the absolute URL is different.
03:22 ^🔗	JAA	WBM's URLs*
03:22 ^🔗		slyphic has quit IRC (Read error: Operation timed out)
03:22 ^🔗		slyphic has joined #archiveteam-bs
03:23 ^🔗		Jopik has joined #archiveteam-bs
03:23 ^🔗	JAA	I think someone here (PurpleSym?) had a PoC of something based on service workers for rewriting URLs on the fly.
03:23 ^🔗	JAA	WBM currently works by rewriting anything that looks like a URL statically.
03:23 ^🔗	JAA	That solution would instead hijack any requests sent by the browser, rewrite them into the equivalent WBM URLs, and then send that request instead.
03:24 ^🔗	JAA	That still doesn't help with the potential differences in URLs based on browser versions etc. though.
03:24 ^🔗		underscor has joined #archiveteam-bs
03:24 ^🔗		Hani has joined #archiveteam-bs
03:24 ^🔗		noirscape has joined #archiveteam-bs
03:24 ^🔗		argus has joined #archiveteam-bs
03:24 ^🔗		arbin has joined #archiveteam-bs
03:24 ^🔗		ReimuHaku has joined #archiveteam-bs
03:24 ^🔗		Ganonmast has joined #archiveteam-bs
03:24 ^🔗		PurpleSym has joined #archiveteam-bs
03:24 ^🔗		kisspunch has joined #archiveteam-bs
03:24 ^🔗		Frogging has joined #archiveteam-bs
03:24 ^🔗		jodizzle has joined #archiveteam-bs
03:24 ^🔗		VoynichCr has joined #archiveteam-bs
03:24 ^🔗		MrRadar2 has joined #archiveteam-bs
03:24 ^🔗		Tenebrae has joined #archiveteam-bs
03:24 ^🔗		BnAboyZ has joined #archiveteam-bs
03:24 ^🔗		irc.efnet.nl sets mode: +oo PurpleSym MrRadar2
03:24 ^🔗	JAA	And the archival would still require a full browser, which is very inefficient compared to our usual methods of archiving things.
03:25 ^🔗		tammy_ has quit IRC (Ping timeout: 261 seconds)
03:25 ^🔗		VerifiedJ has quit IRC (Ping timeout: 252 seconds)
03:25 ^🔗	Exairnous	JAA: Would something like Rhizome's Webrecorder produce a warc that could be uploaded to IA and playback correctly? Or does youtube have to be dynamic?
03:26 ^🔗	JAA	The problem partially lies in the Wayback Machine itself.
03:26 ^🔗	JAA	So no, playback would almost certainly not work correctly.
03:27 ^🔗		flipflop has quit IRC (Read error: Operation timed out)
03:32 ^🔗	Flashfire	JAA try putting https://vaguthu.mv/evaguthu/163689 through the save now page found at https://web.archive.org/save/
03:32 ^🔗	Flashfire	its not working
03:34 ^🔗		VerifiedJ has joined #archiveteam-bs
03:44 ^🔗	Flashfire	wtf is going on for me to not be able to use the save now feature
03:45 ^🔗	Flashfire	Have I been marked as a spammer from the weird urls?
04:07 ^🔗		odemgi has joined #archiveteam-bs
04:09 ^🔗		odemgi_ has quit IRC (Ping timeout: 252 seconds)
04:13 ^🔗	hook54321	Flashfire: What message do you get when trying to save it?
04:14 ^🔗	Flashfire	The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.
04:15 ^🔗	hook54321	I've had that happen a few times, not sure why. Try another URL and see what happens.
04:15 ^🔗		odemg has quit IRC (Ping timeout: 615 seconds)
04:19 ^🔗	Flashfire	it does it with other urls as well
04:22 ^🔗		odemg has joined #archiveteam-bs
04:30 ^🔗		VerifiedJ has quit IRC (Ping timeout: 252 seconds)
04:34 ^🔗		VerifiedJ has joined #archiveteam-bs
04:46 ^🔗		qw3rty111 has joined #archiveteam-bs
04:49 ^🔗		m007a83_ is now known as m007a83
04:50 ^🔗		qw3rty119 has quit IRC (Read error: Operation timed out)
04:54 ^🔗		HashbangI has quit IRC (Read error: Operation timed out)
04:54 ^🔗		nataraj_ has joined #archiveteam-bs
04:55 ^🔗		a_spook_ has joined #archiveteam-bs
04:57 ^🔗		HashbangI has joined #archiveteam-bs
04:57 ^🔗	a_spook_	Flashfire: dunno if it helps, but I have weird issues with wayback sometimes and they go away when I clear cookies? Though, I don't think it was the same error message as yours.
05:02 ^🔗	a_spook_	Flashfire: also I just did https://web.archive.org/save/https://www.youtube.com/watch?v=El41sHXck-E&disable_polymer=1 because ew modern youtube :P
05:03 ^🔗	Flashfire	Yeah see doing it that way works but using the save now button uses less of my computers resources. or at least makes the fan not scream as loud
05:03 ^🔗	Flashfire	but its not letting me do that
05:08 ^🔗		ndiddy_ has joined #archiveteam-bs
05:09 ^🔗	a_spook_	Flashfire: ah I see, I missed that page you said you were using, sorry
05:12 ^🔗	Flashfire	a_spook_ Yeah trying to use the save now page
05:13 ^🔗		ndiddy has quit IRC (Ping timeout: 492 seconds)
05:30 ^🔗	a_spook_	Flashfire: I've actually never used that before and just tried it. I'm getting the same error as you on a random page I chose to test. Guess it's not just you then! :')
05:37 ^🔗		SimpBrain has quit IRC (Read error: Connection reset by peer)
05:39 ^🔗		SimpBrain has joined #archiveteam-bs
05:42 ^🔗	Exairnous	Is it better to use an IA save now bookmarklet or archivebot for a single page?
05:44 ^🔗	Exairnous	Cause I looked at one of the youtube embeds in my site (seems to be in IA now, yay) and the iframe link resolves to a valid link not in the wayback machine
05:45 ^🔗	Exairnous	I think if I archive that link the embed may work, but I'm wondering whether to use archivebot or a bookmarklet.
05:52 ^🔗		wp494 has quit IRC (Read error: Operation timed out)
05:52 ^🔗		wp494 has joined #archiveteam-bs
06:14 ^🔗		ndiddy_ has quit IRC ()
06:58 ^🔗	Exairnous	JAA: ^^
07:25 ^🔗		SimpBrain has quit IRC (Remote host closed the connection)
07:25 ^🔗		SimpBrain has joined #archiveteam-bs
07:34 ^🔗		SimpBrain has quit IRC (Remote host closed the connection)
07:34 ^🔗		SimpBrain has joined #archiveteam-bs
07:44 ^🔗		BlueMax has quit IRC (Read error: Connection reset by peer)
07:44 ^🔗		SimpBrain has quit IRC (Read error: Connection reset by peer)
07:47 ^🔗		Pixi` has joined #archiveteam-bs
07:48 ^🔗		Pixi has quit IRC (Read error: Operation timed out)
07:51 ^🔗		SimpBrain has joined #archiveteam-bs
07:54 ^🔗		VerifiedJ has quit IRC (Ping timeout: 252 seconds)
07:58 ^🔗		SimpBrain has quit IRC (Remote host closed the connection)
08:05 ^🔗		SimpBrain has joined #archiveteam-bs
08:06 ^🔗		VerifiedJ has joined #archiveteam-bs
08:30 ^🔗		S1mpbrain has joined #archiveteam-bs
08:30 ^🔗		SimpBrain has quit IRC (Remote host closed the connection)
08:47 ^🔗		lag__ has joined #archiveteam-bs
08:55 ^🔗		S1mpbrain has quit IRC (Ping timeout: 615 seconds)
10:13 ^🔗		Mateon1 has quit IRC (Ping timeout: 740 seconds)
10:14 ^🔗		Mateon1 has joined #archiveteam-bs
10:38 ^🔗	JAA	Exairnous: Not sure. Both methods have advantages and disadvantages. But I don't know which works better for YouTube.
10:39 ^🔗	JAA	VoynichCr: Oh, awesome! I was looking for a way to do that but couldn't figure it out. :-)
10:40 ^🔗	JAA	I guess there's no way to filter out the /list pages, right?
10:44 ^🔗		Hani has quit IRC (Read error: Connection reset by peer)
10:44 ^🔗		Hani has joined #archiveteam-bs
10:51 ^🔗		Gfy has quit IRC (Ping timeout: 265 seconds)
10:54 ^🔗		a_spook_ has quit IRC (Quit: Connection closed for inactivity)
11:03 ^🔗		Gfy has joined #archiveteam-bs
12:16 ^🔗		bitBaron has joined #archiveteam-bs
13:42 ^🔗		bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
13:52 ^🔗		godane has joined #archiveteam-bs
14:15 ^🔗		bitBaron has joined #archiveteam-bs
14:53 ^🔗		wp494 has quit IRC (Ping timeout: 492 seconds)
14:55 ^🔗		wp494 has joined #archiveteam-bs
15:43 ^🔗		deevious has quit IRC (Quit: deevious)
16:02 ^🔗		Oddly has joined #archiveteam-bs
16:03 ^🔗		schbirid has joined #archiveteam-bs
16:26 ^🔗		VerifiedJ has quit IRC (Ping timeout: 252 seconds)
17:27 ^🔗		Hani111 has joined #archiveteam-bs
17:28 ^🔗		Hani has quit IRC (Read error: Connection reset by peer)
17:28 ^🔗		Hani111 is now known as Hani
17:38 ^🔗		Oddly has quit IRC (Ping timeout: 255 seconds)
17:44 ^🔗		VerifiedJ has joined #archiveteam-bs
17:48 ^🔗	JAA	My wiki bot will now keep the WBM exclusion list sorted.
18:02 ^🔗		nataraj_ has quit IRC (Read error: Operation timed out)
18:16 ^🔗		Oddly has joined #archiveteam-bs
18:55 ^🔗	VoynichCr	JAA: i dont think that filtering /list pages is possible
19:18 ^🔗		bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
19:35 ^🔗	Exairnous	JAA: Does saving with a bookmarklet interfere with what archivebot got?
19:36 ^🔗		bitBaron has joined #archiveteam-bs
19:47 ^🔗		evul_ is now known as evul
19:58 ^🔗		BlueMax has joined #archiveteam-bs
21:05 ^🔗		Albardin has quit IRC (Read error: Operation timed out)
21:07 ^🔗		Oddly has quit IRC (Ping timeout: 255 seconds)
21:08 ^🔗		kiskabak has quit IRC (Ping timeout: 265 seconds)
21:13 ^🔗		Hani has quit IRC (Read error: Operation timed out)
21:13 ^🔗		Hani has joined #archiveteam-bs
21:20 ^🔗		Hani has quit IRC (Ping timeout: 268 seconds)
21:20 ^🔗		Hani has joined #archiveteam-bs
22:39 ^🔗		wyatt8740 has joined #archiveteam-bs
23:00 ^🔗	JAA	Exairnous: Well, the Wayback Machine is one big mixture of WARCs from all over the place, including the "save now" feature and ArchiveBot. Meaning, when you view the AB snapshot, you may also see content (e.g. images, stylesheets, scripts) from "save now" and vice-versa. So yes, it could interfere in that way. But the AB snapshot itself won't be affected by it in any way.
23:01 ^🔗	jodizzle	Seems like a lot of the Venezuelan sites are down right now. Wonder if it's because of this: https://www.theguardian.com/world/2019/mar/07/venezuela-hit-by-major-power-outage
23:06 ^🔗		BlueMax has quit IRC (Quit: Leaving)
23:18 ^🔗		MR9K has quit IRC (Remote host closed the connection)
23:19 ^🔗		MR9K has joined #archiveteam-bs
23:39 ^🔗	Gfy	SketchCow: is there a chance day addnfo-2010-1020.zip got skipped in the process somewhere? (regarding https://archive.org/download/nfo_large_collection_2009_2012)
23:58 ^🔗		wp494 has quit IRC (Read error: Operation timed out)
23:59 ^🔗		wp494 has joined #archiveteam-bs

irclogger-viewer