#archiveteam 2014-10-21,Tue

↑back Search

Time Nickname Message
00:00 🔗 SketchCow Boop. Did I miss anything
00:02 🔗 SketchCow Regarding the discussion a few lines back, if you have to constantly start and shut your warrior, maybe you shouldn't be running a warrior.
00:10 🔗 Mayonaise does stopping and starting it frequently cause issues?
00:10 🔗 Mayonaise i leave mine running. just curious
00:13 🔗 balrog Mayonaise: yes it does, if you don't let it gracefully stop
00:13 🔗 balrog it means jobs get stuck
00:13 🔗 Mayonaise hrm evil
00:20 🔗 sharpobje hi, is there any way to find a particular archived twitch vod?
00:21 🔗 sharpobje I found a helpful searchable index, but its contents are not as broad as the contents of the github repo with a list of videos to grab
00:21 🔗 sharpobje also it just has old twitch URLs which are not functional
00:24 🔗 aaaaaaaaa did you check http://chfoo-cn.mooo.com/~archiveteam/twitchtv-index/html/ or the list here: http://archiveteam.org/index.php?title=Twitch.tv#What_we_are_saving
00:25 🔗 sharpobje the former thing gives links to archives hosted by twitch?
00:26 🔗 aaaaaaaaa its a rough index of what was saved based on the lists in the second link, AFAIK
00:26 🔗 sharpobje I checked out the git repo, it says there is one archive saved for channel leveluplive2, but the former tool does not return any results for that channel
00:26 🔗 sharpobje well I am not sure how to use the searchable index to find content that is not still hosted by twitch
00:29 🔗 SketchCow It is not easy, we admit.
00:29 🔗 SketchCow Analysis is one of my big pushes for 2015.
00:30 🔗 SketchCow Hey, so swipnet is BASICALLY in the wayback machine.
00:30 🔗 aaaaaaaaa which list was it on at the git repo
00:30 🔗 SketchCow Next tie, we have to think of ways to do it that don't result in MASSIVE amounts of tiny files.
00:31 🔗 aaaaaaaaa SketchCow: does FOS love you again?
00:31 🔗 SketchCow It took the machine weeks and weeks to deal with that.
00:31 🔗 SketchCow No, FOS is still pretty much manga torture comic land
00:32 🔗 SketchCow But things are starting to finish up that are the most intense disk operations.
00:32 🔗 sharpobje https://github.com/ArchiveTeam/twitchtv-items/blob/master/csv/highlights_top.csv in this file there is an entry for leveluplive2
00:33 🔗 sharpobje (also it is very weird for me to have a twitchtv-items repo next to all my other twitch repos)
00:33 🔗 sharpobje I guess the csv may have been the input to a process that generated the much shorter https://github.com/ArchiveTeam/twitchtv-items/blob/master/items/video_pages/05_top_videos_10000views.txt, which is the list of stuff that was actually saved?
00:35 🔗 SketchCow Like, it's still doing verizon files, also tiny, also legion.
00:36 🔗 SketchCow Ha ha OOPS
00:36 🔗 SketchCow I just opened all my screens on FOS and one of them is a move swipnet operation, still running.
00:36 🔗 SketchCow It is officially past... 60 days.
00:37 🔗 SketchCow That's the thing that just oves them to prepare them to be made into megawarcs.
00:37 🔗 balrog ouch.
00:37 🔗 SketchCow JUST moving. From one directory to another on the sae drive.
00:37 🔗 balrog that should not take so long... what's the sheer number of files?
00:38 🔗 SketchCow ha ha "should"
00:38 🔗 SketchCow Justice does not reign in linux
00:38 🔗 SketchCow No, our merry band of maniacs just put millions, millions of files into this filesystem.
00:38 🔗 SketchCow It does not like.
00:38 🔗 aaaaaaaaa sharpobje: what is the channel name again, please?
00:39 🔗 sharpobje leveluplive2
00:40 🔗 SketchCow See, it has to do a "how big is this directory" after each file.
00:40 🔗 SketchCow So it's doing that at the end, cand checking each pass multiple times.
00:40 🔗 SketchCow This laptop is nice but the keyboard is shit.
00:46 🔗 aaaaaaaaa sharpobje: sorry, I'm unable to find a video url.
00:46 🔗 sharpobje alright, thanks
00:47 🔗 aaaaaaaaa it looks like it was grabbed but I don't see a reference to the actual url for the video.
00:55 🔗 aaaaaaaaa yeah, I can't figure out how to translate the highlight into the real url.
01:15 🔗 SketchCow Verified and now deleting 3tb of Ancestry.com files.
01:15 🔗 balrog they're all taken care of?
01:15 🔗 SketchCow They're all uploaded.
01:15 🔗 SketchCow I keep the original directory and the generated files until I'm sure it's all set.
01:16 🔗 sharpobje aaaaaaaaa: the actual video urls contain the channel name, so if there is a list of those somewhere we could search it
01:17 🔗 aaaaaaaaa https://github.com/ArchiveTeam/twitchtv-items/tree/master/items/flv_urls
01:17 🔗 aaaaaaaaa but no liveuplive2 in any of them
01:18 🔗 aaaaaaaaa tons of liveuplive, but no liveuplive2
01:18 🔗 sharpobje alright, thanks
01:18 🔗 aaaaaaaaa sorry, leveluplive and leveluplive2
01:19 🔗 aaaaaaaaa but highlights use the video url of the original video, so if channel 1 makes a highlight of channel 2's video, the url would have channel 2's name in it
01:19 🔗 aaaaaaaaa at least that is my understanding
01:27 🔗 SketchCow FOS still a nightmare
01:29 🔗 namespace FOS?
01:44 🔗 aaaaaaaaa FOS is the magical machine that turns what we download into something usable by the internet archive
01:46 🔗 godane looks like this ted talk is as video/audio sync issue: https://archive.org/details/JaneMcGonigal_2010
01:46 🔗 godane this is ted talk issue cause i have the problem with the original file and the 1500k file i have
02:07 🔗 SketchCow My buddy Jane!
02:14 🔗 JaneMax >___>
02:46 🔗 SketchCow I can have more than one buddy Jane
06:35 🔗 Nemo_bis http://www.infodisiac.com/Wikipedia/ScanMail/ is not much on wayback, archivebot please :) (no parent dirs)
12:35 🔗 schbirid http://www.apkmirror.com/
12:41 🔗 godane i could do a brute force of the download links: http://www.apkmirror.com/wp-content/themes/APKMirror/download.php?id=970
15:12 🔗 dserodio how to properly archive controversial Youtube videos that may be taken down because of political "censoring"?
15:19 🔗 arkiver <SketchCow>Verified and now deleting 3tb of Ancestry.com files.
15:20 🔗 arkiver You do know we are still running? http://tracker.archiveteam.org/ancestry/
15:35 🔗 SketchCow I do.
15:35 🔗 SketchCow That's 3tb of files that have been uploaded out of the hopper.
15:56 🔗 DFJustin dserodio: youtube_dl --title --continue --retries 4 --write-info-json --write-description --write-thumbnail --write-annotations --all-subs --ignore-errors -f 38/138+141/138+22/138+140/138+139/264+141/264+22/264+140/264+139/137+141/137+22/137+140/137+139/37/22/135+141/135+22/135+140/135+139/best
15:56 🔗 DFJustin https://github.com/ludios/youtube-dl
15:57 🔗 DFJustin (needs to be on the wiki)
19:46 🔗 bebzol hi! can any of the admins look at github repository ownlog-grab?
19:46 🔗 bebzol I'm at the second round of testing those project - and so far everything is working fine
21:05 🔗 reptile10 WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
21:05 🔗 chronomex yahoosucks
21:05 🔗 chronomex what is your quest

irclogger-viewer