[00:05] *** simon816 has joined #internetarchive [00:06] *** JAA has joined #internetarchive [00:07] *** bakJAA sets mode: +o JAA [00:10] *** arkiver has quit IRC (Read error: Operation timed out) [00:10] *** dxrt_ has quit IRC (Read error: Operation timed out) [00:10] *** ivan has quit IRC (Read error: Operation timed out) [00:10] *** balrog has quit IRC (Read error: Operation timed out) [00:10] *** kiska1 has quit IRC (Read error: Operation timed out) [00:11] *** logchfoo2 starts logging #internetarchive at Sun Mar 31 00:11:27 2019 [00:11] *** logchfoo2 has joined #internetarchive [00:11] *** arbin has joined #internetarchive [00:11] *** fredgido has joined #internetarchive [00:12] *** joepie91 has joined #internetarchive [00:14] *** fredgido_ has quit IRC (Read error: Operation timed out) [00:16] *** qw3rty112 has quit IRC (Read error: Operation timed out) [00:16] *** arkiver has joined #internetarchive [00:30] *** Sunfused has joined #internetarchive [01:08] *** qw3rty112 has joined #internetarchive [01:09] *** kiska1 has joined #internetarchive [02:03] *** balrog has quit IRC (Read error: Operation timed out) [02:07] *** balrog has joined #internetarchive [03:32] *** qw3rty113 has joined #internetarchive [03:38] *** qw3rty112 has quit IRC (Ping timeout: 600 seconds) [03:57] *** balrog has quit IRC (Write error: Broken pipe) [03:57] *** kiska1 has quit IRC (Read error: Operation timed out) [03:57] *** balrog has joined #internetarchive [03:58] *** Sunfused has quit IRC (Read error: Operation timed out) [03:59] *** qw3rty114 has joined #internetarchive [03:59] *** kiska1 has joined #internetarchive [03:59] *** Sunfused has joined #internetarchive [04:01] *** qw3rty113 has quit IRC (Ping timeout: 600 seconds) [04:04] *** odemg has quit IRC (Ping timeout: 615 seconds) [04:10] *** odemg has joined #internetarchive [04:40] *** DFJustin has quit IRC (Remote host closed the connection) [04:57] *** DFJustin has joined #internetarchive [06:25] *** bztoot has quit IRC (Remote host closed the connection) [07:00] *** t2t2 has joined #internetarchive [07:23] *** Somebody2 has quit IRC (Read error: Operation timed out) [07:24] *** Somebody2 has joined #internetarchive [07:25] *** flipflop has joined #internetarchive [07:29] *** Jopik has quit IRC (Ping timeout: 360 seconds) [07:30] *** prokuz has joined #internetarchive [07:31] *** Smiley has quit IRC (Remote host closed the connection) [07:31] *** prokuz has quit IRC (Remote host closed the connection) [07:31] *** Smiley has joined #internetarchive [07:33] *** prokuz has joined #internetarchive [07:33] *** prokuz has quit IRC (Remote host closed the connection) [07:34] *** Jopik has joined #internetarchive [07:35] *** flipflop has quit IRC (Ping timeout: 360 seconds) [07:36] *** VADemon has joined #internetarchive [07:54] *** Somebody2 has quit IRC (Ping timeout: 360 seconds) [07:55] *** Somebody2 has joined #internetarchive [07:59] *** dxrt_ has joined #internetarchive [07:59] *** dxrt sets mode: +o dxrt_ [10:24] *** Stiletto has joined #internetarchive [19:13] *** figpucker has joined #internetarchive [19:41] *** figpucker has quit IRC (Read error: Connection reset by peer) [19:43] *** figpucker has joined #internetarchive [19:58] is there a good way to search for youtube videos in IA? Searching by channel id doesn't yield many results. Is it up to the uploader to tag it? [20:15] simon816: you can search by video ID or channel name, also tags [20:16] tubeup script uploads that metadata [20:17] video ID should work, but requires me to have all ids for a channel (luckily I can get that from somewhere else). when searching for channel name I got some false positives [20:18] basically I'm wanting to download videos from channels that are not already in the archive [20:18] so I need to find out what's already in IA [20:19] ah, check the "Mirrortube" collection and export the item names [20:20] there are about 100,000 videos already, not sure if all from youtube or other services too [20:20] 252,000 [20:22] thanks. that looks like a good place to start [20:24] tubeup uses the pattern "service-ID" for item names [20:26] twitter videos are "twittercard-ID" though i dont know if they are in mirrortube collection [20:29] I guess my next question is how to get all file names in the mirrortube collection? [20:38] simon816: https://archive.org/advancedsearch.php [20:40] excellent, thanks [20:51] unfortunately most uploads are really short on metadata or even lack a description. argh >:/ [22:08] How you guys doing with Google Minus project at the moment [22:10] *** figpucker has quit IRC (Quit: Leaving) [22:12] * Kaz looks at topic [22:35] Whoops [22:38] *** GLolol has joined #internetarchive [23:02] *** Jopik has quit IRC (Remote host closed the connection) [23:02] *** Jopik has joined #internetarchive [23:12] [total task time: 3.0 days][includes 18.4 hours for initial rsync] [23:12] Ah, the joy of derives on huge items. [23:13] But at least it ran through smoothly, unlike last time, when it had to be restarted by IA admins three times. [23:43] huh, more than I expected. https://puu.sh/D8cy1/967335b015.png <- percentage of videos of my subscriptions found in mirrortube