[00:11] *** odemg has quit IRC (Read error: Operation timed out) [00:16] *** odemg has joined #archiveteam-bs [00:29] *** odemg has quit IRC (Ping timeout: 250 seconds) [00:42] *** odemg has joined #archiveteam-bs [00:55] *** espes__ has joined #archiveteam-bs [01:12] *** BlueMax has quit IRC (Leaving) [01:22] *** ranavalon has quit IRC (Read error: Connection reset by peer) [01:22] *** ranavalon has joined #archiveteam-bs [02:49] *** Mayonaise has quit IRC (Read error: Connection reset by peer) [02:49] *** Mayonaise has joined #archiveteam-bs [02:54] *** ndiddy has joined #archiveteam-bs [04:14] *** qw3rty112 has joined #archiveteam-bs [04:20] *** qw3rty111 has quit IRC (Read error: Operation timed out) [04:24] Congrads on the new Warrior (yes, I haven't been paying attention much lately)! [04:26] *** ndiddy has quit IRC (Read error: Operation timed out) [05:18] powerKitt: We can kinda use Tubeup to grab them, but it's not easy grab a whole channel with it. [05:20] Yeah, since people have already reuploaded videos from it to archive.org, tubeup would crash and die when it got to those videos. [05:20] I've got an idea though [05:21] Gonna try and download the "flat playlist" of his uploads using youtube-dl (basically, a list of all his video URLs) [05:21] Someone seems to already have started doing it. [05:21] Yeah [05:22] https://archive.org/search.php?query=creator%3A%22The+Alex+Jones+Channel%22 [05:23] Once I get a list of all of the video URLs though, I'll be able to just make a bash script that'll call tubeup.py on each individual video [05:23] I wonder if this would be possible to do with the warrior somehow... [05:23] so that if one has already been uploaded, the resulting crash of tubeup won't kill the whole job. [05:24] You know that tubeup doesn't fully support channel URLs for grabbing videos in bulk yet, right? [05:24] Yes. [05:24] That's why I'm doing this workaround [05:24] instead of just giving tubeup the channel url [05:25] What I've been doing is tubeup (videoURL) && tubeup (videoURL) ... [05:25] the script will give tubeup one of the video urls and then once that job has finished, it will start up another one. [05:25] Are you doing this on Windows or Linux? [05:26] I mean, I'm on Windows [05:26] Also, side question, is it just me, or does the latest version of tubeup for some reason not make a list of video IDs of videos it has already grabbed? [05:26] but I wouldn't be able to run this anyway [05:27] I don't have the disk-space to spare for swap space at the moment. [05:27] I'll make the Bash script for someone else to run. [05:27] tubeup grabs one video at a time. It downloads a video, uploads it, and then deletes it. [05:29] https://archive.zhimingwang.org/blog/2014-11-05-list-youtube-playlist-with-youtube-dl.html [05:30] yeah, but I only have 1.78 GB of space left on my laptop [05:30] and Alex probably has videos bigger than that [05:30] (Also I'm on Windows, so tubeup wouldn't work anyway) [05:32] oh [05:32] I have an 8 TB drive, but my network connection isn't always stable. :/ [05:33] I need to get a 8TB drive for archival [05:40] https://cdn.discordapp.com/attachments/155138530920628224/419730708622213121/alexjones.json [05:40] all his videos [05:40] *videos' ids [05:40] I got an 8TB WD Red for $160 [05:40] 4.69 MB UTF-8 encoding [05:41] hook54321: damn what a deal [05:42] *** Famicoman has quit IRC (Remote host closed the connection) [05:46] *** Famicoman has joined #archiveteam-bs [05:49] powerKitt: How difficult would it be to create a script that tries to find Unlisted videos that were previously not unlisted by looking at the snapshots of the channel's pages through the wayback machine? [05:50] probably pretty difficult, I'd assume. [07:21] *** ndiddy has joined #archiveteam-bs [07:26] *** ndiddy has quit IRC (Ping timeout: 250 seconds) [07:29] so i have past my number of items from last year [07:29] i'm at 106k now [07:30] lets see if i can get past 450k items this year [07:30] since that was my item number from 2016 [07:37] *** BlueMax has joined #archiveteam-bs [07:41] *** Stilett0 has quit IRC (Ping timeout: 250 seconds) [08:14] so i fixed my 2nd broken tape [08:14] this one of the tapes i bought [08:14] not today but from ebay [08:19] *** Stilett0 has joined #archiveteam-bs [10:29] *** odemg has quit IRC (Ping timeout: 633 seconds) [10:37] *** odemg has joined #archiveteam-bs [10:37] *** odemg has quit IRC (Connection closed) [10:38] *** odemg has joined #archiveteam-bs [10:47] *** BlueMax has quit IRC (Leaving) [12:05] *** jschwart has joined #archiveteam-bs [12:16] *** fie has joined #archiveteam-bs [12:25] *** dashcloud has quit IRC (Read error: Operation timed out) [12:31] *** dashcloud has joined #archiveteam-bs [12:32] *** Mateon1 has quit IRC (Read error: Operation timed out) [12:32] *** Mateon1 has joined #archiveteam-bs [12:54] *** schbirid has joined #archiveteam-bs [12:59] i have not forgotten about that strava heatmap thing btw [12:59] just havent had time to validate the files [13:13] *** schbirid has quit IRC (Quit: Leaving) [13:31] *** ndiddy has joined #archiveteam-bs [13:51] *** Mateon1 has quit IRC (Mateon1) [13:52] *** Mateon1 has joined #archiveteam-bs [14:16] *** ranavalon has quit IRC (Read error: Connection reset by peer) [14:19] *** ranavalon has joined #archiveteam-bs [14:40] SketchCow: Could you check whether the first WARC for Charlie Rose was transferred to FOS correctly? I haven't used FTP through the command line before, so I'm not entirely sure if this worked correctly (and I can't download the file again, it seems). The SHA-256 hash of /charlierose.com-videos/charlierose.com-videos-00000.warc.gz should be 599c8fa2311eff5cfc358407fd262642e3db4034af6e10241e5af28a392e53 [14:40] 6f. [14:40] 599c8fa2311eff5cfc358407fd262642e3db4034af6e10241e5af28a392e536f [15:00] *** Stiletto has joined #archiveteam-bs [15:03] *** Stilett0 has quit IRC (Ping timeout: 360 seconds) [15:07] *** odemg has quit IRC (Read error: Operation timed out) [15:12] *** Stilett0 has joined #archiveteam-bs [15:13] *** Stiletto has quit IRC (Ping timeout: 250 seconds) [15:17] *** RichardG has quit IRC (Read error: Connection reset by peer) [15:19] *** odemg has joined #archiveteam-bs [15:19] *** RichardG has joined #archiveteam-bs [15:20] https://github.com/ariya/phantomjs/issues/15344 [15:20] phantomjs 'suspended' [15:20] > With that, all the earlier plans regarding PhantomJS 2.5 (from @Vitallium) or 2.1.x (from @pixiuPL) will be abandoned effective immediately. Consequently, the source and binary packages for the above abandoned version will be removed to avoid any confusions. PhantomJS version 2.1.1 will remain the last known stable release until further notice. [15:21] Well yeah, PhantomJS has been dead for years. I've never been a fan of zombies. [15:21] lol [15:34] *** Stiletto has joined #archiveteam-bs [15:39] *** Stilett0 has quit IRC (Ping timeout: 360 seconds) [15:46] *** ndiddy has quit IRC (Read error: Operation timed out) [15:46] *** ndiddy has joined #archiveteam-bs [16:04] *** fie has quit IRC (Ping timeout: 600 seconds) [16:11] *** fie has joined #archiveteam-bs [16:15] *** Stilett0 has joined #archiveteam-bs [16:16] *** dashcloud has quit IRC (Remote host closed the connection) [16:17] *** ndiddy has quit IRC (Read error: Operation timed out) [16:17] *** dashcloud has joined #archiveteam-bs [16:17] *** Stiletto has quit IRC (Read error: Operation timed out) [16:22] *** Stiletto has joined #archiveteam-bs [16:27] *** Stilett0 has quit IRC (Read error: Operation timed out) [16:33] *** Stilett0 has joined #archiveteam-bs [16:37] *** Stiletto has quit IRC (Ping timeout: 492 seconds) [16:37] *** Stiletto has joined #archiveteam-bs [16:38] *** Stilett0 has quit IRC (Ping timeout: 264 seconds) [16:44] *** fie has quit IRC (Ping timeout: 246 seconds) [16:52] *** powerKitt has quit IRC (Quit: powerKitt) [16:53] *** powerKitt has joined #archiveteam-bs [16:57] *** fie has joined #archiveteam-bs [17:17] *** fie has quit IRC (Read error: Operation timed out) [17:25] *** Stilett0 has joined #archiveteam-bs [17:26] *** fie has joined #archiveteam-bs [17:27] *** Stiletto has quit IRC (Read error: Operation timed out) [17:52] *** keith20 has joined #archiveteam-bs [18:13] *** Stiletto has joined #archiveteam-bs [18:17] *** Stilett0 has quit IRC (Read error: Operation timed out) [18:17] *** Stilett0 has joined #archiveteam-bs [18:19] *** Stiletto has quit IRC (Read error: Operation timed out) [18:22] *** Stiletto has joined #archiveteam-bs [18:25] *** Stilett0 has quit IRC (Read error: Operation timed out) [18:28] *** powerKit1 has joined #archiveteam-bs [18:30] *** powerKitt has quit IRC (Ping timeout: 360 seconds) [18:30] *** powerKit1 is now known as powerKitt [18:39] *** Stilett0 has joined #archiveteam-bs [18:40] *** dashcloud has quit IRC (Read error: Operation timed out) [18:41] *** dashcloud has joined #archiveteam-bs [18:42] *** Stiletto has quit IRC (Read error: Operation timed out) [18:50] *** Stiletto has joined #archiveteam-bs [18:50] *** Stilett0 has quit IRC (Read error: Operation timed out) [18:55] *** bithippo has joined #archiveteam-bs [19:04] *** Stiletto has quit IRC (Ping timeout: 246 seconds) [19:05] *** Stilett0 has joined #archiveteam-bs [19:13] Anyone know if there's a best practice for archiving Github projects that have gone into read only/archived mode? [19:14] (https://github.com/ariya/phantomjs/issues/15344) [19:34] *** Polylith_ is now known as Polylith [19:37] I would recommend using the GitHub API to grab the issues and pull requests as json files, and making a git clone --mirror of the repository [19:39] *** keith20 has quit IRC (Read error: Operation timed out) [19:39] https://archive.org/details/Nicksergeantsnipt.git [19:39] here's an example archival. [19:41] *** JSharp has joined #archiveteam-bs [19:46] Yes, git clone --mirror + git bundle (with --all) is probably the best way to archive a git repo. [19:46] The wiki is also a git repo, so it works for that as well. [19:47] *** Stilett0 has quit IRC (Read error: Operation timed out) [19:48] *** Stilett0 has joined #archiveteam-bs [19:55] *** keith20 has joined #archiveteam-bs [19:56] Thank you! [20:04] *** Stiletto has joined #archiveteam-bs [20:07] *** Stilett0 has quit IRC (Read error: Operation timed out) [20:20] *** Asparagir has joined #archiveteam-bs [20:32] *** antomatic has joined #archiveteam-bs [20:33] *** SmileyG has joined #archiveteam-bs [20:34] *** Smiley has quit IRC (Read error: Connection reset by peer) [20:34] *** Asparagir has quit IRC (Asparagir) [20:34] *** antomati_ has quit IRC (Read error: Operation timed out) [20:36] *** odemg has quit IRC (Read error: Operation timed out) [20:45] *** odemg has joined #archiveteam-bs [21:00] *** keith20 has quit IRC (byeee) [21:12] *** ndiddy has joined #archiveteam-bs [21:43] *** BlueMax has joined #archiveteam-bs [22:39] *** fie has quit IRC (Ping timeout: 244 seconds) [22:45] *** jschwart has quit IRC (Quit: Konversation terminated!) [22:50] *** ndiddy has quit IRC (Ping timeout: 492 seconds) [22:59] *** fie has joined #archiveteam-bs [23:12] where can I read more about the updated warrior vm v3? [23:12] *** Ravenloft has joined #archiveteam-bs [23:13] SketchCow: See above, can you run sha256sum charlierose.com-videos-00000.warc.gz on FOS please? My machine's out of disk so the grab is currently paused, and I'd like to make sure everything's fine before I delete anything on my side. [23:15] *** ndiddy has joined #archiveteam-bs [23:15] *** Stiletto has quit IRC (Read error: Operation timed out) [23:24] (If you don't have sha256sum for some reason, another checksum is also fine for me.) [23:26] *** Stilett0 has joined #archiveteam-bs