#archiveteam-bs 2018-03-04,Sun

↑back Search

Time Nickname Message
00:11 🔗 odemg has quit IRC (Read error: Operation timed out)
00:16 🔗 odemg has joined #archiveteam-bs
00:29 🔗 odemg has quit IRC (Ping timeout: 250 seconds)
00:42 🔗 odemg has joined #archiveteam-bs
00:55 🔗 espes__ has joined #archiveteam-bs
01:12 🔗 BlueMax has quit IRC (Leaving)
01:22 🔗 ranavalon has quit IRC (Read error: Connection reset by peer)
01:22 🔗 ranavalon has joined #archiveteam-bs
02:49 🔗 Mayonaise has quit IRC (Read error: Connection reset by peer)
02:49 🔗 Mayonaise has joined #archiveteam-bs
02:54 🔗 ndiddy has joined #archiveteam-bs
04:14 🔗 qw3rty112 has joined #archiveteam-bs
04:20 🔗 qw3rty111 has quit IRC (Read error: Operation timed out)
04:24 🔗 Somebody2 Congrads on the new Warrior (yes, I haven't been paying attention much lately)!
04:26 🔗 ndiddy has quit IRC (Read error: Operation timed out)
05:18 🔗 hook54321 powerKitt: We can kinda use Tubeup to grab them, but it's not easy grab a whole channel with it.
05:20 🔗 powerKitt Yeah, since people have already reuploaded videos from it to archive.org, tubeup would crash and die when it got to those videos.
05:20 🔗 powerKitt I've got an idea though
05:21 🔗 powerKitt Gonna try and download the "flat playlist" of his uploads using youtube-dl (basically, a list of all his video URLs)
05:21 🔗 hook54321 Someone seems to already have started doing it.
05:21 🔗 powerKitt Yeah
05:22 🔗 hook54321 https://archive.org/search.php?query=creator%3A%22The+Alex+Jones+Channel%22
05:23 🔗 powerKitt Once I get a list of all of the video URLs though, I'll be able to just make a bash script that'll call tubeup.py on each individual video
05:23 🔗 hook54321 I wonder if this would be possible to do with the warrior somehow...
05:23 🔗 powerKitt so that if one has already been uploaded, the resulting crash of tubeup won't kill the whole job.
05:24 🔗 hook54321 You know that tubeup doesn't fully support channel URLs for grabbing videos in bulk yet, right?
05:24 🔗 powerKitt Yes.
05:24 🔗 powerKitt That's why I'm doing this workaround
05:24 🔗 powerKitt instead of just giving tubeup the channel url
05:25 🔗 hook54321 What I've been doing is tubeup (videoURL) && tubeup (videoURL) ...
05:25 🔗 powerKitt the script will give tubeup one of the video urls and then once that job has finished, it will start up another one.
05:25 🔗 hook54321 Are you doing this on Windows or Linux?
05:26 🔗 powerKitt I mean, I'm on Windows
05:26 🔗 hook54321 Also, side question, is it just me, or does the latest version of tubeup for some reason not make a list of video IDs of videos it has already grabbed?
05:26 🔗 powerKitt but I wouldn't be able to run this anyway
05:27 🔗 powerKitt I don't have the disk-space to spare for swap space at the moment.
05:27 🔗 powerKitt I'll make the Bash script for someone else to run.
05:27 🔗 hook54321 tubeup grabs one video at a time. It downloads a video, uploads it, and then deletes it.
05:29 🔗 powerKitt https://archive.zhimingwang.org/blog/2014-11-05-list-youtube-playlist-with-youtube-dl.html
05:30 🔗 powerKitt yeah, but I only have 1.78 GB of space left on my laptop
05:30 🔗 powerKitt and Alex probably has videos bigger than that
05:30 🔗 powerKitt (Also I'm on Windows, so tubeup wouldn't work anyway)
05:32 🔗 hook54321 oh
05:32 🔗 hook54321 I have an 8 TB drive, but my network connection isn't always stable. :/
05:33 🔗 powerKitt I need to get a 8TB drive for archival
05:40 🔗 powerKitt https://cdn.discordapp.com/attachments/155138530920628224/419730708622213121/alexjones.json
05:40 🔗 powerKitt all his videos
05:40 🔗 powerKitt *videos' ids
05:40 🔗 hook54321 I got an 8TB WD Red for $160
05:40 🔗 powerKitt 4.69 MB UTF-8 encoding
05:41 🔗 powerKitt hook54321: damn what a deal
05:42 🔗 Famicoman has quit IRC (Remote host closed the connection)
05:46 🔗 Famicoman has joined #archiveteam-bs
05:49 🔗 hook54321 powerKitt: How difficult would it be to create a script that tries to find Unlisted videos that were previously not unlisted by looking at the snapshots of the channel's pages through the wayback machine?
05:50 🔗 powerKitt probably pretty difficult, I'd assume.
07:21 🔗 ndiddy has joined #archiveteam-bs
07:26 🔗 ndiddy has quit IRC (Ping timeout: 250 seconds)
07:29 🔗 godane so i have past my number of items from last year
07:29 🔗 godane i'm at 106k now
07:30 🔗 godane lets see if i can get past 450k items this year
07:30 🔗 godane since that was my item number from 2016
07:37 🔗 BlueMax has joined #archiveteam-bs
07:41 🔗 Stilett0 has quit IRC (Ping timeout: 250 seconds)
08:14 🔗 godane so i fixed my 2nd broken tape
08:14 🔗 godane this one of the tapes i bought
08:14 🔗 godane not today but from ebay
08:19 🔗 Stilett0 has joined #archiveteam-bs
10:29 🔗 odemg has quit IRC (Ping timeout: 633 seconds)
10:37 🔗 odemg has joined #archiveteam-bs
10:37 🔗 odemg has quit IRC (Connection closed)
10:38 🔗 odemg has joined #archiveteam-bs
10:47 🔗 BlueMax has quit IRC (Leaving)
12:05 🔗 jschwart has joined #archiveteam-bs
12:16 🔗 fie has joined #archiveteam-bs
12:25 🔗 dashcloud has quit IRC (Read error: Operation timed out)
12:31 🔗 dashcloud has joined #archiveteam-bs
12:32 🔗 Mateon1 has quit IRC (Read error: Operation timed out)
12:32 🔗 Mateon1 has joined #archiveteam-bs
12:54 🔗 schbirid has joined #archiveteam-bs
12:59 🔗 schbirid i have not forgotten about that strava heatmap thing btw
12:59 🔗 schbirid just havent had time to validate the files
13:13 🔗 schbirid has quit IRC (Quit: Leaving)
13:31 🔗 ndiddy has joined #archiveteam-bs
13:51 🔗 Mateon1 has quit IRC (Mateon1)
13:52 🔗 Mateon1 has joined #archiveteam-bs
14:16 🔗 ranavalon has quit IRC (Read error: Connection reset by peer)
14:19 🔗 ranavalon has joined #archiveteam-bs
14:40 🔗 JAA SketchCow: Could you check whether the first WARC for Charlie Rose was transferred to FOS correctly? I haven't used FTP through the command line before, so I'm not entirely sure if this worked correctly (and I can't download the file again, it seems). The SHA-256 hash of /charlierose.com-videos/charlierose.com-videos-00000.warc.gz should be 599c8fa2311eff5cfc358407fd262642e3db4034af6e10241e5af28a392e53
14:40 🔗 JAA 6f.
14:40 🔗 JAA 599c8fa2311eff5cfc358407fd262642e3db4034af6e10241e5af28a392e536f
15:00 🔗 Stiletto has joined #archiveteam-bs
15:03 🔗 Stilett0 has quit IRC (Ping timeout: 360 seconds)
15:07 🔗 odemg has quit IRC (Read error: Operation timed out)
15:12 🔗 Stilett0 has joined #archiveteam-bs
15:13 🔗 Stiletto has quit IRC (Ping timeout: 250 seconds)
15:17 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
15:19 🔗 odemg has joined #archiveteam-bs
15:19 🔗 RichardG has joined #archiveteam-bs
15:20 🔗 Smiley https://github.com/ariya/phantomjs/issues/15344
15:20 🔗 Smiley phantomjs 'suspended'
15:20 🔗 Smiley > With that, all the earlier plans regarding PhantomJS 2.5 (from @Vitallium) or 2.1.x (from @pixiuPL) will be abandoned effective immediately. Consequently, the source and binary packages for the above abandoned version will be removed to avoid any confusions. PhantomJS version 2.1.1 will remain the last known stable release until further notice.
15:21 🔗 JAA Well yeah, PhantomJS has been dead for years. I've never been a fan of zombies.
15:21 🔗 Smiley lol
15:34 🔗 Stiletto has joined #archiveteam-bs
15:39 🔗 Stilett0 has quit IRC (Ping timeout: 360 seconds)
15:46 🔗 ndiddy has quit IRC (Read error: Operation timed out)
15:46 🔗 ndiddy has joined #archiveteam-bs
16:04 🔗 fie has quit IRC (Ping timeout: 600 seconds)
16:11 🔗 fie has joined #archiveteam-bs
16:15 🔗 Stilett0 has joined #archiveteam-bs
16:16 🔗 dashcloud has quit IRC (Remote host closed the connection)
16:17 🔗 ndiddy has quit IRC (Read error: Operation timed out)
16:17 🔗 dashcloud has joined #archiveteam-bs
16:17 🔗 Stiletto has quit IRC (Read error: Operation timed out)
16:22 🔗 Stiletto has joined #archiveteam-bs
16:27 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
16:33 🔗 Stilett0 has joined #archiveteam-bs
16:37 🔗 Stiletto has quit IRC (Ping timeout: 492 seconds)
16:37 🔗 Stiletto has joined #archiveteam-bs
16:38 🔗 Stilett0 has quit IRC (Ping timeout: 264 seconds)
16:44 🔗 fie has quit IRC (Ping timeout: 246 seconds)
16:52 🔗 powerKitt has quit IRC (Quit: powerKitt)
16:53 🔗 powerKitt has joined #archiveteam-bs
16:57 🔗 fie has joined #archiveteam-bs
17:17 🔗 fie has quit IRC (Read error: Operation timed out)
17:25 🔗 Stilett0 has joined #archiveteam-bs
17:26 🔗 fie has joined #archiveteam-bs
17:27 🔗 Stiletto has quit IRC (Read error: Operation timed out)
17:52 🔗 keith20 has joined #archiveteam-bs
18:13 🔗 Stiletto has joined #archiveteam-bs
18:17 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
18:17 🔗 Stilett0 has joined #archiveteam-bs
18:19 🔗 Stiletto has quit IRC (Read error: Operation timed out)
18:22 🔗 Stiletto has joined #archiveteam-bs
18:25 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
18:28 🔗 powerKit1 has joined #archiveteam-bs
18:30 🔗 powerKitt has quit IRC (Ping timeout: 360 seconds)
18:30 🔗 powerKit1 is now known as powerKitt
18:39 🔗 Stilett0 has joined #archiveteam-bs
18:40 🔗 dashcloud has quit IRC (Read error: Operation timed out)
18:41 🔗 dashcloud has joined #archiveteam-bs
18:42 🔗 Stiletto has quit IRC (Read error: Operation timed out)
18:50 🔗 Stiletto has joined #archiveteam-bs
18:50 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
18:55 🔗 bithippo has joined #archiveteam-bs
19:04 🔗 Stiletto has quit IRC (Ping timeout: 246 seconds)
19:05 🔗 Stilett0 has joined #archiveteam-bs
19:13 🔗 bithippo Anyone know if there's a best practice for archiving Github projects that have gone into read only/archived mode?
19:14 🔗 bithippo (https://github.com/ariya/phantomjs/issues/15344)
19:34 🔗 Polylith_ is now known as Polylith
19:37 🔗 powerKitt I would recommend using the GitHub API to grab the issues and pull requests as json files, and making a git clone --mirror of the repository
19:39 🔗 keith20 has quit IRC (Read error: Operation timed out)
19:39 🔗 powerKitt https://archive.org/details/Nicksergeantsnipt.git
19:39 🔗 powerKitt here's an example archival.
19:41 🔗 JSharp has joined #archiveteam-bs
19:46 🔗 JAA Yes, git clone --mirror + git bundle (with --all) is probably the best way to archive a git repo.
19:46 🔗 JAA The wiki is also a git repo, so it works for that as well.
19:47 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
19:48 🔗 Stilett0 has joined #archiveteam-bs
19:55 🔗 keith20 has joined #archiveteam-bs
19:56 🔗 bithippo Thank you!
20:04 🔗 Stiletto has joined #archiveteam-bs
20:07 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
20:20 🔗 Asparagir has joined #archiveteam-bs
20:32 🔗 antomatic has joined #archiveteam-bs
20:33 🔗 SmileyG has joined #archiveteam-bs
20:34 🔗 Smiley has quit IRC (Read error: Connection reset by peer)
20:34 🔗 Asparagir has quit IRC (Asparagir)
20:34 🔗 antomati_ has quit IRC (Read error: Operation timed out)
20:36 🔗 odemg has quit IRC (Read error: Operation timed out)
20:45 🔗 odemg has joined #archiveteam-bs
21:00 🔗 keith20 has quit IRC (byeee)
21:12 🔗 ndiddy has joined #archiveteam-bs
21:43 🔗 BlueMax has joined #archiveteam-bs
22:39 🔗 fie has quit IRC (Ping timeout: 244 seconds)
22:45 🔗 jschwart has quit IRC (Quit: Konversation terminated!)
22:50 🔗 ndiddy has quit IRC (Ping timeout: 492 seconds)
22:59 🔗 fie has joined #archiveteam-bs
23:12 🔗 dashcloud where can I read more about the updated warrior vm v3?
23:12 🔗 Ravenloft has joined #archiveteam-bs
23:13 🔗 JAA SketchCow: See above, can you run sha256sum charlierose.com-videos-00000.warc.gz on FOS please? My machine's out of disk so the grab is currently paused, and I'd like to make sure everything's fine before I delete anything on my side.
23:15 🔗 ndiddy has joined #archiveteam-bs
23:15 🔗 Stiletto has quit IRC (Read error: Operation timed out)
23:24 🔗 JAA (If you don't have sha256sum for some reason, another checksum is also fine for me.)
23:26 🔗 Stilett0 has joined #archiveteam-bs

irclogger-viewer