[00:17] *** ats has quit IRC (Read error: Operation timed out) [00:20] *** ats has joined #archiveteam-ot [00:37] *** qnisz has joined #archiveteam-ot [00:42] *** qnicw has quit IRC (Read error: Operation timed out) [01:08] *** akierig has joined #archiveteam-ot [01:15] *** robogoat has quit IRC (Read error: Operation timed out) [01:16] *** BlueMax has joined #archiveteam-ot [01:16] *** yawkat has quit IRC (Ping timeout: 252 seconds) [01:16] *** robogoat has joined #archiveteam-ot [01:26] *** akierig_ has joined #archiveteam-ot [01:28] *** yawkat has joined #archiveteam-ot [01:32] *** akierig has quit IRC (Read error: Operation timed out) [01:46] *** akierig_ has quit IRC (Remote host closed the connection) [01:47] *** akierig has joined #archiveteam-ot [01:49] *** akierig_ has joined #archiveteam-ot [01:49] *** akierig has quit IRC (Read error: Connection reset by peer) [02:07] *** akierig has joined #archiveteam-ot [02:07] *** bluefoo has quit IRC (Read error: Connection reset by peer) [02:13] *** akierig_ has quit IRC (Read error: Operation timed out) [02:39] *** akierig has quit IRC (Quit: later_gator) [04:18] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [04:27] *** qnisz has quit IRC (Ping timeout: 496 seconds) [04:32] *** qw3rty2 has joined #archiveteam-ot [04:37] *** qw3rty has quit IRC (Ping timeout: 745 seconds) [04:38] *** odemg has quit IRC (Ping timeout: 745 seconds) [04:42] *** odemg has joined #archiveteam-ot [05:36] *** dhyan_nat has joined #archiveteam-ot [06:14] *** markedL7 has joined #archiveteam-ot [06:16] *** markedL has quit IRC (Read error: Operation timed out) [06:16] *** markedL7 is now known as markedL [06:29] *** dhyan_nat has quit IRC (Quit: Konversation terminated!) [06:29] *** dhyan_nat has joined #archiveteam-ot [06:35] *** m007a83 has joined #archiveteam-ot [07:14] *** RSY00O has joined #archiveteam-ot [07:14] [02:12] Hi. [02:13] I am wondering if it [02:13] excuse me [02:13] I am wondering if it's possible to mass archive every YT video, but just everything from before the beginning of 2010. I.e. all of 2000s YouTube. [02:14] my friend told me 2000s YT would be ~385TB but it was a rough estimate [07:15] RSY00O: do you know how many videos that was? [07:16] I have an extensive YouTube archiving thing going on in #youtubearchive [07:16] https://www.archiveteam.org/index.php?title=YouTube this page says "Little is known about its database, but according to data from 2006, it was 45TB and doubling every 4 months. At this rate it would be 660 Petabytes (Oct 2014) by now." [07:17] I'm not immediately on board but getting all 2006-2009 YouTube but if you sample the content and think it's more good than bad, maybe [07:18] 2005-2009 [07:20] there's this which might help find the old stuff https://old.reddit.com/r/DataHoarder/comments/906884/youtube_metadata_archive_because_working_with/ [07:21] well the scope of 00s YT videos gets smaller every day. we would need to get it ASAP [07:21] especially before 2021. hopefully this whole article 17 thing won't make the scope shrink considerably. but we'll have to see [07:25] if I were somehow able to successfully get all the videos (pretty hard stuff) I would need to put them on like 25 individual 16 TB hard-drives to save them locally which would cost around $12,000 [07:25] and that's not including backups [07:26] but the YTPs and Unregistered Hypercam 2 vids must be saved! [07:26] are you any good at writing software or data modeling [07:27] Nope. [07:27] too bad, I am really looking for someone to help with the YouTube I've got [07:27] I'm assuming I need a web crawling script to work with youtube-dl? [07:28] I really respect what you guys are doing btw really great stuff, it'll take a few years before I can help that much with these painstaking efforts. [07:29] you'd have to scrape the upload playlists of a lot of channels and load them into a database and youtube-dl the pre-2010 stuff [07:29] just found https://github.com/simon987/yt-metadata via that reddit link [07:30] also you need hundreds of IPs to archive YouTube these days [07:30] you get about 500-1000 videos per day per IP [07:30] well thanks, for now with youtube-dl I can archive individual channels at the very least automatically right? [07:31] idk how that works my computer has issues that need to be worked out, so I couldn't install it yet [07:31] so once I watch one video from that time, just grab the entire channel's videos (up to 2010) and add them to my archive [07:32] so I don't have to download each video and copy the metadata manually, which sucks [07:52] youtube-dl can grab channels yes [07:59] *** deevious has joined #archiveteam-ot [08:03] *** HP_Archiv has joined #archiveteam-ot [08:03] Hey, so I've got HexChat installed [08:03] I'm trying to connect to the EFnet but I can't for some reason [08:03] Any thoughts? [08:04] Never mind, disregard that. Got it. [08:24] At the rate YouTube deletes videos, that 385 TB should only be a few gigs today :p [08:24] (snark re RSY00O) [08:28] *** BlueMax has quit IRC (Read error: Connection reset by peer) [09:09] *** HP_Archiv has quit IRC (Quit: Page closed) [09:51] *** Jens has quit IRC (Remote host closed the connection) [09:51] *** Jens has joined #archiveteam-ot [11:34] multilingual keyword spreadsheet that I made for youtube searching https://docs.google.com/spreadsheets/d/1fFqfhJjpZsCNuL9_uvRpwpe40onVoRE1RsflJsiOKKc/edit?usp=sharing [11:35] I guess the non-insane way to archive this stuff would be to have scripts hit search and look for high-view videos [13:33] *** vitzli has joined #archiveteam-ot [14:59] *** bluefoo has joined #archiveteam-ot [15:03] *** akierig has joined #archiveteam-ot [15:53] *** deevious has quit IRC (Remote host closed the connection) [15:55] *** dhyan_nat has quit IRC (Read error: Operation timed out) [16:01] *** RSY00O has quit IRC (Ping timeout: 260 seconds) [16:19] *** akierig has quit IRC (Quit: later_gator) [16:28] *** SketchCow has quit IRC (Read error: Connection reset by peer) [16:31] *** SketchCow has joined #archiveteam-ot [16:31] *** Fusl__ sets mode: +o SketchCow [16:31] *** Fusl sets mode: +o SketchCow [16:31] *** Fusl_ sets mode: +o SketchCow [16:37] *** Hani111 has joined #archiveteam-ot [16:47] *** Hani has quit IRC (Ping timeout: 745 seconds) [16:47] *** Hani111 is now known as Hani [17:49] *** icedice has joined #archiveteam-ot [17:54] *** iceloops1 has joined #archiveteam-ot [17:55] *** prq has joined #archiveteam-ot [18:08] *** akierig has joined #archiveteam-ot [18:41] *** vitzli has quit IRC (Quit: Leaving) [18:51] *** icedice has quit IRC (Ping timeout: 252 seconds) [19:19] *** Hani111 has joined #archiveteam-ot [19:23] *** Hani has quit IRC (Ping timeout: 745 seconds) [19:23] *** Hani111 is now known as Hani [19:46] *** icedice has joined #archiveteam-ot [20:24] *** akierig has quit IRC (Read error: Operation timed out) [20:59] *** odemg has quit IRC (Ping timeout: 745 seconds) [21:00] *** odemg has joined #archiveteam-ot [21:08] *** dhyan_nat has joined #archiveteam-ot [21:35] *** dhyan_nat has quit IRC (Read error: Operation timed out) [22:37] *** X-Scale` has joined #archiveteam-ot [22:40] *** X-Scale has quit IRC (Read error: Operation timed out) [22:40] *** X-Scale` is now known as X-Scale [23:10] *** BlueMax has joined #archiveteam-ot