#archiveteam-ot 2019-11-15,Fri

↑back Search

Time Nickname Message
00:17 🔗 ats has quit IRC (Read error: Operation timed out)
00:20 🔗 ats has joined #archiveteam-ot
00:37 🔗 qnisz has joined #archiveteam-ot
00:42 🔗 qnicw has quit IRC (Read error: Operation timed out)
01:08 🔗 akierig has joined #archiveteam-ot
01:15 🔗 robogoat has quit IRC (Read error: Operation timed out)
01:16 🔗 BlueMax has joined #archiveteam-ot
01:16 🔗 yawkat has quit IRC (Ping timeout: 252 seconds)
01:16 🔗 robogoat has joined #archiveteam-ot
01:26 🔗 akierig_ has joined #archiveteam-ot
01:28 🔗 yawkat has joined #archiveteam-ot
01:32 🔗 akierig has quit IRC (Read error: Operation timed out)
01:46 🔗 akierig_ has quit IRC (Remote host closed the connection)
01:47 🔗 akierig has joined #archiveteam-ot
01:49 🔗 akierig_ has joined #archiveteam-ot
01:49 🔗 akierig has quit IRC (Read error: Connection reset by peer)
02:07 🔗 akierig has joined #archiveteam-ot
02:07 🔗 bluefoo has quit IRC (Read error: Connection reset by peer)
02:13 🔗 akierig_ has quit IRC (Read error: Operation timed out)
02:39 🔗 akierig has quit IRC (Quit: later_gator)
04:18 🔗 DogsRNice has quit IRC (Read error: Connection reset by peer)
04:27 🔗 qnisz has quit IRC (Ping timeout: 496 seconds)
04:32 🔗 qw3rty2 has joined #archiveteam-ot
04:37 🔗 qw3rty has quit IRC (Ping timeout: 745 seconds)
04:38 🔗 odemg has quit IRC (Ping timeout: 745 seconds)
04:42 🔗 odemg has joined #archiveteam-ot
05:36 🔗 dhyan_nat has joined #archiveteam-ot
06:14 🔗 markedL7 has joined #archiveteam-ot
06:16 🔗 markedL has quit IRC (Read error: Operation timed out)
06:16 🔗 markedL7 is now known as markedL
06:29 🔗 dhyan_nat has quit IRC (Quit: Konversation terminated!)
06:29 🔗 dhyan_nat has joined #archiveteam-ot
06:35 🔗 m007a83 has joined #archiveteam-ot
07:14 🔗 RSY00O has joined #archiveteam-ot
07:14 🔗 RSY00O [02:12] <RSY00O> Hi. [02:13] <RSY00O> I am wondering if it [02:13] <RSY00O> excuse me [02:13] <RSY00O> I am wondering if it's possible to mass archive every YT video, but just everything from before the beginning of 2010. I.e. all of 2000s YouTube. [02:14] <RSY00O> my friend told me 2000s YT would be ~385TB but it was a rough estimate
07:15 🔗 ivan RSY00O: do you know how many videos that was?
07:16 🔗 ivan I have an extensive YouTube archiving thing going on in #youtubearchive
07:16 🔗 RSY00O https://www.archiveteam.org/index.php?title=YouTube this page says "Little is known about its database, but according to data from 2006, it was 45TB and doubling every 4 months. At this rate it would be 660 Petabytes (Oct 2014) by now."
07:17 🔗 ivan I'm not immediately on board but getting all 2006-2009 YouTube but if you sample the content and think it's more good than bad, maybe
07:18 🔗 ivan 2005-2009
07:20 🔗 ivan there's this which might help find the old stuff https://old.reddit.com/r/DataHoarder/comments/906884/youtube_metadata_archive_because_working_with/
07:21 🔗 RSY00O well the scope of 00s YT videos gets smaller every day. we would need to get it ASAP
07:21 🔗 RSY00O especially before 2021. hopefully this whole article 17 thing won't make the scope shrink considerably. but we'll have to see
07:25 🔗 RSY00O if I were somehow able to successfully get all the videos (pretty hard stuff) I would need to put them on like 25 individual 16 TB hard-drives to save them locally which would cost around $12,000
07:25 🔗 RSY00O and that's not including backups
07:26 🔗 RSY00O but the YTPs and Unregistered Hypercam 2 vids must be saved!
07:26 🔗 ivan are you any good at writing software or data modeling
07:27 🔗 RSY00O Nope.
07:27 🔗 ivan too bad, I am really looking for someone to help with the YouTube I've got
07:27 🔗 RSY00O I'm assuming I need a web crawling script to work with youtube-dl?
07:28 🔗 RSY00O I really respect what you guys are doing btw really great stuff, it'll take a few years before I can help that much with these painstaking efforts.
07:29 🔗 ivan you'd have to scrape the upload playlists of a lot of channels and load them into a database and youtube-dl the pre-2010 stuff
07:29 🔗 ivan just found https://github.com/simon987/yt-metadata via that reddit link
07:30 🔗 ivan also you need hundreds of IPs to archive YouTube these days
07:30 🔗 ivan you get about 500-1000 videos per day per IP
07:30 🔗 RSY00O well thanks, for now with youtube-dl I can archive individual channels at the very least automatically right?
07:31 🔗 RSY00O idk how that works my computer has issues that need to be worked out, so I couldn't install it yet
07:31 🔗 RSY00O so once I watch one video from that time, just grab the entire channel's videos (up to 2010) and add them to my archive
07:32 🔗 RSY00O so I don't have to download each video and copy the metadata manually, which sucks
07:52 🔗 ivan youtube-dl can grab channels yes
07:59 🔗 deevious has joined #archiveteam-ot
08:03 🔗 HP_Archiv has joined #archiveteam-ot
08:03 🔗 HP_Archiv Hey, so I've got HexChat installed
08:03 🔗 HP_Archiv I'm trying to connect to the EFnet but I can't for some reason
08:03 🔗 HP_Archiv Any thoughts?
08:04 🔗 HP_Archiv Never mind, disregard that. Got it.
08:24 🔗 Raccoon At the rate YouTube deletes videos, that 385 TB should only be a few gigs today :p
08:24 🔗 Raccoon (snark re RSY00O)
08:28 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
09:09 🔗 HP_Archiv has quit IRC (Quit: Page closed)
09:51 🔗 Jens has quit IRC (Remote host closed the connection)
09:51 🔗 Jens has joined #archiveteam-ot
11:34 🔗 ivan multilingual keyword spreadsheet that I made for youtube searching https://docs.google.com/spreadsheets/d/1fFqfhJjpZsCNuL9_uvRpwpe40onVoRE1RsflJsiOKKc/edit?usp=sharing
11:35 🔗 ivan I guess the non-insane way to archive this stuff would be to have scripts hit search and look for high-view videos
13:33 🔗 vitzli has joined #archiveteam-ot
14:59 🔗 bluefoo has joined #archiveteam-ot
15:03 🔗 akierig has joined #archiveteam-ot
15:53 🔗 deevious has quit IRC (Remote host closed the connection)
15:55 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
16:01 🔗 RSY00O has quit IRC (Ping timeout: 260 seconds)
16:19 🔗 akierig has quit IRC (Quit: later_gator)
16:28 🔗 SketchCow has quit IRC (Read error: Connection reset by peer)
16:31 🔗 SketchCow has joined #archiveteam-ot
16:31 🔗 Fusl__ sets mode: +o SketchCow
16:31 🔗 Fusl sets mode: +o SketchCow
16:31 🔗 Fusl_ sets mode: +o SketchCow
16:37 🔗 Hani111 has joined #archiveteam-ot
16:47 🔗 Hani has quit IRC (Ping timeout: 745 seconds)
16:47 🔗 Hani111 is now known as Hani
17:49 🔗 icedice has joined #archiveteam-ot
17:54 🔗 iceloops1 has joined #archiveteam-ot
17:55 🔗 prq has joined #archiveteam-ot
18:08 🔗 akierig has joined #archiveteam-ot
18:41 🔗 vitzli has quit IRC (Quit: Leaving)
18:51 🔗 icedice has quit IRC (Ping timeout: 252 seconds)
19:19 🔗 Hani111 has joined #archiveteam-ot
19:23 🔗 Hani has quit IRC (Ping timeout: 745 seconds)
19:23 🔗 Hani111 is now known as Hani
19:46 🔗 icedice has joined #archiveteam-ot
20:24 🔗 akierig has quit IRC (Read error: Operation timed out)
20:59 🔗 odemg has quit IRC (Ping timeout: 745 seconds)
21:00 🔗 odemg has joined #archiveteam-ot
21:08 🔗 dhyan_nat has joined #archiveteam-ot
21:35 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
22:37 🔗 X-Scale` has joined #archiveteam-ot
22:40 🔗 X-Scale has quit IRC (Read error: Operation timed out)
22:40 🔗 X-Scale` is now known as X-Scale
23:10 🔗 BlueMax has joined #archiveteam-ot

irclogger-viewer