#archiveteam 2016-02-15,Mon

↑back Search

Time Nickname Message
00:10 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
00:11 🔗 Stiletto has joined #archiveteam
00:15 🔗 Muad-Dib has quit IRC (Ping timeout: 260 seconds)
00:17 🔗 Muad-Dib has joined #archiveteam
00:23 🔗 arkiver So at first this videobot will only support youtube
00:23 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
00:23 🔗 Stiletto has joined #archiveteam
00:23 🔗 arkiver It will save youtube video together with all files that would be neede for playback
00:23 🔗 arkiver it will also upload the youtube video as video item to IA
00:24 🔗 arkiver Now youtue-dl does an ok job on saving youtube videos and making them playback later
00:24 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
00:24 🔗 Chorca has quit IRC (Read error: Operation timed out)
00:25 🔗 arkiver But some video sites downloaded with youtube-dl won't have all files saved that are needed for playback
00:25 🔗 arkiver Other sites will be supported later
00:25 🔗 arkiver Full account, playlist, etc. discovery for videos will be in too
00:25 🔗 Stiletto has joined #archiveteam
00:28 🔗 arkiver SketchCow: what do you think of such a project? see above
00:29 🔗 Chorca has joined #archiveteam
00:43 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
00:44 🔗 Stiletto has joined #archiveteam
01:23 🔗 Stiletto has quit IRC (Ping timeout: 246 seconds)
01:33 🔗 SketchCow Videobot of what
01:33 🔗 SketchCow Everything?
01:37 🔗 SN4T14 has joined #archiveteam
01:39 🔗 HCross I think hes meaning an "on demand channel archiver" - so you feed it a channel and it gets everything related to it
01:52 🔗 Stiletto has joined #archiveteam
01:53 🔗 toad2 has joined #archiveteam
01:56 🔗 toad1 has quit IRC (Read error: Operation timed out)
02:03 🔗 SketchCow FOS load has gone WAY down.
02:03 🔗 SketchCow Hard drive usage is dropping notably
02:10 🔗 vitzli has joined #archiveteam
02:15 🔗 philpem has quit IRC (Ping timeout: 260 seconds)
02:22 🔗 SirCmpwn has joined #archiveteam
02:39 🔗 kisspunch has joined #archiveteam
02:45 🔗 Frogging1 YouTube Red deletes videos if the channel owner isn't around to accept the new terms of service I think
02:45 🔗 Frogging1 So it might be worth having a way to archive things that are likely to disappear
02:45 🔗 Frogging1 Certain YouTubers have died for example
02:45 🔗 Frogging1 Or just stopped using the site
02:46 🔗 Frogging1 is now known as Frogging
02:52 🔗 snape_ Weren't the new terms of service rolled out months ago, though? I remember wailing and arguments about it late last year. Hasn't the deadline come and gone?
02:55 🔗 snape_ A quick Google search suggests the deadline for accepting the TOS was 22 October 2015.
02:59 🔗 snape_ That being said... http://youtube.wikia.com/wiki/Deceased_YouTubers
03:10 🔗 trs80 they might only become inaccessible in the US though? since youtube red isn't available elsewhere
03:19 🔗 bai doesn't that mean....heh
03:26 🔗 altlabel has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 i0npulse has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 PotcFdk has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 limebyte has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 coretx has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 tobbez has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 pikhq has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 Ymgve has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 PurpleSym has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 mafrasi2 has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 Meeh has quit IRC (hub.dk irc.homelien.no)
03:26 🔗 sHATNER has quit IRC (hub.dk irc.homelien.no)
03:59 🔗 vOYtEC_ has joined #archiveteam
04:02 🔗 achip has quit IRC (hub.efnet.us irc.Prison.NET)
04:26 🔗 Stiletto has quit IRC (Remote host closed the connection)
04:26 🔗 Stiletto has joined #archiveteam
04:30 🔗 Stiletto has quit IRC (Remote host closed the connection)
04:31 🔗 Stiletto has joined #archiveteam
04:31 🔗 achip has joined #archiveteam
04:38 🔗 Chorca has quit IRC (Ping timeout: 252 seconds)
04:40 🔗 SketchCow sets mode: +b *!*kyan@184.75.223.*
04:40 🔗 kyan was kicked by SketchCow (kyan)
04:40 🔗 Chorca has joined #archiveteam
04:43 🔗 Froggypwn has joined #archiveteam
05:14 🔗 Swizzle has quit IRC (Read error: Operation timed out)
05:38 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:45 🔗 Sk1d has joined #archiveteam
06:09 🔗 oldcad has quit IRC (Quit: Leaving.)
06:20 🔗 db48x has joined #archiveteam
06:26 🔗 WinterFox has joined #archiveteam
06:26 🔗 sHATNER has joined #archiveteam
06:26 🔗 i0npulse has joined #archiveteam
06:26 🔗 mafrasi2_ has joined #archiveteam
06:26 🔗 altlabel has joined #archiveteam
06:26 🔗 PotcFdk has joined #archiveteam
06:26 🔗 limebyte has joined #archiveteam
06:26 🔗 coretx has joined #archiveteam
06:26 🔗 tobbez has joined #archiveteam
06:26 🔗 pikhq has joined #archiveteam
06:26 🔗 Ymgve has joined #archiveteam
06:26 🔗 PurpleSym has joined #archiveteam
06:26 🔗 Meeh has joined #archiveteam
06:26 🔗 irc.homelien.no sets mode: +o PurpleSym
07:13 🔗 Aranje has quit IRC (Quit: Three sheets to the wind)
07:15 🔗 jut has joined #archiveteam
08:47 🔗 signius has quit IRC (Ping timeout: 300 seconds)
08:49 🔗 antomatic has quit IRC (Read error: Connection reset by peer)
08:50 🔗 antomatic has joined #archiveteam
09:00 🔗 signius has joined #archiveteam
09:28 🔗 schbirid has joined #archiveteam
09:47 🔗 arkiver SketchCow: The videobot should support a lot of video, audio, etc. services
09:48 🔗 arkiver Basically if some youtube, vine, some other service account is going away for whatever reason then this videobot will grab all videos from that account
09:49 🔗 arkiver It then uploads the videos to IA as video items (like the videos from the youtubearchive collection that was darked) and as WARC items.
09:50 🔗 arkiver Single videos will also be supported. For example in case of protests or new terrorist attacks.
10:08 🔗 xekc has joined #archiveteam
10:25 🔗 lytv has quit IRC (Read error: Operation timed out)
10:26 🔗 lytv has joined #archiveteam
11:01 🔗 xekc has quit IRC (Ping timeout: 250 seconds)
11:11 🔗 VADemon has joined #archiveteam
11:16 🔗 Swizzle has joined #archiveteam
11:33 🔗 Swizzle has quit IRC (Read error: Operation timed out)
11:43 🔗 i0npulse has quit IRC (leaving)
11:47 🔗 i0npulse has joined #archiveteam
11:54 🔗 WinterFox has quit IRC (Remote host closed the connection)
12:00 🔗 megaminxw has joined #archiveteam
12:26 🔗 arkiver3 has joined #archiveteam
12:44 🔗 Rickster has quit IRC (Ping timeout: 260 seconds)
12:44 🔗 marvinw has quit IRC (Ping timeout: 260 seconds)
12:46 🔗 Kenshin has quit IRC (Read error: Connection reset by peer)
12:46 🔗 Kenshin has joined #archiveteam
12:46 🔗 Famicoman has quit IRC (Ping timeout: 260 seconds)
12:47 🔗 goekesmi has quit IRC (Ping timeout: 260 seconds)
12:47 🔗 goekesmi has joined #archiveteam
12:55 🔗 Rickster has joined #archiveteam
13:00 🔗 marvinw has joined #archiveteam
13:12 🔗 megaminxw has quit IRC (Quit: Leaving.)
13:34 🔗 VADemon has quit IRC (Read error: Operation timed out)
13:36 🔗 Famicoman has joined #archiveteam
13:47 🔗 arkiver3 has quit IRC (Ping timeout: 252 seconds)
13:51 🔗 arkiver3 has joined #archiveteam
13:52 🔗 SmileyG Nice
14:22 🔗 arkiver3 has quit IRC (Ping timeout: 252 seconds)
14:24 🔗 Zei-Pii has joined #archiveteam
14:31 🔗 plog99 has joined #archiveteam
14:34 🔗 fpoee has quit IRC (Ping timeout: 360 seconds)
14:41 🔗 vegbrasil has quit IRC (*)
14:41 🔗 vegbrasil has joined #archiveteam
14:43 🔗 scyther has joined #archiveteam
14:49 🔗 Boltsie__ has joined #archiveteam
14:50 🔗 Boltsie__ is now known as Boltsie
14:55 🔗 VADemon has joined #archiveteam
14:57 🔗 arkiver3 has joined #archiveteam
15:17 🔗 arkiver3 has quit IRC (Ping timeout: 252 seconds)
15:29 🔗 RichardG has quit IRC (Read error: Operation timed out)
15:48 🔗 GLaDOS has quit IRC (Read error: Operation timed out)
15:49 🔗 ndiddy has joined #archiveteam
15:56 🔗 RichardG has joined #archiveteam
16:12 🔗 scyther has quit IRC (Read error: Connection reset by peer)
16:14 🔗 GLaDOS has joined #archiveteam
16:23 🔗 VADemon has quit IRC (Quit: left4dead)
16:24 🔗 PotcFdk Hey, I just wanted to announce that I began rewriting my old broken YouTube channel/playlist mirror script that helps maintaining a local mirror of channels. It handles video title changes and collisions while providing a handy way of keeping an up-to-date mirror, including a directory of video-title symlinks that point at video-id files. Full explanation and example workflow in README.md - maybe somebody here is interested in such a thing, too.
16:24 🔗 PotcFdk https://github.com/PotcFdk/youtube-sync (Note: This is WIP. It works, but I wouldn't consider this stable yet.)
16:26 🔗 Fletcher arkiver ^
16:55 🔗 HCross Can whoever is running newsbuddy again. Stop please....
16:57 🔗 HCross The IRC bot is broken, but its actually working
17:04 🔗 SketchCow FOS continues to heal
17:31 🔗 espes__ has quit IRC (Read error: Operation timed out)
17:43 🔗 scyther has joined #archiveteam
17:51 🔗 philpem has joined #archiveteam
18:02 🔗 vitzli has quit IRC (Leaving)
18:04 🔗 mafrasi2_ has quit IRC (Read error: Connection reset by peer)
18:06 🔗 Swizzle has joined #archiveteam
18:07 🔗 i0npulse has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 sHATNER has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 altlabel has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 PotcFdk has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 limebyte has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 coretx has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 tobbez has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 pikhq has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 Ymgve has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 PurpleSym has quit IRC (hub.dk irc.homelien.no)
18:07 🔗 Meeh has quit IRC (hub.dk irc.homelien.no)
18:29 🔗 Tomcat_ has joined #archiveteam
18:34 🔗 swebb My heritrix crawl of Al Jazeera America is about 170,000 pages (18GB) so-far. Should I continue even if archive.org is already archiving it in the Wayback machine?
18:35 🔗 arkiver why not
18:35 🔗 arkiver duplicate of some pages wouldn't be too bad
18:36 🔗 snape_ If you can afford the space and BW, sure. All al-Jazeera have to do is add one line to their robots.txt to make everything in the Wayback Machine unavailable, after all...
18:36 🔗 arkiver if it's unavailable it's still saved
18:38 🔗 tobbez has joined #archiveteam
18:38 🔗 i0npulse has joined #archiveteam
18:38 🔗 PurpleSym has joined #archiveteam
18:38 🔗 mafrasi2 has joined #archiveteam
18:38 🔗 sHATNER has joined #archiveteam
18:38 🔗 altlabel has joined #archiveteam
18:38 🔗 PotcFdk has joined #archiveteam
18:38 🔗 limebyte has joined #archiveteam
18:38 🔗 coretx has joined #archiveteam
18:38 🔗 pikhq has joined #archiveteam
18:38 🔗 Ymgve has joined #archiveteam
18:38 🔗 Meeh has joined #archiveteam
18:43 🔗 snape_ True, but it doesn't hurt to have a second copy, just in case. It's less than a Blu-Ray of data, after all...
18:44 🔗 swebb snape: so-far.
18:48 🔗 zino PotcFdk: Nice! I'll try that out for my personal datahoarding.
18:49 🔗 PotcFdk zino: I'm happy that it appears to be useful to other people than just me
18:51 🔗 zino Now I only need something similar for Twitch since they automatically throw away all old content that has not been featured.
18:51 🔗 HCross zino, arkiver - is twitch something the videobot could tackle as a repeat thing?
18:52 🔗 arkiver sure
18:52 🔗 arkiver Hmm, I'm going to make a version too which can be run at home for personal archives
18:53 🔗 zino That would be very nice.
18:53 🔗 arkiver with the option to create WARC, only grab video/audio file or do both
18:54 🔗 jut That would be amazing.
18:54 🔗 snape_ swebb, is that 170,000 pages, or pages/images/scripts/everything else?
18:55 🔗 swebb Oh, everything.
18:55 🔗 swebb urls
18:55 🔗 swebb 80k html pages
18:56 🔗 PotcFdk zino: Feel free to spam issues in case everything breaks horribly
18:56 🔗 wyatt8740 has joined #archiveteam
18:58 🔗 zino PotcFdk: Will do. Probably not until the weekend though. I'm rebuilding my home racks and several of my storage servers are currently residing on my living room table. :)
19:02 🔗 snape_ swebb, I have to imagine you're pretty close to done. Even with all the topic pages and everything, that'd be something above 70 pages/day over their three-year run. I wouldn't think it'd be much above a hundred, but I could easily be wrong...
19:04 🔗 metalcamp has joined #archiveteam
19:06 🔗 snape_ Google claims to know of only "about 36,000" pages, FWIW. O.o
19:15 🔗 arkiver Update your scripts for gametrailers!
19:15 🔗 arkiver Last round of items
19:15 🔗 arkiver All 10videos items have been converted to single video items
19:15 🔗 arkiver well, all 10videos items that were out
19:16 🔗 snape_ Boston-specific startup dunwello.com is closing down in the next few weeks, probably maybe not even the wacky head of the company really seems to know for sure. http://bostinno.streetwise.co/2016/02/15/dunwello-is-shutting-down-matt-lauzon-says/
19:21 🔗 Frogging arkiver: You know, youtube-dl supports a ton of video sites and downloading whole profiles on some of them (including YouTube)
19:21 🔗 Frogging https://rg3.github.io/youtube-dl/supportedsites.html
19:22 🔗 arkiver yes, though youtube-dl is not working well for all video websites when comes to creating a WARC can be playbacked somewhere in the future
19:22 🔗 arkiver It sometimes doesn't grab all files needed for a playback
19:22 🔗 arkiver However, youtube-dl is working fine for youtube when it comes to that
19:22 🔗 Frogging Soundcloud is supported too it seems
19:23 🔗 Frogging WARC?
19:24 🔗 Frogging I'd google but I'm on mobile
19:24 🔗 arkiver WebARChive file
19:24 🔗 arkiver contains all headers too besides the files
19:24 🔗 Frogging What is that kind of file used for?
19:24 🔗 arkiver well, web archives
19:25 🔗 arkiver pretty much for every project we do
19:25 🔗 arkiver and the wayback machine only works with that
19:25 🔗 arkiver but let's move this over to #archiveteam-bs
19:29 🔗 Frogging kk
19:36 🔗 RichardG has quit IRC (Read error: Operation timed out)
19:44 🔗 GLaDOS has quit IRC (Ping timeout: 260 seconds)
20:30 🔗 metalcamp has quit IRC (Ping timeout: 492 seconds)
20:30 🔗 espes__ has joined #archiveteam
20:58 🔗 Zei-Pii has quit IRC (Ping timeout: 250 seconds)
21:09 🔗 Tomcat_ has quit IRC (Remote host closed the connection)
21:25 🔗 jut has quit IRC (Read error: Connection reset by peer)
21:29 🔗 wyatt8740 has quit IRC (Read error: Operation timed out)
21:34 🔗 RichardG has joined #archiveteam
21:36 🔗 schbirid has quit IRC (Quit: Leaving)
21:46 🔗 RichardG has quit IRC (Ping timeout: 633 seconds)
21:50 🔗 megaminxw has joined #archiveteam
22:00 🔗 megaminxw has quit IRC (Quit: Leaving.)
22:06 🔗 scyther has quit IRC (Quit: Leaving)
22:16 🔗 RichardG has joined #archiveteam
22:23 🔗 RichardG has quit IRC (Ping timeout: 360 seconds)
22:31 🔗 Atom__ has quit IRC (Ping timeout: 252 seconds)
22:32 🔗 Lord_Nigh has quit IRC (Ping timeout: 252 seconds)
22:35 🔗 superkuh has quit IRC (Ping timeout: 252 seconds)
22:37 🔗 Lord_Nigh has joined #archiveteam
22:39 🔗 superkuh has joined #archiveteam
23:28 🔗 mismatch has quit IRC (Remote host closed the connection)
23:28 🔗 mismatch has joined #archiveteam

irclogger-viewer