#archiveteam 2016-04-15,Fri

↑back Search

Time Nickname Message
01:11 🔗 GLaDOS has quit IRC (Quit: Oh crap, I died.)
01:11 🔗 GLaDOS has joined #archiveteam
01:20 🔗 davidar has quit IRC (Quit: Connection closed for inactivity)
01:29 🔗 xXx_ndidd has joined #archiveteam
01:38 🔗 Stiletto has quit IRC (Read error: Operation timed out)
01:40 🔗 ndiddy has quit IRC (Read error: Operation timed out)
01:53 🔗 Stiletto has joined #archiveteam
01:58 🔗 davidar has joined #archiveteam
02:00 🔗 Yoshimura has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client)
02:00 🔗 Yoshimura has joined #archiveteam
03:00 🔗 JesseW has joined #archiveteam
03:11 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
03:29 🔗 SadDM has joined #archiveteam
03:29 🔗 swebb sets mode: +o SadDM
03:30 🔗 matthusb- has joined #archiveteam
03:30 🔗 yakfish has joined #archiveteam
03:32 🔗 jspiros has joined #archiveteam
03:40 🔗 JesseW has joined #archiveteam
03:49 🔗 bwn has quit IRC (Quit: Leaving)
03:50 🔗 bwn has joined #archiveteam
04:28 🔗 scyther has joined #archiveteam
04:59 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:04 🔗 Lord_Nigh have a question about archiving all the VOD videos on a particular twitch channel
05:05 🔗 Lord_Nigh a popular streamer named dalsarius82 died suddenly and unexpectedly (pulmonary embolism?) tuesday morning and i'm wondering how all his videos can be archived before they expire due to the twitch 30 day thing
05:06 🔗 Sk1d has joined #archiveteam
05:09 🔗 Honno has joined #archiveteam
05:10 🔗 ariscop has quit IRC (Read error: Operation timed out)
05:22 🔗 scyther has quit IRC (Quit: Leaving)
05:25 🔗 dan- Lord_Nigh: maybe point youtube-dl and see if that gets anything?
05:25 🔗 dan- point youtube-dl at his channel*
06:03 🔗 ariscop has joined #archiveteam
06:15 🔗 MMovie2 has joined #archiveteam
06:17 🔗 MMovie has quit IRC (Read error: Operation timed out)
06:25 🔗 WinterFox has joined #archiveteam
06:39 🔗 kisspunch has quit IRC (ZNC - http://znc.in)
06:43 🔗 kisspunch has joined #archiveteam
06:48 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
06:53 🔗 Medowar has joined #archiveteam
06:54 🔗 JesseW has joined #archiveteam
07:00 🔗 metalcamp has joined #archiveteam
07:11 🔗 schbirid has joined #archiveteam
07:30 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
07:31 🔗 mek_ has joined #archiveteam
07:31 🔗 vitzli has joined #archiveteam
07:42 🔗 Lord_Nigh does youtube-dl handle twitch? the documentation was ambiguous?
07:45 🔗 dan- yepyep, iirc you should just need to feed it the channel url / url to the list of vods, https://github.com/rg3/youtube-dl/blob/master/docs/supportedsites.md
07:55 🔗 casdr has left
08:07 🔗 RedType has quit IRC (Read error: Operation timed out)
08:23 🔗 atomotic has joined #archiveteam
08:29 🔗 vitzli has quit IRC (Quit: Leaving)
08:34 🔗 hook54321 has joined #archiveteam
08:37 🔗 mek_ has quit IRC (Read error: Operation timed out)
08:38 🔗 Stiletto has quit IRC (Read error: Operation timed out)
08:57 🔗 RedType has joined #archiveteam
09:19 🔗 Stiletto has joined #archiveteam
09:28 🔗 bwn has quit IRC (Read error: Operation timed out)
09:51 🔗 bwn has joined #archiveteam
09:53 🔗 Stiletto has quit IRC (Read error: Operation timed out)
10:18 🔗 VADemon has joined #archiveteam
10:20 🔗 arkiver2 has joined #archiveteam
10:34 🔗 arkiver2 has quit IRC (Ping timeout: 244 seconds)
11:15 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
11:27 🔗 quibbit has joined #archiveteam
11:27 🔗 quibbit moddb?
11:28 🔗 quibbit has quit IRC (Client Quit)
11:28 🔗 HCross http://www.moddb.com/groups/moddb/news/gamefront-is-closing-lets-save-the-mods#readarticle
11:30 🔗 HCross We might want to get GameFront going and done
11:48 🔗 atomotic has joined #archiveteam
11:50 🔗 arkiver We already have most of it
11:50 🔗 arkiver I'll make sure we also get the latest files
11:58 🔗 AndroUser has joined #archiveteam
11:58 🔗 AndroUser has quit IRC (Client Quit)
11:58 🔗 AndroUser has joined #archiveteam
12:00 🔗 AndroUser has left
12:01 🔗 arkiver2 has joined #archiveteam
12:01 🔗 swebb sets mode: +o arkiver2
12:27 🔗 VADemon has quit IRC (Quit: left4dead)
12:42 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
12:44 🔗 z00nx has quit IRC (Ping timeout: 244 seconds)
12:55 🔗 Stiletto has joined #archiveteam
13:01 🔗 arkiver2 has quit IRC (Quit: AndroIRC - Android IRC Client ( http://www.androirc.com ))
13:02 🔗 arkiver2 has joined #archiveteam
13:02 🔗 swebb sets mode: +o arkiver2
13:03 🔗 scyther has joined #archiveteam
13:06 🔗 z00nx has joined #archiveteam
13:09 🔗 davidar out of curiosity, has anyone archived http://videolectures.net/ ?
13:10 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:13 🔗 davidar hrm... "Video downloads (where available) are limited to several lecture downloads per day. We were forced to introduce download limitations due to several web-bot experiences in which automated downloaders tried to transfer terabytes of data and consequently over-saturated our servers and internet connections, thus hindering our quality of service to other
13:13 🔗 davidar users." http://videolectures.net/faq/
13:25 🔗 vitzli has joined #archiveteam
13:42 🔗 WinterFox has quit IRC (Remote host closed the connection)
14:09 🔗 arkiver3 has joined #archiveteam
14:09 🔗 swebb sets mode: +o arkiver3
14:09 🔗 dashcloud has quit IRC (Ping timeout: 250 seconds)
14:10 🔗 dashcloud has joined #archiveteam
14:13 🔗 arkiver2 has quit IRC (Ping timeout: 244 seconds)
14:30 🔗 MMovie has joined #archiveteam
14:37 🔗 MMovie2 has quit IRC (Ping timeout: 633 seconds)
15:09 🔗 atomotic has joined #archiveteam
15:10 🔗 arkiver3 has quit IRC (Ping timeout: 244 seconds)
15:15 🔗 arkiver3 has joined #archiveteam
15:15 🔗 swebb sets mode: +o arkiver3
15:15 🔗 arkiver3 has quit IRC (Client Quit)
15:18 🔗 Froggypwn has joined #archiveteam
15:31 🔗 atrocity has quit IRC (Read error: Connection reset by peer)
15:37 🔗 Froggypwn has quit IRC (Ping timeout: 961 seconds)
15:37 🔗 JesseW has joined #archiveteam
15:42 🔗 Froggypwn has joined #archiveteam
16:06 🔗 bwn_ has joined #archiveteam
16:11 🔗 bwn__ has joined #archiveteam
16:15 🔗 mek_ has joined #archiveteam
16:16 🔗 zino 8.6T GameTrailers packed and uploaded to IA. Now I need someone to verify that that looks good before I purge the uploaded folder.
16:16 🔗 zino ^-- arkiver
16:16 🔗 arkiver yeah
16:17 🔗 arkiver SketchCow: how do you normally verify an upload?
16:19 🔗 HCross do hashes on both sides?
16:19 🔗 bwn has quit IRC (Read error: Operation timed out)
16:20 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
16:22 🔗 arkiver I'm not sure if SketchCow hashes everything or checks using some other way
16:23 🔗 bwn_ has quit IRC (Read error: Operation timed out)
16:24 🔗 schbirid MrRadar: try to pressure the moddb guy into letting us archive their files please :)
16:24 🔗 MrRadar So ask him if we can archive ModDB?
16:24 🔗 schbirid yes
16:25 🔗 MrRadar Have they blocked us before?
16:25 🔗 schbirid no idea but raw access > all the haxxoring
16:25 🔗 MrRadar Yeah
16:27 🔗 aashsdhdf has joined #archiveteam
16:27 🔗 VADemon has joined #archiveteam
16:30 🔗 MrRadar Should I just ask them to get in touch with us or should I propose something more concrete?
16:31 🔗 arkiver I'd say archive using warrior project
16:31 🔗 arkiver but they're stable now, so I don't think it's needed yet
16:32 🔗 MrRadar Yeah, in the comment thread he specifically mentioned that ModDB is run as a private indepdent site to avoid having any pressure to turn signficant profits or die
16:32 🔗 MrRadar *independent
16:38 🔗 Medowar has quit IRC (Quit: Connection closed for inactivity)
16:39 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
16:41 🔗 khaoohs has joined #archiveteam
16:42 🔗 schbirid "could we rsync your files?"
16:43 🔗 arkiver do you have metadata of those files?
16:43 🔗 arkiver what filenames do they have?
16:43 🔗 arkiver Grabbing them using a warrior project will also get us all metadata
16:43 🔗 MrRadar Yeah, ModDB is much more than just a file hosting site
16:43 🔗 MrRadar They also host blogs, photos, videos, etc
16:43 🔗 MrRadar From the mode developers
16:43 🔗 MrRadar *mod
16:44 🔗 MrRadar If we archived them we'd definitely want to grab all that stuff too
16:48 🔗 aashsdhdf Does anything need to happen to save Gamefront? http://www.archiveteam.org/index.php?title=GameFront
16:48 🔗 schbirid yeah but it might be easy for them to give out static files
16:48 🔗 phuzion aashsdhdf: We've been working on it in the background for a while for months. We've archived something around 28TB of data from their site.
16:50 🔗 aashsdhdf Awesome. Sometimes projects like this have a divide and conquer strategy where people just need to download a client or start wget-ing stuff
16:50 🔗 MrRadar Yeah, we have a VM (the ArchiveTeam Warrior) people can run and it just archives stuff
16:51 🔗 MrRadar (You can also run the projects yourself if you have a Linux system)
16:55 🔗 aashsdhdf has quit IRC (Leaving)
17:03 🔗 mek_ has quit IRC (Ping timeout: 244 seconds)
17:06 🔗 espes__ has quit IRC (Read error: Operation timed out)
17:08 🔗 sivoais has quit IRC (Ping timeout: 244 seconds)
17:09 🔗 DarkMorph has joined #archiveteam
17:12 🔗 scyther has quit IRC (Quit: Leaving)
17:15 🔗 espes__ has joined #archiveteam
17:19 🔗 sivoais has joined #archiveteam
17:23 🔗 DarkMorph has quit IRC (Sayonara!)
18:25 🔗 pfallenop has joined #archiveteam
18:33 🔗 Stiletto has quit IRC (Read error: Operation timed out)
18:41 🔗 kris33 has joined #archiveteam
18:42 🔗 bzc6p has joined #archiveteam
18:42 🔗 swebb sets mode: +o bzc6p
18:42 🔗 bzc6p sets mode: +oooo achip Atluxity chfoo chfoo-
18:42 🔗 bzc6p sets mode: +oooo closure dashcloud Fletcher Fletcher_
18:42 🔗 bzc6p sets mode: +oooo GLaDOS godane HCross HCross2
18:42 🔗 bzc6p sets mode: +oooo ivan` joepie91 JW_work Kazzy
18:43 🔗 bzc6p sets mode: +oooo Kaz midas PurpleSym Sanqui
18:43 🔗 bzc6p sets mode: +oooo SimpBrain schbirid Smiley Start
18:43 🔗 bzc6p sets mode: +ooo VADemon wp494 yipdw_
18:43 🔗 bzc6p sets mode: -o bzc6p
18:43 🔗 bzc6p has left
19:01 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
19:15 🔗 SilSte has joined #archiveteam
19:17 🔗 PMT has joined #archiveteam
19:18 🔗 VADemon has quit IRC (Ping timeout: 260 seconds)
19:20 🔗 PMT Hi all, I'm trying to run the Warrior appliance on VBox 5.0.14, and it mostly works, but if I try to select the GameFront project, it just says "beginning work on a project" and nothing ever shows up under Current Project. The last console output line is DEBUG Load pipeline /data/data/gamefront-aa..., no errors are printed. I tried nuking the data/data/projects/gamefront-... dir and/or the ~warrior/projec
19:21 🔗 PMT ts/gamefront dir (while the daemon said it was idle) and letting it recreate them, and nothing changed. I can select other projects just fine and have them run without error (e.g. the Yuku project), just not the GameFront project.
19:21 🔗 PMT (Sorry, it was VBox 5.0.16, not that I expect that to be apropos here; I just forgot I'd updated it.)
19:23 🔗 JW_work PMT: I think there may just not be any more work to do on that right at the moment. We've got most of it already, and the person in charge of adding new tasks may not have added any of the last bits yet.
19:27 🔗 PMT WFM, I just wanted to make sure it wasn't broken. :)
19:27 🔗 JW_work thanks for checking!
19:39 🔗 philpem has joined #archiveteam
19:40 🔗 arkiver2 has joined #archiveteam
19:40 🔗 swebb sets mode: +o arkiver2
19:57 🔗 kline has joined #archiveteam
20:01 🔗 bwn__ has quit IRC (Read error: Operation timed out)
20:03 🔗 kline what relationship does Archive Team have with archive.org , and what actually happens to archived data (where is it stored, how can it be retrieved by people?) I get that you're not archive.org, but http://archiveteam.org/index.php?title=Dev/Infrastructure makes it look like you have them as a host for (some?) data
20:05 🔗 DFJustin we upload a ton of shit to archive.org but anyone can upload a ton of shit to archive.org
20:05 🔗 kline Neat. I saw the Warrior VM mentioned and I was hoping to get one (or more :)) running at my university (we're looking to spin up a student managed server rack), so I'm just trying to see how it all works.
20:05 🔗 MrRadar The ArchiveTeam is indepdenent of the IA though our founder and de-facto leader SketchCow (a.k.a. Jason Scott) is an employee of the IA
20:06 🔗 Atluxity it's a conspiracy, really
20:06 🔗 MrRadar While the IA does general web crawling we do focussed grabs of content that's in immediate danger of disappearing
20:06 🔗 kline yeah, I saw that link, I occasionally bump into textfiles.com when he does somethingthat gets posted onto HN
20:07 🔗 MrRadar Since he works there he uses his privlidges to incorporate our data into the IA's WayBack Machine which makes it about a million times more accessible than if we were just posting the raw .warc files
20:08 🔗 MrRadar So it's basically a win-win (the IA gets more content and we have an outlet to store out work)
20:08 🔗 kline ace
20:08 🔗 JW_work (most) of our data.
20:08 🔗 MrRadar True that
20:08 🔗 kline so, for stuff that isn't uploaded to IA, does it just end up on volunteered storage? And how do normal people get access to that?
20:08 🔗 JW_work convoluted means
20:09 🔗 Jonimus I know torrents have been done but in general yeah
20:09 🔗 DFJustin basically all of it is uploaded at least to IA
20:10 🔗 DFJustin some of it is also on torrents or other servers
20:10 🔗 mismatch has quit IRC (Ping timeout: 370 seconds)
20:10 🔗 kline ok, that sounds good. Hopefully when we have infrastructure I'll be able to spin something up.
20:11 🔗 kline thanks for explaining what I'm sure gets explained often :)
20:12 🔗 vitzli has quit IRC (Quit: Leaving)
20:12 🔗 JW_work kline: there are a number of other options to help; you could run a IPFS node, or contribute to the IA.BAK project (an effort to use git-annex to backup archive.org)
20:12 🔗 JW_work or run a #archivebot pipeline
20:13 🔗 kline ia.bak is already on the "charitable assistance" list, but storage is probably going to be a problem for us (we use only donated hardware, HDD, etc) than bandwidth
20:13 🔗 JW_work nods
20:15 🔗 kline likewise, the only "issue" I can see with running Warrior is that we still need to negotiate a block of IPs, right now the proposal from IS is to get a single IP and nat like hell, and if that's the case I don't particularly want to risk the rack getting banned for crawls
20:15 🔗 DFJustin an archivebot pipeline would be great
20:16 🔗 JW_work you are hopefully already thinking of acting as a bittorrent seed for various linux distros, I hope
20:16 🔗 kline it's all up for negotiation with information services. They already blackholed one of the trial servers we had for running a student BNC because they thought it was a botnet cnc server
20:17 🔗 JW_work heh
20:17 🔗 DFJustin also make sure they don't have a content filtering proxy for porn or whatever
20:17 🔗 kline they do, but we'll probably be excluded from it
20:18 🔗 kline but yes, at the start, our infra resources are going to be way higher than our demand, so for at least the beginning the intention is to make the most of it for charitable stuff (ia.bak, IRC server hosting, maybe colo'd backups/failure for other similar projects) and scale it back when times are tight
20:19 🔗 mismatch has joined #archiveteam
20:22 🔗 bwn__ has joined #archiveteam
20:24 🔗 SketchCow tah dah
20:36 🔗 JW_work yes, yes?
20:42 🔗 SketchCow 8we have Gamefront by jow?
20:52 🔗 arkiver2 We got most of the files. The remaining files will also be done
20:53 🔗 arkiver2 And the forums need to be saved
20:54 🔗 arkiver2 I haven't had a good look at the forums yet, but I think we can do that in the project too
20:56 🔗 vegbrasil has quit IRC (*)
20:58 🔗 MrRadar Archivebot has also been scraping the forums sine January
21:02 🔗 bwn_ has joined #archiveteam
21:03 🔗 SketchCow I would prioritize.
21:08 🔗 kline has quit IRC (Quit: http://chat.efnet.org )
21:09 🔗 vegbrasil has joined #archiveteam
21:15 🔗 bwn__ has quit IRC (Read error: Operation timed out)
21:21 🔗 Frogging MrRadar: yeah. hopefully that pipeline doesn't die..
21:22 🔗 arkiver SketchCow: we don't have to prioritize, we can get everything
21:23 🔗 arkiver We still have 15 days left
21:28 🔗 Frogging arkiver: the Yuku grab is hitting a lot of infinite redirects (I think) such as http://cms-pixel.crowdreport.com/urlqueue/?pubID=1003&URL=http://eqguide.yuku.com/topic/780/http//[...]http//www.bck.org&divID=wrapper
21:29 🔗 arkiver is it a list of 20x URLs which the exact same URLs?
21:30 🔗 Frogging I don't see a list of URLs, I see like /topic/780/http//http//http//[some other URL]
21:30 🔗 arkiver can you post a log for me?
21:30 🔗 Frogging yeah I'll try
21:32 🔗 Frogging http://kitsune.fastquake.com/files/archiveteam/yuku_10threads_eqguide_78-wget.log
21:34 🔗 zino <zino> 8.6T GameTrailers packed and uploaded to IA. Now I need someone to verify that that looks good before I purge the uploaded folder.
21:34 🔗 zino ^-- SketchCow
21:34 🔗 zino No one was sure what the procedure for that was when I asked earlier.
21:35 🔗 SketchCow see if the upload redrows.
21:35 🔗 arkiver ah, I'll check
21:35 🔗 SketchCow check the history.
21:36 🔗 SketchCow if it all goes through, good.
21:36 🔗 arkiver nor redrows here
21:36 🔗 arkiver no*
21:36 🔗 zino arkiver: How can I check that?
21:36 🔗 arkiver It was uploaded using my keys, and I don't see any redrows here
21:36 🔗 SketchCow ok.
21:36 🔗 zino Ah, right, so maybe I can't check then. :)
21:37 🔗 SketchCow if something blows, it won't be the structure.
21:38 🔗 arkiver zino, here we go: https://archive.org/catalog.php?all=1&search_submitter=Arkiver@hotmail.com
21:39 🔗 zino Log in it says...
21:39 🔗 * zino makes an account
21:39 🔗 Frogging log link above, btw arkiver
21:39 🔗 arkiver Frogging: yes, I saw
21:39 🔗 arkiver strange error
21:39 🔗 Frogging oki :)
21:39 🔗 arkiver will be fixed anyway
21:51 🔗 zino I have 16G worth of .rsync-tmp dirs left in the gametralers incoming dir. I assume that is typical and can just be deleted?
21:54 🔗 zino arkiver: Any preference for what to upload next? fotolog (8.1T) or ftpgrab (8.7T).
21:55 🔗 SketchCow ftpgrab
21:56 🔗 zino OK.
21:58 🔗 metalcamp has quit IRC (Ping timeout: 250 seconds)
22:00 🔗 arkiver2 has quit IRC (Ping timeout: 244 seconds)
22:11 🔗 mek_ has joined #archiveteam
22:16 🔗 zino arkiver: Sent you some config changes to confirm before I start this.
22:16 🔗 arkiver will check that
22:16 🔗 arkiver Frogging: will fix the problem tomorrow, off to bed now
22:16 🔗 Frogging all right
22:17 🔗 Frogging good night :)
22:17 🔗 arkiver you too!
22:17 🔗 zino arkiver: Feel free to do my stuff tomorrow too. No rush. Have a good night. :)
22:17 🔗 arkiver zino: will do that then, you too!
22:31 🔗 Honno has quit IRC (Read error: Operation timed out)
22:34 🔗 RedType has quit IRC (Read error: Operation timed out)
22:35 🔗 nickname_ has joined #archiveteam
22:38 🔗 nickname_ I used wget to download everything on wshu.org (on 2016-03-31) and output it to a *.warc.gz and *.cdx, now what?
22:40 🔗 DFJustin https://archive.org/upload/
22:42 🔗 nickname_ okay
22:44 🔗 VADemon has joined #archiveteam
22:45 🔗 WinterFox has joined #archiveteam
22:47 🔗 schbirid has quit IRC (Quit: Leaving)
23:13 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:18 🔗 nickname_ has quit IRC (Read error: Operation timed out)
23:21 🔗 dashcloud has joined #archiveteam
23:34 🔗 nickname_ has joined #archiveteam
23:44 🔗 RedType has joined #archiveteam

irclogger-viewer