#archiveteam 2013-11-03,Sun

↑back Search

Time Nickname Message
00:02 πŸ”— Cowering oh.. maybe not, i mistook those for http://www.parodius.com/
00:07 πŸ”— arkhive S
00:07 πŸ”— arkhive SketchCow: Do you have plans for speaking/presenting in Colorado in 2014?
00:53 πŸ”— ivan` Lord_Nigh: how big is this site in GB?
00:54 πŸ”— ivan` also, do you need all of the uptobox.com files?
00:54 πŸ”— ivan` grabbing from a bunch of different file hosts could be annoying
00:55 πŸ”— ivan` Lord_Nigh: achivebot is grabbing the site itself http://archivebot.at.ninjawedding.org:4567/
01:16 πŸ”— Lord_Nigh Cowering: the nesdev.parodius.com forums were backed up and rehosted by tepples at nesdev.com when parodius shut down
01:16 πŸ”— Lord_Nigh ivan`: i'm not sure whether the uptobox files are needed ornot but i'd err on the side of yes
01:18 πŸ”— SketchCow In general? No idea.
01:23 πŸ”— ivan` Lord_Nigh: you might want to get them all with jdownloader or similar
01:44 πŸ”— DragonDon sooo, "ArchiveTeam's Choice" chooses 'TinyBack' which fails retreiving anything. Should I be sticking with one of the other 3 options? (formspring, URLTeam, blip"?
01:45 πŸ”— ivan` http://pastebin.com/j05z2Bnx external links for the 69 pages on 64scener
01:47 πŸ”— DragonDon actually....nothing is working. Tried the others, "No item received" message for URLTeam and blip
11:08 πŸ”— antomatic Clearly everything has been archived already, then! :)
11:09 πŸ”— antomatic It feels like we should always make sure that there's at least ONE project available at all times in the warrior, even if it's only pre-emptive archiving or crawling or similar.
11:11 πŸ”— antomatic There IS stuff that needs to be done - e.g. the warhammer forums closing in December - but I think that project may be in need of some setup assistance?
11:36 πŸ”— ersi DragonDon: Don't worry, it'll start doing project work.
11:36 πŸ”— Nemo_bis antomatic: isn't urlteam running?
11:36 πŸ”— ersi Nemo_bis: TinyBack == URLTeam
11:36 πŸ”— Nemo_bis so?
11:37 πŸ”— ersi It fucks up occationally
12:34 πŸ”— DragonDon pccasionally?
12:35 πŸ”— DragonDon guess I just don't want something sitting that will 'eventually' do something...
12:35 πŸ”— Nemo_bis SketchCow: turns out the discs I have, alone, are almost 8 kg... let's see how much they cost
12:35 πŸ”— Nemo_bis DragonDon: orly? isn't that what daemons are for? :)
12:36 πŸ”— DragonDon warrior != daemon
12:37 πŸ”— DragonDon does that mean you are ok with asking your print daemon to print and then hoping it'll print 'eventually'?
13:04 πŸ”— ersi Well, we've had some problems with that specific project - which currently, is the only one active that you can use in the ArchiveTeam Warrior.
13:08 πŸ”— * joepie93 reminds people that Hyves needs urgent backing up and that no code exists for it yet, and that there is #angerthehyve for that
13:08 πŸ”— joepie93 (and it should probably be a warrior project because they have LOTS of data)
13:11 πŸ”— ersi Oh yes.
13:13 πŸ”— Foxboron joepie93: happy now?
13:13 πŸ”— joepie93 Foxboron: ohai, wrong channel
13:13 πŸ”— joepie93 #angerthehyve
13:13 πŸ”— joepie93 :P
13:13 πŸ”— joepie93 cc _46bit
13:14 πŸ”— Foxboron ohh
13:15 πŸ”— _46bit Hey guys
13:16 πŸ”— _46bit joepie93: Yup I saw, that's why I connected :)
13:18 πŸ”— DragonDon so then...nothing to save right now then huh? ok, will check in some other time.
13:24 πŸ”— joepie93 DragonDon: not -yet-
13:25 πŸ”— joepie93 through the warrior anyway
13:25 πŸ”— joepie93 in a few days at most, some code should materialize for hyves at least
13:25 πŸ”— ersi DragonDon: Feel free to hang around and/or come around some other time :)
13:28 πŸ”— godane looks like the buck sexton show is now 6 days a week
13:31 πŸ”— joepie93 ATTENTION PYTHON DEVELOPERS: developers needed to write pipeline code for archiving Hyves, a massive Dutch social network; please join #angerthehyve
13:31 πŸ”— joepie93 shutdown expected in under a month
13:34 πŸ”— Nemo_bis DragonDon: if you don't want to have something (the warrior) sitting there doing something (or maybe not), you can run the scripts directly
13:34 πŸ”— Nemo_bis the warrior is to avoid worrying
13:36 πŸ”— DragonDon Nemo_bis, I'm cool with letting things run in the background. While not overly versed in scripts(something I am learning more about this year) but if it's not too much work/great a learning curve, I'll be game.
13:58 πŸ”— ersi No need to be 'versed'. It's just "running programs" really
13:58 πŸ”— ersi But the whole point of the ArchiveTeam warrior is to be as little hassle as possible with regards to running projects :)
14:04 πŸ”— DragonDon I haven't looked but is the setup instructions in the wiki?
14:06 πŸ”— ersi Of what? Running the scripts standalone?
14:08 πŸ”— DragonDon running the scripts
14:09 πŸ”— ersi No, there's no general installation instruction. But they are usually available in the README file for each projects source code.
14:09 πŸ”— ersi They require to be run in a Linux environment though.
14:10 πŸ”— DragonDon oh, ok. Having never looked at any of the source files, where do I find them?
14:10 πŸ”— DragonDon I run Linux Mint 15 :) So we're good
14:10 πŸ”— ersi They're all available at https://github.com/ArchiveTeam/
14:10 πŸ”— antomatic A good place to look is github.com/ArchiveTeam
14:10 πŸ”— antomatic doh, stereo :)
14:10 πŸ”— ersi Usually, the ones called something with -grab are projects. There's a bunch of misc. source repositories there as well
14:11 πŸ”— ersi With the exception of tinyback (which is the urlteam project)
14:11 πŸ”— antomatic Have a look at puush-grab, for example. That project is still live and runs pre-emptively with new work units every hour.
14:12 πŸ”— DragonDon oh, ok, these seem pretty straightfoward to setup and run.
14:12 πŸ”— antomatic You can run that (as a script) now, today. It sits idle when there's nothing to do, and springs into action each hour to get new stuff
14:12 πŸ”— DragonDon I'll look into it tomorrow after I got to bed. nearly 11:30pm here
14:12 πŸ”— ersi Sure thing. :)
14:14 πŸ”— DragonDon Question: I see "./get-wget-lua.sh" then I see "# Start downloading with: screen ~/.local/bin/run-pipeline --disable-web-server pipeline.py YOURNICKNAME" does that mean I need to edit the script with that info? run both commands consecutively?
14:15 πŸ”— antomatic No, you can just type it as a command
14:15 πŸ”— antomatic so usually something like...
14:15 πŸ”— antomatic cd whatever-grab
14:15 πŸ”— DragonDon but there are TWO things to type....
14:15 πŸ”— antomatic ./get-wget-lua.sh
14:15 πŸ”— antomatic run-pipeline --concurrent 9000 pipeline.py AntIsFantastic --disable-web-server
14:15 πŸ”— antomatic or screen run-pipeline --concurrent 9000 pipeline.py AntIsFantastic --disable-web-server for extra funk
14:15 πŸ”— DragonDon ah, the .sh is just to set it up right? got it
14:15 πŸ”— * antomatic nods
14:17 πŸ”— DragonDon ok, screw it....doing it now
14:22 πŸ”— BiggieJon concurrent 9000 ??!?
14:23 πŸ”— ersi Sounds like a bad idea :)
14:23 πŸ”— joepie93 that's what I tried to tell him during isoprey
14:23 πŸ”— joepie93 lol
14:23 πŸ”— BiggieJon need like 1TB ram to run that many threads
14:26 πŸ”— DragonDon ok, script running now...but "No item received. Retrying after 30 seconds..." will it keep trying every 30 seconds till it finds something then switch to once an hour or the like?
14:27 πŸ”— BiggieJon new items are added to teh tracker at teh top of each hour, takes about 15-20 min to run thru the additions
14:29 πŸ”— ersi If you leave it running, it'll pick up work
14:30 πŸ”— DragonDon ok, will do
15:44 πŸ”— joepie93 hey look, it's a website shutdown hotline: https://docs.google.com/forms/d/1jAzdEfsAGNzzVQpDisNDuJHk_kjCUIXV2nCpKsPAL0I/viewform
16:02 πŸ”— Lord_Nigh joepie93: nice!
16:03 πŸ”— Lord_Nigh 64scener already got archived though (except for external links) so...
17:30 πŸ”— joepie93 hotline responses minus the e-mail addresses go here: https://docs.google.com/a/cryto.net/spreadsheet/ccc?key=0Aj7l5eFy3CKsdDFWRUxwMGVjTmhYc291ZXlCdk1zOWc#gid=0
19:50 πŸ”— Nemo_bis is there a way to use wget --continue on web.archive.org? «Note that -c only works with FTP servers and with HTTP servers that support the "Range" header.»
20:10 πŸ”— Nemo_bis https://archive.org/post/1003894/wayback-machine-doesnt-support-the-range-header-aka-wget-continue-doesnt-work
21:32 πŸ”— _46bit Hey guys
21:33 πŸ”— _46bit Can I setup a server to just archive whatever happens to be going on, so I don't have to babysit it over time?
21:33 πŸ”— _46bit I've only helped with #isoprey before so don't know much about this.
21:38 πŸ”— DFJustin that's basically the warrior http://archiveteam.org/index.php?title=ArchiveTeam_Warrior
21:38 πŸ”— DFJustin not every project is hooked up to that though
21:40 πŸ”— _46bit Oh I see, the ArchiveTeamҀ™s Choice option. Thanks DFJustin. Is there a guide to setting that up outside the VM?
21:42 πŸ”— DFJustin there might be but I don't know where
21:45 πŸ”— ersi _46bit: Yes, for every project there's a README who documents how to get going. All of the projects source code are on https://github.com/ArchiveTeam/ and the projects usually have "-grab" in their name
21:47 πŸ”— _46bit ersi: Yeah, I meant getting the ArchiveTeam's Choice working outside it - thanks tohugh :-)
21:48 πŸ”— ersi Oh, ah. Not in a simple way, no.
21:48 πŸ”— ersi You could actually do that though. But that'd mean running every script that make up the Warrior :)
21:48 πŸ”— ersi Should make a guide for that anyway though
21:50 πŸ”— _46bit ersi: Okay, thanks. I suppose I'll just go for one of the longer-term projects for now.
22:44 πŸ”— touya Blue Max was a good game.
22:45 πŸ”— BlueMax >___>
22:46 πŸ”— touya i think it was about the first games i ever played. that and decathlon. and this probably belongs into -bs.
22:46 πŸ”— * ersi nods and pats touya

irclogger-viewer