#archiveteam 2012-10-17,Wed

↑back Search

Time Nickname Message
03:13 πŸ”— dragondon Greetings all. Getting a curl error "curl: (22) The requested URL returned error: 500". whole error here http://pastebin.com/scMx6bkf
03:15 πŸ”— dragondon seeing a bunch of those but it seems that eventually the upload does happen.
03:16 πŸ”— underscor dragondon: one (or more) of the upload endpoints is full
03:16 πŸ”— underscor I'll be fixing it asap
03:16 πŸ”— underscor Uploads should still work eventualyl
03:16 πŸ”— dragondon ok, cool.
03:16 πŸ”— underscor as they're roundrobined between boxes in the cluster
03:16 πŸ”— dragondon yeah, that's what I did see from others.
03:16 πŸ”— underscor eventually*
03:18 πŸ”— nintendud bah. VirtualBox won't start up my warrior VM, or a newly downloaded warrior VM.
03:20 πŸ”— Cameron_D What error>
03:21 πŸ”— nintendud NS_ERROR_FAILURE
03:21 πŸ”— nintendud "VBoxManage: error: The virtual machine 'archiveteam-warrior-2_1' has terminated unexpectedly during startup with exit code 0"
03:21 πŸ”— flaushy can i play around with the timeouts?
03:22 πŸ”— nintendud Doesn't seem to be particularly obvious. I guess this error covers a wide range of possible issues.
03:22 πŸ”— Sue nintendud: are you trying to run without X
03:22 πŸ”— nintendud Sue: yup
03:22 πŸ”— nintendud The warrior worked before.
03:22 πŸ”— Sue VBoxHeadless
03:22 πŸ”— nintendud Oh. Crpa.
03:23 πŸ”— nintendud Crap*
03:23 πŸ”— Sue VBoxManage dies without X unless you specify headless or start with VBoxHeadless
03:23 πŸ”— nintendud I forgot that was the command.
03:23 πŸ”— Sue don't forget to start with &
03:23 πŸ”— Sue i had that same problem at first
03:24 πŸ”— nintendud yeah, I have it running in screen
03:25 πŸ”— underscor Why not just run the pipeline outside?
03:25 πŸ”— nintendud Sue: thanks for the help. herp derp on my end.
03:26 πŸ”— Sue underscor: fun; nintendud: np
05:09 πŸ”— dragondon Umm, "No item received. Retrying after 30 seconds..." and "Retrying CurlUpload for Item gourmetsexpress after 30 seconds..." are all I am getting now
05:10 πŸ”— dragondon 4 workers are getting "No item" and two are "retrying"
05:13 πŸ”— dragondon restarted VM, now all are 'retrying"
05:16 πŸ”— dragondon that was for BT Internet homepages
05:16 πŸ”— dragondon switched back to Webshots and downloading data now just fine
05:33 πŸ”— NovaKing apparently some servers full
05:33 πŸ”— NovaKing and they working on it
07:37 πŸ”— SmileyG [04:23:22] < Sue> don't forget to start with &
07:37 πŸ”— SmileyG if you forget, ctrl z, then bg, then disown
08:44 πŸ”— Nemo_bis anyone able to kill http://www.us.archive.org/log_show.php?task_id=124346847 here?
08:47 πŸ”— alard There's a timeout 87457739 in there.
08:47 πŸ”— alard (Is quite a long time, probably.)
08:53 πŸ”— Cameron_D That is only 1012 days
08:55 πŸ”— SmileyG lol
09:27 πŸ”— Nemo_bis it wasn't enough last time
09:27 πŸ”— Nemo_bis [ PDT: 2012-10-16 16:16:59 ] Executing: timeout 87457739 python /usr/local/petabox/sw/books/ol_search/solr_post.py 'EB1911WMF' '/var/tmp/autoclean/derive-EB1911WMF-AbbyyXML/EB1911_abbyy.xml'
09:28 πŸ”— Nemo_bis sorry [ PDT: 2012-09-28 04:45:24 ] Executing: timeout 87457739 python /usr/local/petabox/sw/books/ol_search/solr_post.py 'EB1911WMF' '/var/tmp/autoclean/derive-EB1911WMF-AbbyyXML/EB1911_abbyy.xml'
09:29 πŸ”— Nemo_bis [...] nice /usr/local/petabox/deriver/derive.php /var/tmp/autoclean/derive/EB1911WMF [...] failed with exit code: 9 [...] TASK FAILED AT UTC: 2012-10-02 18:11:37
09:30 πŸ”— Nemo_bis underscor?
09:32 πŸ”— underscor Nemo_bis: Why do you need it killed?
09:38 πŸ”— Nemo_bis underscor: because it will fail surely
09:39 πŸ”— Nemo_bis and I want to update the images split in volumes now, so that it will work
09:39 πŸ”— Nemo_bis *upload
09:41 πŸ”— underscor ah
09:41 πŸ”— underscor okay
09:41 πŸ”— underscor well, I'll kill it :P
09:41 πŸ”— Nemo_bis underscor: thanks
09:42 πŸ”— underscor Interrupting task for task_id: 124346847 1 derive.php SERVER iw600709.us.archive.org USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 5646 82.8 1.1 750384 414324 ? RN Oct16 517:23 python /usr/local/petabox/sw/books/ol_search/solr_post.py EB1911WMF /var/tmp/autoclean/derive-EB1911WMF-AbbyyXML/EB1911_abbyy.xml KILLING 5646
09:42 πŸ”— underscor cc Nemo_bis
09:47 πŸ”— Nemo_bis underscor: thanks
09:49 πŸ”— Cameron_D ?tf2
09:49 πŸ”— Cameron_D oops, wrong chnnel
09:52 πŸ”— Nemo_bis hmpf only 650 kB/s upload even from a USA server
10:55 πŸ”— godane looks like theblaze tv did some bad audio sync for episode 2012-10-15
10:57 πŸ”— godane what is funny is that the real news and wilkow on-demand works fine
11:10 πŸ”— godane i found a way to sync it
11:10 πŸ”— godane its off by 1.5 seconds
11:48 πŸ”— SmileyG ffmpeg will resync stuff...
12:36 πŸ”— Cameron_D Hmm, IGN is up for sale and there is some talk of them possibly closing the boards (85.8 million posts), so that might be something to keep an eye on
12:38 πŸ”— balrog_ :O
12:39 πŸ”— balrog_ what's a good setting for number of instances when you have 50mbit/25mbit internet speeds?
12:44 πŸ”— Cameron_D I'm assuming that is for Webshots, in which case I'm not really sure, I haven't really been able to see how much bandwidth it uses
12:46 πŸ”— alard It also depends on the distance to the webshots servers.
12:46 πŸ”— joepie91 1mbit per thread generally
12:46 πŸ”— balrog_ yeah, for webshots
12:46 πŸ”— joepie91 appro
12:47 πŸ”— joepie91 aprox *
12:47 πŸ”— joepie91 ..
12:47 πŸ”— joepie91 approx ***
12:47 πŸ”— joepie91 at least, in my experience
12:49 πŸ”— balrog_ alard: also how do you stop it again? STOP file in the same dir as the pipeline.py?
12:49 πŸ”— balrog_ because it seems to be ignoring it
12:49 πŸ”— alard It finishes the current jobs first.
12:49 πŸ”— balrog_ yeah, but it seems to keep doing jobs
12:49 πŸ”— alard Can you use the web interface?
12:49 πŸ”— balrog_ what port does that run on?
12:49 πŸ”— alard 8001
12:52 πŸ”— Cameron_D And I think the STOP file needs to be in the directory you launced it from, not the pipeline directory
12:52 πŸ”— balrog_ ahh.
12:53 πŸ”— balrog_ that ... sounds like a possible bug :P
12:56 πŸ”— alard That could be.
12:56 πŸ”— alard curl -d "" http://127.0.0.1:8001/api/stop works too.
12:56 πŸ”— _case if anyone feels like pondering a wget question re: hotlinked page requisitesҀ¦ http://stackoverflow.com/questions/12934528/recursive-wget-with-hotlinked-requisites
12:58 πŸ”— alard Have you tried --no-parent? (Just a guess.)
13:03 πŸ”— balrog_ yeah, and now I'm killing disk IO
13:23 πŸ”— dragondon is it safe to go back working on the BT project yet?
13:27 πŸ”— Cameron_D Is there anything left to do?
13:40 πŸ”— alard No, BT is done, that is: we need more usernames.
13:41 πŸ”— dragondon ok, stopped webshots projct, started BT :)
13:42 πŸ”— alard dragondon: There's nothing to do there. :)
13:42 πŸ”— dragondon huh? all done?
13:42 πŸ”— alard We've worked through our list of usernames.
13:43 πŸ”— alard There might be users that are not on our list, but we'll have to discover those usernames first. That isn't done by the warrior.
13:46 πŸ”— dragondon oh, that's what you meant. I thought you meant you needed more usernames completed.
14:00 πŸ”— SmileyG no work for happy workers :(
18:39 πŸ”— alard Webshots numbers: S[h]O[r]T has uploaded 100,000 items; we've uploaded 20,000 GB. Hurray!
18:51 πŸ”— [1]deathy I would give S[h]O[r]T the internet as a prize, but apparently he can download it by himself ...
18:59 πŸ”— joepie91 haha
21:03 πŸ”— balrog_ ugh my warrior vm crashed
22:18 πŸ”— SketchCow We need more usernames. It can't be so many.
22:25 πŸ”— arkhive http://news.cnet.com/8301-1023_3-57533820-93/news-corp-puts-ign-entertainment-up-for-auction/
22:25 πŸ”— arkhive Probably been linked here.
22:26 πŸ”— arkhive The new IGN network will probably shutter/close multiple sites
22:28 πŸ”— arkhive ah. just read above.
22:37 πŸ”— SketchCow https://docs.google.com/a/textfiles.com/spreadsheet/ccc?key=0ApQeH7pQrcBWdDZIUEVjR3d1UmRoU0lPSWZYX0Q1Ync#gid=0
22:37 πŸ”— SketchCow Watch as I do final signoff!
22:37 πŸ”— SketchCow Anything with the deep blue on the left is going into wayback!
22:38 πŸ”— joepie91 SketchCow: what is a MegaWARC?
22:39 πŸ”— SketchCow A MegaWARC is a concatenation of warc files, allowing us to put thousands of individual warc grabs as one file.
22:39 πŸ”— joepie91 I see
22:45 πŸ”— arkhive SketchCow: I still have MobileMe files that have not been uploaded yet. Problem with my hard drive. I cloned it to another and will finish recovering the files as soon as I can.
22:47 πŸ”— arkhive Just letting you know so you don't put up an incomplete copy on WayBack
22:49 πŸ”— arkhive I should be able to recover just about all of the files
22:50 πŸ”— arkhive But a few might be impossible to retrieve.
22:52 πŸ”— arkhive So I apologize in advance for my screw up. :)
22:59 πŸ”— DFJustin https://archive.org/details/archiveteam-qaudio-archive-1 etc. not going in?
23:03 πŸ”— SketchCow Sure it is.
23:04 πŸ”— SketchCow I'm sure some stuff has escaped my gaze, hence my asking people to look over my shoulder at the google doc.
23:04 πŸ”— DFJustin also 2-7
23:07 πŸ”— SketchCow Right.
23:07 πŸ”— SketchCow No, on it.
23:07 πŸ”— SketchCow They're all fine, though, they already were working.
23:07 πŸ”— SketchCow Now I'm just bundling them.
23:08 πŸ”— SketchCow http://archive.org/details/archiveteam-qaudio-archive will have it soon.
23:15 πŸ”— SketchCow http://archive.org/details/archiveteam-qaudio-archive now fixed.
23:55 πŸ”— godane thanks for putting my isos in the linux format collection

irclogger-viewer