[00:29] The Newsgrabber warrior project doesn't seem to do anything. I am using the latest VirtualBox appliance (archiveteam-warrior-v3-20171013.ova) that was provided. [00:31] Already answered in #newsgrabber, but for completeness: Newsgrabber in the warrior is unsupported. [02:35] *** human4565 has joined #warrior [06:37] *** zera has joined #warrior [06:41] hey not sure if this is the right place here, but I've been trying to get tumblr-grab running, but it seems like the aur wget-lua is failing during the build process [06:57] well guess I'll just use the warrior dockerfile [06:58] *** zera has quit IRC (Quit: Page closed) [09:20] I can't get the tumblr script to run on a new ubuntu 16.4.5 installl I get bash: run-pipeline: command not found? [09:20] I tried pip2 [09:56] Did you try pip install --upgrade seesaw? [09:56] usually that works for me [09:59] I did, I'm installing 18.04 to see what happens [12:12] *** Silvan has quit IRC (Read error: Operation timed out) [12:19] *** SilSte has joined #warrior [12:26] *** nertzy has joined #warrior [12:59] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [16:25] what is the proper procedure to ascertain which tasks have been aborted through warrior shutdown and reporting them? [16:27] human4565: You can ask someone to requeue all items associated with either your username or your IP address. Won't work if you run multiple instances on the same IP obviously. [16:27] JAA: I assume that will only requeue items which were not finished? [16:27] But generally, don't worry about it; all items get requeued at the end anyway normally, assuming there's still time to do so. [16:27] Correct. [16:28] I guess that leaves one concern. Is there any priority in queuing, such that more important items might be assigned early on? [16:28] The internal term for this is a "claim": a pipeline claims an item and later reports it as done. If that doesn't happen, the item remains claimed under your username, and the "requeueing" actually just means releasing those claims. [16:29] The same thing happens when an item fails, by the way. [16:29] That item would just remain claimed and get released eventually. [16:30] Whether items are queued by priority depends strongly on the project. If I were to run such a project, I'd make sure to requeue all unresolved claims for items of the highest priority before adding a batch of less-urgent items. [16:30] ok, I'm working on tumblr which seems it could be a bit rushed [16:31] Right. Not sure how the item list was prepared there exactly. [16:31] ok [16:56] is it possible to open a gui when a virtualbox warrior was started headless [17:26] *** Atom__ has quit IRC (Read error: Operation timed out) [17:27] *** Atom__ has joined #warrior [17:34] I got this error on one warrior job but the others seem to be going ok: [17:34] rsync error: error in socket IO (code 10) at clientserver.c(128) [sender=3.1.1] [17:34] Is there a way to recover this or am I going to have to abandon it? [18:30] *** vectr0n has left [19:31] *** pukkie has joined #warrior [20:08] *** human4565 has quit IRC (Quit: Leaving) [20:43] *** pukkie has left [22:22] *** tuluu has joined #warrior [22:22] *** tuluu has quit IRC (Remote host closed the connection) [22:23] *** tuluu has joined #warrior [22:59] *** twoTBHetz has joined #warrior [23:02] Hi i just setup my first warrior of #tumbrl (in a headless virtual box). I gave it 2 GB of RAM, did not change the 60GB disk size and one 1 CPU. My server is still idling arounf at 0.12 load with 4 cores (2 hyoer threads). Are there some ways how i can get it more busy? [23:03] *** elomatreb has joined #warrior [23:05] Also the "data transfer graph" in the web interface is flat 0, but i the logs suggest it is wokring [23:12] twoTBHetz: If you want to get the most of your hardware, don't use the warrior but run the scripts directly. [23:12] And yeah, I think someone said that the transfer graph is broken. [23:12] i am under solaris ... i doubt they run directly [23:15] where can i get a tarball of the scripts and read about the dependencies? [23:16] https://github.com/archiveteam/tumblr-grab#running-without-a-warrior [23:18] great so i just have to build pip ;) [23:19] Might be easier to set up a Debian VM and do it there. :-) [23:21] how is a debian vm faster than warrior? [23:22] Warrior is limited in terms of concurrency. If you have your own VM, you can do it all manually (and also get rid of the web dashboard). [23:24] you mean like being limited to 6 parallel things? [23:24] Yeah [23:24] You can run at a higher concurrency and multiple pipelines in parallel. [23:25] I see. [23:29] prefered pip version? [23:29] ¯\_(ツ)_/¯ [23:29] Shouldn't matter as long as it can install packages. [23:30] As for Python 2/3, I think the Tumblr project should work with both. I'm using 3.4 on one machine, FWIW. [23:31] I'm using the warrior to work on the tumblr project, and one of the items it picked up is running for 7 hours and 130k URLs by now, is that to be expected or did something break? [23:32] No, that's expected. Some of those blogs are huge. [23:32] Anything specifically about the Tumblr project in #tumbledown please. [23:32] Oh, ok, sorry