#warrior 2017-01-14,Sat

↑back Search

Time Nickname Message
05:15 🔗 zerkalo has quit IRC (Ping timeout: 260 seconds)
05:38 🔗 zerkalo has joined #warrior
06:41 🔗 Honno has joined #warrior
06:58 🔗 zerkalo has quit IRC (Read error: Operation timed out)
06:59 🔗 zerkalo has joined #warrior
10:06 🔗 zhongfu has quit IRC (Ping timeout: 260 seconds)
10:19 🔗 zhongfu has joined #warrior
19:22 🔗 scraping_ has joined #warrior
19:23 🔗 scraping_ Hello - Are there any good examples of using Scrapy to build a warrior project?
19:23 🔗 scraping_ Or, alternatively: Any suggestions how I might contribute to add Scrapy support to Warrior?
19:24 🔗 scraping_ ArchiveTeam's an amazing project, but I think yall woud get a lot more coder support if you could do Scrapy projects because of the ecosystem
19:28 🔗 arkiver we're not using scrapy
19:28 🔗 arkiver and I'm planning on doing anything with scrapy
19:29 🔗 arkiver we're doing fine on warrior projects
19:29 🔗 arkiver well, we're not using scrapy for warrior project, not sure about other projects
19:30 🔗 scraping_ I've seen projects on your GH that don't look like they're seesaw though
19:31 🔗 scraping_ but they say they're "Warrior" compatible. So warrior must be pluggable? Or am I mistaken there?
20:19 🔗 arkiver is now known as arkiver2
21:12 🔗 Kaz scraping_: got a link to one of those?
21:14 🔗 SmileyG has quit IRC (Read error: Connection reset by peer)
21:19 🔗 Smiley has joined #warrior
21:36 🔗 scraping_ kaz: https://github.com/ArchiveTeam/tinyback looks like it's pure python, no lua involved. And it specifies it works on Warrior. My initial impresion was that lua was needed for seesaw
21:38 🔗 Kaz that uses seesaw
21:38 🔗 Kaz lua is usually used with wget-lua, for the actual grabbing (afaik), seesaw is irrelevant to that
21:38 🔗 Kaz that project is also 3+ years old, things have changed a bit since then
21:39 🔗 scraping_ Ah, ok. Just wondering; why Seesaw? it seems to be a custom thing just for ArchiveTeam, but that sort of opts out of all the scraping ecosystems out there, doesn't it?
21:40 🔗 Kaz seesaw isn't just the scraping, it's the checking into tracker, getting jobs, keeping on top of the status of those jobs, pushing the final output to a specified location etc
21:41 🔗 Kaz Created well before I heard about AT, i'd assume mainly because making something custom purely for AT purposes was better than working through various other tools and bolting bits together
22:34 🔗 Honno has quit IRC (Ping timeout: 370 seconds)
23:25 🔗 chfoo seesaw used to be a bunch of bash scripts that downloaded and uploaded. it evolved from there.
23:28 🔗 chfoo it works well for most projects but using it isn't strictly mandatory. ie, urlteam doesn't fit in so it calls its own code for managing the process
23:43 🔗 arkiver2 is now known as arkiver

irclogger-viewer