#warrior 2017-01-14,Sat

↑back Search

Time	Nickname	Message
05:15 ^🔗		zerkalo has quit IRC (Ping timeout: 260 seconds)
05:38 ^🔗		zerkalo has joined #warrior
06:41 ^🔗		Honno has joined #warrior
06:58 ^🔗		zerkalo has quit IRC (Read error: Operation timed out)
06:59 ^🔗		zerkalo has joined #warrior
10:06 ^🔗		zhongfu has quit IRC (Ping timeout: 260 seconds)
10:19 ^🔗		zhongfu has joined #warrior
19:22 ^🔗		scraping_ has joined #warrior
19:23 ^🔗	scraping_	Hello - Are there any good examples of using Scrapy to build a warrior project?
19:23 ^🔗	scraping_	Or, alternatively: Any suggestions how I might contribute to add Scrapy support to Warrior?
19:24 ^🔗	scraping_	ArchiveTeam's an amazing project, but I think yall woud get a lot more coder support if you could do Scrapy projects because of the ecosystem
19:28 ^🔗	arkiver	we're not using scrapy
19:28 ^🔗	arkiver	and I'm planning on doing anything with scrapy
19:29 ^🔗	arkiver	we're doing fine on warrior projects
19:29 ^🔗	arkiver	well, we're not using scrapy for warrior project, not sure about other projects
19:30 ^🔗	scraping_	I've seen projects on your GH that don't look like they're seesaw though
19:31 ^🔗	scraping_	but they say they're "Warrior" compatible. So warrior must be pluggable? Or am I mistaken there?
20:19 ^🔗		arkiver is now known as arkiver2
21:12 ^🔗	Kaz	scraping_: got a link to one of those?
21:14 ^🔗		SmileyG has quit IRC (Read error: Connection reset by peer)
21:19 ^🔗		Smiley has joined #warrior
21:36 ^🔗	scraping_	kaz: https://github.com/ArchiveTeam/tinyback looks like it's pure python, no lua involved. And it specifies it works on Warrior. My initial impresion was that lua was needed for seesaw
21:38 ^🔗	Kaz	that uses seesaw
21:38 ^🔗	Kaz	lua is usually used with wget-lua, for the actual grabbing (afaik), seesaw is irrelevant to that
21:38 ^🔗	Kaz	that project is also 3+ years old, things have changed a bit since then
21:39 ^🔗	scraping_	Ah, ok. Just wondering; why Seesaw? it seems to be a custom thing just for ArchiveTeam, but that sort of opts out of all the scraping ecosystems out there, doesn't it?
21:40 ^🔗	Kaz	seesaw isn't just the scraping, it's the checking into tracker, getting jobs, keeping on top of the status of those jobs, pushing the final output to a specified location etc
21:41 ^🔗	Kaz	Created well before I heard about AT, i'd assume mainly because making something custom purely for AT purposes was better than working through various other tools and bolting bits together
22:34 ^🔗		Honno has quit IRC (Ping timeout: 370 seconds)
23:25 ^🔗	chfoo	seesaw used to be a bunch of bash scripts that downloaded and uploaded. it evolved from there.
23:28 ^🔗	chfoo	it works well for most projects but using it isn't strictly mandatory. ie, urlteam doesn't fit in so it calls its own code for managing the process
23:43 ^🔗		arkiver2 is now known as arkiver

irclogger-viewer