#archiveteam 2014-10-27,Mon

↑back Search

Time Nickname Message
14:38 🔗 bebzol arkiver - if you are reading it, could you please check my ownlog-grab repository? It's rewritten and all scripts seem to work fine
16:01 🔗 arkiver bebzoll: sure!
16:03 🔗 bebzol thanks :)
16:04 🔗 bebzol I'm not on IRC all the time so please, write me an email: bebum@o2.pl :). Thanks in advance
16:04 🔗 arkiver looks fine to me
16:04 🔗 arkiver are you sure there are no url that need to be scrape, which are not automatically scraped by wget?
16:05 🔗 arkiver did you check the files that are downloaded when you load a page to check for important pages that may not be scraped by wget?
16:08 🔗 bebzol yeah, I confirmed it with debugging mode in Firefox - all website dependencies like CSS/images are handled by lua script
16:10 🔗 arkiver ok, looks great then!
16:11 🔗 arkiver so this https://github.com/ArchiveTeam/ownlog-grab/blob/master/pipeline.py#L176 will download the whole website
16:11 🔗 bebzol cool :)
16:12 🔗 bebzol yes
16:12 🔗 bebzol I must go - I'll catch you later :)
16:13 🔗 arkiver ok
16:13 🔗 bebzol those blogs are really simple in structure
16:13 🔗 arkiver bebzol
16:13 🔗 arkiver I'll make a push requests later tonight
16:37 🔗 arkiver bebzol: made a pull request
19:40 🔗 bebzol arkiver, are you here?
20:30 🔗 arkiver bebzol: yes
20:32 🔗 bebzol thanks for changes in the code :). Actually --domain parameter should be only set for one particular subdomain, like xyz.ownlog.com - so wget doesn't grab main ownlog.com site and catalogues
20:33 🔗 bebzol there are no external dependencies outside subdomain
20:33 🔗 arkiver --domain for ownlog.com will also grab xyz.ownlog.com
20:33 🔗 arkiver please take a look at the changes in the lua script
20:33 🔗 arkiver it makes sure nothing else then *item_value*.ownlog.com is being downloaded
20:34 🔗 arkiver that is the easiest way to do those tings
20:34 🔗 arkiver things*
20:34 🔗 arkiver and you can easily edit and add other kind of domains to it
20:35 🔗 bebzol all right, now I see it
20:35 🔗 bebzol tomorrow I'll make tests and I think we can go on production with it
20:35 🔗 arkiver yep
20:35 🔗 arkiver good luck!
20:35 🔗 bebzol ok, thanks :)
20:36 🔗 bebzol I'll let you know tomorrow
20:39 🔗 arkiver I'll try to take a better look at the websites and make a second pull request if needed
20:51 🔗 DFJustin https://twitter.com/DigitCurator/status/526810006629654528
22:49 🔗 SketchCow https://github.com/mamedev/mame/commit/f22371f389f3abac59a40028dc488406ec27f670
22:49 🔗 SketchCow Sorry, awesome mispaste
23:51 🔗 ArloJames This must get annoying after a while (why not have a separate channel?) but
23:51 🔗 ArloJames WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
23:52 🔗 chronomex yahoosucks
23:52 🔗 ArloJames Thanks.
23:53 🔗 chronomex ArloJames: if you're going to work on the wiki, please hang around in irc for a while
23:54 🔗 ArloJames I had not planned to do any editing yet, just create an account. I will lurk moar, fear not.
23:58 🔗 chronomex ok, cool
23:58 🔗 chronomex good to meet you then
23:58 🔗 chronomex or something i guess

irclogger-viewer