#archiveteam 2011-07-04,Mon

↑back Search

Time Nickname Message
05:24 🔗 SketchCow Alard gets/win 4
13:40 🔗 * db48x yawns
16:46 🔗 db48x bah
16:46 🔗 db48x servers that don't send a Last-Modified header are really annoying
16:54 🔗 db48x alard: --warc-file and timestamping don't mix very well
16:55 🔗 db48x alard: any file not downloaded because the Last-Modified header indicates that it hasn't changed doesn't get added to the warc file
16:55 🔗 perfinion do any modern servers not send that?
16:56 🔗 db48x yes
16:56 🔗 db48x many
16:56 🔗 alard Yes, that's one of the problems that should be looked at.
16:56 🔗 db48x perfinion: especially for pages that are actually served via cgi (or similar) interfaces, since the web server doesn't know if it's been modified or not
16:57 🔗 perfinion oh, right fair enough
16:57 🔗 perfinion hmm
16:57 🔗 db48x alard: on the other hand, if you're reusing the warc file then any files that got changed will simply be appended to the file and the ones that weren't changed will still be in there
16:57 🔗 perfinion are you supposed to? how do you if you want to?
16:57 🔗 db48x is who supposed to what?
16:57 🔗 perfinion send last modified for cgi?
16:58 🔗 db48x ah
16:58 🔗 db48x it's not like HTTP requires it
16:58 🔗 perfinion well yeah
16:58 🔗 alard db48x: warc has a special record type for that, the 'revisit' record. But that's hard, because you need to add a reference to the previous response.
16:58 🔗 perfinion its just nice for caching
16:58 🔗 db48x perfinion: right
16:59 🔗 alard So maybe it's better to just disable anything timestamps, --continue, --no-clobber et cetera.
16:59 🔗 perfinion although the main point of setting up a squid on a network is to cache images and other big things which are not cgi
16:59 🔗 db48x perfinion: so to do it right, the cgi script (or watever) has to check the request's Last-Modified header and id and see if there is any newer data to display
17:00 🔗 perfinion so just handle it inside the cgi script?
17:00 🔗 perfinion instead of regenning the page
17:00 🔗 db48x yea
17:00 🔗 db48x but that requires the author to specifically implement it
17:00 🔗 db48x either for every "page" individually, or in such a way that it's accurate across all "pages"
17:01 🔗 db48x one is tedious and the other is boring
17:01 🔗 db48x and if you get it wrong, your users complain that the changes they make aren't showing up
17:01 🔗 db48x sorry, tedious and tricky, not tedious and boring :)
17:03 🔗 perfinion ah
17:03 🔗 perfinion yeah
17:03 🔗 perfinion its kinda not really worth it if they are small pages anyway
17:04 🔗 db48x yea
19:05 🔗 Spirit_ SketchCow: dunno if you know http://www.pagetable.com/ , nice scans of c64 stuff
20:13 🔗 db48x alard: I guess --warc-file should just turn off timestamping
20:14 🔗 db48x it'd be nice if we could avoid writing to the disk files that haven't actually changed, though
20:14 🔗 alard Yeah, but then we'd need an index of all previous warc files.
20:15 🔗 alard Maybe something for a future version?
20:15 🔗 db48x it's not exactly related to warc files, but if wget could download the content, do --convert-links if requested, hash the result and then not compare that to the hash of the file on disk
20:15 🔗 db48x that would save lots of space on filesystems that do COW
20:15 🔗 db48x and still let you have a complete warc of the site
20:16 🔗 db48x but I must sleep
20:18 🔗 alard Good night then. I'll have a look at the number of temporary files. I've just seen that libwarc writes every record to a temporary file, so it might be possible to just give it the handle of wget-warc's temporary file.
20:59 🔗 bsmith093 are you all worldwide, cause its only 5pm where i am
21:01 🔗 chronomex your clock is 3 hours fast
21:03 🔗 bsmith093 im ny usa
21:03 🔗 bsmith093 eastern time
21:06 🔗 chronomex as I said...
21:42 🔗 underscor Wassup guys?
21:42 🔗 underscor Have I missed much in my absence?
21:43 🔗 marceloan My HD died, and the other too.
21:44 🔗 underscor Oh man, that sucks :(
21:44 🔗 chronomex I'm kind of preparing a torrent of symbian.org
21:44 🔗 marceloan I lost my part of Twaud.io archive
21:44 🔗 marceloan Heeey.
21:45 🔗 underscor Man, I love being in college
21:45 🔗 underscor http://www.speedtest.net/result/1371167859.png
21:45 🔗 marceloan I made a backup of it
21:45 🔗 underscor chronomex: I can help initial seed, if you want
21:45 🔗 underscor 4 bonded 100mbps lines
21:45 🔗 underscor :D
21:46 🔗 underscor (Not on this machine, but the other one)
21:46 🔗 underscor Also, I get a public IP, which is p sweet
21:46 🔗 marceloan 80Mbps...
21:47 🔗 underscor hehe
21:47 🔗 marceloan I'll do that test.
21:47 🔗 marceloan Let me see how many Kbps i get
21:47 🔗 chronomex underscor: cool beans, I'll let you know. it'll be ~15G I think.
21:48 🔗 underscor chronomex: ok, nifty
21:48 🔗 underscor I'm here til the 30th
21:48 🔗 chronomex ayeaye
21:48 🔗 chronomex bandwidth is easy for me to obtain, time less so :P
21:48 🔗 marceloan http://speedtest.net/
21:49 🔗 underscor yep
21:49 🔗 * underscor sends chronomex a box of time
21:49 🔗 chronomex wooot
21:51 🔗 marceloan My phone don't support Flash Player 10, it's 9 --'
21:53 🔗 marceloan Any alternative to speedtest.net?
21:53 🔗 chronomex wget? :P
21:53 🔗 underscor Yeah
21:53 🔗 underscor http://cachefly.cachefly.net/10mb.test
21:53 🔗 underscor wget -O /dev/null "http://cachefly.cachefly.net/10mb.test"
21:54 🔗 marceloan When will someone port wget to J2ME...
21:54 🔗 underscor marceloan: You're on your phone?
21:55 🔗 marceloan Yes
21:55 🔗 DFJustin cable here is not too shabby, asymmetry lol though http://www.speedtest.net/result/1368599789.png
21:56 🔗 DFJustin think they're rolling out a 50mb down plan for the same price
21:56 🔗 underscor 23 isn't bad
21:56 🔗 underscor But I use more upstream than I use downstream
21:56 🔗 underscor hahaha
21:56 🔗 DFJustin yeah rsyncing this google video stuff is gonna suck
21:56 🔗 DFJustin er groups
21:58 🔗 chronomex froups
22:03 🔗 marceloan Is 52oC good for the CPU, 43oC HD1 and 49oC HD2?
22:05 🔗 chronomex bit warm, shouldn't cause much problems
22:05 🔗 chronomex (btw if you can't type 52°C then 52C is more common and more readable than 52oC)
22:07 🔗 marceloan I'm using other computer. CPU Temperature=36C
22:08 🔗 chronomex that's pretty reasonable
22:08 🔗 chronomex about body temperature
22:08 🔗 chronomex at 70°C I would start worrying, but chips these days are generally okay up to 70-80°C
22:10 🔗 marceloan My computer only reaches 70C when I'm trying to play Tomb Raider Underworld using software renderer
22:11 🔗 marceloan Oh hell. (BSoD= STOP: c0000218 {Registry fail})

irclogger-viewer