[05:24] Alard gets/win 4 [13:40] * db48x yawns [16:46] bah [16:46] servers that don't send a Last-Modified header are really annoying [16:54] alard: --warc-file and timestamping don't mix very well [16:55] alard: any file not downloaded because the Last-Modified header indicates that it hasn't changed doesn't get added to the warc file [16:55] do any modern servers not send that? [16:56] yes [16:56] many [16:56] Yes, that's one of the problems that should be looked at. [16:56] perfinion: especially for pages that are actually served via cgi (or similar) interfaces, since the web server doesn't know if it's been modified or not [16:57] oh, right fair enough [16:57] hmm [16:57] alard: on the other hand, if you're reusing the warc file then any files that got changed will simply be appended to the file and the ones that weren't changed will still be in there [16:57] are you supposed to? how do you if you want to? [16:57] is who supposed to what? [16:57] send last modified for cgi? [16:58] ah [16:58] it's not like HTTP requires it [16:58] well yeah [16:58] db48x: warc has a special record type for that, the 'revisit' record. But that's hard, because you need to add a reference to the previous response. [16:58] its just nice for caching [16:58] perfinion: right [16:59] So maybe it's better to just disable anything timestamps, --continue, --no-clobber et cetera. [16:59] although the main point of setting up a squid on a network is to cache images and other big things which are not cgi [16:59] perfinion: so to do it right, the cgi script (or watever) has to check the request's Last-Modified header and id and see if there is any newer data to display [17:00] so just handle it inside the cgi script? [17:00] instead of regenning the page [17:00] yea [17:00] but that requires the author to specifically implement it [17:00] either for every "page" individually, or in such a way that it's accurate across all "pages" [17:01] one is tedious and the other is boring [17:01] and if you get it wrong, your users complain that the changes they make aren't showing up [17:01] sorry, tedious and tricky, not tedious and boring :) [17:03] ah [17:03] yeah [17:03] its kinda not really worth it if they are small pages anyway [17:04] yea [19:05] SketchCow: dunno if you know http://www.pagetable.com/ , nice scans of c64 stuff [20:13] alard: I guess --warc-file should just turn off timestamping [20:14] it'd be nice if we could avoid writing to the disk files that haven't actually changed, though [20:14] Yeah, but then we'd need an index of all previous warc files. [20:15] Maybe something for a future version? [20:15] it's not exactly related to warc files, but if wget could download the content, do --convert-links if requested, hash the result and then not compare that to the hash of the file on disk [20:15] that would save lots of space on filesystems that do COW [20:15] and still let you have a complete warc of the site [20:16] but I must sleep [20:18] Good night then. I'll have a look at the number of temporary files. I've just seen that libwarc writes every record to a temporary file, so it might be possible to just give it the handle of wget-warc's temporary file. [20:59] are you all worldwide, cause its only 5pm where i am [21:01] your clock is 3 hours fast [21:03] im ny usa [21:03] eastern time [21:06] as I said... [21:42] Wassup guys? [21:42] Have I missed much in my absence? [21:43] My HD died, and the other too. [21:44] Oh man, that sucks :( [21:44] I'm kind of preparing a torrent of symbian.org [21:44] I lost my part of Twaud.io archive [21:44] Heeey. [21:45] Man, I love being in college [21:45] http://www.speedtest.net/result/1371167859.png [21:45] I made a backup of it [21:45] chronomex: I can help initial seed, if you want [21:45] 4 bonded 100mbps lines [21:45] :D [21:46] (Not on this machine, but the other one) [21:46] Also, I get a public IP, which is p sweet [21:46] 80Mbps... [21:47] hehe [21:47] I'll do that test. [21:47] Let me see how many Kbps i get [21:47] underscor: cool beans, I'll let you know. it'll be ~15G I think. [21:48] chronomex: ok, nifty [21:48] I'm here til the 30th [21:48] ayeaye [21:48] bandwidth is easy for me to obtain, time less so :P [21:48] http://speedtest.net/ [21:49] yep [21:49] * underscor sends chronomex a box of time [21:49] wooot [21:51] My phone don't support Flash Player 10, it's 9 --' [21:53] Any alternative to speedtest.net? [21:53] wget? :P [21:53] Yeah [21:53] http://cachefly.cachefly.net/10mb.test [21:53] wget -O /dev/null "http://cachefly.cachefly.net/10mb.test" [21:54] When will someone port wget to J2ME... [21:54] marceloan: You're on your phone? [21:55] Yes [21:55] cable here is not too shabby, asymmetry lol though http://www.speedtest.net/result/1368599789.png [21:56] think they're rolling out a 50mb down plan for the same price [21:56] 23 isn't bad [21:56] But I use more upstream than I use downstream [21:56] hahaha [21:56] yeah rsyncing this google video stuff is gonna suck [21:56] er groups [21:58] froups [22:03] Is 52oC good for the CPU, 43oC HD1 and 49oC HD2? [22:05] bit warm, shouldn't cause much problems [22:05] (btw if you can't type 52°C then 52C is more common and more readable than 52oC) [22:07] I'm using other computer. CPU Temperature=36C [22:08] that's pretty reasonable [22:08] about body temperature [22:08] at 70°C I would start worrying, but chips these days are generally okay up to 70-80°C [22:10] My computer only reaches 70C when I'm trying to play Tomb Raider Underworld using software renderer [22:11] Oh hell. (BSoD= STOP: c0000218 {Registry fail})