[00:23] i found a 2 hour podcast on g4tv.com [00:23] its 684mb [00:32] argh at using urllib2 [00:56] is wget the only application that has warc export features? [00:59] ersi:) maybe try requests? [04:25] i found ces2007 ms keynote [04:25] its in 3 parts [04:26] dam it [04:26] part 3 is broken [04:26] maybe all of it is broken [04:26] fucking g4 [06:43] i'm uploading a broken video now to see how it derive [06:46] uploaded: https://archive.org/details/g4tv.com-video14990 [06:46] lets see how IA takes [06:52] dam it [06:52] IA can't fix it [06:56] am I allowed to talk about something I'm doing that might be illegal in here [06:57] BlueMax: talk about whatever you want, it's a free internet [06:57] lol [06:58] I'm trying to edit a PSP game's files because I find it interesting, come across what appears to be a compression format I've never seen before [06:58] k [06:58] how is that illegal? [06:58] opened the file in HxD and the first few characters are "VFS" [06:58] dunno, law is pretty crazy [06:59] what's `file' say? [06:59] not sure what you mean [06:59] are you on unix? [07:00] Windows [07:00] put the file somewhere I can download from and I'll try to identify it [07:00] 10-20% likelihood of succes [07:00] will do [07:01] hey guys [07:01] can any of you look at video14990 [07:01] the flv is broken [07:01] so the full video is no derived [07:01] *not derived [07:02] only a 1:52 vs the 3:02 the original file has [07:11] chronomex, http://www.mediafire.com/?cr681ih63gf249r good luck and god speed [07:11] mediafire goddamnit [07:11] what the shit is wrong with you [07:12] well I don't own my own webserver and I have no idea where else I can put it [07:12] k [07:12] swebb: was it you who has the auto-op thing going on? [07:13] hm, no hits BlueMax [07:15] yeah, it's a weird one [07:15] I thought the "VFS" might have meant something I didn't know about [07:22] I'm seeing no common file headers within the first 2160 bytes [07:23] for future reference, to find file headers at small offsets you split the file into a bunch of small copies, starting at offsets: for P in `seq 1 2160` ; do echo $P ; dd if=EDF2.DAT of=edf2-"$P".dat bs=1k skip="$P" count=1 iflag=skip_bytes ; done ; file -sk edf2-*.dat | less [07:25] I see [07:26] Well it has to be storing something - some form of text, some type of PSP format 3D models and textures [07:27] (although they aren't "PSP format" they're "formats used by PSP") [07:27] unfortunately I have no information on what those might be [07:38] maybe it's compressed, chronomex - when I was looking around someone on a forum who got to the same point I did said it might be a compressed file of some sort [08:13] yeah it's clearly an archive file [08:13] i have uploaded over 2K videos [08:13] of g4tv.com [08:18] After uploading something to the FTP, how do I check it in? [08:18] As I'm a bit lost in the historical-quality design of archive.org [09:46] http://mamedev.emulab.it/etabeta/?p=260 [10:15] omf_: No, there's also something from one of IA's sister orgs called Net something.. [10:15] lol at BlueMax's "may I talk to something that might be illegal???" [14:29] thanks ersi [14:30] Did you find it? I still havn't remembered the name of it [14:30] nope [14:40] omf_: NetPreserve ? [15:29] tef, do they have an application that saves warc while archiving? I want to know how many applications out there support warc [15:30] wooooooo [15:30] off the top of my head, heritrix, wget and a couple of toy projects do. I know some commerical companies [15:30] also use warcs. [15:31] SketchCow: how easy is it to visit IA in person? I am visiting sf for a few weeks next month and I am curious :D [15:32] easy. [15:32] noon every friday. free lunch [15:32] what weeks? [15:35] 26th feb->26th Mar [15:36] free lunch? :o [15:36] also, I'd love to visit IA in person some day, but other side of the ocean etc etc [15:36] joepie91: I have to admit my facial expression did perk up upon reading that I can get to IA, but free food too [15:36] (can someone please invent teleports already, thank you) [15:36] joepie91: ditto, this is my first experience outside europe. [15:36] tef: heh [15:37] well [15:37] I don't have the money for a plane ticket [15:37] so.. yeah [15:37] :p [15:37] probably not going to cross the ocean any time soon [15:40] Yeah took me a while to save [15:41] And I'm getting use of a sofa so I don't get hammered for accomodation costs, and i'll be telecommuting too [15:42] * ersi weeps a little after fiddling with his spaghetti code [15:43] tef: well, with the 'income' I have, there's no saving [15:43] :p [15:44] joepie91: yeah, I couldn't afford it until I got a raise +year ago, paid off debts and got into +ve savings [15:44] * joepie91 has no job, lives off donations for projects [15:44] plus, being able to telecommute means I can stay out there for a bit [15:44] joepie91: get invited to a conference, get them to pay for flights ? [15:45] even then I probably wouldn't have enough money to travel to for example IA [15:45] joepie91: Ballsy [15:45] :P [15:45] my monthly living expenses (and thus income) are about 250 euro [15:48] cheap rent? [15:50] joepie91: sounds fun though [15:50] joepie91: i kinda resent not being able to publish my work [15:50] sometimes, sometimes not [15:50] and well, no rent - where I live now was previously a squat, there's a non-monetary temporary rent contract with the owner now [15:51] for as long as it's not sold [15:51] and well, that's what I really like about living this way - I can do whatever the damn hell I please to, with the things I make [15:51] :p [15:51] nice1 [15:51] nice!, even [15:51] the result is http://cryto.net/~joepie91 and http://github.com/joepie91 [15:52] fun [15:52] i have github.com/tef for what i've managed to hack together [15:53] ah, I should probably add http://learn.cryto.net/ to my list of things [15:53] tef: I think I've seen your github acc before [15:53] :O [15:54] can't quite recall what from [15:54] probably warctools? [15:54] possibly, idk [15:55] I have a terrible memory at times :P [15:55] likely. unless you saw the toy crawler I wrote, or the rpc-esque yet conspicuously restful api thingy [15:55] also, tef, you may find this useful in particular: http://cryto.net/zippydoc [15:55] (unrelated, what's wat? [15:56] hmm, going to have a read through the crawler at some point, might have some useful pointers [15:57] my most recent crawling/scraping stuff is at https://github.com/joepie91/crytolearn/tree/develop/updater/scrapers [15:57] joepie91: weirdly enough I find RST not that complex or hard [15:57] the crawler is pretty nice [15:57] also https://github.com/joepie91/crytolearn/tree/develop/updater/shared [15:58] tef: hmm. I basically gave up in frustration after a while. [15:58] joepie91: my only skill right now seems to be that I have read the http spec and warc spec [15:58] you know the warc spec? [15:58] yes [15:58] i've even met one of the authors :O [15:58] :D [15:59] coolio [15:59] clement from the bnf [15:59] but eh, I'm unsure whether hanzo tools already fulfill this [15:59] but it would be sort of useful [15:59] to have a warc modification module in py [15:59] with a sane API [15:59] that's well documented [15:59] :P [15:59] I'm struggling with getting the WARCs I write nice [15:59] so if that doesn't exist yet, there might be a project in there for you :P [15:59] yeah [16:00] 'sane' being sane in the sense of what requests is like [16:00] warctools doesn't have a nice 'requests' layer over it [16:00] didn;t get the time [16:00] not 'like python standard lib' because quite honestly, python standard lib is horrible [16:00] heh [16:00] * ersi nods [16:00] it's kinda 'here is enough to get going, away with you' [16:00] tef: that's what I got from my cursory looks at warctools [16:00] both IA's python warc library and warctools are in that state [16:00] you have to write a lot of boilerplate once, but heh, you at least don't have to write a parser [16:00] then again, that's also what I got from urllib/urllib2 [16:00] so probably a requests-like abstraction layer is in order [16:01] joepie91: i am happy I got a http parser inside warctools [16:01] I'm banging my head around handling redirects with urllib2 and saving them down to a warc right now [16:01] :P [16:01] ersi: use requests [16:01] tef: link, by any chance? [16:01] to that particular bit of code [16:01] I know, of course of requests. I did go with urllib2 though [16:02] joepie91 it's in warctools - hanzo/httptools/messaging.py [16:03] I see [16:04] :p [16:04] okay, that's too big for me to read in a cursory manner [16:04] usually other peoples python code confuses me anyway [16:04] recently managed to hack a bit on beautifulsoup [16:04] It's a skill in it self to read others code [16:04] to make it comprehend nth-of-parent [16:05] in css selectors [16:05] cool [16:05] ersi: yeah, very much so [16:05] and while I was doing that [16:05] I also fixed direct descendant support [16:05] so that it actually worked as it should [16:05] debugging code and extending code is also separate skills, imo [16:05] instead of choking on class names or IDs [16:05] ersi: I'm good at reading other peoples code, just not python [16:05] there seem to be so many ways to do everything [16:05] it's nearly impossible to understand all of them [16:05] on sight [16:05] joepie91: heh, that's the first I've heard someone say that of python [16:06] joepie91: never used perl then :-) [16:06] hehe, I guess - though that's a rather rare opinion [16:06] well, actually [16:06] I've tried to fix a perl scraper [16:06] once [16:06] was quite hard, but primarily because of syntax [16:06] shrug :) [16:06] * joepie91 and eq are not friends [16:06] personally I find python one of the easiest, but I've been using it for ~10 years and 4 in work [16:06] anyway, I sort of got the yahoo cache scraper to work [16:06] ... sometimes [16:06] and the bing scraper too [16:07] google cache scraper stayed dead [16:07] tef: python as a language is really easy [16:07] it's just that there's a gazillion libraries for every single thing you could possibly want to do and just as many ways to write the code to do so (ranging from C-like to pythonic) [16:07] so it takes a really big amount of time to learn to recognize all code styles and patterns in them [16:07] compared to other languages [16:08] imo [16:08] the way I read code is by recognizing certain patterns in the code, not by actually reading it token by token [16:08] so the more possible patterns to do the same thing, the longer it'll take for me to recognize that thing with 100% accuracy [16:09] joepie91: most languages end up having this once they are big enough [16:09] (that being said, pattern-based code reading can be really useful.. I have, however scary it may be, sort of developed an internal 'PHP interpreter' in my mind, so I can just look at code and determine at any point what is doing what, and which variables are holding what values ._.) [16:09] (in PHP at least) [16:09] joepie91: that's called learning programming [16:09] ? [16:10] being able to reason about what the code does without running it [16:10] no, it goes beyond that [16:10] also, you may have just read a lot of awful python [16:10] I can't just *reason* what code does, I can actually mentally *execute* it so to say [16:10] joepie91: I would wager that if you can't interpret what the code does you don't understand it [16:11] as if my mind is one big debugger from a cursory look [16:11] and this seems to be not very common, judging from how much trouble other people have writing a parser [16:11] without a spec [16:11] tef: I can't really explain it any better, there's a difference between understanding what code does and almost literally having a VM in your mind running it [16:12] I've written a template parser in PHP, sort of token-based, just outputs a syntax tree [16:12] most of the very experienced PHP developers I've shared it with had trouble understanding it [16:12] joepie91: no, to me, if you can't run it in your head, you don't understand the code [16:12] not because of messy code, but just because they couldn't figure out what it did where without a debugging tool of some sort [16:12] joepie91: that might be your code :-) [16:13] said people can't point out any issues with the code [16:13] and I can read other peoples PHP parsers just the same :P [16:13] er [16:13] PHP-based parsers [16:13] without trying to be mean, if you struggle to understand python, and people strugle to understand yours, well, you might want to practice more at reading and writing [16:13] tef: people typically have absolutely no issue understanding my code [16:14] 'can't figure out what it does without a debugger' [16:14] python is obviously a bit messier since I've started on that recently, but as a general rule of thumb even non-programmers can grasp how my code works [16:14] yes, I was refering to the parser [16:14] because that requires you to keep track of a LOT of variables at the same time [16:14] that are constantly being changed [16:14] and this is what I mean with a difference between understanding the code and mentally 'running' the code [16:15] I'm with tef... you're setting a lower bar for "understanding" that shouldn't be there :D [16:15] yeah [16:15] understanding the code. being able to explain what it does, step by step. [16:15] you understand php. I comprehend it, or some other such word [16:15] being able to use the code, understand what it should be doing, or using it as a library, sure [16:16] * Aranje wonders if perhaps english is weak at this differentiation [16:16] Aranje: possibly [16:16] sigh, I don't seem to be capable of sufficiently explaining the difference between your usage of the word 'understand' and my usage of the word 'mentally run'... [16:16] tef: I am absolutely sure that I understand what you mean [16:16] joepie91:) nono, we're telling you that understand = mentally run [16:16] the problem is that I can't concretely point out where the difference is between how I mentally 'run' code [16:16] and what you see as 'understanding' [16:16] the problem is with code, it's hard to tell why it should look that way [16:16] Aranje: and I'm trying to explain that that is not the case [16:17] but failing to find the proper words to do so [16:17] joepie91: we call being able to mentally interpret the code 'understanding it' [16:17] tef: except it's still not the same as what I mean with 'mentally running' [16:17] in the sense 'i understand your idea' -> 'i have a mental model of your idea, how it fits togethr, and I could explain it' [16:17] let's just drop this subject until I find a better way to explain it [16:17] joepie91: if you can't point out a difference, there miht not be any :-) [16:17] * Aranje grins [16:18] tef: that's an incorrect assumption to make [16:18] and a very dangerous one [16:18] it's possible though :P [16:18] Aranje: I strongly dispute that in this particular case [16:18] joepie91: welcome to duck typed logic :-) [16:18] * Aranje laughs [16:18] joepie91: the thing that I get is that you are trying to explain a level of comprehension above your peer group [16:19] no, that's the point [16:19] it's not a "level" [16:19] it's a different *type* [16:19] hrm [16:19] * Aranje pines for the word false dichotomy [16:19] I see, so you're talking about that how you internalize things is different to other people [16:20] yes [16:20] with regards to PHP code, at least [16:20] here is my definition for understanding [16:20] I haven't really looked into it for other languages or concepts [16:20] you can write the code [16:21] joepie91: the thing we're saying, is that if you can't interpret the code, well, you're doing it wrong :-) [16:21] not programming, but incantation [16:21] the problem is that it seems your definition of "interpret" is different from that of mine, but I cannot currently find the right words to express how [16:21] blurgh, vocal language is so limited at times [16:22] joepie91: if you mean 'run the code in your head', i.e this is x, this is y, this is the step that happens next [16:22] we mean interpret in the sense t hat given a function, inputs you can calculate the output in the same way the computer does [16:23] tef: am I correct if I say that your meaning of 'interpret' is "input and output must be equal to reference implementation, but mental implementation can differ"? [16:23] joepie91: no [16:24] joepie91: that's like saying 'I can run this code twice, do different things and get the same output' [16:24] "run the code in your head" [16:24] (unless your code is non deterministic, for ex, but I don't think many people have that sort of ambiguity...) [16:24] joepie91: here is a thing: how does the php sort function work [16:25] you can deal with it in an abstraction - something with unstable sorting, probably worst case quadratic [16:25] or, for ex, you can know the actual algorithm underneath [16:25] and sometimes exploit the characteristics [16:25] like to merge two lists in python, concat them and sort them. because of how the sort works, it exploits latent order in the data structures [16:26] er, unintentionally this has made me realize a flaw in my mindset - I try to avoid doing things for which I am not aware of the implementation... I should probably fix taht [16:26] that * [16:27] to me that says you just need to dive into the implementation, but it's a sidetrack [16:27] ... what. How did I get a warc record with only the headers and no payload ò_ò [16:50] hi i see no on talking why is that? [16:50] - AnonNews247 quit (User quit: Page closed) [16:50] (2 minutes later) [16:51] I am really not quite sure how people expect there to be activity within 2 minutes on a public chatroom [16:51] even if they are not familiar with IRC [16:51] it's like standing on a plaza in the middle of the night and expecting there to walk someone towards you and talk to you within 2 minutes [17:00] perhaps their concept is closer to aol chat rooms from the 90's? [17:01] I think many were not on those chat rooms, but I mean model-wise [17:01] idk :/ [17:10] 2 minutes could be a pretty long time [22:34] i see that the flv file jumps from 1:42 to 268:10:42 [23:45] some one should help me get this: http://www.g4tv.com/techtvvault/index.html [23:46] i want to grab the pages so i can make a index [23:48] going to bed right now