Time |
Nickname |
Message |
00:48
🔗
|
yipdw |
proust position post: pull progressing; projection pending |
01:18
🔗
|
yipdw |
and I just learned that not all x86_64 CPUs from Intel do 40-bit physical addressing |
01:18
🔗
|
yipdw |
that's so weird |
01:23
🔗
|
Wyatt|NOC |
Just a quick note before I forget; seems Tim Follin has floppies with his old source in "Einstein" format (Tatung Einstein?). Do we have the capability to read them? |
01:27
🔗
|
Wyatt|NOC |
Speaking of those (this is the first I've heard of them), this page is pretty amazing http://www.tatungeinstein.co.uk/front/bandhcomputers.htm |
01:27
🔗
|
Wyatt|NOC |
In various meanings. |
01:44
🔗
|
yipdw |
SketchCow: do you think it makes sense to archive private stories on Proust? |
01:45
🔗
|
yipdw |
we don't get much out of it |
01:45
🔗
|
yipdw |
and from what I'm seeing, they are (1) the majority of stories; (2) easily identified |
01:47
🔗
|
yipdw |
or, to put it another way, 100 * (12 / 174) = 6.89% of downloaded stories are public |
03:09
🔗
|
Wyatt|NOC |
But that is a snazzy hat, SketchCow. How could people _not_ want to be interviewed by that? |
03:19
🔗
|
chronomex |
for serious |
03:38
🔗
|
nitro2k01 |
yipdw: Which don't? It would make sense if the early P4s or some budget chips don't for example |
03:43
🔗
|
yipdw |
nitro2k01: the Xeon E5430 does 38-bit physical addressing, or at least the one reported on my EC2 instance is |
03:43
🔗
|
yipdw |
a friend's Core i5-2500K reports 36-bit |
03:43
🔗
|
yipdw |
my E5520 reports 40-bit |
03:43
🔗
|
yipdw |
I think Intel just cripples their chips for some market segmentation dealie |
03:46
🔗
|
Wyatt|Wor |
You know, I never thought about all that. How does that affect performance, exactly? |
03:47
🔗
|
yipdw |
it doesn't realy |
03:47
🔗
|
yipdw |
it mostly affects how much RAM you can access |
03:48
🔗
|
yipdw |
because all those chips do 48-bit virtual addressing, and in 64-bit mode the software is juggling 64-bit pointers anyway |
03:54
🔗
|
Wyatt|Wor |
So if you put more than 8GB memory in a machine with 36-bit physical addressing, it will...? |
03:55
🔗
|
yipdw |
the memory above 8 GiB won't be addressable without bank-switching tomfoolery |
03:56
🔗
|
Wyatt|Wor |
So it'll degrade performance a bit. |
03:57
🔗
|
yipdw |
actually, I don't know if x86_64 can even address > 8 GiB in that situation |
03:58
🔗
|
Wyatt|Wor |
I think it should be able to. PAE has been available on commodity parts for years. |
03:58
🔗
|
yipdw |
IIRC, PAE is only applicable to 32-bit processors |
03:59
🔗
|
yipdw |
or more specifically 32-bit modes |
04:25
🔗
|
Wyatt|Wor |
Okay, my bad. I forgot that memory is byte-addressable. 36-bit memory address width is 64GB |
04:57
🔗
|
bsmith093 |
is torrent.textfiles.com stil availible? |
05:00
🔗
|
Coderjoe |
by your powers combined, i am... |
05:00
🔗
|
Coderjoe |
CAPTAIN ARCHIVE! |
05:00
🔗
|
Wyatt|Wor |
Wilford Brimley! |
05:04
🔗
|
bsmith093 |
i had to google for that ref, good one |
05:07
🔗
|
PatC |
Coderjoe, you the Coderjoe from tgg on freenode? |
05:27
🔗
|
Coderjoe |
... |
05:45
🔗
|
underscor |
alard: Are we gonna try and get the new anyhub prefixes? |
06:08
🔗
|
SketchCow |
yipdw: is archiving a private story hard? |
06:08
🔗
|
SketchCow |
If you can get to it from the net, it's not private. |
06:19
🔗
|
chronomex |
tumblr is the geocities of 2010: http://gutsygumshoe.tumblr.com/ |
07:05
🔗
|
yipdw |
SketchCow: I haven't looked into whether or not having an account gives you further access to said private stories |
07:05
🔗
|
yipdw |
I'll check that |
07:06
🔗
|
yipdw |
hmm, nope, no further access |
07:09
🔗
|
no2pencil |
private stories? Sounds provocative |
07:09
🔗
|
yipdw |
there is GET http://www.proust.com/ac/story/export/generate |
07:09
🔗
|
yipdw |
that relies on session data |
07:10
🔗
|
yipdw |
and for Proust, said session data is server-side |
07:15
🔗
|
yipdw |
well, maybe |
07:15
🔗
|
bsmith093 |
so what IS the status of the klol script? |
07:15
🔗
|
yipdw |
underscor: what email address did you use? |
07:15
🔗
|
yipdw |
underscor: to register with Proust |
07:15
🔗
|
yipdw |
I wonder if I can trick it to download other users' data |
07:15
🔗
|
yipdw |
"it" being the PDF exporter |
07:16
🔗
|
bsmith093 |
wouldnt tht mean they had really horrible securityZ? |
07:16
🔗
|
yipdw |
it's not uncommon |
07:16
🔗
|
balrog |
hi SketchCow |
07:16
🔗
|
balrog |
might want to be careful with SoftDisk for Apple II, those are still legitimately distributed. |
07:16
🔗
|
yipdw |
also, it's the only way I see to actually get the stories that are shared only with family and friends |
07:16
🔗
|
yipdw |
I mean, I *could* also just friend everyone on Proust and hope they reciprocate |
07:16
🔗
|
yipdw |
but for now I'm just getting ones marked as public |
07:18
🔗
|
bsmith093 |
my upload from yesterday is done its in bsmith on batcave as a 7z called ffnet_dump_and_script.7z, just fyi if someone wants to contiue where i left off 112025 stories grabbed out of 3.6 million, in the folder books |
07:18
🔗
|
bsmith093 |
bye now, good luck with proust |
07:18
🔗
|
yipdw |
just as an FYI, none of us have read access to that\ |
07:19
🔗
|
Wyatt|Wor |
bsmith093: Is the script up on github? |
07:19
🔗
|
bsmith093 |
ah well ok then ummm it sdhould be hold on |
07:20
🔗
|
bsmith093 |
http://code.google.com/p/fanficdownloader-fork/downloads/detail?name=fanficdownloader-fork0.0.1.7z |
07:20
🔗
|
bsmith093 |
no, not github, but heres the link to a repo i set up |
07:20
🔗
|
yipdw |
7.2 megabytes? |
07:20
🔗
|
bsmith093 |
that has everything except the stories |
07:20
🔗
|
bsmith093 |
damn your quick |
07:21
🔗
|
bsmith093 |
run automate.sh link to grab all the storeies in sequence |
07:22
🔗
|
bsmith093 |
automate runs download.py using enery line of link in order, it will take several months to complete and there will be new stories by then anyway but this is a complete list as of several weeks ago |
07:22
🔗
|
bsmith093 |
i reccommend using a vps or something you dont have to leave on yourself |
07:23
🔗
|
yipdw |
there's no way it has to take several months |
07:23
🔗
|
bsmith093 |
then you fix the code then, i just ran it for a week straight, and got oly 112k storis |
07:23
🔗
|
yipdw |
I did fix it :P |
07:24
🔗
|
bsmith093 |
you and your ruby voodoo, this is why i like bash, it Just Works (TM) |
07:24
🔗
|
yipdw |
bash actually has some serious portability problems |
07:24
🔗
|
yipdw |
we've hit them quite often here |
07:24
🔗
|
bsmith093 |
anyway storis is the raw id list and link is the id list wrapped up into url form |
07:25
🔗
|
yipdw |
for example, du-helper.sh in splinder-grab exists solely to paper over differences between GNU and BSD du |
07:25
🔗
|
yipdw |
and it's not really perfect |
07:25
🔗
|
yipdw |
to be fair, that's not bash per se, but a dependency of a bash script |
07:25
🔗
|
bsmith093 |
well ruby has some serious noob coder issues, and likes to spit back cryptic error messages to me |
07:25
🔗
|
yipdw |
but even within bash-the-language there's real problems between versions |
07:26
🔗
|
bsmith093 |
from story_grab.rb:1 |
07:26
🔗
|
bsmith093 |
ruby story_grab.rb 8 |
07:26
🔗
|
bsmith093 |
story_grab.rb:1:in `require': no such file to load -- mechanize (LoadError) |
07:26
🔗
|
bsmith093 |
for examply i thought i fixed this last night?!?! |
07:26
🔗
|
yipdw |
that means that a file called "mechanize" can't be loaded |
07:27
🔗
|
yipdw |
make sure you're using the right Ruby installation |
07:27
🔗
|
bsmith093 |
rmv use 1.9.3 |
07:27
🔗
|
bsmith093 |
using 1.9.3p0 |
07:28
🔗
|
bsmith093 |
now what? |
07:28
🔗
|
yipdw |
ensure the mechanize gem is present |
07:29
🔗
|
bsmith093 |
gem install mechanize |
07:29
🔗
|
yipdw |
gem list -i mechanize |
07:29
🔗
|
bsmith093 |
true |
07:29
🔗
|
yipdw |
then it's installed |
07:29
🔗
|
yipdw |
run it again |
07:29
🔗
|
yipdw |
the girl_friday and connection_pool gems are also used |
07:29
🔗
|
bsmith093 |
stack trace |
07:30
🔗
|
yipdw |
ok |
07:30
🔗
|
yipdw |
what is it that you want to save from fanfiction.net? |
07:30
🔗
|
yipdw |
http://archiveteam.org/index.php?title=FanFiction.Net doesn't state what |
07:31
🔗
|
bsmith093 |
the stories, minimum, the reviews and author profiles would be really nice |
07:31
🔗
|
yipdw |
ok |
07:32
🔗
|
bsmith093 |
ruby story_grab.rb 8 maybe this is a stupid question, but i am running this right, right? |
07:32
🔗
|
yipdw |
so stories, reviews, author profiles |
07:32
🔗
|
yipdw |
yes, that's correct |
07:33
🔗
|
yipdw |
although that script will not handle stories without chapters correctly; it needs to be modified for that |
07:33
🔗
|
bsmith093 |
yes, that would be great |
07:33
🔗
|
bsmith093 |
every story has atleast one chapter |
07:33
🔗
|
yipdw |
that script will not handle stories without >= 2 chapters |
07:34
🔗
|
bsmith093 |
ohhhh thats what u meant?! |
07:34
🔗
|
bsmith093 |
ok that makes more sense, check the link i gave u, they solved that problem, the google group in fanficdownloader |
07:34
🔗
|
yipdw |
I know what the problem is |
07:34
🔗
|
bsmith093 |
really , ehat |
07:35
🔗
|
yipdw |
see lines 24-28 |
07:35
🔗
|
yipdw |
there's an assumption that the chapter box is present |
07:35
🔗
|
yipdw |
as I mentioned, that script is just a test |
07:35
🔗
|
Jofo |
if anyone, I feel like this group would appreciate this link http://www.therestartpage.com/# |
07:35
🔗
|
yipdw |
to demonstrate that it is possible to download a multi-chapter story in less than 2.5 seconds per chapter |
07:35
🔗
|
yipdw |
for actual use it needs to be expanded |
07:36
🔗
|
bsmith093 |
check downloader py and all the stuff it refs |
07:36
🔗
|
bsmith093 |
they solved this somehow |
07:36
🔗
|
yipdw |
I know how to solve it |
07:36
🔗
|
bsmith093 |
....and? |
07:36
🔗
|
yipdw |
(1) download the first page; (2) if a chapter box is present, add chapters (2..n) to the queue |
07:37
🔗
|
yipdw |
and I'm not working for you, so I haven't solved it? |
07:37
🔗
|
bsmith093 |
ummm, ok so grep the page and se if the chaoter box is there? |
07:37
🔗
|
yipdw |
yes, and if it's there then initiate further downloads |
07:37
🔗
|
yipdw |
if it isn't there, you're done |
07:39
🔗
|
yipdw |
I can expand the story finder and downloader, but I don't know when |
07:39
🔗
|
bsmith093 |
sorry for being so rude, i thought this was a bigger issue than it truned out to be |
07:39
🔗
|
yipdw |
it isn't |
07:39
🔗
|
yipdw |
downloading fanfiction.net is really trivial |
07:39
🔗
|
yipdw |
well, at least the reviews, stories, and user profiles |
07:39
🔗
|
yipdw |
however I am working on other things |
07:43
🔗
|
yipdw |
hmm |
07:43
🔗
|
yipdw |
that said, if I say it's really trivial, I guess I better go do it, right |
07:44
🔗
|
bsmith093 |
so, im looking thorugh the mechanize docs, and this looks like some if's and a agent.search thing |
07:45
🔗
|
bsmith093 |
http://mechanize.rubyforge.org/GUIDE_rdoc.html way at the bottom |
07:51
🔗
|
yipdw |
yeah, that's pretty much it |
07:54
🔗
|
bsmith093 |
if agents.search("chapter" i am horrible with syntax |
08:38
🔗
|
yipdw |
bsmith093: https://gist.github.com/1577729 is a set of scripts that will grab stories, reviews, and profile for that story |
08:38
🔗
|
yipdw |
https://s3.amazonaws.com/nw-depot/example_run.tar.gz is an example of two runs of get_one_story.rb |
08:38
🔗
|
bsmith093 |
thanks, seriously. |
08:39
🔗
|
yipdw |
one on story ID 8, and one on story 4089014, which I chose because it has 701 reviews and 60 chapterws |
08:39
🔗
|
yipdw |
I have not yet inspected the WARCs |
08:39
🔗
|
yipdw |
but they should work |
08:39
🔗
|
yipdw |
actually, they might be slightly broken -- I'm not sure if --page-requisites is doing what I think it's doing |
08:39
🔗
|
yipdw |
time to fire up wayback |
08:39
🔗
|
yipdw |
so, yeah |
08:40
🔗
|
yipdw |
I don't think you need to grab one profile per story |
08:40
🔗
|
yipdw |
it is probably better to queue all the URLs up and just fetch once per unique URL |
08:40
🔗
|
yipdw |
but that depends on your approach |
08:41
🔗
|
bsmith093 |
warcs can be fixed later, to be honest, i have no idea why session data is useful to anyone, even the archivers. |
08:42
🔗
|
yipdw |
request/response headers tell you the circumstances under which a resource was retrieved, which is important for determining what state that resource is in |
08:42
🔗
|
yipdw |
because Web resources can change their content depending on headers |
08:43
🔗
|
bsmith093 |
thee that dynamic? |
08:43
🔗
|
yipdw |
Web resources can change based on *anything* |
08:43
🔗
|
bsmith093 |
i need to earn to type slower |
08:43
🔗
|
bsmith093 |
oy "P |
08:44
🔗
|
yipdw |
oh, fuck |
08:44
🔗
|
yipdw |
yeah, I didn't fetch the images or CSS |
08:44
🔗
|
yipdw |
that needs to be fixed |
08:45
🔗
|
yipdw |
oh, damnit |
08:45
🔗
|
yipdw |
the chapter selector doesn't work in the WARC |
08:45
🔗
|
yipdw |
because it suffixes the name of the story |
08:45
🔗
|
yipdw |
that's fairly annoying |
08:46
🔗
|
yipdw |
bsmith093: if you want to see what I'm "oh, fuck"ing about: https://s3.amazonaws.com/nw-depot/wayback1.png |
08:48
🔗
|
bsmith093 |
i would say thats fine the images dont change much ever |
08:48
🔗
|
yipdw |
it's not fine, it's incomplete |
08:48
🔗
|
bsmith093 |
grab once and link to them |
08:48
🔗
|
yipdw |
just needs some wget tweaks though |
08:48
🔗
|
Wyatt|Wor |
bsmith093: There's no emergency, so there's no reason not to do it right. |
08:48
🔗
|
bsmith093 |
ok, then |
08:48
🔗
|
yipdw |
also I want to find a way to get that chapter selector working |
08:49
🔗
|
yipdw |
ALL THAT SAID |
08:49
🔗
|
yipdw |
if all you want is the text, the text is there |
08:50
🔗
|
yipdw |
hmm |
08:50
🔗
|
yipdw |
I wonder how hard it'd be to set up our own Wayback Machine |
08:50
🔗
|
yipdw |
with a WARC upload UI |
08:50
🔗
|
yipdw |
that'd make checking archives pretty snazy |
08:50
🔗
|
yipdw |
snazzy, too |
08:51
🔗
|
* |
yipdw tries |
08:51
🔗
|
Wyatt|Wor |
Haha, I was just thinking a warc viewer for my phone would be neat too. |
08:51
🔗
|
yipdw |
wayback seems to already support that in some capacity, so maybe I just need to throw on some UI code |
08:51
🔗
|
yipdw |
Wyatt|Wor: I wish there was a lightweight WARC viewer out there |
08:51
🔗
|
Wyatt|Wor |
Actually, are there browser plugins for warc files or something? |
08:51
🔗
|
yipdw |
I wish :P |
08:51
🔗
|
Wyatt|Wor |
I didn't even think to look. |
08:51
🔗
|
yipdw |
if you find one, let me know |
08:52
🔗
|
Wyatt|Wor |
Ah...will do. |
08:52
🔗
|
yipdw |
wayback is the only thing I've found that will render a WARC's content in a Web browser |
08:52
🔗
|
bsmith093 |
WARNING: Installing to ~/.gem since /var/lib/gems/1.8 and /var/lib/gems/1.8/bin aren't both writable. WARNING: You don't have /home/ben/.gem/ruby/1.8/bin in your PATH, gem executables will not run. |
08:52
🔗
|
yipdw |
and it's pretty heavy |
08:52
🔗
|
bsmith093 |
thats the output of gem install mechanize |
08:52
🔗
|
bsmith093 |
it worked but i figure huge warnings are notable |
08:52
🔗
|
yipdw |
bsmith093: if you're using your system's Ruby installation, that'll happen |
08:53
🔗
|
* |
Wyatt|Wor flinches at the mention of ruby gems. |
08:53
🔗
|
bsmith093 |
happens every tim i try to run make_story_urls |
08:53
🔗
|
yipdw |
you either need to grant your user write permission to those directories (ick) or use a Ruby distribution that your user controls |
08:53
🔗
|
bsmith093 |
ive got rvm in my home dir |
08:53
🔗
|
yipdw |
rvm is good for setting up the latter |
08:54
🔗
|
Wyatt|Wor |
bsmith093: ...or set your $PATH. |
08:54
🔗
|
yipdw |
rvm isn't a Ruby distribution; it just manages distributions |
08:54
🔗
|
yipdw |
yeah, or that |
08:54
🔗
|
yipdw |
but mechanize's executables are not used by make_story_urls so |
08:54
🔗
|
bsmith093 |
wheres $PATH, in the configs |
08:54
🔗
|
Wyatt|Wor |
Or use your distro's package manager to install the gem. |
08:54
🔗
|
yipdw |
PATH is an environment variable, but don't worry about it |
08:54
🔗
|
bsmith093 |
require': no such file to load -- mechanize (LoadError) |
08:54
🔗
|
bsmith093 |
from make_story_urls.rb:3 |
08:54
🔗
|
bsmith093 |
before and after |
08:54
🔗
|
yipdw |
do this |
08:54
🔗
|
yipdw |
add require "rubygems" to the top of all Ruby source files |
08:55
🔗
|
yipdw |
I don't like to do that for various reasons but it will ensure Rubygems is loaded |
08:55
🔗
|
yipdw |
(the main reason is that Ruby programs should not have any dependency on a specific package manager) |
08:56
🔗
|
bsmith093 |
/usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- /home/ben/1577729/url_generators (LoadError) |
08:56
🔗
|
bsmith093 |
from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `require' |
08:56
🔗
|
bsmith093 |
ran for a sec then that |
08:56
🔗
|
yipdw |
you'll need that file from the gist, too |
08:56
🔗
|
yipdw |
oh it's not in there |
08:56
🔗
|
yipdw |
https://gist.github.com/1577729#file_url_generators.rb |
08:56
🔗
|
Wyatt|Wor |
yipdw: That's the most salient argument I think I've heard against gem from a non-Debian/Gentoo developer. |
08:57
🔗
|
yipdw |
Wyatt|Wor: heh |
08:57
🔗
|
yipdw |
Wyatt|Wor: yeah, it's largely a theoretical argument but that is a good point |
08:57
🔗
|
yipdw |
ruby programs shouldn't break just because you installed a library via Rubygems or apt-get or whatever |
08:59
🔗
|
Wyatt|Wor |
yipdw: Oh I wouldn't say it's theoretical. If our experience in buying a couple "rails hosting" brands is any indication it's more like...a tsunami of ass-pain. |
08:59
🔗
|
Wyatt|Wor |
See also: flameeyes adventures in gem packaging. |
08:59
🔗
|
yipdw |
I feel really bad for people who have to package Ruby gems |
09:00
🔗
|
yipdw |
gems move really, really freaking fast |
09:00
🔗
|
bsmith093 |
ok actual code error this time /home/ben/.gem/ruby/1.8/gems/mechanize-2.1/lib/mechanize/http/agent.rb:303:in `fetch': 404 => Net::HTTPNotFound (Mechanize::ResponseCodeError) from /home/ben/.gem/ruby/1.8/gems/mechanize-2.1/lib/mechanize.rb:319:in `get' from make_story_urls.rb:16 |
09:00
🔗
|
Wyatt|Wor |
Kind of. Sometimes. |
09:00
🔗
|
yipdw |
bsmith093: yeah, that script has no graceful error handling at all |
09:00
🔗
|
yipdw |
but uh |
09:00
🔗
|
yipdw |
are you sure you passed a valid story ID as the first argument |
09:00
🔗
|
yipdw |
to get_one_story |
09:01
🔗
|
bsmith093 |
oh, um i was running "ruby make_story_urls.rb" errr, whoops :D |
09:02
🔗
|
yipdw |
Wyatt|Wor: actually, for most languages I work with -- python, ruby, occasionally haskell and node -- I've actually begun to not use the OS' package manager |
09:02
🔗
|
yipdw |
and have been instead using easy_install, rubygems, cabal, npm |
09:02
🔗
|
yipdw |
it's way more complex and makes my package manifest incomplete, but there's so many other people who just publish libraries in those languages in their specific package managers |
09:03
🔗
|
yipdw |
which is quite a bit of inertia to overcome |
09:03
🔗
|
yipdw |
the only language I can think of that I work in and use the distribution's packages is C/C++ |
09:03
🔗
|
yipdw |
and that's not entirely true for things like Qt :P |
09:03
🔗
|
Wyatt|Wor |
Not familiar with the latter two, but python eggs have a lot of the same issues as gems, as far as I'm aware. |
09:04
🔗
|
Wyatt|Wor |
At least CPAN gets it right~ |
09:04
🔗
|
yipdw |
I think I'll become rich and famous if I find a way to encapsulate a gem/egg/whatever as a deb or whatever |
09:04
🔗
|
bsmith093 |
short version could not find gem custom_require locally or in a repository |
09:04
🔗
|
bsmith093 |
YES YOU WILL, fantastically so |
09:04
🔗
|
Wyatt|Wor |
Just formalise package metadata about the gem to the extent perl does and you can. |
09:05
🔗
|
yipdw |
Wyatt|Wor: what does CPAN do? I've just used perl -MCPAN -e 'install ...' |
09:05
🔗
|
yipdw |
is there a way to do it that doesn't involve doing that |
09:05
🔗
|
yipdw |
or, more specifically, respects the OS' package management system |
09:05
🔗
|
Wyatt|Wor |
yipdw: cpan itself is just software. It's all because they have a good packaging format that we can have things like g-cpan. |
09:05
🔗
|
yipdw |
ahh |
09:06
🔗
|
yipdw |
actually, that reminds me |
09:06
🔗
|
yipdw |
the source code that drives rubygems.org is available |
09:07
🔗
|
yipdw |
perhaps it is feasible to add a service endpoint to it that makes it behave as an apt repo |
09:08
🔗
|
Wyatt|Wor |
BTW, here's the horse's mouth on the subject: http://blog.flameeyes.eu/2008/12/14/rubygems-cpan-and-other-languages |
09:11
🔗
|
yipdw |
ahh |
09:11
🔗
|
yipdw |
yeah, I agree with all of those points |
09:11
🔗
|
yipdw |
there has been *some* success on the standardization front though |
09:11
🔗
|
yipdw |
namely, running "rake" in an increasing number of projects runs the testsuite |
09:11
🔗
|
yipdw |
regardless of test harness |
09:12
🔗
|
yipdw |
but, yes, the file format of gems is scattersht |
09:12
🔗
|
yipdw |
shot |
09:17
🔗
|
Wyatt|Wor |
Ergh, yeah, If last week's tirade about mongo_mapper is any indication. |
09:17
🔗
|
Wyatt|Wor |
Well a couple weeks, I guess. |
09:18
🔗
|
yipdw |
oh, I had no idea mongo_mapper sucked that bad |
09:19
🔗
|
Wyatt|Wor |
Oh, did you read his post about it? |
09:19
🔗
|
yipdw |
yeah |
09:20
🔗
|
yipdw |
I also realized that the gems I maintain do not include test files or a Rakefile in their gem form |
09:20
🔗
|
yipdw |
under the rationale that tests and build process are useful only to a developer |
09:20
🔗
|
yipdw |
I'll have to change that |
09:20
🔗
|
yipdw |
somehow it didn't click that someone might want to use the *.gem and repackage it in a package manager that does things like run tests |
09:21
🔗
|
Wyatt|Wor |
Hehe, yeah. The most recent post is a semi-continuation of the mongo_mapper post, too. This happens every couple months, or so, btw |
09:22
🔗
|
yipdw |
I'm surprised he's stuck with it |
09:22
🔗
|
yipdw |
(I didn't :P) |
09:22
🔗
|
yipdw |
try to get gems to play nice with the package manager that is |
09:22
🔗
|
Wyatt|Wor |
And thanks! I'm not a Ruby user, personally, but I'm always thankful when release engineering is improved. |
09:23
🔗
|
Wyatt|Wor |
Yeah, he _really_ loves him some ruby |
09:23
🔗
|
yipdw |
yeah, no problem |
09:23
🔗
|
yipdw |
thanks for pointing out flameeyes' blog |
09:24
🔗
|
yipdw |
I'll follow it, as he is the first person I've seen who is still sticking with it |
09:24
🔗
|
yipdw |
most other people I know who do Ruby use rvm + bundler to just throw all of an application's dependencies into a directory |
09:24
🔗
|
yipdw |
I mean, it works, and it isolates things |
09:24
🔗
|
yipdw |
but it is very heavy |
09:25
🔗
|
yipdw |
it makes sense on systems that don't really try to define their system configuration in terms of packages |
09:25
🔗
|
yipdw |
like Windows, OS X |
09:25
🔗
|
Wyatt|Wor |
RVM is kind of neat for developers. |
09:26
🔗
|
Wyatt|Wor |
But it's a nightmare for our setup. |
09:26
🔗
|
Wyatt|Wor |
(Speaking of dependency hell, http://blog.flameeyes.eu/files/bones-dependencies-graph.png) |
09:28
🔗
|
yipdw |
haha what |
09:28
🔗
|
yipdw |
oh bones |
09:28
🔗
|
yipdw |
ugh |
09:28
🔗
|
yipdw |
I do not like bones, jeweler, hoe |
09:29
🔗
|
yipdw |
they make the process of making a gem so ridiculously complex |
09:29
🔗
|
Wyatt|Wor |
Apparently they make the packaging difficult, too |
09:29
🔗
|
yipdw |
actually, the gem command in bundler is very minimal and seems to do it bets |
09:29
🔗
|
yipdw |
best |
09:35
🔗
|
yipdw |
oh! |
09:35
🔗
|
yipdw |
fanfiction.net pages include the canonical URL |
09:35
🔗
|
yipdw |
badass |
09:42
🔗
|
Wyatt|Wor |
They generate a lot of stuff into their pages, as I recall it. |
09:42
🔗
|
yipdw |
yeah, really helps with retrieval |
09:52
🔗
|
yipdw |
"To modrenaissancewoman: Thank you for pointing that out. I thought French kissing is the one where friends give each other on their cheeks. My mistakes." |
09:52
🔗
|
yipdw |
whoops |
09:52
🔗
|
Wyatt|Wor |
A common mistake. |
09:52
🔗
|
yipdw |
yeah, but the implications are funny |
09:53
🔗
|
Wyatt|Wor |
lol, was joking. |
09:53
🔗
|
bsmith093 |
ruby get_one_story.rb http://www.fanfiction.net/s/4/1/get_one_story.rb:11: warning: already initialized constant VERSION get_one_story.rb:11: command not found: ./make_story_urls.rb http://www.fanfiction.net/s/4/1/ get_one_story.rb:26:in `initialize': No such file or directory - /home/ben/1577729/data/h/ht/htt/http://www.fanfiction.net/s/4/1//http://www.fanfiction.net/s/4/1/_urls (Errno::ENOENT) from get_one_story.rb:26:in `open |
09:53
🔗
|
yipdw |
bsmith093: it's just the ID, not the full URL |
09:53
🔗
|
bsmith093 |
/home/ben/1577729/wget-warc -U 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.54 Safari/535.2' -o /home/ben/1577729/data/4/4/4/4/4.log -e 'robots=off' --warc-file=/home/ben/1577729/data/4/4/4/4/4 --warc-max-size=inf --warc-header='operator: Archive Team' --warc-header='ff-download-script-version: 20120108.01' -nd -nv --no-timestamping --page-requisites -i /home/ben/1577729/d |
09:53
🔗
|
bsmith093 |
get_one_story.rb:11: command not found: ./make_story_urls.rb 4 |
09:53
🔗
|
bsmith093 |
get_one_story.rb:11: warning: already initialized constant VERSION |
09:53
🔗
|
bsmith093 |
ruby get_one_story.rb 4 |
09:53
🔗
|
bsmith093 |
sh: /home/ben/1577729/wget-warc: not found |
09:53
🔗
|
bsmith093 |
that, then |
09:53
🔗
|
yipdw |
you need wget-warc |
09:54
🔗
|
yipdw |
or some wget that does WARC |
09:54
🔗
|
yipdw |
adjust the WGET_WARC constant as required |
09:54
🔗
|
bsmith093 |
oy right hold on |
09:56
🔗
|
bsmith093 |
whats gnutls and do i want wgetwarc compiled with it |
09:57
🔗
|
bsmith093 |
get_one_story.rb:11: command not found: ./make_story_urls.rb 4 |
09:57
🔗
|
yipdw |
man, fanfiction.net really does not want to be archived |
09:57
🔗
|
yipdw |
in addition to their robots.txt file there's a ROBOTS=NOARCHIVE meta tag in every generated output |
09:58
🔗
|
yipdw |
I feel bad doing this |
09:58
🔗
|
bsmith093 |
well it is technically against the tos not that i care |
09:58
🔗
|
Wyatt|Wor |
yipdw: Yeah, I mentioned that a while back, I think. |
09:58
🔗
|
yipdw |
hm |
09:58
🔗
|
yipdw |
yeah |
09:58
🔗
|
Wyatt|Wor |
bsmith093: Oh dear, I might get the account I don't have banned. |
09:58
🔗
|
yipdw |
I think at this point I'll just stop |
09:59
🔗
|
* |
chronomex slides in |
09:59
🔗
|
yipdw |
I mean, yes, I understand the point of archiving this, but on the other hand ignoring all of those signs is really shitty netizen behavior |
10:00
🔗
|
chronomex |
shitty netizen on one hand, but fanfic people on the other hand |
10:00
🔗
|
chronomex |
the noarchive bullshit is just ff.n trying to force the internet to depend on its continued existence |
10:01
🔗
|
Wyatt|Wor |
I think that's debatable. It's not very good netizenship to put yourself in a position where millions of users' work could just disappear, either. |
10:01
🔗
|
yipdw |
right |
10:01
🔗
|
chronomex |
I know that the existence of public logs is going to cause me to regret saying so eventually, but fuck that shit. |
10:01
🔗
|
yipdw |
a moral quandrary |
10:01
🔗
|
Wyatt|Wor |
History lasts longer than any one website. |
10:02
🔗
|
bsmith093 |
story_page = agent.get(UrlGenerators::STORY_URL[sid, '']) this line in make_story_urls is throwing an error |
10:02
🔗
|
Wyatt|Wor |
Which is about as weird a way as I could have found to express that. |
10:02
🔗
|
chronomex |
in my experience, fanfic people can be rabidly anti-archivism, and I have no idea why -- especially because all the fannish people I've met save webpages religiously |
10:03
🔗
|
Wyatt|Wor |
And they don't tend to keep backups. |
10:03
🔗
|
Wyatt|Wor |
Of their own stuff, at least. |
10:03
🔗
|
chronomex |
well, maybe. |
10:03
🔗
|
yipdw |
chronomex: http://ansuz.sooke.bc.ca/entry/35 is one theory |
10:03
🔗
|
bsmith093 |
the its my story, and ill kill it if i want to line of thought |
10:03
🔗
|
yipdw |
bsmith093: there's more to it than that |
10:03
🔗
|
chronomex |
bsmith093: yes, that. exactly. |
10:04
🔗
|
bsmith093 |
/home/ben/.gem/ruby/1.8/gems/mechanize-2.1/lib/mechanize/http/agent.rb:303:in `fetch': 404 => Net::HTTPNotFound (Mechanize::ResponseCodeError) |
10:04
🔗
|
bsmith093 |
ben@ben-laptop:~/1577729$ ruby make_story_urls.rb |
10:04
🔗
|
bsmith093 |
from /home/ben/.gem/ruby/1.8/gems/mechanize-2.1/lib/mechanize.rb:319:in `get' |
10:04
🔗
|
bsmith093 |
from make_story_urls.rb:16 |
10:04
🔗
|
bsmith093 |
siorry forgot to sump line breaks |
10:04
🔗
|
yipdw |
a lot of fandoms are actually very sensitive to the legal complications surrounding their fandom |
10:05
🔗
|
yipdw |
bsmith093: make_story_urls is meant to be called from get_one_story, and it requires a story ID |
10:05
🔗
|
bsmith093 |
oy well that explains it |
10:05
🔗
|
bsmith093 |
ruby make_story_urls.rb 4 |
10:06
🔗
|
bsmith093 |
worked perfectly |
10:08
🔗
|
yipdw |
lol wtf |
10:08
🔗
|
yipdw |
http://b.fanfiction.net/static/styles/fanfiction42.css |
10:08
🔗
|
yipdw |
I do not know how the fuck that is coming back |
10:09
🔗
|
yipdw |
if I get that with curl, I get gzipped CSS (?!) |
10:09
🔗
|
yipdw |
if I get that with Chrome, I get an HTML page that has the CSS between <pre> tags |
10:09
🔗
|
Wyatt|Wor |
yipdw: gzipped CSS!? |
10:09
🔗
|
yipdw |
and I mean it's gzipped CSS, not merely sent with Content-Encoding: gzip and compressed by the server |
10:09
🔗
|
yipdw |
Wyatt|Wor: yeah, try it |
10:10
🔗
|
Wyatt|Wor |
I... |
10:10
🔗
|
Wyatt|Wor |
What. |
10:10
🔗
|
yipdw |
I am amazed that works |
10:11
🔗
|
chronomex |
no <pre> tags in opera |
10:11
🔗
|
yipdw |
oh |
10:11
🔗
|
yipdw |
that might just be the web inspector |
10:11
🔗
|
chronomex |
are you viewing-source in chrome? |
10:11
🔗
|
yipdw |
I am now |
10:11
🔗
|
yipdw |
and yeah, that appears fine |
10:12
🔗
|
yipdw |
but that is so weird |
10:12
🔗
|
bsmith093 |
quick thing i have a list of id numbers in a file, and they work individually, but the autogeneration part of the scrupt seems to be tripping over itself |
10:13
🔗
|
bsmith093 |
could you just package the id list into the repo |
10:13
🔗
|
yipdw |
well, wait |
10:13
🔗
|
yipdw |
it IS sent with Content-Encoding: gzip |
10:13
🔗
|
yipdw |
so I guess that's valid |
10:13
🔗
|
Wyatt|Wor |
Huh, interesting. |
10:13
🔗
|
yipdw |
I expected curl to inflate the stream, though |
10:13
🔗
|
yipdw |
to say nothing of wget |
10:14
🔗
|
yipdw |
are they gzipping gzipped data? |
10:14
🔗
|
chronomex |
Content-Encoding: gzip |
10:14
🔗
|
yipdw |
right |
10:14
🔗
|
yipdw |
I thought curl/wget would be able to handle that by inflating the stream |
10:14
🔗
|
chronomex |
it is single-gzipped |
10:14
🔗
|
chronomex |
(curl | gunzip) --> plaintext |
10:15
🔗
|
yipdw |
yeah, that works |
10:17
🔗
|
yipdw |
ohh |
10:17
🔗
|
yipdw |
b.fanfiction.net sends that regardless of Accept-Encoding |
10:17
🔗
|
yipdw |
that's...broken |
10:21
🔗
|
yipdw |
I guess we just need to download and gunzip that separately |
10:21
🔗
|
yipdw |
or something |
10:21
🔗
|
yipdw |
tricksy |
10:37
🔗
|
bsmith093 |
well its 536am est so being in ny, im going to bed, keep the repo updated, ciao, night | morning depending on timezones |
10:43
🔗
|
chronomex |
nite |
11:46
🔗
|
SketchCow |
And here I am!! |
11:46
🔗
|
SketchCow |
Packing the car up |
11:46
🔗
|
Wyatt|Wor |
Gah, I thought dotwizards.com would be some cool Japanese pixel art site. Alas, corporate coaching. |
11:47
🔗
|
Wyatt|Wor |
SketchCow: Ah, have a good Magfest? |
11:47
🔗
|
SketchCow |
I had a very good magfest. |
11:47
🔗
|
Wyatt|Wor |
Awesome. That couple with the arcade sounds like it's going to be an awesome...err, episode? |
11:49
🔗
|
SketchCow |
Just more filming stuff |
11:49
🔗
|
SketchCow |
But yeah, I like them a lot. |
11:50
🔗
|
SketchCow |
http://www.facebook.com/SavePointMD |
11:51
🔗
|
Wyatt|Wor |
How much do you think Arcade will cover in terms of pinball's role in arcades? I mean, yeah there's Tilt! (which I need to get a copy of, come to think of it), but I'm a fanatic. ;) |
12:54
🔗
|
SketchCow |
Good question, no answer. |
14:42
🔗
|
SketchCow |
can someoneca |
14:43
🔗
|
SketchCow |
hey |
14:44
🔗
|
SketchCow |
underscor gave a rough address in here a ways back. google link. can someone tell ot to me? |
14:45
🔗
|
underscor |
http://g.co/maps/zge32 |
14:45
🔗
|
SketchCow |
on phone, keyboard fuckery limited. |
14:45
🔗
|
SketchCow |
just give me the location, kid |
14:45
🔗
|
underscor |
it's grassy knoll ct woodbridge, va 22193 |
14:45
🔗
|
SketchCow |
om |
14:46
🔗
|
PatC |
Is there a nickserv here? |
14:46
🔗
|
underscor |
no |
14:46
🔗
|
underscor |
no services on efnet |
14:46
🔗
|
underscor |
Besides chanfix |
14:46
🔗
|
PatC |
ok |
14:47
🔗
|
underscor |
SketchCow: does this mean you'll be here in like 45 minutes? |
14:47
🔗
|
underscor |
or are you just planning ahead |
14:47
🔗
|
SketchCow |
may e |
14:47
🔗
|
underscor |
oh |
14:47
🔗
|
underscor |
damn, we have church at 11 |
14:48
🔗
|
SketchCow |
see soon! |
14:48
🔗
|
SketchCow |
when do you get back? |
14:49
🔗
|
SketchCow |
no rush. |
14:49
🔗
|
SketchCow |
no ticking off family. |
14:49
🔗
|
underscor |
1:15ish |
14:49
🔗
|
underscor |
hahah |
14:51
🔗
|
SketchCow |
see you around then. |
20:07
🔗
|
closure |
Someone posted some data to usenet in 1982 and I made a visualization of it today. http://olduse.net/blog/current_usenet_map/ fun collaboration :) |
20:14
🔗
|
closure |
I especially like the tall doubly linked list of systems at the bottom. we don't build networks like that anymore. |
20:15
🔗
|
nitro2k01 |
Token Link? |
20:15
🔗
|
nitro2k01 |
Token Ring rather |
20:15
🔗
|
nitro2k01 |
Kill me! |
20:17
🔗
|
closure |
could be token ring, more likely it was a dozen systems talking over 300 baud dialup |
20:18
🔗
|
closure |
hmm, actually, token ring seems to be 1985 or so, not 1982 |
20:18
🔗
|
nitro2k01 |
Seems like an expensive way of conencting |
20:18
🔗
|
nitro2k01 |
If the middle box needs to reach out, it needs to rely on a bunch of telephone lines |
20:19
🔗
|
closure |
and it probably takes it *days* to get new traffic |
20:19
🔗
|
nitro2k01 |
Damn (whoever) for not providing more metadata |
20:20
🔗
|
closure |
yeah, I hope for a future dataset with more info |
20:20
🔗
|
nitro2k01 |
Also, why the double arrows everywhere? |
20:20
🔗
|
nitro2k01 |
Seems like they don'tprovide additional information |
20:20
🔗
|
closure |
(of course, telehack.org has a newer, much more extensive uucp map they use in their simulation) |
20:21
🔗
|
closure |
bidirectional links, each system could call the other |
20:21
🔗
|
nitro2k01 |
Right. But this applied to ALL of the links? |
20:21
🔗
|
nitro2k01 |
Except the wormhole :p |
20:22
🔗
|
closure |
according to Mark, it did, yes |
20:22
🔗
|
nitro2k01 |
Wait, look at eagle and mhux* |
20:23
🔗
|
nitro2k01 |
Multiple links |
20:23
🔗
|
nitro2k01 |
mhuxj -> eagle *2 |
20:23
🔗
|
nitro2k01 |
mhuxj <-> mhuxm *2 |
20:23
🔗
|
closure |
yeah, I've been fixing a few that he doubled |
20:24
🔗
|
nitro2k01 |
Oh, so that's not even useful data? ._. |
20:25
🔗
|
closure |
well, look at the original post :P |
20:25
🔗
|
closure |
it was like a bunch of badly formatted lines from 1982 |
20:25
🔗
|
nitro2k01 |
But that's like text and stuff |
20:25
🔗
|
nitro2k01 |
I can't read text |
20:25
🔗
|
closure |
hahahaha |
20:26
🔗
|
nitro2k01 |
Like, if someone would send me a link to textfiles.com |
20:26
🔗
|
nitro2k01 |
I'd be lost |
20:27
🔗
|
closure |
this is why I thought a graphical map would be nice.. I personally prefer the handdrawn ascii ones below it though |
20:27
🔗
|
nitro2k01 |
In fact, if I didn't have tghis program that translated IRC messages to pictures of fruit, I couldn't have this conversation |
20:27
🔗
|
closure |
<dingbat> <cloud> <mushroom> <teletype> |
20:30
🔗
|
nitro2k01 |
http://www.textfiles.com/conspiracy/art-04.txt |
20:30
🔗
|
nitro2k01 |
Just look at the second paragraph |
20:31
🔗
|
nitro2k01 |
How nicely sliced it is |
20:31
🔗
|
nitro2k01 |
A signle diagonal stroke |
20:31
🔗
|
nitro2k01 |
Same with the first too actually |
22:20
🔗
|
ndurner_c |
Hi |
22:21
🔗
|
ndurner_c |
Any emergency downloads going on right now? |
22:29
🔗
|
ndurner_c |
(will read the log tomorrow.. gn8) |