Time |
Nickname |
Message |
00:40
🔗
|
SketchCow |
Hi gang |
00:41
🔗
|
SketchCow |
I am not well, I'll be on here and there. |
00:43
🔗
|
bbot_ |
alright |
00:44
🔗
|
Paradoks |
Short-term not-well, hopefully? Regardless, hopefully things get better soon. |
00:48
🔗
|
dashcloud |
hope things get better soon |
00:49
🔗
|
bsmith093 |
ditto |
00:57
🔗
|
dashcloud |
hope things get better for you |
01:24
🔗
|
GLaDOS |
I am the wizard. |
01:29
🔗
|
Coderjoe |
yer a lizzard harry |
01:31
🔗
|
rude___ |
I am the robot zydel |
01:35
🔗
|
Coderjoe |
better than whiny brat Cindel |
02:18
🔗
|
underscor |
SketchCow: Aww, why'd you ban zetathrustra? |
02:53
🔗
|
db48x |
stupid irc |
03:22
🔗
|
yipdw |
bsmith093: that's not 80% full, that's 1% |
03:22
🔗
|
yipdw |
in any case, I made further modifications to the crawler to not hit the same page twice |
03:23
🔗
|
bsmith093 |
sorry must have missed a decimal place |
03:23
🔗
|
yipdw |
been running it for about two hours, it's pulled back 173,224 story IDs |
03:23
🔗
|
yipdw |
I inserted random 5-second sleeps |
03:23
🔗
|
yipdw |
in order to not be an asshole |
03:27
🔗
|
* |
yipdw also finally figured out his dash vault, woo |
03:35
🔗
|
chronomex |
underscor: because it talked. we've been over this already. |
03:38
🔗
|
dnova |
I don't get it |
03:45
🔗
|
underscor |
chronomex: I was more insightful than some people we allow to stay in here |
03:45
🔗
|
underscor |
Plus, it was only once per day |
03:45
🔗
|
underscor |
:( |
03:45
🔗
|
chronomex |
hahaha I suppose that's true |
04:08
🔗
|
underscor |
s/I/It/ |
04:08
🔗
|
chronomex |
my statement stands |
04:09
🔗
|
Paradoks |
dnova: zetathrustra is underscor's bot. It made a smart-alec comment, and was then banned. |
04:09
🔗
|
dnova |
but we thrive on smart-alec comments |
04:09
🔗
|
Paradoks |
Yes, but not generally pre-programmed ones. |
04:10
🔗
|
chronomex |
we do not allow bots which speak in the channel |
04:10
🔗
|
dnova |
ah, welp. |
04:10
🔗
|
chronomex |
it is that simple |
04:11
🔗
|
GLaDOS |
I should stick a MegaHAL-containing bot in here, and see what the resulting dict is somewhere else. |
04:11
🔗
|
GLaDOS |
Alternatively, teach it C++ and python, with bits of ruby |
04:14
🔗
|
chronomex |
you don't need to bring a bot in here, the logs are public |
04:15
🔗
|
GLaDOS |
Pff, wheres the fun in that? |
04:15
🔗
|
GLaDOS |
Alternatively, I've somehow managed to use 5TB of Traffic in 8 days, over my 1TB limit. |
04:15
🔗
|
GLaDOS |
I now have a 2000 dollar bill |
04:20
🔗
|
underscor |
They're not preprogrammed |
04:20
🔗
|
underscor |
They're word-delineated markov chains. |
04:21
🔗
|
underscor |
Based on the occurrences in the channel |
04:21
🔗
|
underscor |
GLaDOS: Uh oh |
04:22
🔗
|
yipdw |
wtf, there's Cirque du Soleil fanfiction |
04:22
🔗
|
yipdw |
how is that even possible |
04:22
🔗
|
yipdw |
there are no Cirque characters that even have names |
04:22
🔗
|
GLaDOS |
However, it spikes, so I'm going to say it's a DDoS and contact the host. |
04:41
🔗
|
SketchCow |
Back, took some rest. |
04:41
🔗
|
SketchCow |
Yeah, so fuck bots. |
04:51
🔗
|
bsmith093 |
anyone else scraping poevews, or is it just me? |
04:51
🔗
|
bsmith093 |
poe-news.com |
05:30
🔗
|
bsmith093 |
youre all over the world, you can't all be asleep, and or busy |
05:30
🔗
|
* |
GLaDOS is away: zzz |
05:51
🔗
|
Coderjoe |
scumbag LoC: maintains MARC 21 on just about everything. charges thousands of dollars for access. |
05:51
🔗
|
bsmith093 |
SketchCow: im getting 503 filename prohibited errors with my upload |
05:54
🔗
|
chronomex |
bsmith093: what filename are you using? |
05:54
🔗
|
bsmith093 |
Thief_and_the_Cobbler_Recobbled_Cut minus the underscroes, for te filename |
05:55
🔗
|
chronomex |
ThiefandtheCobblerRecobbledCut , you mean? |
05:55
🔗
|
chronomex |
what is your upload command exactly? |
05:55
🔗
|
bsmith093 |
er, no the file has spaces, the identifier for the archive has underscores, could that be the problem? |
05:56
🔗
|
bsmith093 |
gftp |
05:56
🔗
|
chronomex |
what is the exact name of the file |
05:56
🔗
|
chronomex |
and where are you putting it |
05:56
🔗
|
chronomex |
--exact-- name |
05:56
🔗
|
bsmith093 |
"Thief and the Cobbler Recobbled Cut.iso" |
05:57
🔗
|
chronomex |
and what is the item name? |
05:57
🔗
|
chronomex |
I don't think IA likes spaces. |
05:57
🔗
|
bsmith093 |
the item name i gave IA is that but with underscores instead of spaces |
05:57
🔗
|
bsmith093 |
Thief_and_the_Cobbler_Recobbled_Cut is the folder ia gave me |
05:58
🔗
|
Coderjoe |
pretty sure IA does not like spaces in filenames |
05:58
🔗
|
bsmith093 |
juj ok easy to fix |
05:58
🔗
|
DFJustin |
yeah their uploader replaces them with _ |
05:58
🔗
|
yipdw |
also, did I miss something, or when did you get authorization to upload that |
05:59
🔗
|
yipdw |
as far as I can tell that was released in 1995 |
05:59
🔗
|
bsmith093 |
actually i meant to ask about that, does anyone actually know if its pd or what |
05:59
🔗
|
yipdw |
uh |
05:59
🔗
|
yipdw |
it started production in 1964 |
05:59
🔗
|
yipdw |
released in 1995 |
05:59
🔗
|
yipdw |
no matter which way you slice it, no |
06:00
🔗
|
bsmith093 |
k then no upload for m, then. |
06:00
🔗
|
yipdw |
I suggest hunting down the rights to that first |
06:00
🔗
|
yipdw |
unless IA policy says otherwise |
06:00
🔗
|
yipdw |
(I'm not sure) |
06:00
🔗
|
bsmith093 |
its a boondoggle, like you wouldnt believe, this is a fanmade version that as close as possible to the original plan |
06:01
🔗
|
bsmith093 |
is this too iffy? |
06:02
🔗
|
yipdw |
not my area of expertise; it just sounded weird |
06:02
🔗
|
yipdw |
if IA says do it, maybe it's best to let them sort it out |
06:02
🔗
|
bsmith093 |
i would say this is most definitely transformative, but im not really prepared to back that up, legally |
06:02
🔗
|
bsmith093 |
who would i talk to |
06:02
🔗
|
chronomex |
yipdw: we do not worry about ia policy. ia policy is ia does not worry until someone complains. |
06:02
🔗
|
yipdw |
ok |
06:02
🔗
|
yipdw |
upload is fine then, I guess |
06:03
🔗
|
bsmith093 |
k then thats what i thought, if they worry they can always ,ake it dark,. and just keep if backed up but offline, which is what they do anyway, right? |
06:05
🔗
|
Coderjoe |
yipdw: iirc, ia will also hold it, just dark, until the copyright expires (if that ever happens) |
06:05
🔗
|
Coderjoe |
(after a complaint at least) |
06:06
🔗
|
yipdw |
sounds fair |
06:07
🔗
|
bsmith093 |
hey another thing, how do i go about adding things already in the archive, to a collection, so theyy are more ealsiy findable in one page rather than as each individual search result? |
06:07
🔗
|
Coderjoe |
i wonder if the lighttpd/nginx config has something that prevents access to dark items, even if you manage to somehow know the location of the files |
06:07
🔗
|
bsmith093 |
im talking about felix the cat, is case your wondering |
06:07
🔗
|
Coderjoe |
bsmith093: that requires IA staffer intervention, afaik |
06:08
🔗
|
Coderjoe |
(as far as I know) |
06:08
🔗
|
bsmith093 |
ah ok then so should i just redownload and upload to a felix the cat folder? |
06:08
🔗
|
Coderjoe |
no. they would create a collection and then modify the items to add them to that collection |
06:09
🔗
|
bsmith093 |
so just put a message on the forum or something? |
06:11
🔗
|
Coderjoe |
i don't really know how you get their attention |
06:11
🔗
|
DFJustin |
info@archive.org supposedly |
06:12
🔗
|
bsmith093 |
they should really have an irc channel |
06:13
🔗
|
Coderjoe |
oh I love you google |
06:13
🔗
|
Coderjoe |
"here's a breakdown of activity on group X". click on an item: "Cannot find x. There is no group named x." |
06:13
🔗
|
bsmith093 |
Coderjoe: for what exaclty |
06:14
🔗
|
bsmith093 |
gotta love tat |
06:14
🔗
|
bsmith093 |
that |
06:14
🔗
|
bsmith093 |
ive run into that whenever my searches get **really** specific |
06:15
🔗
|
Coderjoe |
this is for a mailing list I was on at some point in the past |
06:15
🔗
|
Coderjoe |
which apparently no longer exists |
06:46
🔗
|
balrog |
does anyone know of any ancient UNIX stuff which isn't in TUHS? |
06:47
🔗
|
Coderjoe |
how ancient and what is TUHS? |
06:49
🔗
|
SketchCow |
I now own textfiles.xxx |
06:49
🔗
|
SketchCow |
Go team |
06:49
🔗
|
dnova |
haha |
06:49
🔗
|
bsmith093 |
good for u, anothermirror or to keep out the cybersquatters? |
06:49
🔗
|
dnova |
what registrar did you use? |
06:49
🔗
|
Coderjoe |
did you buy textfiles.co when columbia's landrush happened? |
06:49
🔗
|
SketchCow |
No, that's bullshit |
06:50
🔗
|
balrog |
Coderjoe: ancient, unix v6 |
06:50
🔗
|
dnova |
I want an .xxx or two but the price is a bit much |
06:50
🔗
|
balrog |
TUHS is The Unix Historical Society |
06:50
🔗
|
SketchCow |
This is more of a move by me to protect against being ICE seized. |
06:50
🔗
|
bsmith093 |
why would ICE care? |
06:51
🔗
|
Coderjoe |
holy shit. $80.18 per year from my source. |
06:51
🔗
|
Coderjoe |
bsmith093: ICE usually goes after the copyright violation domain names, iirc |
06:52
🔗
|
bsmith093 |
yeah, and...? textfiles.com cant possibly be violating anything with domain names right? |
06:52
🔗
|
bsmith093 |
oh wait, yeah that makes much more sense |
06:53
🔗
|
bsmith093 |
the stuff on the site, wow im tired. :P |
06:57
🔗
|
bsmith093 |
gnight/gmorning ,all |
06:59
🔗
|
dnova |
g'night, bsmith093 |
07:41
🔗
|
SketchCow |
Once we decided that the US could "seize" domain names, and believe me, it's utterly untested in 1000 ways, it's just a matter of time. |
07:43
🔗
|
SketchCow |
Boy, I am adding a ton of french magazines. |
07:43
🔗
|
dnova |
zut alors! |
07:44
🔗
|
SketchCow |
I need to do a weblog post about all the stuff I've added, followed by a request to give money to archive.org |
07:45
🔗
|
SketchCow |
Also, my scripts have become a ton more flexible since I started this, with more error correction and clarity. |
07:48
🔗
|
SketchCow |
It's just this thing has a ton of magazines with, like, 8 issues. |
07:48
🔗
|
SketchCow |
So it takes me 2 minutes to set up, or 5 depending. |
07:48
🔗
|
SketchCow |
then just 8 issues. |
07:48
🔗
|
SketchCow |
But when it's 120... then we're cooking with gas. |
07:50
🔗
|
SketchCow |
The fun one is http://www.archive.org/details/computermagazines-french-porte-revues |
09:20
🔗
|
Coderjoe |
http://www.youtube.com/watch?v=zWu0W1kGvsQ&t=5m15s |
09:20
🔗
|
Coderjoe |
I threw up in my mouth a bit |
09:25
🔗
|
dnova |
oh well I'll be sure to watch that then |
09:25
🔗
|
SketchCow |
Yeah, right on it |
09:25
🔗
|
Coderjoe |
it's an example transfer from 9.5mm done by some other company than the one that made the video. |
09:26
🔗
|
Coderjoe |
(film to dvd transfer) |
09:26
🔗
|
SketchCow |
Know what's great? http://procatinator.com/ |
09:26
🔗
|
Coderjoe |
does that use some kind of metadata to match up cats to songs? |
09:27
🔗
|
SketchCow |
I would think all video and audio uses metadata |
09:27
🔗
|
Coderjoe |
I find it rather weird that it possibly randomly matched up "Walking on Sunshine" to a cat on a treadmill |
09:28
🔗
|
Coderjoe |
ok. this is not random matchups |
09:35
🔗
|
dnova |
yeah they're not random |
09:39
🔗
|
SketchCow |
I'm going to get in trouble for the new weblog posting. |
09:39
🔗
|
SketchCow |
But fuck it |
09:43
🔗
|
godane |
this is funny |
09:43
🔗
|
godane |
the bigger ipod versions of crankygeeks are worser i think |
09:44
🔗
|
godane |
i'm backing up diggnation next |
09:44
🔗
|
godane |
or at least the first 100 episodes |
09:45
🔗
|
godane |
looks like i was right on that something was changed with crankygeeks ipod format before it ended |
09:46
🔗
|
godane |
jan 13 2010 show doesn't have pixal blocks |
09:46
🔗
|
godane |
but april 22 2010 show does |
09:49
🔗
|
dnova |
Why do you think you'll get in trouble for that, SketchCow? |
09:49
🔗
|
dnova |
I fucking love your slamposts. I just really, really hope I am never the subject of one |
09:50
🔗
|
dnova |
"here are 5,000 cited reasons why dnova SUCKS" |
09:51
🔗
|
RedType |
that's not a slam, that's an aggressive motivational speech |
09:51
🔗
|
dnova |
haha |
09:54
🔗
|
SketchCow |
It's a call to arms presented while standing on the corpse of a fat guy |
10:05
🔗
|
SketchCow |
I'm now writing an entry with what I've been putting in the Archive these past few months. |
11:16
🔗
|
dnova |
is this ever going to stop? |
11:16
🔗
|
dnova |
15787 ./tmpfs/it/perijulka |
11:16
🔗
|
dnova |
that's megabytes |
11:19
🔗
|
marceloan |
Hey, I've uploaded all my Splinder files to the Batcave. |
11:20
🔗
|
dnova |
bodacious |
11:20
🔗
|
dnova |
did you say so on the wiki? |
11:20
🔗
|
marceloan |
What? |
11:21
🔗
|
dnova |
I'll take care of it |
11:22
🔗
|
dnova |
http://archiveteam.org/index.php?title=Splinder#Upload_status |
11:23
🔗
|
marceloan |
:) |
16:53
🔗
|
SketchCow |
New front page looks great, dnova |
16:54
🔗
|
dnova |
thanks! still have some ideas |
17:13
🔗
|
Schbirid |
http://www.jorisvanhoboken.nl/?p=308 |
19:53
🔗
|
dnova |
anyone want to watch JAWS on CED? cuzzz I just got it. and 59 others. and 2 players. and that much less space in my house. |
21:03
🔗
|
bsmith094 |
good news my poenews scrape is done |
21:04
🔗
|
bsmith094 |
SketchCow: where do you want he Poe-news.com to go? |
21:09
🔗
|
underscor |
Coderjoe: Items are made dark by permissions of the actual files |
21:09
🔗
|
underscor |
You can always see where an item's files are by going to |
21:09
🔗
|
underscor |
archive.org/download/IDENTIFIER |
21:33
🔗
|
bsmith094 |
SketchCow: ive got some poenews scraped, if you want it |
21:50
🔗
|
underscor |
http://inkdroid.org:3000/ |
21:50
🔗
|
underscor |
Realtime wikipedia edits |
21:57
🔗
|
BlueMax |
underscor: wow. |
22:05
🔗
|
underscor |
http://qaa.ath.cx/TheEmperorsNewClothes.html |
22:10
🔗
|
yipdw |
awesome, now I can see Edit Wars IV: A New Hope |
22:11
🔗
|
pberry |
underscor: I know the guy that made that. He's a library tech guy |
22:22
🔗
|
underscor |
cool! |
22:23
🔗
|
bsmith094 |
yipdw: anything on the ffnet script? |
22:23
🔗
|
yipdw |
bsmith094: it needs to be made more robust to deal with network failures |
22:23
🔗
|
bsmith094 |
also i have probably some and or all, of poenews i anyone wants it |
22:23
🔗
|
yipdw |
I don't know what happened to fanfiction.net last night |
22:23
🔗
|
yipdw |
but they were returning 503s for a while |
22:24
🔗
|
yipdw |
the discovery mechanism does not gracefully cope with those |
22:24
🔗
|
Coderjoe |
not pure bash. it relies on awk. though I haven't really observed any version problems with awk, but that's probably because we haven't really used awk here at AT |
22:25
🔗
|
yipdw |
uh, what |
22:25
🔗
|
Coderjoe |
that link from underscor, which was a bash json interpreter |
22:25
🔗
|
yipdw |
oh |
22:26
🔗
|
Coderjoe |
I would have been really impressed if it was pure bash |
22:26
🔗
|
yipdw |
I have a terrible way to do it |
22:26
🔗
|
yipdw |
write a bash backend for Ragel, write the JSON parser in Ragel |
22:26
🔗
|
Coderjoe |
farm it off to something else that has a json parser? |
22:27
🔗
|
yipdw |
yes, python2.6 -mjson is one way to do it |
22:28
🔗
|
Coderjoe |
yipdw: I don't know about ff.net, but I have observed other sites that go down at regular intervals to do backups and stuff. (which to me screams "you're doing it wrong") |
22:28
🔗
|
yipdw |
or whatever version you've got |
22:28
🔗
|
yipdw |
Coderjoe: well, either way, the network is never reliable etc |
22:28
🔗
|
Coderjoe |
except those other sites just stop accepting on 80 |
22:28
🔗
|
Coderjoe |
true |
22:28
🔗
|
yipdw |
it really felt like someone was just hammering ff |
22:28
🔗
|
yipdw |
wasn't me, I was just running two connections at a time |
22:29
🔗
|
Coderjoe |
perhaps some asshole that also wants a copy of everything? |
22:29
🔗
|
bsmith094 |
wasnt me either, i was scraping poevews |
22:29
🔗
|
yipdw |
maybe |
22:30
🔗
|
bsmith094 |
what kind of webservr cant take the load of two mirroring efforts at once, anyway? |
22:31
🔗
|
yipdw |
it's not that uncommon |
22:32
🔗
|
yipdw |
the problem is rarely the web server, though |
22:33
🔗
|
yipdw |
the application to which the server proxies is usually your bottleneck |
22:33
🔗
|
bsmith094 |
are there 8any* sites left that are just hmtl and links |
22:34
🔗
|
bsmith094 |
those are dead easy to save, this place has to write custom code for every job |
22:35
🔗
|
Coderjoe |
4chan's only database and non-static content, last I knew, was the actual posting script. the thread pages and index pages were re-written as static html when a new post came in that affected them |
22:35
🔗
|
bbot_ |
well, we're mostly interested in user-generated content |
22:35
🔗
|
Coderjoe |
(as an example) |
22:35
🔗
|
bbot_ |
and if you let users upload arbitrary HTML, then things get zany |
22:36
🔗
|
yipdw |
consider, too, that the characteristics of a mirroring operation are not the same as what a human would do |
22:36
🔗
|
yipdw |
for one, mirroring will fuck your cache |
22:36
🔗
|
Coderjoe |
like there was no tomorrow |
22:37
🔗
|
yipdw |
because mirroring is going to request everything, including rarely-hit pages; and if it takes a lot of resources to generate HTML then that can bring an application down |
22:37
🔗
|
yipdw |
or it'll generate a lot of content to be dumped into cache and depending on the cache expiration policy that may shove hot data out |
22:38
🔗
|
yipdw |
(I mean, it shouldn't, but...) |
22:49
🔗
|
Coderjoe |
even with a basic LRU policy, that will depend on the ratio of normal users to mirror users |
22:50
🔗
|
yipdw |
yeah, there's a lot of factors |
23:51
🔗
|
bsmith094 |
seriously does anybody want the poe-news wget-warc dump? |
23:52
🔗
|
DFJustin |
SketchCow is probably just afk, chillax for a while |