Time |
Nickname |
Message |
00:02
🔗
|
SketchCow |
The bytesize is correct, yay |
00:03
🔗
|
Wyatt |
So to restart, use --warc-dedup it looks like? |
00:30
🔗
|
SketchCow |
Alard, when you have a chance, it'd be good to have an option in the bookmarklet to read the article. |
00:30
🔗
|
SketchCow |
You know, so it has a second purpose. |
01:06
🔗
|
Wyatt |
Oh, if I want to pull in things linked from another domain (not necessarily spider the whole other domain)....how greedy is wget -H exactly? |
01:08
🔗
|
Coderjoe |
it will follow any links to anything on the specified domain. if something on your source domain links to a page on another domain in -H and that page links to other pages on itself, it will go follow them |
01:11
🔗
|
Wyatt |
So if someone linked to yamaha.com in a forum...oh dear. That's not quite what I'm after. |
01:12
🔗
|
Coderjoe |
if yamaha.com is in the -H |
01:13
🔗
|
Coderjoe |
rt |
01:13
🔗
|
Coderjoe |
er |
01:13
🔗
|
Coderjoe |
hmm |
01:13
🔗
|
Coderjoe |
my memory of what -H was is flawed |
01:14
🔗
|
Coderjoe |
mixed it up with -D |
01:14
🔗
|
Coderjoe |
I am not really sure what -H does |
01:14
🔗
|
Wyatt |
Ah. |
01:58
🔗
|
SketchCow |
Hooray, 5 terabytes of Friendster uploaded. |
02:58
🔗
|
dashcloud |
thanks for the heads-up about the radio appearance- I visited the thisiswhyimbroke.com site and it is indeed really awesome |
02:59
🔗
|
dashcloud |
(rest of the show is great as well) |
03:46
🔗
|
SketchCow |
So dangerous! |
03:46
🔗
|
SketchCow |
Camera Lens Cups |
03:47
🔗
|
SketchCow |
http://t.co/czPoAYa5 |
04:22
🔗
|
bertrando |
Hi, I'm looking for a range of Friendster data, ids 3000-3999, but I don't see it in this list: http://www.archive.org/details/FRIENDSTER-000000000 |
04:22
🔗
|
bertrando |
Is it actually available? |
04:23
🔗
|
SketchCow |
Uploads still happening. |
04:23
🔗
|
SketchCow |
Should be available, I am sure we have it. |
04:23
🔗
|
bertrando |
Great, will that page be updated when it gets uploaded? Or is there a better place to look? |
04:23
🔗
|
SketchCow |
Keep watching. |
04:23
🔗
|
SketchCow |
I'm just uploading as much and as fast as I can, and people are providing the uploading machine with data constantly. |
04:24
🔗
|
SketchCow |
It's just a lot of data, terabytes, and it's taking a while to upload. |
04:24
🔗
|
bertrando |
Cool, thanks. |
04:31
🔗
|
db48x |
yes, we have that range |
06:07
🔗
|
chronomex |
metadata metadata metadata |
06:29
🔗
|
Wyatt |
Unfortunately, I don't think it can be summoned like Beetlejuice. |
06:31
🔗
|
chronomex |
what? |
06:34
🔗
|
Wyatt |
Metadata, unlike Hastur and Beetlejuice, cannot be called forth by speaking thrice its name. |
06:35
🔗
|
Wyatt |
It'd be much easier if it could. |
06:35
🔗
|
SketchCow |
blowjob blowjob blowjob |
06:35
🔗
|
SketchCow |
.... |
06:35
🔗
|
SketchCow |
DAMNIT |
06:36
🔗
|
SketchCow |
Actually, a few volunteers are doing some kickass work for me. |
06:36
🔗
|
Wyatt |
That means you have minions? Cool |
06:37
🔗
|
Wyatt |
Oh yeah, has anyone else encountered segfaults with alard's wget branch? |
07:32
🔗
|
alard |
1. Always download and upload when you click 'View PDF', then offer a 'read a copy yourself' button. |
07:32
🔗
|
alard |
2. Add an extra link to the box on the JSTOR site: 'View PDF' works as normal, 'Liberate/save/free PDF' does the download-upload thing. |
07:32
🔗
|
alard |
SketchCow: Like a sort of JSTOR shuffle. |
07:32
🔗
|
alard |
Three options, which do you prefer? |
07:32
🔗
|
alard |
3. Like option 2, with two links, but download-upload is always enabled, even if you click 'View PDF'. |
07:32
🔗
|
chronomex |
I prefer 3 |
07:32
🔗
|
alard |
Wyatt: Have you checked if the 'normal' wget doesn't segfault? What where the options you tried? |
07:34
🔗
|
SketchCow |
I like 3 |
07:34
🔗
|
SketchCow |
I just want it that you always have the option, the encouragement, to read it |
07:34
🔗
|
SketchCow |
You freed it, you read it |
07:35
🔗
|
SketchCow |
It's not about speed. |
07:36
🔗
|
SketchCow |
I want a thousand people leisurely sucking them dry |
07:36
🔗
|
SketchCow |
I want them forced into a dick move. |
07:36
🔗
|
SketchCow |
That's why the reading is critical. |
07:36
🔗
|
SketchCow |
You're just reading it! |
07:36
🔗
|
SketchCow |
I'm trying to decide if I can afford a second pair of new shoes. |
07:37
🔗
|
chronomex |
what could be wrong with that? |
07:37
🔗
|
chronomex |
two pairs?!? |
07:37
🔗
|
chronomex |
bourgeois |
07:37
🔗
|
Wyatt |
alard: Been fiddling with it a bit, trying to replicate it more thoroughly. If I'm not mistaken it's when trying to resume with --warc-dedup |
07:37
🔗
|
SketchCow |
You have no idea. |
07:37
🔗
|
SketchCow |
These are expensive, lovely shoes |
07:37
🔗
|
chronomex |
doubtless |
07:38
🔗
|
Wyatt |
But are they comfortable? That's important too. |
07:38
🔗
|
ersi |
SketchCow: If they'll last long - do it! |
07:38
🔗
|
ersi |
CHOOSE ONE, MAKING YOU BETTER FEELING |
07:38
🔗
|
ersi |
MAKING YOU! |
07:38
🔗
|
chronomex |
my shoes tend to wear out after about 2,000,000 steps. |
07:38
🔗
|
ersi |
BETTER FEELING! |
07:38
🔗
|
SketchCow |
Shoe one: http://www.bornshoes.com/Product.aspx?ProductID=5516 |
07:38
🔗
|
ersi |
http://www.youtube.com/watch?v=hHjSj_nKTws |
07:39
🔗
|
SketchCow |
Shoe 2: http://www.bornshoes.com/Product.aspx?ProductID=3877 |
07:40
🔗
|
chronomex |
I like the first one, it says "sizes 8-14" and someone is of course asking for a size 7 |
07:40
🔗
|
BlueMax |
why is a data archiving channel talking about shoes |
07:40
🔗
|
Wyatt |
The black ones look like they might be pretty comfortable. |
07:41
🔗
|
chronomex |
BlueMax: because I'm waiting on curl |
07:41
🔗
|
Wyatt |
Though I really do prefer to try footwear on before purchase; I have a weird-shaped arch. |
07:41
🔗
|
Wyatt |
Shoes are data too, right? |
07:42
🔗
|
SketchCow |
When someone acts up, I kick them with the fuckin' shoes |
07:42
🔗
|
chronomex |
SketchCow: I got my uploader stuff working, successfully separated in time metadata and data |
07:42
🔗
|
chronomex |
also made it so I can update the metadata, then rerun the same uploader and it'll see that only the metadata is new |
07:42
🔗
|
SketchCow |
Excellent. |
07:42
🔗
|
SketchCow |
Sorry it's so freaky to learn |
07:43
🔗
|
chronomex |
no worries |
07:43
🔗
|
SketchCow |
Luckily, you can rape one item over and over |
07:43
🔗
|
chronomex |
yeah, usually a small item |
07:43
🔗
|
chronomex |
:P |
07:43
🔗
|
chronomex |
unless you're like you and have more bandwidth than god |
07:43
🔗
|
chronomex |
:( |
07:43
🔗
|
SketchCow |
The littlest ones are the most fun to rape |
07:43
🔗
|
SketchCow |
uh i heard |
07:43
🔗
|
chronomex |
>.> |
07:43
🔗
|
* |
SketchCow takes a seat over there |
07:43
🔗
|
chronomex |
rape means steal, not deposit |
07:43
🔗
|
SketchCow |
THIS IS MUCH BETTER THAN SHOES |
07:43
🔗
|
chronomex |
hurdur |
07:45
🔗
|
chronomex |
my metadata thing is pretty nice to use, for a 100-line shellscript |
07:45
🔗
|
* |
BlueMax slowly hides the shoes |
07:45
🔗
|
chronomex |
display the document in xzgv, ask questions |
07:45
🔗
|
SketchCow |
Yeah, Friendsmash is helping upload friendster data like crazy |
07:45
🔗
|
SketchCow |
I'd love to have that here, but right now I don't do that (for bitsavers documents) |
07:45
🔗
|
chronomex |
aye |
07:46
🔗
|
chronomex |
I'll share it tomorrow or something if you want |
07:46
🔗
|
chronomex |
do you know how to display a progress bar on http file upload? |
07:50
🔗
|
alard |
curl? "If you want a progress meter for HTTP POST or PUT requests, you need to redirect the response output to a file, using shell redirect (>), -o [file] or similar." |
07:51
🔗
|
chronomex |
doesn't give me any progress indicator. |
07:51
🔗
|
alard |
oh. |
07:52
🔗
|
chronomex |
oh, um, okay. got it. |
07:52
🔗
|
chronomex |
that's more like it |
07:59
🔗
|
SketchCow |
I'd love to see THAT code. |
07:59
🔗
|
SketchCow |
Bought the shoes, got free shipping and cheaper price. |
08:07
🔗
|
BlueMax |
I wonder if you need to feel fabulous while archiving |
08:07
🔗
|
chronomex |
I think you can get along with hungover |
08:09
🔗
|
BlueMax |
OK, note to self, keep it down for SketchCow's sake |
08:09
🔗
|
chronomex |
right, he wants to talk about shoes today |
08:11
🔗
|
SketchCow |
I can't imagine my "I'm uploading friendster!!!" getting anything but boring. |
08:12
🔗
|
chronomex |
I'm uploading scans!!! |
08:12
🔗
|
chronomex |
watch them go! |
08:12
🔗
|
SketchCow |
Because I'm uploading a metric asston of friendster. |
08:12
🔗
|
chronomex |
http://www.archive.org/details/bellsystem_CD-1C605-01 |
08:40
🔗
|
alard |
Done, JSTOR now gets two buttons: "View & Save PDF" and "Just Save PDF". |
08:41
🔗
|
SketchCow |
Can I see? |
08:41
🔗
|
SketchCow |
I still owe you verbiage |
08:43
🔗
|
alard |
If you still have the bookmarklet, just go to www.jstor.org and click on it. |
08:43
🔗
|
alard |
If you've lost the bookmarklet: http://severe-samurai-6114.heroku.com/ |
08:44
🔗
|
alard |
It would also be nice to decide on the right words to use: what are you doing? Have you 'saved', 'freed', 'liberated', 'stored', 'stolen' the PDF? |
08:44
🔗
|
alard |
But that probably should depend on the tone of your texts. |
08:44
🔗
|
SketchCow |
I'm sticking with liberator |
08:44
🔗
|
SketchCow |
Liberated. |
08:45
🔗
|
SketchCow |
You read it and gave away a copy. |
08:47
🔗
|
alard |
So that's probably also what the buttons should say: 'View & Liberate' vs 'Just Liberate'? (Probably reduces confusion, since 'View & Save' could also mean that you save it for yourself.) |
08:48
🔗
|
SketchCow |
Yes. |
08:52
🔗
|
alard |
Done. |
08:53
🔗
|
SketchCow |
Thanks |
08:53
🔗
|
alard |
There are 449.287 free articles to download, by the way, so there's probably enough for everyone. |
08:53
🔗
|
SketchCow |
Agreed. |
08:56
🔗
|
SketchCow |
OK. |
08:56
🔗
|
SketchCow |
I just re-read the terms. |
08:56
🔗
|
SketchCow |
We're golden. |
08:57
🔗
|
SketchCow |
I owe you verbiage for that page, and we need to have a server that takes in the data. |
08:57
🔗
|
SketchCow |
I have some theoretical ones. |
08:58
🔗
|
SketchCow |
Let me know how it pushes it, and so on. |
08:58
🔗
|
SketchCow |
Also, I guess we need to make a liberator.archiveteam.org |
08:59
🔗
|
SketchCow |
Ok, bed, it's 5am, we'll talk tomorrow |
08:59
🔗
|
SketchCow |
Great job. |
08:59
🔗
|
SketchCow |
I think we'll announce Monday, once we get all ducks in a row |
08:59
🔗
|
alard |
Okay. Good night. |
08:59
🔗
|
alard |
(We should probably check first to see if what comes out at the other end is actually useful.) |
09:00
🔗
|
SketchCow |
Put it in a directory for me. |
09:00
🔗
|
SketchCow |
Exciting stuff |
09:00
🔗
|
alard |
It would be really useful if that server could run something like Redis to keep the list of things to do. |
09:13
🔗
|
* |
EDream Great Electronics Sale! Prices are reduced up to 50%! Laptops, PDAs, Tablet PCs and more only at X Laptops Co, Ltd. Check us out at http://XLaptops.net |
09:59
🔗
|
BlueMax |
So what's this new project if I can ask |
10:30
🔗
|
ersi |
It's to archive ALL electronics! And have a GREAT SALE! reduced prices up to 50%! |
17:54
🔗
|
sankin |
Does anyone know of a way to archive articles / newspapers from the Google News Archives? |
17:55
🔗
|
sankin |
they stopped adding new content back in May, and now they've removed the archive search page that was at http://www.google.com/archivesearch/advanced_search |
17:55
🔗
|
sankin |
how long until they kill it completely? |
18:04
🔗
|
chronomex |
hmm |
18:48
🔗
|
SketchCow |
Heads up, gang: |
18:48
🔗
|
SketchCow |
http://s.assetbar.com/index |
18:51
🔗
|
db48x2 |
http://public.numair.com/2011_fbfool.html |
19:14
🔗
|
chronomex |
they sound like they were more interested in their technology than what they were doing with it: http://www.assetbar.com/index_about_us |
19:19
🔗
|
Schbirid |
anyone know a pastebin where i can upload a 5mb textfile? or is it my browser that refuses to paste |
19:21
🔗
|
Nemo_bis |
try Chrome/Chromium |
19:22
🔗
|
Schbirid |
it froze |
19:22
🔗
|
Schbirid |
it pasted |
19:23
🔗
|
Schbirid |
The connection to pastebin.com was interrupted. |
19:23
🔗
|
Nemo_bis |
http://p.defau.lt/new.html |
19:24
🔗
|
Schbirid |
filehost that might be dying http://stashbox.org/ |
19:28
🔗
|
Schbirid |
thanks |
19:28
🔗
|
Schbirid |
oh, emijrp aint here |
19:42
🔗
|
Schbirid |
argh, wget does not like -c and --content-disposition together it seems |
19:52
🔗
|
Schbirid |
help, i am too dumb to make aria2c simply not download a file if it already exists locally |
19:55
🔗
|
Schbirid |
--auto-file-renaming=false seems like a dirty hack (and results in an error) |
19:57
🔗
|
db48x2 |
if [ ! -f $file ]; then $aria ...; fi |
19:59
🔗
|
Schbirid |
can't do that. i am fetching albums from jamendo and the url i pass to aria2c is not what the downloaded file is named |
20:00
🔗
|
Schbirid |
example http://www.jamendo.com/get/album/id/album/archiverestricted/redirect/29/?p2pnet=bittorrent&are=ogg3 |
20:00
🔗
|
Schbirid |
with wget i would need to use --content-disposition |
20:08
🔗
|
Schbirid |
oh actually aria2c is being smart |
20:08
🔗
|
Schbirid |
many albums at jamendo were changed, aria2c notices that and decides to download even though the _filename_ already exists |
20:08
🔗
|
Schbirid |
should have expected that |
20:12
🔗
|
Schbirid |
eh, on another run it does not notice that filename.1 already is the renamed one |
20:12
🔗
|
Schbirid |
meh |
20:52
🔗
|
chronomex |
speaking about delicious, this is an interesting read: http://simonwillison.net/notes/2006/summit/schachter.txt |
20:53
🔗
|
chronomex |
Morals: You have to develop a sense of morals when you build your system. It's |
20:53
🔗
|
chronomex |
the user's data; it's not yours. Make sure they can remove themselves and |
20:53
🔗
|
chronomex |
their account if they want to. |
20:53
🔗
|
chronomex |
hmmm. |
20:53
🔗
|
chronomex |
In del.icio.us if a user deletes something they really do purge the data from |
20:53
🔗
|
chronomex |
the system. No transaction logs etc for getting stuff back. |
21:49
🔗
|
Ymgve |
no backups? |
22:27
🔗
|
Zebranky |
chronomex: I think I would prefer that to having it retained indefinitely, though that opens concerns of malicious deletion by other people, etc. |
22:27
🔗
|
chronomex |
yeah... |