Time |
Nickname |
Message |
02:39
🔗
|
SketchCow |
I am very sad they destroyed these BYTEs to make these scans, but I am happy with the results |
02:40
🔗
|
SketchCow |
So they're awful actions resulting in nice outcomes |
02:52
🔗
|
joepie91 |
currently scraping devilskitchen blog, for anyone who cares, here is the scraper source: http://git.cryto.net/cgit/joepie91/tree/tools/scrapers/devilskitchen.py |
02:55
🔗
|
kennethre |
SketchCow: you happen to be in SF? |
03:00
🔗
|
bsmith094 |
SketchCow: are these the atariage forum scans? |
03:06
🔗
|
SketchCow |
Half are |
03:06
🔗
|
SketchCow |
Someone took it up after the guy disappeared and stopped |
03:12
🔗
|
joepie91 |
okay, I'm encountering something really strange |
03:12
🔗
|
joepie91 |
can anyone try to go to http://www.devilskitchen.me.uk/2009_01_01_archive.html and see if it loads? |
03:12
🔗
|
joepie91 |
or whether they get an 'account error'? |
03:14
🔗
|
joepie91 |
wtf |
03:14
🔗
|
joepie91 |
403s all over the place |
10:08
🔗
|
SmileyG |
Account Load Error |
10:08
🔗
|
SmileyG |
Your Google account has been disabled or suspended or deleted. lol |
11:36
🔗
|
godane |
it looks like 2009_01 to 2009_04 are all blocked |
11:40
🔗
|
godane |
i'm grabbing all the old T3 magazines i can get |
13:03
🔗
|
SketchCow |
Greetings from Train |
13:14
🔗
|
underscor |
nice |
13:58
🔗
|
dragondon |
Greetings all. Can't import the latest ova image into VirtualBox on a Debian AMD-FX6100 system. "Could not read OVF file 'archiveteam-warrior-v2-20120813.ovf' (VERR_TAR_END_OF_FILE)." |
13:58
🔗
|
dragondon |
I am attempting to download again just in case. |
13:59
🔗
|
Frigolit |
considering the "END_OF_FILE" it does sound like a corrupt download |
14:07
🔗
|
SketchCow |
I'm blasting dozens maybe hundreds of laptop service manuals in. |
14:08
🔗
|
SketchCow |
Laptop all the things |
14:09
🔗
|
SketchCow |
Past 80 added now. |
14:09
🔗
|
SketchCow |
So that's something. |
14:16
🔗
|
SketchCow |
Little things like manuals can have a huge effect. |
14:41
🔗
|
dragondon |
Second download, same error. Does someone have an MD5 that I can check against? |
14:46
🔗
|
SketchCow |
http://archive.org/details/dell-service-manual-64ptnen wheeee |
14:47
🔗
|
alard |
dragondon: e9079fbbcf5e05b3493fee8c05cd6f77 (from http://ia601200.us.archive.org/3/items/archiveteam-warrior/archiveteam-warrior_files.xml) |
14:48
🔗
|
SketchCow |
dragondon: Set your modem to 8,N,1 |
14:48
🔗
|
alard |
dragondon: What sometimes helps is to rename the ova file to archiveteam-warrior-v2.ova (remove the date) |
14:50
🔗
|
alard |
I renamed the ova file before uploading, so the ovf file inside is still called archiveteam-warrior-v2.ovf. Some versions of VirtualBox don't like that. |
20:23
🔗
|
Nemo_bis |
now deriving a 29700 pages book: http://www.us.archive.org/log_show.php?task_id=124346847 |
20:23
🔗
|
Nemo_bis |
(Britannica 1911) |
20:25
🔗
|
Nemo_bis |
looks like Module AbbyyXML will take 109 h |
20:25
🔗
|
Nemo_bis |
I'm curious to see the result... |
20:29
🔗
|
DFJustin |
lol |
20:31
🔗
|
bsmith094 |
you think thats big, all the stories i've got, if i printed them, would take 225 REAMS of paper, 117215 pages |
20:34
🔗
|
Nemo_bis |
noo I don't think it's big |
20:34
🔗
|
Nemo_bis |
it's just the deriver being a silly bully |
20:34
🔗
|
Nemo_bis |
"I'll show you, I can eat ALL OF IT at once!" |
20:35
🔗
|
bsmith094 |
hey, you know what would be a serious PITA to OCR, House of Leaves? |
20:35
🔗
|
bsmith094 |
typographic nightmare |
20:36
🔗
|
Nemo_bis |
oh, don't worry, I have my own |
20:36
🔗
|
bsmith094 |
which is? |
20:36
🔗
|
Nemo_bis |
http://archive.org/details/VocabolarioDellaLinguaItaliana2 |
20:36
🔗
|
Nemo_bis |
Zingarelli_images.zip 23-Dec-2011 04:03 21792613118 |
20:37
🔗
|
bsmith094 |
ok, why does a 6.4gb pdf of mostly text even exist? |
20:37
🔗
|
Nemo_bis |
looks like no OCR system is able to correctly recognize text if the entry exponent is twice as big in a font as the rest of the entry |
20:37
🔗
|
Nemo_bis |
it's just a broken pdf |
20:37
🔗
|
Nemo_bis |
and it's an image pdf |
20:37
🔗
|
Nemo_bis |
we took scans at 400 dpi IIRC |