| Time |
Nickname |
Message |
|
02:39
🔗
|
SketchCow |
I am very sad they destroyed these BYTEs to make these scans, but I am happy with the results |
|
02:40
🔗
|
SketchCow |
So they're awful actions resulting in nice outcomes |
|
02:52
🔗
|
joepie91 |
currently scraping devilskitchen blog, for anyone who cares, here is the scraper source: http://git.cryto.net/cgit/joepie91/tree/tools/scrapers/devilskitchen.py |
|
02:55
🔗
|
kennethre |
SketchCow: you happen to be in SF? |
|
03:00
🔗
|
bsmith094 |
SketchCow: are these the atariage forum scans? |
|
03:06
🔗
|
SketchCow |
Half are |
|
03:06
🔗
|
SketchCow |
Someone took it up after the guy disappeared and stopped |
|
03:12
🔗
|
joepie91 |
okay, I'm encountering something really strange |
|
03:12
🔗
|
joepie91 |
can anyone try to go to http://www.devilskitchen.me.uk/2009_01_01_archive.html and see if it loads? |
|
03:12
🔗
|
joepie91 |
or whether they get an 'account error'? |
|
03:14
🔗
|
joepie91 |
wtf |
|
03:14
🔗
|
joepie91 |
403s all over the place |
|
10:08
🔗
|
SmileyG |
Account Load Error |
|
10:08
🔗
|
SmileyG |
Your Google account has been disabled or suspended or deleted. lol |
|
11:36
🔗
|
godane |
it looks like 2009_01 to 2009_04 are all blocked |
|
11:40
🔗
|
godane |
i'm grabbing all the old T3 magazines i can get |
|
13:03
🔗
|
SketchCow |
Greetings from Train |
|
13:14
🔗
|
underscor |
nice |
|
13:58
🔗
|
dragondon |
Greetings all. Can't import the latest ova image into VirtualBox on a Debian AMD-FX6100 system. "Could not read OVF file 'archiveteam-warrior-v2-20120813.ovf' (VERR_TAR_END_OF_FILE)." |
|
13:58
🔗
|
dragondon |
I am attempting to download again just in case. |
|
13:59
🔗
|
Frigolit |
considering the "END_OF_FILE" it does sound like a corrupt download |
|
14:07
🔗
|
SketchCow |
I'm blasting dozens maybe hundreds of laptop service manuals in. |
|
14:08
🔗
|
SketchCow |
Laptop all the things |
|
14:09
🔗
|
SketchCow |
Past 80 added now. |
|
14:09
🔗
|
SketchCow |
So that's something. |
|
14:16
🔗
|
SketchCow |
Little things like manuals can have a huge effect. |
|
14:41
🔗
|
dragondon |
Second download, same error. Does someone have an MD5 that I can check against? |
|
14:46
🔗
|
SketchCow |
http://archive.org/details/dell-service-manual-64ptnen wheeee |
|
14:47
🔗
|
alard |
dragondon: e9079fbbcf5e05b3493fee8c05cd6f77 (from http://ia601200.us.archive.org/3/items/archiveteam-warrior/archiveteam-warrior_files.xml) |
|
14:48
🔗
|
SketchCow |
dragondon: Set your modem to 8,N,1 |
|
14:48
🔗
|
alard |
dragondon: What sometimes helps is to rename the ova file to archiveteam-warrior-v2.ova (remove the date) |
|
14:50
🔗
|
alard |
I renamed the ova file before uploading, so the ovf file inside is still called archiveteam-warrior-v2.ovf. Some versions of VirtualBox don't like that. |
|
20:23
🔗
|
Nemo_bis |
now deriving a 29700 pages book: http://www.us.archive.org/log_show.php?task_id=124346847 |
|
20:23
🔗
|
Nemo_bis |
(Britannica 1911) |
|
20:25
🔗
|
Nemo_bis |
looks like Module AbbyyXML will take 109 h |
|
20:25
🔗
|
Nemo_bis |
I'm curious to see the result... |
|
20:29
🔗
|
DFJustin |
lol |
|
20:31
🔗
|
bsmith094 |
you think thats big, all the stories i've got, if i printed them, would take 225 REAMS of paper, 117215 pages |
|
20:34
🔗
|
Nemo_bis |
noo I don't think it's big |
|
20:34
🔗
|
Nemo_bis |
it's just the deriver being a silly bully |
|
20:34
🔗
|
Nemo_bis |
"I'll show you, I can eat ALL OF IT at once!" |
|
20:35
🔗
|
bsmith094 |
hey, you know what would be a serious PITA to OCR, House of Leaves? |
|
20:35
🔗
|
bsmith094 |
typographic nightmare |
|
20:36
🔗
|
Nemo_bis |
oh, don't worry, I have my own |
|
20:36
🔗
|
bsmith094 |
which is? |
|
20:36
🔗
|
Nemo_bis |
http://archive.org/details/VocabolarioDellaLinguaItaliana2 |
|
20:36
🔗
|
Nemo_bis |
Zingarelli_images.zip 23-Dec-2011 04:03 21792613118 |
|
20:37
🔗
|
bsmith094 |
ok, why does a 6.4gb pdf of mostly text even exist? |
|
20:37
🔗
|
Nemo_bis |
looks like no OCR system is able to correctly recognize text if the entry exponent is twice as big in a font as the rest of the entry |
|
20:37
🔗
|
Nemo_bis |
it's just a broken pdf |
|
20:37
🔗
|
Nemo_bis |
and it's an image pdf |
|
20:37
🔗
|
Nemo_bis |
we took scans at 400 dpi IIRC |