| Time |
Nickname |
Message |
|
04:10
🔗
|
godane |
i may have found something interesting |
|
04:11
🔗
|
godane |
so it turns out that wal-mart has a 3rd-party host some pdfs |
|
04:11
🔗
|
godane |
the 3rd parts is vo.msecnd.net domain |
|
04:26
🔗
|
joepie92 |
godane: something microsoft |
|
04:36
🔗
|
godane |
any ways there is a msrvideo.vo.msecnd.net domain and it has pdfs of there research |
|
04:51
🔗
|
joepie92 |
SketchCow, undersco2, someone is reporting issues with warc2zip - where do I report this? |
|
05:14
🔗
|
godane |
joepie92: the pdfs are from microsoft research labs |
|
05:39
🔗
|
godane |
so the microsoft research pdfs are going to be a uploaded as a zip file |
|
05:40
🔗
|
godane |
this way i can do the dump slowly per a 1000 ids |
|
06:15
🔗
|
godane |
i'm starting to upload the microsoft research pdfs i found: https://archive.org/details/msrvideo.vo.msecnd.net-pdf-grab-103000-to-104000 |
|
06:15
🔗
|
godane |
the range alone is over 1gb |
|
06:16
🔗
|
godane |
also know i try downloading from 100000 but got nothing until 103339 |
|
07:24
🔗
|
joepie92 |
Lord_Nigh: I typically imagine warriors as a fleet of express trains that spawn out of nowhere, run over a due-to-be-shutdown service, somehow magically transport all the data into their cargo hold, and then vanish into thin air |
|
07:29
🔗
|
BlueMax |
I imagine "THIS! IS! ARCHIVING!" *kick* followed immediately by a Battle of Thermopylae situation |
|
07:43
🔗
|
joepie92 |
Battle of Thermopylae? |
|
07:45
🔗
|
BlueMax |
The famous battle with King Leonidas and his 300 spartans, joepie92 |
|
07:45
🔗
|
joepie92 |
ahhh |
|
07:45
🔗
|
joepie92 |
heh |
|
09:43
🔗
|
Schbirid |
huh, i cant lpay the ogv from https://archive.org/details/internetarchivecelebration20131024 with mplayer or vlc. |
|
09:58
🔗
|
Lord_Nigh |
question about scanning schematics: for large 11x17 schematics which only ever had about a 1140x1720 or so image printed as the schematic istelf on them, i think 400dpi is sufficient |
|
09:58
🔗
|
Lord_Nigh |
for most hand drawn or greyscale schems 800 is needed but in this case i think 400 is fine |
|
10:45
🔗
|
godane |
just know that 105000 to 120000 range of microsoft research papers is going to be very small cause there are very few files there |
|
13:03
🔗
|
BiggieJon |
/join #archiveteam |
|
19:21
🔗
|
balrog |
SketchCow: re your scanning blog post: I personally don't mind destroying something to scan it if it's very common and easy to replace |
|
19:21
🔗
|
balrog |
reminds me though, I need to upload some recent scans |
|
20:04
🔗
|
* |
phillipsj has pasted his root password into IRC chat. |
|
20:47
🔗
|
Schbirid |
wget's -D matches all subdomains for the specified domains, is that new or did i never notice |
|
20:47
🔗
|
Schbirid |
ev tumblr.com will match ALL *.tumblr.com |
|
20:47
🔗
|
Schbirid |
ev = eg |
|
21:38
🔗
|
ersi |
https://fbcdn-sphotos-f-a.akamaihd.net/hphotos-ak-prn1/1385827_651805974859860_2051484422_n.jpg |
|
21:41
🔗
|
w0rp |
ersi: I love it! |
|
21:42
🔗
|
phillipsj |
lol. My comeback always was: "No, but 'u' and 'i' are in community!" |