Time |
Nickname |
Message |
00:17
🔗
|
kyan |
Here's a bit more of Docstoc that didn't get uploaded normally, by the way: https://archive.org/download/WARCdealer_BarrelData_kthenu_e257bd0e-5f2c-49fb-9b31-ae516964a559.2015-12-05-04-27-07-934046-_E |
00:17
🔗
|
kyan |
arkiver, ^ |
00:18
🔗
|
kyan |
(it should probably get tucked into the regular Docstoc collection?) |
00:19
🔗
|
|
wyatt8740 has quit IRC (Remote host closed the connection) |
00:20
🔗
|
kyan |
also, why is one of the items 18 GB? That seems kind of big for 100 documents... -_- |
00:20
🔗
|
kyan |
Eh, whatever, it's saved now anyway |
00:30
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
00:32
🔗
|
|
SN4T14 has joined #archiveteam-bs |
00:52
🔗
|
|
RichardG has quit IRC (Ping timeout: 499 seconds) |
01:43
🔗
|
|
antomatic has joined #archiveteam-bs |
01:43
🔗
|
|
swebb sets mode: +o antomatic |
01:45
🔗
|
|
antomati_ has quit IRC (Ping timeout: 252 seconds) |
01:50
🔗
|
|
RichardG has joined #archiveteam-bs |
02:19
🔗
|
|
JesseW has quit IRC (Leaving.) |
02:24
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
02:28
🔗
|
|
username1 has joined #archiveteam-bs |
02:31
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
02:43
🔗
|
|
VADemon has quit IRC (left4dead) |
02:44
🔗
|
|
Start_ has joined #archiveteam-bs |
02:44
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
02:48
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
02:49
🔗
|
|
RichardG has quit IRC (Ping timeout: 369 seconds) |
02:53
🔗
|
|
ndiddy has quit IRC (Remote host closed the connection) |
03:23
🔗
|
|
diacope has quit IRC (Ping timeout: 491 seconds) |
03:23
🔗
|
|
deathy___ has quit IRC (Ping timeout: 491 seconds) |
03:23
🔗
|
|
tjg has quit IRC (Read error: Connection reset by peer) |
03:23
🔗
|
|
_desu____ has joined #archiveteam-bs |
03:23
🔗
|
|
zyphlar_ has joined #archiveteam-bs |
03:24
🔗
|
|
tjg has joined #archiveteam-bs |
03:25
🔗
|
|
Boltsie__ has quit IRC (Ping timeout: 242 seconds) |
03:25
🔗
|
|
JSharp___ has quit IRC (Ping timeout: 242 seconds) |
03:25
🔗
|
|
_desu___ has quit IRC (Ping timeout: 242 seconds) |
03:25
🔗
|
|
zyphlar has quit IRC (Ping timeout: 242 seconds) |
03:25
🔗
|
|
Ctrl-S___ has quit IRC (Ping timeout: 242 seconds) |
03:25
🔗
|
|
JSharp___ has joined #archiveteam-bs |
03:25
🔗
|
|
_desu____ is now known as _desu___ |
03:25
🔗
|
|
zyphlar_ is now known as zyphlar |
03:25
🔗
|
|
Boltsie__ has joined #archiveteam-bs |
03:25
🔗
|
|
primus104 has quit IRC (Leaving.) |
03:25
🔗
|
|
Ctrl-S___ has joined #archiveteam-bs |
03:26
🔗
|
|
deathy___ has joined #archiveteam-bs |
03:41
🔗
|
|
tjg has quit IRC (Read error: Connection reset by peer) |
03:41
🔗
|
|
_desu____ has joined #archiveteam-bs |
03:41
🔗
|
|
zyphlar_ has joined #archiveteam-bs |
03:42
🔗
|
|
tjg has joined #archiveteam-bs |
03:42
🔗
|
|
diacope has joined #archiveteam-bs |
03:43
🔗
|
|
deathy___ has quit IRC (Ping timeout: 246 seconds) |
03:43
🔗
|
|
JSharp___ has quit IRC (Ping timeout: 246 seconds) |
03:43
🔗
|
|
Ctrl-S___ has quit IRC (Ping timeout: 246 seconds) |
03:43
🔗
|
|
Boltsie__ has quit IRC (Ping timeout: 246 seconds) |
03:43
🔗
|
|
zyphlar has quit IRC (Ping timeout: 246 seconds) |
03:43
🔗
|
|
_desu___ has quit IRC (Ping timeout: 246 seconds) |
03:43
🔗
|
|
_desu____ is now known as _desu___ |
03:43
🔗
|
|
zyphlar_ is now known as zyphlar |
03:43
🔗
|
|
Boltsie__ has joined #archiveteam-bs |
03:43
🔗
|
|
JSharp___ has joined #archiveteam-bs |
03:43
🔗
|
|
Ctrl-S___ has joined #archiveteam-bs |
03:45
🔗
|
|
RichardG has joined #archiveteam-bs |
03:50
🔗
|
|
JesseW has joined #archiveteam-bs |
03:54
🔗
|
|
deathy___ has joined #archiveteam-bs |
04:25
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
04:26
🔗
|
|
Start_ is now known as Start |
04:55
🔗
|
kyan |
Rrrgh. I wish IA didn't dark spam, but just noindexed it |
04:56
🔗
|
kyan |
Trying to figure out how to get to this collection of mp3s of lectures, but it was flagged as spam https://archive.org/details/Dr.JamaalBadawi |
04:57
🔗
|
kyan |
That said, maybe it is fake or something? But I'd rather be able to find out for myself than be left stuck wondering |
05:06
🔗
|
kyan |
It looks like it was an accident https://archive.org/post/1048792/help-with-failed-exit-code-1 |
05:07
🔗
|
kyan |
It's been darked 3 times as spam, and undarked twice |
05:07
🔗
|
kyan |
Why not just have a checkbox on the search pages asking if we want to include things marked as spam in search results? |
05:21
🔗
|
|
remsen has quit IRC (Read error: Operation timed out) |
05:31
🔗
|
|
remsen has joined #archiveteam-bs |
05:32
🔗
|
|
remsen2 has joined #archiveteam-bs |
05:35
🔗
|
|
R5M has joined #archiveteam-bs |
05:36
🔗
|
|
remsen2 has quit IRC (Read error: Operation timed out) |
05:37
🔗
|
|
remsen has quit IRC (Read error: Operation timed out) |
05:42
🔗
|
|
R5M has quit IRC (Read error: Operation timed out) |
05:47
🔗
|
|
remsen has joined #archiveteam-bs |
05:53
🔗
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
05:54
🔗
|
|
remsen2 has joined #archiveteam-bs |
05:56
🔗
|
|
remsen has quit IRC (Read error: Operation timed out) |
05:56
🔗
|
|
R5M has joined #archiveteam-bs |
06:04
🔗
|
|
remsen2 has quit IRC (Read error: Operation timed out) |
06:14
🔗
|
kyan |
So, I'm seeing a bunch of people talking about "Twitter Moments". https://twitter.com/moments is some random person who's never tweeted |
06:14
🔗
|
kyan |
I don't have any buttons on twitter called Moment |
06:15
🔗
|
kyan |
or displaying the electric-y logo for it. |
06:15
🔗
|
|
vitzli has joined #archiveteam-bs |
06:15
🔗
|
kyan |
Googling "twitter moments" takes me to https://twitter.com/i/moments?lang=en which is a 404 |
06:15
🔗
|
kyan |
After watching their lovely ad, I was so excited to see what all the fuss was about! |
06:16
🔗
|
kyan |
Hashtag #marketing, hashtag #fail. |
06:35
🔗
|
|
zerkalo has quit IRC (Read error: Operation timed out) |
06:46
🔗
|
|
zerkalo has joined #archiveteam-bs |
07:20
🔗
|
fie |
worked for me? |
07:26
🔗
|
kyan |
I think they've only enabled it for some accounts, but are advertising it to everyone |
07:26
🔗
|
kyan |
fie ^ |
07:26
🔗
|
* |
kyan is going to sleep now though :3 |
07:27
🔗
|
fie |
I heard about it on NPR weeks ago... I don't even use twitter |
07:27
🔗
|
kyan |
Huh, ok |
07:27
🔗
|
kyan |
I just heard about it tonight since SketchCow posted about it |
07:27
🔗
|
fie |
I may or may not have an account... who knwos.... that place is just a trash bin |
07:27
🔗
|
* |
kyan gets all his news from ArchiveTeam |
07:28
🔗
|
* |
fie gets all of his news from many legged creatures that live under rocks |
07:31
🔗
|
|
vitzli has quit IRC (Quit: Leaving) |
08:38
🔗
|
|
primus104 has joined #archiveteam-bs |
08:44
🔗
|
|
JesseW has quit IRC (Leaving.) |
08:48
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
08:55
🔗
|
|
dashcloud has joined #archiveteam-bs |
09:14
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
09:46
🔗
|
|
primus104 has quit IRC (Leaving.) |
10:11
🔗
|
|
Sk1d has joined #archiveteam-bs |
11:59
🔗
|
|
primus104 has joined #archiveteam-bs |
12:38
🔗
|
|
primus104 has quit IRC (Leaving.) |
13:02
🔗
|
|
R5M has quit IRC (Read error: Operation timed out) |
13:31
🔗
|
|
VADemon has joined #archiveteam-bs |
13:34
🔗
|
|
SN4T14 has quit IRC (Remote host closed the connection) |
13:41
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
13:44
🔗
|
|
SN4T14 has joined #archiveteam-bs |
14:08
🔗
|
|
SimpBrain has quit IRC (Leaving) |
14:11
🔗
|
VADemon |
What are the user-agent strings for archive.org waybackmachine bot? ia_archiver and archive.org_bot are proposed by http://www.archiveteam.org/index.php?title=ArchiveBot#Disclaimers |
14:12
🔗
|
VADemon |
But I also have found a website telling that ia_archiver-web.archive.org is a bot from Alexa additionally indexing items for web.archive.org |
14:36
🔗
|
username1 |
sounds like ym seagate 8tb woes are a kernel problem https://bugzilla.kernel.org/show_bug.cgi?id=93581 |
15:00
🔗
|
|
primus104 has joined #archiveteam-bs |
15:14
🔗
|
|
SimpBrain has joined #archiveteam-bs |
15:40
🔗
|
|
SN4T14 has quit IRC (Remote host closed the connection) |
15:46
🔗
|
godane |
so i found a pattern to grab Time Magazine from there vault website |
15:57
🔗
|
godane |
so looks like the older vault Time magazines was scanned better |
15:57
🔗
|
godane |
where things in the 1990s they didn't care |
15:58
🔗
|
godane |
they scan those very baddly |
16:01
🔗
|
godane |
here is a exampile of a bad scan: http://time.com/vault/issue/1923-09-03/page/1/ |
16:03
🔗
|
godane |
anyways the first 100 download ids of fieldsupport.lingnet.org is done: https://archive.org/details/fieldsupport.lingnet.org-download-id-1-to-100-20151206 |
16:03
🔗
|
godane |
you really only get 74 of them |
16:03
🔗
|
godane |
but you have a wget.log to see what is missing |
16:04
🔗
|
|
R5M has joined #archiveteam-bs |
17:18
🔗
|
SketchCow |
In EXTREMELY boring news, the archivebot screenshotter ignored an item if it had anything called *png* in it, that's fixed, so those little ones without any screenshots are now getting screenshots. |
17:19
🔗
|
SketchCow |
It's going to be at this for months, probably, but I can summarily ignore it |
17:19
🔗
|
SketchCow |
I will automate it a tad more and then just watch it fill. |
17:19
🔗
|
SketchCow |
The question is if it can ever beat the race condition. |
17:19
🔗
|
SketchCow |
(I don't want to do things like just do a small set of the page grabs in a given set.) |
17:20
🔗
|
SketchCow |
I mean, if we were having guests, I might do that. |
17:20
🔗
|
SketchCow |
We're not having guests |
17:23
🔗
|
|
username1 is now known as schbirid |
17:23
🔗
|
* |
schbirid slaps arkiver with nohome |
17:29
🔗
|
|
limebyte has quit IRC (ZNC - http://znc.in) |
17:29
🔗
|
|
limebyte has joined #archiveteam-bs |
17:42
🔗
|
|
Ravenloft has joined #archiveteam-bs |
17:44
🔗
|
|
no2pencil has quit IRC (Ping timeout: 252 seconds) |
17:44
🔗
|
|
no2pencil has joined #archiveteam-bs |
17:44
🔗
|
arkiver |
schbirid: I think I already send you the telenor target? |
17:44
🔗
|
schbirid |
nope, or maybe i lost the log |
17:45
🔗
|
|
tjg has quit IRC (Read error: Connection reset by peer) |
17:45
🔗
|
|
Boltsie__ has quit IRC (Write error: Connection reset by peer) |
17:45
🔗
|
|
_desu____ has joined #archiveteam-bs |
17:45
🔗
|
|
zyphlar_ has joined #archiveteam-bs |
17:45
🔗
|
|
bauruine has quit IRC (Read error: Connection reset by peer) |
17:45
🔗
|
|
bauruine_ has joined #archiveteam-bs |
17:45
🔗
|
arkiver |
schbirid: found it, looks like you were online |
17:46
🔗
|
|
bauruine_ is now known as bauruine |
17:46
🔗
|
|
Boltsie__ has joined #archiveteam-bs |
17:46
🔗
|
arkiver |
offline* |
17:46
🔗
|
schbirid |
cheers! |
17:47
🔗
|
|
deathy___ has quit IRC (Ping timeout: 241 seconds) |
17:47
🔗
|
|
JSharp___ has quit IRC (Ping timeout: 241 seconds) |
17:47
🔗
|
|
Ctrl-S___ has quit IRC (Ping timeout: 241 seconds) |
17:47
🔗
|
|
_desu___ has quit IRC (Ping timeout: 241 seconds) |
17:47
🔗
|
|
zyphlar has quit IRC (Ping timeout: 241 seconds) |
17:47
🔗
|
|
_desu____ is now known as _desu___ |
17:47
🔗
|
|
JSharp___ has joined #archiveteam-bs |
17:47
🔗
|
|
zyphlar_ is now known as zyphlar |
17:48
🔗
|
|
tjg has joined #archiveteam-bs |
17:48
🔗
|
|
Ctrl-S___ has joined #archiveteam-bs |
17:52
🔗
|
|
PrincessK has joined #archiveteam-bs |
17:52
🔗
|
|
deathy___ has joined #archiveteam-bs |
18:00
🔗
|
|
Knoeki has quit IRC (Read error: Operation timed out) |
18:21
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 252 seconds) |
18:23
🔗
|
|
JesseW has joined #archiveteam-bs |
18:57
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
18:57
🔗
|
|
swebb sets mode: +o aaaaaaaaa |
19:21
🔗
|
|
JesseW has quit IRC (Leaving.) |
19:44
🔗
|
|
SN4T14 has joined #archiveteam-bs |
19:53
🔗
|
|
R5M has quit IRC (Read error: Operation timed out) |
20:31
🔗
|
|
JesseW has joined #archiveteam-bs |
20:48
🔗
|
JesseW |
godane: thanks for the lingnet grab: https://archive.org/details/fieldsupport.lingnet.org-download-id-1-to-100-20151206 |
20:52
🔗
|
godane |
your welcome |
20:53
🔗
|
godane |
also this is up to now: https://archive.org/details/fieldsupport.lingnet.org-download-id-301-to-400-20151206 |
21:10
🔗
|
|
JesseW has quit IRC (Leaving.) |
21:15
🔗
|
arkiver |
godane: should that also be saved into WARCs? |
21:17
🔗
|
godane |
i was not saving it in warc cause the files are pdfs and zips |
21:18
🔗
|
godane |
but if you want i could do that |
21:18
🔗
|
godane |
its like how i did the lego pdfs |
21:20
🔗
|
arkiver |
I think it's always best to save files into WARCs |
21:20
🔗
|
arkiver |
Direct links to the files in the WARC files can always be made |
21:26
🔗
|
godane |
ok |
21:26
🔗
|
godane |
i'm doing 1 to 500 as a WARC |
21:27
🔗
|
arkiver |
thanks! |
21:28
🔗
|
godane |
i mostly do the zips for later collection building |
21:28
🔗
|
arkiver |
I'll soon start writing the warrior project for your scripts we talked about |
21:29
🔗
|
arkiver |
so you can use many IPs or lot's of bandwidth to do your grabs |
21:29
🔗
|
godane |
i'm doing good with kpfa so far |
21:29
🔗
|
godane |
i'm up to may 2006 |
21:29
🔗
|
arkiver |
saw that yeah, nice |
22:13
🔗
|
|
Ravenloft has joined #archiveteam-bs |
23:03
🔗
|
|
Muad-Dib has quit IRC (Ping timeout: 252 seconds) |
23:14
🔗
|
|
R5M has joined #archiveteam-bs |
23:40
🔗
|
|
zenguy has quit IRC (Quit: see ya!) |
23:41
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
23:42
🔗
|
|
zenguy has joined #archiveteam-bs |
23:43
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 252 seconds) |
23:51
🔗
|
|
zenguy_pc has joined #archiveteam-bs |