#archiveteam-bs 2015-12-06,Sun

↑back Search

Time Nickname Message
00:17 🔗 kyan Here's a bit more of Docstoc that didn't get uploaded normally, by the way: https://archive.org/download/WARCdealer_BarrelData_kthenu_e257bd0e-5f2c-49fb-9b31-ae516964a559.2015-12-05-04-27-07-934046-_E
00:17 🔗 kyan arkiver, ^
00:18 🔗 kyan (it should probably get tucked into the regular Docstoc collection?)
00:19 🔗 wyatt8740 has quit IRC (Remote host closed the connection)
00:20 🔗 kyan also, why is one of the items 18 GB? That seems kind of big for 100 documents... -_-
00:20 🔗 kyan Eh, whatever, it's saved now anyway
00:30 🔗 wyatt8740 has joined #archiveteam-bs
00:32 🔗 SN4T14 has joined #archiveteam-bs
00:52 🔗 RichardG has quit IRC (Ping timeout: 499 seconds)
01:43 🔗 antomatic has joined #archiveteam-bs
01:43 🔗 swebb sets mode: +o antomatic
01:45 🔗 antomati_ has quit IRC (Ping timeout: 252 seconds)
01:50 🔗 RichardG has joined #archiveteam-bs
02:19 🔗 JesseW has quit IRC (Leaving.)
02:24 🔗 wyatt8740 has quit IRC (Read error: Operation timed out)
02:28 🔗 username1 has joined #archiveteam-bs
02:31 🔗 schbirid2 has quit IRC (Read error: Operation timed out)
02:43 🔗 VADemon has quit IRC (left4dead)
02:44 🔗 Start_ has joined #archiveteam-bs
02:44 🔗 Start has quit IRC (Read error: Connection reset by peer)
02:48 🔗 wyatt8740 has joined #archiveteam-bs
02:49 🔗 RichardG has quit IRC (Ping timeout: 369 seconds)
02:53 🔗 ndiddy has quit IRC (Remote host closed the connection)
03:23 🔗 diacope has quit IRC (Ping timeout: 491 seconds)
03:23 🔗 deathy___ has quit IRC (Ping timeout: 491 seconds)
03:23 🔗 tjg has quit IRC (Read error: Connection reset by peer)
03:23 🔗 _desu____ has joined #archiveteam-bs
03:23 🔗 zyphlar_ has joined #archiveteam-bs
03:24 🔗 tjg has joined #archiveteam-bs
03:25 🔗 Boltsie__ has quit IRC (Ping timeout: 242 seconds)
03:25 🔗 JSharp___ has quit IRC (Ping timeout: 242 seconds)
03:25 🔗 _desu___ has quit IRC (Ping timeout: 242 seconds)
03:25 🔗 zyphlar has quit IRC (Ping timeout: 242 seconds)
03:25 🔗 Ctrl-S___ has quit IRC (Ping timeout: 242 seconds)
03:25 🔗 JSharp___ has joined #archiveteam-bs
03:25 🔗 _desu____ is now known as _desu___
03:25 🔗 zyphlar_ is now known as zyphlar
03:25 🔗 Boltsie__ has joined #archiveteam-bs
03:25 🔗 primus104 has quit IRC (Leaving.)
03:25 🔗 Ctrl-S___ has joined #archiveteam-bs
03:26 🔗 deathy___ has joined #archiveteam-bs
03:41 🔗 tjg has quit IRC (Read error: Connection reset by peer)
03:41 🔗 _desu____ has joined #archiveteam-bs
03:41 🔗 zyphlar_ has joined #archiveteam-bs
03:42 🔗 tjg has joined #archiveteam-bs
03:42 🔗 diacope has joined #archiveteam-bs
03:43 🔗 deathy___ has quit IRC (Ping timeout: 246 seconds)
03:43 🔗 JSharp___ has quit IRC (Ping timeout: 246 seconds)
03:43 🔗 Ctrl-S___ has quit IRC (Ping timeout: 246 seconds)
03:43 🔗 Boltsie__ has quit IRC (Ping timeout: 246 seconds)
03:43 🔗 zyphlar has quit IRC (Ping timeout: 246 seconds)
03:43 🔗 _desu___ has quit IRC (Ping timeout: 246 seconds)
03:43 🔗 _desu____ is now known as _desu___
03:43 🔗 zyphlar_ is now known as zyphlar
03:43 🔗 Boltsie__ has joined #archiveteam-bs
03:43 🔗 JSharp___ has joined #archiveteam-bs
03:43 🔗 Ctrl-S___ has joined #archiveteam-bs
03:45 🔗 RichardG has joined #archiveteam-bs
03:50 🔗 JesseW has joined #archiveteam-bs
03:54 🔗 deathy___ has joined #archiveteam-bs
04:25 🔗 aaaaaaaaa has quit IRC (Leaving)
04:26 🔗 Start_ is now known as Start
04:55 🔗 kyan Rrrgh. I wish IA didn't dark spam, but just noindexed it
04:56 🔗 kyan Trying to figure out how to get to this collection of mp3s of lectures, but it was flagged as spam https://archive.org/details/Dr.JamaalBadawi
04:57 🔗 kyan That said, maybe it is fake or something? But I'd rather be able to find out for myself than be left stuck wondering
05:06 🔗 kyan It looks like it was an accident https://archive.org/post/1048792/help-with-failed-exit-code-1
05:07 🔗 kyan It's been darked 3 times as spam, and undarked twice
05:07 🔗 kyan Why not just have a checkbox on the search pages asking if we want to include things marked as spam in search results?
05:21 🔗 remsen has quit IRC (Read error: Operation timed out)
05:31 🔗 remsen has joined #archiveteam-bs
05:32 🔗 remsen2 has joined #archiveteam-bs
05:35 🔗 R5M has joined #archiveteam-bs
05:36 🔗 remsen2 has quit IRC (Read error: Operation timed out)
05:37 🔗 remsen has quit IRC (Read error: Operation timed out)
05:42 🔗 R5M has quit IRC (Read error: Operation timed out)
05:47 🔗 remsen has joined #archiveteam-bs
05:53 🔗 Sk1d has quit IRC (Read error: Operation timed out)
05:54 🔗 remsen2 has joined #archiveteam-bs
05:56 🔗 remsen has quit IRC (Read error: Operation timed out)
05:56 🔗 R5M has joined #archiveteam-bs
06:04 🔗 remsen2 has quit IRC (Read error: Operation timed out)
06:14 🔗 kyan So, I'm seeing a bunch of people talking about "Twitter Moments". https://twitter.com/moments is some random person who's never tweeted
06:14 🔗 kyan I don't have any buttons on twitter called Moment
06:15 🔗 kyan or displaying the electric-y logo for it.
06:15 🔗 vitzli has joined #archiveteam-bs
06:15 🔗 kyan Googling "twitter moments" takes me to https://twitter.com/i/moments?lang=en which is a 404
06:15 🔗 kyan After watching their lovely ad, I was so excited to see what all the fuss was about!
06:16 🔗 kyan Hashtag #marketing, hashtag #fail.
06:35 🔗 zerkalo has quit IRC (Read error: Operation timed out)
06:46 🔗 zerkalo has joined #archiveteam-bs
07:20 🔗 fie worked for me?
07:26 🔗 kyan I think they've only enabled it for some accounts, but are advertising it to everyone
07:26 🔗 kyan fie ^
07:26 🔗 * kyan is going to sleep now though :3
07:27 🔗 fie I heard about it on NPR weeks ago... I don't even use twitter
07:27 🔗 kyan Huh, ok
07:27 🔗 kyan I just heard about it tonight since SketchCow posted about it
07:27 🔗 fie I may or may not have an account... who knwos.... that place is just a trash bin
07:27 🔗 * kyan gets all his news from ArchiveTeam
07:28 🔗 * fie gets all of his news from many legged creatures that live under rocks
07:31 🔗 vitzli has quit IRC (Quit: Leaving)
08:38 🔗 primus104 has joined #archiveteam-bs
08:44 🔗 JesseW has quit IRC (Leaving.)
08:48 🔗 dashcloud has quit IRC (Read error: Operation timed out)
08:55 🔗 dashcloud has joined #archiveteam-bs
09:14 🔗 BlueMaxim has quit IRC (Quit: Leaving)
09:46 🔗 primus104 has quit IRC (Leaving.)
10:11 🔗 Sk1d has joined #archiveteam-bs
11:59 🔗 primus104 has joined #archiveteam-bs
12:38 🔗 primus104 has quit IRC (Leaving.)
13:02 🔗 R5M has quit IRC (Read error: Operation timed out)
13:31 🔗 VADemon has joined #archiveteam-bs
13:34 🔗 SN4T14 has quit IRC (Remote host closed the connection)
13:41 🔗 mistym has quit IRC (Remote host closed the connection)
13:44 🔗 SN4T14 has joined #archiveteam-bs
14:08 🔗 SimpBrain has quit IRC (Leaving)
14:11 🔗 VADemon What are the user-agent strings for archive.org waybackmachine bot? ia_archiver and archive.org_bot are proposed by http://www.archiveteam.org/index.php?title=ArchiveBot#Disclaimers
14:12 🔗 VADemon But I also have found a website telling that ia_archiver-web.archive.org is a bot from Alexa additionally indexing items for web.archive.org
14:36 🔗 username1 sounds like ym seagate 8tb woes are a kernel problem https://bugzilla.kernel.org/show_bug.cgi?id=93581
15:00 🔗 primus104 has joined #archiveteam-bs
15:14 🔗 SimpBrain has joined #archiveteam-bs
15:40 🔗 SN4T14 has quit IRC (Remote host closed the connection)
15:46 🔗 godane so i found a pattern to grab Time Magazine from there vault website
15:57 🔗 godane so looks like the older vault Time magazines was scanned better
15:57 🔗 godane where things in the 1990s they didn't care
15:58 🔗 godane they scan those very baddly
16:01 🔗 godane here is a exampile of a bad scan: http://time.com/vault/issue/1923-09-03/page/1/
16:03 🔗 godane anyways the first 100 download ids of fieldsupport.lingnet.org is done: https://archive.org/details/fieldsupport.lingnet.org-download-id-1-to-100-20151206
16:03 🔗 godane you really only get 74 of them
16:03 🔗 godane but you have a wget.log to see what is missing
16:04 🔗 R5M has joined #archiveteam-bs
17:18 🔗 SketchCow In EXTREMELY boring news, the archivebot screenshotter ignored an item if it had anything called *png* in it, that's fixed, so those little ones without any screenshots are now getting screenshots.
17:19 🔗 SketchCow It's going to be at this for months, probably, but I can summarily ignore it
17:19 🔗 SketchCow I will automate it a tad more and then just watch it fill.
17:19 🔗 SketchCow The question is if it can ever beat the race condition.
17:19 🔗 SketchCow (I don't want to do things like just do a small set of the page grabs in a given set.)
17:20 🔗 SketchCow I mean, if we were having guests, I might do that.
17:20 🔗 SketchCow We're not having guests
17:23 🔗 username1 is now known as schbirid
17:23 🔗 * schbirid slaps arkiver with nohome
17:29 🔗 limebyte has quit IRC (ZNC - http://znc.in)
17:29 🔗 limebyte has joined #archiveteam-bs
17:42 🔗 Ravenloft has joined #archiveteam-bs
17:44 🔗 no2pencil has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 no2pencil has joined #archiveteam-bs
17:44 🔗 arkiver schbirid: I think I already send you the telenor target?
17:44 🔗 schbirid nope, or maybe i lost the log
17:45 🔗 tjg has quit IRC (Read error: Connection reset by peer)
17:45 🔗 Boltsie__ has quit IRC (Write error: Connection reset by peer)
17:45 🔗 _desu____ has joined #archiveteam-bs
17:45 🔗 zyphlar_ has joined #archiveteam-bs
17:45 🔗 bauruine has quit IRC (Read error: Connection reset by peer)
17:45 🔗 bauruine_ has joined #archiveteam-bs
17:45 🔗 arkiver schbirid: found it, looks like you were online
17:46 🔗 bauruine_ is now known as bauruine
17:46 🔗 Boltsie__ has joined #archiveteam-bs
17:46 🔗 arkiver offline*
17:46 🔗 schbirid cheers!
17:47 🔗 deathy___ has quit IRC (Ping timeout: 241 seconds)
17:47 🔗 JSharp___ has quit IRC (Ping timeout: 241 seconds)
17:47 🔗 Ctrl-S___ has quit IRC (Ping timeout: 241 seconds)
17:47 🔗 _desu___ has quit IRC (Ping timeout: 241 seconds)
17:47 🔗 zyphlar has quit IRC (Ping timeout: 241 seconds)
17:47 🔗 _desu____ is now known as _desu___
17:47 🔗 JSharp___ has joined #archiveteam-bs
17:47 🔗 zyphlar_ is now known as zyphlar
17:48 🔗 tjg has joined #archiveteam-bs
17:48 🔗 Ctrl-S___ has joined #archiveteam-bs
17:52 🔗 PrincessK has joined #archiveteam-bs
17:52 🔗 deathy___ has joined #archiveteam-bs
18:00 🔗 Knoeki has quit IRC (Read error: Operation timed out)
18:21 🔗 Ravenloft has quit IRC (Ping timeout: 252 seconds)
18:23 🔗 JesseW has joined #archiveteam-bs
18:57 🔗 aaaaaaaaa has joined #archiveteam-bs
18:57 🔗 swebb sets mode: +o aaaaaaaaa
19:21 🔗 JesseW has quit IRC (Leaving.)
19:44 🔗 SN4T14 has joined #archiveteam-bs
19:53 🔗 R5M has quit IRC (Read error: Operation timed out)
20:31 🔗 JesseW has joined #archiveteam-bs
20:48 🔗 JesseW godane: thanks for the lingnet grab: https://archive.org/details/fieldsupport.lingnet.org-download-id-1-to-100-20151206
20:52 🔗 godane your welcome
20:53 🔗 godane also this is up to now: https://archive.org/details/fieldsupport.lingnet.org-download-id-301-to-400-20151206
21:10 🔗 JesseW has quit IRC (Leaving.)
21:15 🔗 arkiver godane: should that also be saved into WARCs?
21:17 🔗 godane i was not saving it in warc cause the files are pdfs and zips
21:18 🔗 godane but if you want i could do that
21:18 🔗 godane its like how i did the lego pdfs
21:20 🔗 arkiver I think it's always best to save files into WARCs
21:20 🔗 arkiver Direct links to the files in the WARC files can always be made
21:26 🔗 godane ok
21:26 🔗 godane i'm doing 1 to 500 as a WARC
21:27 🔗 arkiver thanks!
21:28 🔗 godane i mostly do the zips for later collection building
21:28 🔗 arkiver I'll soon start writing the warrior project for your scripts we talked about
21:29 🔗 arkiver so you can use many IPs or lot's of bandwidth to do your grabs
21:29 🔗 godane i'm doing good with kpfa so far
21:29 🔗 godane i'm up to may 2006
21:29 🔗 arkiver saw that yeah, nice
22:13 🔗 Ravenloft has joined #archiveteam-bs
23:03 🔗 Muad-Dib has quit IRC (Ping timeout: 252 seconds)
23:14 🔗 R5M has joined #archiveteam-bs
23:40 🔗 zenguy has quit IRC (Quit: see ya!)
23:41 🔗 schbirid has quit IRC (Quit: Leaving)
23:42 🔗 zenguy has joined #archiveteam-bs
23:43 🔗 Ravenloft has quit IRC (Ping timeout: 252 seconds)
23:51 🔗 zenguy_pc has joined #archiveteam-bs

irclogger-viewer