#archiveteam-bs 2016-01-28,Thu

↑back Search

Time Nickname Message
00:22 🔗 Arkiver2 SketchCow: you are only uploading the 'second' telenor archive as megawarcs right?
00:23 🔗 Arkiver2 is now known as arkiver
00:24 🔗 arkiver I like that NewsGrabber is getting some attention
00:25 🔗 arkiver SketchCow: what's the manager thinking of how newsgrabber is going, vs. GDELT?
00:27 🔗 HCross2 Yeah. It is nice to see it going so well.
00:34 🔗 arkiver yep
00:34 🔗 joepie91 arkiver: it is?
00:34 🔗 joepie91 (getting attention)
00:34 🔗 arkiver yes
00:35 🔗 joepie91 arkiver: link?
00:36 🔗 arkiver It's getting attention from the wayback machine manager
00:36 🔗 arkiver Not like global
00:36 🔗 arkiver yet
00:36 🔗 joepie91 oh, in that sense
00:37 🔗 HCross2 Yet being the key word
00:39 🔗 arkiver Yes
00:42 🔗 joepie91 GitHub down \o/
00:45 🔗 yipdw Atlassian Army strikes again
00:45 🔗 yipdw I think service outages are more fun if you imagine them as overly simplistic battles between two major competitors
00:48 🔗 yipdw or maybe Age of Empires games where you have that one asshole who types in the car cheat code
00:53 🔗 Rye "Click to select this one %s"
00:53 🔗 Rye and that cheat, wasn't it "how do you turn this on" or something like that
00:54 🔗 Rye (how the fuck do i know this. too much fucking about in aoe2 single player games i guess)
00:58 🔗 joepie91 lol
00:59 🔗 Aad downtimes are fun only when you can observe the ensuing chaos and then optionally know about the cause
01:00 🔗 joepie91 I want my TiddlyWiki Desktop :(
01:30 🔗 vitzli has joined #archiveteam-bs
01:54 🔗 JesseW has joined #archiveteam-bs
02:16 🔗 joepie91 "The odds of Github meeting a fate similar to that of the Library of Alexandria are slim."
02:16 🔗 joepie91 welp
02:16 🔗 joepie91 so much for taking that article seriously, I guess
02:16 🔗 joepie91 (ref http://www.wired.com/2015/06/problem-putting-worlds-code-github/ )
02:17 🔗 vitzli wonderful morning
02:31 🔗 JesseW still down is it?
02:32 🔗 JesseW ah, apparently it's partially back up
02:34 🔗 pikhq "Slim"? Really? I mean, I don't think it's going away especially *soon*, but it's not as though corporate digital data storage has a great long-term track record.
02:48 🔗 VADemon has quit IRC (Quit: left4dead)
03:11 🔗 DFJustin has quit IRC (Quit: IMHOSTFU)
03:11 🔗 DFJustin has joined #archiveteam-bs
03:47 🔗 espes___ heh
03:48 🔗 espes___ ive got half of github in cloud storage
03:48 🔗 espes___ totes reliable
03:50 🔗 kyan has quit IRC (This computer has gone to sleep)
03:52 🔗 kyan has joined #archiveteam-bs
04:07 🔗 JetBalsa has quit IRC (Read error: Connection reset by peer)
04:08 🔗 atlogbot has quit IRC (Ping timeout: 360 seconds)
04:09 🔗 Zebranky_ has joined #archiveteam-bs
04:12 🔗 Zebranky has quit IRC (Read error: Operation timed out)
04:18 🔗 atlogbot has joined #archiveteam-bs
04:18 🔗 beardicus has quit IRC (Read error: Operation timed out)
04:18 🔗 kvieta has quit IRC (Read error: Operation timed out)
04:18 🔗 dxrt has quit IRC (Read error: Operation timed out)
04:20 🔗 slyphic has quit IRC (Read error: Operation timed out)
04:20 🔗 toad1 has quit IRC (Read error: Operation timed out)
04:20 🔗 kyan_ has joined #archiveteam-bs
04:22 🔗 Sanqui has quit IRC (Read error: Operation timed out)
04:24 🔗 kyan has quit IRC (Read error: Operation timed out)
04:25 🔗 kyan_ is now known as kyan
04:27 🔗 phuzion has quit IRC (Ping timeout: 633 seconds)
04:27 🔗 toad1 has joined #archiveteam-bs
04:28 🔗 kvieta has joined #archiveteam-bs
04:28 🔗 dxrt has joined #archiveteam-bs
04:31 🔗 phuzion has joined #archiveteam-bs
04:32 🔗 slyphic has joined #archiveteam-bs
04:37 🔗 Sanqui has joined #archiveteam-bs
04:42 🔗 JesseW has quit IRC (Ping timeout: 246 seconds)
04:53 🔗 chazchaz has quit IRC (Ping timeout: 260 seconds)
04:54 🔗 chazchaz has joined #archiveteam-bs
05:31 🔗 acridAxid has quit IRC (Quit: marauder)
05:32 🔗 beardicus has joined #archiveteam-bs
05:32 🔗 acridAxid has joined #archiveteam-bs
05:36 🔗 kvieta has quit IRC (Read error: Operation timed out)
05:36 🔗 phuzion has quit IRC (Read error: Operation timed out)
05:37 🔗 beardicus has quit IRC (Read error: Operation timed out)
05:39 🔗 JesseW has joined #archiveteam-bs
05:39 🔗 phuzion has joined #archiveteam-bs
05:42 🔗 SimpBrain has quit IRC (Ping timeout: 260 seconds)
05:53 🔗 kyan has quit IRC (Ping timeout: 260 seconds)
05:56 🔗 mistym has quit IRC (Ping timeout: 633 seconds)
05:56 🔗 kyan has joined #archiveteam-bs
05:56 🔗 mistym has joined #archiveteam-bs
06:00 🔗 kyan_ has joined #archiveteam-bs
06:02 🔗 robink has quit IRC (Ping timeout: 190 seconds)
06:03 🔗 kyan has quit IRC (Ping timeout: 260 seconds)
06:03 🔗 robink has joined #archiveteam-bs
06:04 🔗 yipdw this is probably the most annoying thing about the S3 upload endpoint:
06:04 🔗 yipdw uploading docstoc20151218032352.megawarc.warc.gz: [################################] 25625/25625 - 00:00:00
06:04 🔗 yipdw error uploading docstoc20151218032352.megawarc.warc.gz to archiveteam_docstoc_20151218032352, 403 Client Error: Forbidden
06:04 🔗 yipdw for some reason it will only return 4xx *after* you upload everything
06:05 🔗 beardicus has joined #archiveteam-bs
06:05 🔗 yipdw (also I don't know why my keys are still denied but that's a different problem)
06:06 🔗 kvieta has joined #archiveteam-bs
06:29 🔗 SimpBrain has joined #archiveteam-bs
06:31 🔗 SimpBrain has quit IRC (Read error: Operation timed out)
06:43 🔗 mismatchm has quit IRC (Ping timeout: 360 seconds)
06:46 🔗 SimpBrain has joined #archiveteam-bs
07:11 🔗 kyan_ has quit IRC (Ping timeout: 260 seconds)
07:35 🔗 ivan` yipdw: are you surprised that an HTTP response can't come before the HTTP request is finished? ;)
07:36 🔗 yipdw ivan`: well there's a 100-Continue response in there before the giant upload
07:36 🔗 ivan` ah I see
07:36 🔗 yipdw so I'd prefer to be told then, yes :P
07:36 🔗 yipdw or at least curl reports one if you POST to the endpoint
07:37 🔗 yipdw I'm not sure what ia upload does but it seems to use the same mechanism
07:56 🔗 HCross2 yipdw: I totally agree. With the NewsGrabber, by the time the IA has errored, we've uploaded 10GB
07:58 🔗 godane i'm grabbing a show called Prime Time from americanarchive.org
07:59 🔗 godane its from 1982/1983 on Rocky Mountain PBS
08:16 🔗 vitzli has quit IRC (Leaving)
08:17 🔗 JesseW has quit IRC (Leaving.)
08:21 🔗 godane i had to stop uploading
08:21 🔗 godane i have 735 items waiting on derive
08:27 🔗 godane i'm at 629,870 items right now
08:30 🔗 kyan has joined #archiveteam-bs
11:35 🔗 vitzli has joined #archiveteam-bs
12:24 🔗 beardicus has quit IRC (Read error: Operation timed out)
12:26 🔗 kvieta has quit IRC (Read error: Operation timed out)
12:28 🔗 SimpBrain has quit IRC (Ping timeout: 633 seconds)
12:39 🔗 kvieta has joined #archiveteam-bs
12:39 🔗 beardicus has joined #archiveteam-bs
12:42 🔗 SimpBrain has joined #archiveteam-bs
12:53 🔗 kvieta has quit IRC (Read error: Operation timed out)
12:55 🔗 beardicus has quit IRC (Ping timeout: 961 seconds)
13:13 🔗 kvieta has joined #archiveteam-bs
13:14 🔗 beardicus has joined #archiveteam-bs
13:28 🔗 logan has quit IRC (Remote host closed the connection)
13:28 🔗 logan has joined #archiveteam-bs
13:45 🔗 Fletcher has quit IRC (Ping timeout: 252 seconds)
13:59 🔗 dashcloud has quit IRC (Ping timeout: 250 seconds)
14:06 🔗 dashcloud has joined #archiveteam-bs
14:46 🔗 Fletcher has joined #archiveteam-bs
14:50 🔗 dashcloud has quit IRC (Read error: Operation timed out)
14:54 🔗 dashcloud has joined #archiveteam-bs
15:20 🔗 VADemon has joined #archiveteam-bs
15:29 🔗 RichardG_ has joined #archiveteam-bs
15:30 🔗 RichardG has quit IRC (Ping timeout: 260 seconds)
15:41 🔗 SketchCow 1. No idea if I screwed up Telenor.
15:41 🔗 SketchCow 2. Yes, this manager is trying to keep track of the work we do. He'll have fun. We're massive.
15:43 🔗 HCross and are only getting bigger
15:45 🔗 godane i'm at 630,508 items now
15:51 🔗 RichardG_ has quit IRC (Remote host closed the connection)
15:51 🔗 RichardG has joined #archiveteam-bs
15:54 🔗 HCross has quit IRC (Read error: Connection reset by peer)
15:54 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
15:54 🔗 brayden has quit IRC (Read error: Connection reset by peer)
15:54 🔗 Stiletto has joined #archiveteam-bs
15:54 🔗 HCross has joined #archiveteam-bs
15:54 🔗 brayden has joined #archiveteam-bs
16:00 🔗 joepie91 herding cats commences
16:00 🔗 joepie91 lol
16:07 🔗 acridAxid was looking at FTP project and found a few SimTel.net mirrors that include variations
16:09 🔗 SketchCow Always good to get.
16:10 🔗 acridAxid tbh i think the largest differences are things that got squashed by C&Ds (contents of /msdos/pcmag are still there)
16:12 🔗 acridAxid in case you're curious: ftp://ftp.sunet.se/pub/simtelnet/
16:14 🔗 midas i might have uploaded that one
16:14 🔗 acridAxid the mirror on freenet.de also has some differences from the mirror currently stored on IA (from bu.edu), but not yet sure what that means (more/less, newer/older)
16:15 🔗 dashcloud has quit IRC (Read error: Operation timed out)
16:15 🔗 dashcloud has joined #archiveteam-bs
16:15 🔗 midas https://archive.org/details/ftp.sunet.se-simtelwin
16:16 🔗 acridAxid ftp.sunet.se/pub/simtelnet/msdos/pcmag/
16:16 🔗 acridAxid yup, nice
16:16 🔗 midas complete dump was a couple of TB in size, so i've split that one up into a couple of parts
17:03 🔗 vitzli I could try to do audit/check/whatever it is on telenor: download all the archives and check inside content for users/webpages inside
17:07 🔗 joepie91 looks like macros in TiddlyWiki work well: http://i.imgur.com/HsrfMrv.png
17:07 🔗 joepie91 syntax highlighting, not so much...
17:10 🔗 Fletcher has quit IRC (Ping timeout: 252 seconds)
17:14 🔗 Fletcher has joined #archiveteam-bs
17:14 🔗 JesseW has joined #archiveteam-bs
17:16 🔗 kristian_ has joined #archiveteam-bs
17:16 🔗 kristian_ hi, all
17:22 🔗 kristian_ I have a 28 page PDF that is 408 megs
17:24 🔗 dashcloud has quit IRC (Read error: Operation timed out)
17:25 🔗 JesseW wow, what a silly way to make a PDF. :-)
17:25 🔗 kristian_ well
17:26 🔗 kristian_ it's a scanned document ... I got a pdf with borders (to make it fit a certain format) and a size of 240M
17:27 🔗 kristian_ I extracted with pdfimages
17:27 🔗 kristian_ to ppm, so I guess that's what was embedded
17:27 🔗 dashcloud has joined #archiveteam-bs
17:28 🔗 xmc strange, ppm is less compressed than i would expect the pdf to be
17:28 🔗 kristian_ aha, pdfimages -list gives me "1 0 image 2456 3492 rgb 3 8 jpeg no 4 0" and so on
17:28 🔗 kristian_ "jpeg" is what's important there
17:32 🔗 kristian_ so I guess the original pdf has jpg files embedded after all
17:35 🔗 kristian_ image-000.jpg JPEG 2456x3492 2456x3492+0+0 8-bit DirectClass 14MB 0.650u 0:00.660
17:38 🔗 DFJustin what are you trying to do
17:39 🔗 kristian_ I'm trying to figure out what I should upload
17:39 🔗 kristian_ I guess I'll go with the 240 megs version, sans borders
17:39 🔗 DFJustin you scanned a document yourself?
17:39 🔗 kristian_ no
17:40 🔗 phuzion Does IA do derive tasks on PDFs?
17:40 🔗 DFJustin upload whatever you received it as
17:40 🔗 DFJustin yes phuzion
17:40 🔗 phuzion Ok yeah, then kristian_, you're gonna want to upload the original file.
17:41 🔗 kristian_ I'll do the one without the artificial borders
17:43 🔗 dashcloud has quit IRC (Read error: Operation timed out)
17:45 🔗 JesseW has quit IRC (Read error: Operation timed out)
17:48 🔗 dashcloud has joined #archiveteam-bs
17:49 🔗 kristian_ hurm, nothing happens when I click "upload file"
17:50 🔗 kristian_ sorry, had to click "add"
17:53 🔗 vitzli has quit IRC (Leaving)
18:00 🔗 VADemon has quit IRC (Read error: Connection reset by peer)
18:51 🔗 Aad has quit IRC (Read error: Connection reset by peer)
18:54 🔗 chfoo has quit IRC (Read error: Operation timed out)
19:23 🔗 chfoo has joined #archiveteam-bs
19:48 🔗 kristian_ has quit IRC (Quit: Leaving)
19:59 🔗 schbirid has joined #archiveteam-bs
20:09 🔗 dashcloud has quit IRC (Read error: Operation timed out)
20:12 🔗 dashcloud has joined #archiveteam-bs
20:19 🔗 dashcloud has quit IRC (Read error: Operation timed out)
20:22 🔗 dashcloud has joined #archiveteam-bs
20:59 🔗 schbirid has quit IRC (Quit: Leaving)
21:04 🔗 mismatchm has joined #archiveteam-bs
21:09 🔗 mismatchm has quit IRC (Ping timeout: 252 seconds)
21:09 🔗 mismatch has joined #archiveteam-bs
21:48 🔗 brayden has quit IRC (Read error: Connection reset by peer)
21:50 🔗 brayden has joined #archiveteam-bs
21:55 🔗 Apathy has quit IRC (Read error: Operation timed out)
21:55 🔗 Apathy has joined #archiveteam-bs
22:05 🔗 VADemon has joined #archiveteam-bs
22:09 🔗 slyphic is now known as slyphic|a
22:44 🔗 RedType has joined #archiveteam-bs
22:51 🔗 JetBalsa has joined #archiveteam-bs
22:58 🔗 robink has quit IRC (Ping timeout: 190 seconds)
23:00 🔗 robink has joined #archiveteam-bs
23:03 🔗 wp494_ has joined #archiveteam-bs
23:04 🔗 HCross3 has joined #archiveteam-bs
23:04 🔗 alard has quit IRC (Ping timeout: 250 seconds)
23:04 🔗 wp494 has quit IRC (Ping timeout: 250 seconds)
23:04 🔗 zerkalo has quit IRC (Ping timeout: 250 seconds)
23:05 🔗 HCross has quit IRC (Ping timeout: 250 seconds)
23:05 🔗 SketchCow has quit IRC (Ping timeout: 250 seconds)
23:05 🔗 mutoso has quit IRC (Ping timeout: 250 seconds)
23:05 🔗 HCross3 is now known as HCross
23:05 🔗 Lord_Nigh has quit IRC (Ping timeout: 250 seconds)
23:05 🔗 SketchCow has joined #archiveteam-bs
23:06 🔗 zerkalo has joined #archiveteam-bs
23:07 🔗 alard has joined #archiveteam-bs
23:07 🔗 mutoso has joined #archiveteam-bs
23:07 🔗 SketchCow Where's the Newsgrabber list?
23:08 🔗 HCross http://newsgrabber.harrycross.me/services.html or https://github.com/ArchiveTeam/NewsGrabber/tree/master/services
23:10 🔗 Lord_Nigh has joined #archiveteam-bs
23:38 🔗 Stiletto has quit IRC (Read error: Operation timed out)

irclogger-viewer