[00:13] *** bwn has joined #archiveteam-bs
[00:27] *** GE has quit IRC (Remote host closed the connection)
[00:33] *** pizzaiolo has quit IRC (Ping timeout: 260 seconds)
[00:55] *** JAA has quit IRC (Quit: Page closed)
[00:59] *** matt_lock has quit IRC (Ping timeout: 268 seconds)
[01:31] *** n00b811 has quit IRC (Ping timeout: 268 seconds)
[01:46] *** icedice has quit IRC (Quit: Leaving)
[01:51] <godane> SketchCow: i'm close to getting tagesschau 20 clock evening news up to 1992-09-30
[01:51] <godane> i'm at 1992-09-29 right now
[01:52] <godane> https://archive.org/details/tagesschau-20-clock-evening-news-1992-09-29
[02:31] *** vitzli has joined #archiveteam-bs
[02:56] *** vitzli has quit IRC (Quit: Leaving)
[03:09] *** ndiddy has joined #archiveteam-bs
[03:17] *** ndiddy has quit IRC ()
[04:10] <dxrt> Somebody2: Have you run into any ratelimiting when using curl with the wayback save function?
[04:46] *** Sk1d has quit IRC (Ping timeout: 250 seconds)
[04:51] <Somebody2> dxrt: I've only been running it once a day, so no. :-)
[04:53] *** Sk1d has joined #archiveteam-bs
[07:17] *** JAA has joined #archiveteam-bs
[07:23] <JAA> Shit, my WunderBlogs grab ran out of space because of an MJPEG of several GB. :-|
[08:23] <wp494> JAA: expect to run into a little bit of that, satellite GIFs can be several hundreds of MBs if not more
[08:23] <wp494> at least GOES-R isn't fully operational yet because that would definitely drive lots more data
[08:33] <Sanqui> well a MJPEG can be of an infinite size
[08:33] <Sanqui> can't it
[08:34] *** GE has joined #archiveteam-bs
[08:39] <JAA> wp494: I just don't expect those huge images to be embedded in a webpage since it also leads to a terrible user experience (webpage loading forever). I'm not following external links, only grabbing page requisites.
[08:48] <Sanqui> JAA: mjpeg is used for streams.  think webcams
[08:55] <JAA> Ooh, that makes sense
[08:57] <Sanqui> it's sort of a hack
[08:58] <Sanqui> from back when <video> nor flash didn't exist
[09:05] <JAA> "sort of"
[09:40] *** odemg has joined #archiveteam-bs
[09:40] *** BlueMaxim has quit IRC (Read error: Operation timed out)
[09:41] *** BlueMaxim has joined #archiveteam-bs
[09:52] <JAA> What the hell? I added a --reject-regex for that MJPEG, but wpull tried downloading it anyway.
[10:04] *** GE has quit IRC (Remote host closed the connection)
[11:14] *** BlueMaxim has quit IRC (Quit: Leaving)
[11:55] *** dashcloud has quit IRC (Read error: Connection reset by peer)
[11:55] *** dashcloud has joined #archiveteam-bs
[11:59] *** pizzaiolo has joined #archiveteam-bs
[12:04] <JAA> Terrific. When a torrent isn't found on Mininova, the download link returns HTTP status 200. Sigh...
[12:06] *** odemg has quit IRC (Remote host closed the connection)
[12:27] *** GE has joined #archiveteam-bs
[13:06] *** odemg has joined #archiveteam-bs
[13:27] *** icedice has joined #archiveteam-bs
[14:50] *** pnJay has joined #archiveteam-bs
[16:23] *** icedice has quit IRC (Quit: Leaving)
[16:45] *** odemg has quit IRC (Quit: fucked right off!!)
[16:46] *** odemg has joined #archiveteam-bs
[17:34] *** GE has quit IRC (Remote host closed the connection)
[18:15] <Frogging> done https://github.com/chfoo/wpull/pull/360 
[18:30] *** odemg has quit IRC (Remote host closed the connection)
[18:30] *** odemg has joined #archiveteam-bs
[18:31] *** tpw_rules has left Textual IRC Client: www.textualapp.com
[18:53] *** GE has joined #archiveteam-bs
[18:53] *** JAA has quit IRC (Ping timeout: 268 seconds)
[18:55] *** Aranje has quit IRC (Read error: Connection reset by peer)
[18:59] *** JAA has joined #archiveteam-bs
[19:17] *** kyounko has quit IRC (Read error: Connection reset by peer)
[19:42] *** odemg has quit IRC (Remote host closed the connection)
[19:53] *** Aranje has joined #archiveteam-bs
[20:17] *** odemg has joined #archiveteam-bs
[20:31] *** odemg has quit IRC (Remote host closed the connection)
[20:32] *** odemg has joined #archiveteam-bs
[20:58] <JAA> Whoa, just discovered that ArchiveBot downloaded a 1.7 GB video file while grabbing Mininova
[21:01] <JAA> ... and the video's filename contains "part05". I guess that might explain why the grab is 68 GB (compressed).
[21:20] <JAA> Yep, that's definitely not the only one in those archives.
[21:48] <JAA> arkiver, rocode: Better estimate for Mininova torrent data size incoming.
[21:48] <JAA> I extracted all correctly downloaded (status 200) .torrent files from the ArchiveBot grab WARC uploaded to IA as of yesterday (just realised moments ago that the upload is still running; I analysed WARCs 00000 to 00024); there are 48294 files in these, 6 of which are empty.
[21:48] <JAA> The 48288 valid torrents contain 2530986261593 bytes = 2.30 TiB (average size 49.99 MiB). Extrapolating to the total torrent count of 72056 gives a total size of approximately 3.43 TiB.
[21:49] <JAA> Also xmc & HCross2 ^
[21:52] <JAA> My grab is at 146k done, 293k left, by the way
[21:53] *** dmt` has left 
[21:53] <JAA> WunderBlogs update: 283k done, 463k left
[22:00] *** JAA has quit IRC (Quit: Page closed)
[22:00] *** JAA has joined #archiveteam-bs
[22:00] *** JAA has quit IRC (Client Quit)
[22:00] *** JAA has joined #archiveteam-bs
[22:01] <JAA> (Sorry about that)
[22:06] *** ZizzyDizz has quit IRC (Remote host closed the connection)
[22:06] *** ZizzyDizz has joined #archiveteam-bs
[22:10] *** Dark_Star has quit IRC (Ping timeout: 246 seconds)
[22:12] *** BlueMaxim has joined #archiveteam-bs
[22:15] *** Dark_Star has joined #archiveteam-bs
[22:19] *** pizzaiolo has left 
[22:24] *** GE has quit IRC (Remote host closed the connection)
[22:27] *** pizzaiolo has joined #archiveteam-bs
[22:30] *** TC01 has quit IRC (Read error: Operation timed out)
[22:31] *** TC01 has joined #archiveteam-bs
[23:22] *** JAA has quit IRC (Quit: Page closed)
[23:37] *** kristian_ has joined #archiveteam-bs
[23:57] *** ZizzyDizz has quit IRC (Remote host closed the connection)