[00:48] *** BlueMaxim has joined #archiveteam-bs [02:13] *** bwn has joined #archiveteam-bs [02:23] *** schbirid2 has joined #archiveteam-bs [02:30] *** Pixi has quit IRC (Quit: Pixi) [02:32] *** schbirid has quit IRC (Read error: Operation timed out) [02:32] *** Pixi has joined #archiveteam-bs [03:04] *** godane has quit IRC (Remote host closed the connection) [03:05] *** godane has joined #archiveteam-bs [04:02] *** Coderjo has quit IRC (Remote host closed the connection) [04:41] *** M-WillBra is now known as WillBradl [04:41] so is batoto just going to die [04:42] *** godane has quit IRC (Read error: Operation timed out) [04:45] *** wbradley has joined #archiveteam-bs [04:45] *** qw3rty16 has joined #archiveteam-bs [04:46] *** wbradley is now known as zeeboots [04:47] *** WillBradl is now known as WillBra4 [04:47] *** WillBra4 is now known as zyph [04:48] *** qw3rty15 has quit IRC (Read error: Operation timed out) [04:48] *** zyph is now known as zyphlar [04:49] *** zeeboots has left WeeChat 1.4 [04:51] *** godane has joined #archiveteam-bs [05:04] so i'm archivebox project maybe in alpha/stable stage [05:06] i found out that the build-in wifi rpi3 would disconnect alot if wireless power management [05:06] was on [05:06] so i added 'wireless-power off' to /etc/network/interfaces [05:07] it was working for about 15 minutes when i was loading tons of pages from kiwix [05:07] vs like 5 or 10 pages before disconnecting with power management on [05:13] *** Mateon1 has quit IRC (Read error: Connection reset by peer) [05:13] *** Mateon1 has joined #archiveteam-bs [05:15] *** icedice has joined #archiveteam-bs [05:45] *** icedice has quit IRC (Read error: Connection reset by peer) [05:50] *** octothorp has quit IRC (Remote host closed the connection) [05:54] *** jdude104 has quit IRC (Leaving) [05:55] *** jdude104 has joined #archiveteam-bs [05:56] *** jdude104 has quit IRC (Client Quit) [05:56] *** jdude104 has joined #archiveteam-bs [05:57] *** icedice has joined #archiveteam-bs [06:00] *** Kimmer has quit IRC (Leaving) [06:28] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [06:41] *** jdude has joined #archiveteam-bs [06:45] *** jdude104 has quit IRC (Read error: Operation timed out) [06:57] *** icedice has quit IRC (Ping timeout: 245 seconds) [07:09] *** jdude has quit IRC (Leaving) [07:09] *** jdude104 has joined #archiveteam-bs [07:12] *** jdude104 has quit IRC (Client Quit) [08:17] *** octothorp has joined #archiveteam-bs [09:15] *** Kimmer has joined #archiveteam-bs [09:25] *** jschwart has joined #archiveteam-bs [09:45] *** Coderjo has joined #archiveteam-bs [11:19] *** BlueMaxim has quit IRC (Leaving) [11:59] jacketcha: Yes, #botato. [12:02] *** Smiley has joined #archiveteam-bs [12:05] *** SmileyG has quit IRC (Ping timeout: 260 seconds) [13:52] *** REiN^ has quit IRC (Remote host closed the connection) [14:53] SketchCow, claim the $100 [14:53] https://twitter.com/_cryptome_/status/952168812505387008 [14:53] https://splinternews.com/rogue-archivists-are-creating-a-copy-of-gawker-com-so-t-1793861301 [15:18] godane, we're ripping pbs content, see https://i.imgur.com/qGRIO9R.png ... get in here https://discord.gg/RQpHMJP (did you already write something?) still, get in there <3 [16:20] charlie rose uses a custom script just for charlierose.com [16:21] *i uses a custom script [16:22] *** K4k has quit IRC (Read error: Connection reset by peer) [16:22] godane: What are you grabbing exactly? I had to ignore the actual videos in the ArchiveBot job towards the end because my machine had a forced reboot due to the Meltdown bug. [16:22] I'm planning to resume that though. There are about 5400 videos left IIRC. [16:23] right now i'm grabbing the 762 version of the videos [16:23] i was downloading a month worth of videos and then upload them [16:24] my panic grab of 762 version is just in case shit hits the fan [16:24] Ok, the URLs I ignored look like this: https://pfm1hycdn01-a.akamaihd.net/788/1HY788_003_xp.f4v [16:24] cause it should be around 2.5 to 3.0tb [16:24] The ArchiveBot job grabbed some 6 TB and the remaining videos will be another 2-3 TB. [16:25] those f4v files most of the time don't exist [16:27] i'm also doing something crazy and making a mp3 collection from the charlie rose videos [16:28] the mp3 collection will be offer some hoarders with low disk space to have some sort of archive of it [16:29] btw other series i have to go after later is called 'The Open Mind' [16:38] odemg: I'm running something to pull out the gawker stuff. [16:38] I'm sure we used archivebot for it, not anything else, right [16:38] godane, ohh I know re crose stuff you sent me the script, just wondering about pbs [16:38] SketchCow, sound :D [16:40] SketchCow, you should likely tweet at them and let them know, get that money son! [17:16] godane: They do exist, but you can only access them if you set the correct referrer, otherwise you get the not found error. [18:08] *** mnjgno has joined #archiveteam-bs [18:09] hello! I did this: http://bookmarklets.htmlbin.net/archiving.html Have any of you know more services? Obviously all of you use more advanced tools (warc, extensions) but for a casual browsing, bookmarklets are excellent, so if any of you know about more services...? :D [18:12] The page should be a little pretty, and should have a way to preview what's IN the bookmarket. [18:15] Igloo: https://twitter.com/emilybatty/status/952241942963851266 [18:15] holy [18:16] assuming hoax, lots of people reporting it but i feel like there'd be some coverage [18:17] Wow [18:17] Pretty wide spread [18:22] https://twitter.com/NutzFordBucks/status/952243050675281922 [18:23] @SketchCow, I am just gathering online archive services, so if you now more, :) obviously all can be improved. [18:24] That's fine [18:24] But I'm telling you "drag this bookmarklet to your bar" is the new "click on this awesome desktop toy.exe" [18:24] Document and make it easy to understand what these do [18:34] cool! I'll have in mind if I ever publish for more people. Although if doing that I should remove peep us then. thanks anyway :) [18:40] JAA: whats the referer needed to get f4v file [18:40] *** Uzerus has joined #archiveteam-bs [18:40] jacketcha: missle? where? [18:43] BBC news dropping in with the *slowest* breaking news alert ever http://www.bbc.co.uk/news/world-us-canada-42677604 [18:46] godane: Something like https://charlierose.com/video/player/24740?autoplay=false (for the URL above) I think. I'm not sure how strictly they check. [19:04] https://www.buzzfeed.com/mbvd/false-alarm-ballistic-missile-threat-hawaii [19:22] Uzerus: Hawaii [19:22] but, false alarm I guess [19:28] godane: Apparently a referrer of https://charlierose.com/ is sufficient. [19:43] tell me how to get this file: https://pfm1hycdn01-a.akamaihd.net/113/1HY113_007_lp.f4v [19:43] i can't get it to download even with charlierose.com as referer [19:45] *** Mateon1 has quit IRC (Read error: Operation timed out) [19:46] *** Mateon1 has joined #archiveteam-bs [19:47] godane: Hmm, yeah, neither can I. The server returns status 200 but an empty body. [19:47] The ArchiveBot job got the same result: 2017-12-02 22:57:21,338 - wpull.processor.web - INFO - Fetched ‘https://pfm1hycdn01-a.akamaihd.net/113/1HY113_007_lp.f4v’: 200 OK. Length: 0 [video/x-flv]. [19:47] So I guess that file might be broken? [19:48] that episode is the only lost one i can't get [19:49] plus side is the 2 segments from that episode do exist [20:01] Kaz: Igloo https://streamable.com/6fs0n [20:01] what was broadcast to TV for the EAS Alert [20:12] by the way, any of you uses peeep.us to bypass robots.txt files? [20:14] Huh [20:14] No, we just ignore them [20:16] ah oki [20:34] jrwr: holy cow that is hard to read [21:11] *** REiN^ has joined #archiveteam-bs [21:11] *** ranavalon has quit IRC (Quit: Leaving) [21:52] *** Jusque has quit IRC (Quit: ZNC - http://znc.in) [21:53] *** Jusque has joined #archiveteam-bs [21:57] *** Jusque has quit IRC (Client Quit) [21:58] *** Jusque has joined #archiveteam-bs [23:38] *** odemg has quit IRC (Ping timeout: 260 seconds) [23:42] *** mnjgno has quit IRC (Quit: Leaving) [23:52] *** odemg has joined #archiveteam-bs