[00:14] *** BlueMax has joined #archiveteam-bs [00:21] *** Asparag-1 has joined #archiveteam-bs [00:22] *** Asparagir has quit IRC (Read error: Operation timed out) [00:41] *** Asparag-1 has quit IRC (Asparag-1) [01:44] *** RichardG_ has joined #archiveteam-bs [01:44] *** ld1 has quit IRC (Quit: ld1) [01:44] *** RichardG has quit IRC (Ping timeout: 250 seconds) [01:50] *** ld1 has joined #archiveteam-bs [01:55] *** vitzli has joined #archiveteam-bs [02:24] riking: apparently using gateways may also require team admin approval and unlike apps it's off by default [02:31] *** vitzli has quit IRC (Leaving) [02:37] you guys have any tools for backing up youtube channels? [02:37] *** Stilett0- is now known as Stiletto [02:37] that's better [02:42] Backup locally? [02:50] *** RichardG_ has quit IRC (Read error: Connection reset by peer) [02:51] *** RichardG has joined #archiveteam-bs [02:52] yeah [02:57] https://github.com/rg3/youtube-dl [03:10] bithippo: Would a script that's like tubeup except it waits until a video goes down before uploading it be possible? [03:11] :thinking: [03:11] So you'd download the video and all metadata locally, and only upon further download attempts in the future (that failed due to the video being deleted) would it be uploaded to IA? [03:17] bithippo: Well, not necessarily try to download it again, but like, check if the video has been taken down from youtube, and then if it has been taken down upload it. [03:20] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [03:26] *** godane has quit IRC (Ping timeout: 250 seconds) [03:29] @hook54321 Sorry, I should've been more specific. [03:29] I'd have to see what sort of API call you'd back to Youtube to detect video availability. [03:30] https://stackoverflow.com/a/32503070 [03:31] Without the Youtube API, looks like the way to do it is attempt to GET the video thumbnail, which will 404 if the video is no longer available. [03:31] I didn't know youtube-dl could do a whole channel at once [03:31] youtube-dl is powerful [03:32] that powerful? [03:33] Can grab entire channels, playlists, and supports a plethora of video sites. [03:33] Rarely do I run into a piece of content I can't extract with it. [03:34] Can also dump channel metadata out as JSON. [03:34] I'd argue it's becoming a first class tool similar to wget and curl. [03:35] (honorable mention: httpie) [03:43] *** godane has joined #archiveteam-bs [04:00] *** Mayonaise has quit IRC (Read error: Connection reset by peer) [04:01] *** godane has quit IRC (Quit: Leaving.) [04:09] *** qw3rty116 has joined #archiveteam-bs [04:11] *** Mayonaise has joined #archiveteam-bs [04:13] *** qw3rty115 has quit IRC (Read error: Operation timed out) [04:35] *** Mateon1 has quit IRC (Read error: Operation timed out) [04:35] *** Mateon1 has joined #archiveteam-bs [05:07] *** godane has joined #archiveteam-bs [05:57] *** godane has quit IRC (Ping timeout: 633 seconds) [06:02] *** odemg has quit IRC (Read error: Operation timed out) [06:06] *** fie has quit IRC (Ping timeout: 600 seconds) [06:12] *** odemg has joined #archiveteam-bs [06:55] *** bithippo has quit IRC (My MacBook Air has gone to sleep. ZZZzzz…) [07:18] *** Aoede has quit IRC (Ping timeout: 250 seconds) [07:18] *** Aoede has joined #archiveteam-bs [07:18] *** Rai-chan has quit IRC (Ping timeout: 250 seconds) [08:33] *** schbirid has joined #archiveteam-bs [09:16] *** Stiletto has quit IRC (Ping timeout: 250 seconds) [09:53] *** BlueMax has quit IRC (Read error: Connection reset by peer) [10:37] *** Rai-chan has joined #archiveteam-bs [12:06] *** rsznick has joined #archiveteam-bs [12:09] *** rsznikk has joined #archiveteam-bs [12:12] *** rsznik has quit IRC (Read error: Operation timed out) [12:13] *** rsznick has quit IRC (Read error: Operation timed out) [13:49] PurpleSym: FYI, I finally got around to taking a look at the Instagram script. Doesn't work anymore, unfortunately. [13:49] Meh, too bad :( [13:50] I found a very easy method to do the scraping though. Just request the profile page with an __a=1 parameter, gives you a JSON. To get later pages, you can use the max_id parameter set to the last post that was retrieved already. [13:51] E.g. https://www.instagram.com/elonmusk/?__a=1 -> https://www.instagram.com/elonmusk/?__a=1&max_id=1709068240325503498 [13:56] I'll implement this now, rather than trying to fiddle around with GraphQL. There's an annoying query_hash parameter in those requests, and I didn't see what it's supposed to be. Obfuscated JS hell... [14:33] for mediawiki backups, I'm going to refresh the dumps I did a year ago. should I update the existing archicve.org entries with the new dumps, or create new ones? [14:35] *** jtn2 has quit IRC (Read error: Operation timed out) [14:38] *** jtn2 has joined #archiveteam-bs [14:40] Does anyone know offhand the model of CD drive that Jason uses to mass-rip a stack of CDs? I've tried searching his twitter and ascii.textfiles.com but I can't seem to find it. [14:43] looks like it's a http://www.acronova.com/product/auto-blu-ray-duplicator-publisher-ripper-nimbie-usb-nb21/9/review.html [14:48] Thanks a bunch. [15:11] *** Igloo has quit IRC (Remote host closed the connection) [15:15] *** odemg has quit IRC (Read error: Operation timed out) [15:52] *** bithippo has joined #archiveteam-bs [15:54] *** bithippo has quit IRC (Client Quit) [15:56] *** bithippo has joined #archiveteam-bs [16:02] *** chfoo has quit IRC (LoveChatot) [16:02] *** svchfoo1 has quit IRC (Remote host closed the connection) [16:11] *** chfoo has joined #archiveteam-bs [16:11] Great... Last December, Facebook "accidentally" broke the Graph API such that you can no longer discover all posts through it. At least that's what I gather from https://github.com/minimaxir/facebook-page-post-scraper (can't read the bug report itself since it requires a login). [16:53] seems to be a recurring theme in social media APIs [16:53] oops we changed something and now there's no way to get complete data [16:54] please call the complaints department at 1800-dev-null [16:58] *** Jonimus has quit IRC (WeeChat 1.4) [17:03] That, and also "Sorry, you'll have to go through the totally separate and unrelated company X now, which will happily sell you the data you're after." [17:06] Interestingly, there's no such thing for Reddit yet as far as I know. [17:07] Even though they crippled the search UI last year and are in the process of removing timestamp-based searches through the API (which was the only way to get around the 1000 threads limit for a while now)... [17:21] The walled garden is having existential anxiety. [17:42] *** RichardG has quit IRC (Read error: Connection reset by peer) [17:44] *** RichardG has joined #archiveteam-bs [17:56] *** godane has joined #archiveteam-bs [18:19] *** BnARobin_ has quit IRC (Read error: Operation timed out) [18:32] *** BnARobin has joined #archiveteam-bs [18:34] *** odemg has joined #archiveteam-bs [18:38] *** BnARobin has quit IRC (Remote host closed the connection) [18:38] *** BnARobin has joined #archiveteam-bs [18:44] *** BnARobin has quit IRC (Remote host closed the connection) [18:44] *** BnARobin has joined #archiveteam-bs [18:52] *** BnARobin has quit IRC (Remote host closed the connection) [18:52] *** BnARobin has joined #archiveteam-bs [18:58] *** kisspunch has quit IRC (Quit: ZNC - http://znc.in) [19:02] *** BnARobin has quit IRC (Remote host closed the connection) [19:03] *** BnARobin has joined #archiveteam-bs [19:05] *** kisspunch has joined #archiveteam-bs [19:09] *** BnARobin has quit IRC (Remote host closed the connection) [19:09] *** BnARobin has joined #archiveteam-bs [19:18] *** BnARobin has quit IRC (Remote host closed the connection) [19:18] *** BnARobin has joined #archiveteam-bs [19:24] *** BnARobin has quit IRC (Remote host closed the connection) [19:24] *** BnARobin has joined #archiveteam-bs [19:28] *** jschwart has joined #archiveteam-bs [19:31] so i lost power last night for maybe 3 hours [19:32] i had to go out of the snow storm to help my brother dig out the generator [19:39] anyways i'm at 20k items now [19:40] i only 8 days into march and i have about 1/3 of feb [19:40] i was at 60k items in 2018-02 [19:41] i just need another 40k items to make number 2 in my grab collections [19:42] Does archive.org have anything on 'http://files.filefront.com/godlike_132rar/;4965816;;/fileinfo.html' maybe I'm searching wrong. [19:46] *** BnARobin has quit IRC (Remote host closed the connection) [19:47] *** BnARobin has joined #archiveteam-bs [19:52] *** BnARobin has quit IRC (Remote host closed the connection) [19:53] *** BnARobin has joined #archiveteam-bs [19:58] *** BnARobin has quit IRC (Remote host closed the connection) [19:59] *** BnARobin has joined #archiveteam-bs [20:04] *** BnARobin has quit IRC (Remote host closed the connection) [20:04] *** BnARobin has joined #archiveteam-bs [20:11] *** BnARobin has quit IRC (Remote host closed the connection) [20:11] *** BnARobin has joined #archiveteam-bs [20:28] *** BnARobin has quit IRC (Remote host closed the connection) [20:29] *** BnARobin has joined #archiveteam-bs [20:36] *** icedice has joined #archiveteam-bs [20:53] *** BnARobin has quit IRC (Remote host closed the connection) [20:53] *** BnARobin has joined #archiveteam-bs [20:59] *** BnARobin has quit IRC (Remote host closed the connection) [20:59] *** BnARobin has joined #archiveteam-bs [21:05] *** BnARobin has quit IRC (Remote host closed the connection) [21:05] *** BnARobin has joined #archiveteam-bs [21:11] *** BnARobin has quit IRC (Remote host closed the connection) [21:11] *** BnARobin has joined #archiveteam-bs [21:20] *** BnARobin has quit IRC (Remote host closed the connection) [21:20] *** BnARobin has joined #archiveteam-bs [21:23] *** JAA sets mode: +b *!*BnARobin@*.bnaboyz.nl [21:23] *** BnARobin was kicked by JAA (Please fix your connection.) [22:02] *** schbirid has quit IRC (Quit: Leaving) [22:57] *** jschwart has quit IRC (Quit: Konversation terminated!) [23:07] *** Igloo has joined #archiveteam-bs [23:07] *** Igloo has quit IRC (Client Quit) [23:29] *** RichardG_ has joined #archiveteam-bs [23:29] *** RichardG has quit IRC (Read error: Connection reset by peer)