[00:17] *** killsushi has joined #archiveteam-bs [02:18] *** BlueMax has joined #archiveteam-bs [02:47] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [03:29] *** qw3rty2 has joined #archiveteam-bs [03:35] *** odemgi has joined #archiveteam-bs [03:35] *** qw3rty has quit IRC (Ping timeout: 745 seconds) [03:38] *** odemgi_ has quit IRC (Ping timeout: 252 seconds) [04:20] *** wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES) [04:25] *** wp494 has joined #archiveteam-bs [05:16] *** zhongfu has quit IRC (Ping timeout: 745 seconds) [05:45] *** VADemon has joined #archiveteam-bs [05:46] *** VADemon has quit IRC (Client Quit) [05:46] *** VADemon_ has quit IRC (Ping timeout: 255 seconds) [05:54] *** deevious has joined #archiveteam-bs [06:07] *** VADemon has joined #archiveteam-bs [06:20] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [06:21] *** ShellyRol has joined #archiveteam-bs [06:30] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [06:31] *** ShellyRol has joined #archiveteam-bs [06:31] *** killsushi has quit IRC (Quit: Leaving) [06:45] *** ShellyRol has quit IRC (Remote host closed the connection) [06:45] *sigh* Google killed off another one of its products [06:45] *** ShellyRol has joined #archiveteam-bs [06:47] *** ShellyRol has quit IRC (Remote host closed the connection) [06:49] *** ShellyRol has joined #archiveteam-bs [06:50] *** ShellyRol has quit IRC (Remote host closed the connection) [06:51] *** ShellyRol has joined #archiveteam-bs [07:11] *** ShellyRol has quit IRC (Read error: Operation timed out) [07:15] *** ShellyRol has joined #archiveteam-bs [07:27] *** ShellyRol has quit IRC (Read error: Operation timed out) [07:28] *** ShellyRol has joined #archiveteam-bs [07:41] *** ShellyRol has quit IRC (Ping timeout: 745 seconds) [07:51] *** ShellyRol has joined #archiveteam-bs [08:04] *** ShellyRol has quit IRC (Ping timeout: 745 seconds) [08:05] *** ShellyRol has joined #archiveteam-bs [08:05] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [08:07] *** ShellyRol has joined #archiveteam-bs [08:08] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [08:09] *** ShellyRol has joined #archiveteam-bs [08:09] *** ShellyRol has quit IRC (Remote host closed the connection) [08:11] *** ShellyRol has joined #archiveteam-bs [08:11] *** JAA sets mode: +b *!*ShellyRol@*.hsd1.wa.comcast.net [08:11] *** ShellyRol was kicked by JAA (Fix your connection please.) [09:11] *** JAA sets mode: -b *!*ShellyRol@*.hsd1.wa.comcast.net [09:51] *** zhongfu has joined #archiveteam-bs [10:52] *** BlueMax has quit IRC (Quit: Leaving) [11:36] Idk about discovery, but does look like there's web content: https://posts.google.com/bulletin/share/huCFGUte/vYjhgm [11:38] How about #bullet-in [13:24] I might have found a way to enumerate all Disqus forums. [13:27] *** bluefoo has quit IRC (Ping timeout: 258 seconds) [13:28] *** wp494 has quit IRC (Read error: Operation timed out) [13:32] *** wp494 has joined #archiveteam-bs [14:21] https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fposts.google.com%2Fbulletin%2Fshare%2F [14:50] godane: Looks good. [14:56] *** deevious has quit IRC (Quit: deevious) [15:16] *** DogsRNice has joined #archiveteam-bs [15:49] *** bluefoo has joined #archiveteam-bs [16:55] *** schbirid has joined #archiveteam-bs [17:01] *** RichardG has quit IRC (Read error: Operation timed out) [17:06] *** ShellyRol has joined #archiveteam-bs [17:08] *** RichardG has joined #archiveteam-bs [17:19] *** RichardG has quit IRC (Ping timeout: 612 seconds) [17:24] *** ShellyRol has quit IRC (Ping timeout: 745 seconds) [17:25] *** ShellyRol has joined #archiveteam-bs [17:26] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [17:30] *** ShellyRol has joined #archiveteam-bs [17:56] *** ShellyRol has quit IRC (Ping timeout: 745 seconds) [17:57] *** ShellyRol has joined #archiveteam-bs [17:59] *** ShellyRol has quit IRC (Write error: Broken pipe) [17:59] *** ShellyRol has joined #archiveteam-bs [18:00] *** ShellyRol has quit IRC (Remote host closed the connection) [18:02] *** RichardG has joined #archiveteam-bs [18:02] https://github.com/yarrick/pingfs [18:02] hrm, I wounder how much data you can float using this [18:02] *** ShellyRol has joined #archiveteam-bs [18:02] *** ShellyRol has quit IRC (Remote host closed the connection) [18:03] *** ShellyRol has joined #archiveteam-bs [18:10] https://blog.benjojo.co.uk/post/dns-filesystem-true-cloud-storage-dnsfs [18:52] *** ShellyRol has quit IRC (Read error: Operation timed out) [18:53] *** ShellyRol has joined #archiveteam-bs [18:54] *** ShellyRol has quit IRC (Write error: Broken pipe) [18:55] *** bluefoo has quit IRC (Ping timeout: 745 seconds) [18:57] *** ShellyRol has joined #archiveteam-bs [19:01] *** ShellyRol has quit IRC (Remote host closed the connection) [19:03] *** schbirid has quit IRC (Read error: Operation timed out) [19:04] *** bluefoo has joined #archiveteam-bs [19:04] *** schbirid has joined #archiveteam-bs [19:04] *** ShellyRol has joined #archiveteam-bs [19:05] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [19:07] *** ShellyRol has joined #archiveteam-bs [19:09] *** ShellyRol has quit IRC (Remote host closed the connection) [19:13] *** ShellyRol has joined #archiveteam-bs [19:13] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [19:21] *** icedice has joined #archiveteam-bs [19:21] *** icedice has quit IRC (Read error: Connection reset by peer) [19:21] *** icedice has joined #archiveteam-bs [19:29] *** icedice has quit IRC (Quit: icedice) [19:30] *** ShellyRol has joined #archiveteam-bs [19:31] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [19:34] *** ShellyRol has joined #archiveteam-bs [19:34] *** JAA sets mode: +b *!*ShellyRol@*.hsd1.wa.comcast.net [19:34] *** ShellyRol was kicked by JAA (Please fix your connection. (Ban expires in 24 hours.)) [19:39] *** RichardG_ has joined #archiveteam-bs [19:39] *** RichardG has quit IRC (Ping timeout: 258 seconds) [20:01] *** systwi has joined #archiveteam-bs [20:06] *** systwi_ has quit IRC (Read error: Operation timed out) [20:40] *** schbirid has quit IRC (Remote host closed the connection) [20:48] archivebot can crawl an entire Twitter user, right? or at least, more than just the page view that the Wayback Machine will snapshot? can we get https://twitter.com/johngraycpa crawled, for political reasons (i.e., saving Tweets that he's made) [20:53] paul2520 I just grabbed that Twitter feed for you [20:53] *** HashbangI has quit IRC (Remote host closed the connection) [20:54] thank you katocala! [20:56] paul2520: ArchiveBot on its own can't, but there's a bot for that! :-) [20:56] yay :-) [20:57] can you explain the logistics, katocala / JAA? or is there a page/documentation somewhere? the URL https://transfer.notkiska.pw/s6SWt/twitter-@johngraycpa has ~4k lines, less than what Twitter say he's tweeted, and I see not all of them are Tweets. [20:57] is that list the max we can get via the bot? [20:58] I don't think there's any documentation on socialbot, but it's basically an IRC interface to snscrape (which is also undocumented). [20:58] *** bluefoo has quit IRC (Ping timeout: 745 seconds) [20:59] snscrape can't get retweets, which might explain (part of) the difference. [20:59] ah, got it. I see the GitHub repo now. thanks :-) [21:00] The list does seem to go back to his first tweet, so yeah, I guess ~9k of those ~15k are retweets. [21:00] that sounds plausible. [21:00] RTs generally aren't endoresements... but can be good to track [21:00] is that an issue with how Twitter works? [21:03] Yep. Profile pages only return the 3200 most recent tweets (including retweets). snscrape uses the search to get past that limit, but the search doesn't return retweets. [21:05] ah, okay [21:11] I'll play with the scraper later... but I'm curious, does it return the full HTML? or just the tweets JSON [21:12] *** HashbangI has joined #archiveteam-bs [21:14] Neither. At the core level, it parses the HTML and returns (Python) objects representing the posts. But unless you like to get your hands dirty, you'll never see those. Through the CLI, you get just the URL to each tweet by default, though that's customisable through the (largely undocumented) --format option. [21:15] Oh! I should have known -- then that list is passed to archivebot. [21:15] thanks for clarifying. [21:19] any live projects atm? :/ [21:38] *** systwi_ has joined #archiveteam-bs [21:46] *** systwi has quit IRC (Ping timeout: 612 seconds) [22:08] *** underscor has quit IRC (Quit: No Ping reply in 180 seconds.) [22:09] *** underscor has joined #archiveteam-bs [22:09] *** slyphic has quit IRC (Read error: Operation timed out) [22:10] *** slyphic has joined #archiveteam-bs [22:22] There are deadlines for sony sketch, national geographic yourshot, drawr.net, and google bulletin [22:24] does socialbot keep track of its last run on an account? [22:24] Nope [22:55] *** BlueMax has joined #archiveteam-bs [23:00] *** DigiDigi has quit IRC (Remote host closed the connection) [23:22] *** Raccoon has quit IRC (Ping timeout: 252 seconds) [23:23] *** DigiDigi has joined #archiveteam-bs [23:32] *** Raccoon has joined #archiveteam-bs