[00:10] *** Zerote has quit IRC (Ping timeout: 604 seconds) [00:15] *** ATrescue has joined #archiveteam-bs [00:55] *** ayanami_ has joined #archiveteam-bs [00:55] https://vanillo.co Videosharing site Vanillo is shutting down in 4 days [00:55] Yup [00:55] Think we can run an ArchiveBot or download the videos? [00:56] Oh, you've heard? [00:56] Running already. [00:56] Oh, nice! [00:56] Is there any link to a tracker? [00:56] http://dashboard.at.ninjawedding.org/ [00:56] Job 4txte5r78lzxz86g5hag0awb7 [00:57] Wow, Vanillo didn't even exist for long enough to get an official youtube-dl extractor [00:57] lol [00:57] Will video/image files be archived or just plain HTML? [00:58] Videos and images are directly linked, so they get grabbed as well. [00:58] seems like it's grabbing a lot of bitchute videos for some reason [00:58] Yeah, outlinks from descriptions etc. [00:59] I'd exclude those since Vanillo is almost yahoo'ing at this point [00:59] ayanami_: Before we lose it: Unlisted video on Vanillo.co: If you enable the adblocker, it refers you to an unlisted video telling you to disable ads. (I can not do it right now becausse my AddOns do not work due to Firefox disaster). [00:59] Should probably be fine. We still have a few days, and the site's pretty small. [00:59] Okay [01:00] Also, didn't an update come out for that disaster? 66.0.4? [01:00] Also, I have a question about the BBM shutdown, if it hasn't been asked already: You think that everything archiveable can be saved w/ ArchiveBot? (News articles, documentation, official media) [01:01] ATrescue: BTW, go to nixnet.xyz to get the addon fix [01:01] BBM? [01:02] Move the addon discussion to -ot please. [01:02] Blackberry Messenger [01:02] Ah, right. [01:02] → #archiveteam-ot [01:02] I talked about it some time ago [01:03] (Not that it bothers me) [01:04] Also, looking at the tracker, apparently Sketch is going to just be an ArchiveBot job? [01:04] 300,000+ MB saved already [01:04] No, #SketchedOut [01:04] It's too big for AB. [01:05] Oh, but I see it on the tracker, why is that? [01:05] Is it just saving some other content? [01:05] And AB also can't save everything due to a bug in wpull. [01:05] *** enowaldo has quit IRC (Ping timeout: 265 seconds) [01:06] It's just a safety net in case we don't get the other project running or something goes wrong or... [01:06] What is Sketch? [01:07] https://archiveteam.org/index.php?title=Sketch [01:07] Ah ok, sounds like it'll be kinda painful to archive [01:14] *** qw3rty114 has quit IRC (Ping timeout: 600 seconds) [01:15] an image only Sketch is 2.2TB, @100Mbps -> 2days [01:16] I think that would be 200 worker threads [01:33] PurpleSym: The bot seems to swallow some edits...? [01:35] it didn't show the one I made to Vanillo [01:47] OK I'm curious, is archiveteam against archiving NSFW sites? Because there aren't articles on many of the major porn sites [01:47] We've archived Eroshare. Does that answer your question? [01:48] Yes [01:48] :-) [01:48] tumblr doesn't count? [01:48] Yeah, and Tumblr. [01:48] Eroshare was only porn though, while Tumblr also has a lot of other content obviously. [01:48] I'm sad we missed HentaiHaven [01:49] Eroshare was unique content and user uploaded while there are some amateur content most porn sites arent going down and the smallers ones are usually just mirrors of the bigger ones [01:50] Ah, that makes sense. I'm guessing that's a similar reason to Tumblr, then? [01:50] At least that is my understanding of it [01:50] I mean if like a small porn site with completely user generated content was found to be shutting down we may grab it but it most likely wouldnt even make it to us [01:52] *** ATrescue has quit IRC (Ping timeout: 260 seconds) [01:55] Yeah, our focus is unique data. Commercial porn isn't exactly that. [01:56] I'm not sure what we'd do if PornHub went down, for example. [01:57] Save amateur section first and leaved the rest to the horny fuckers at datahoarders [01:59] Yeah, something like that. [02:00] LMAO, yeah. I'm remembering a post from a few days ago "Datahoarders who have gfs or wives, how does your gf/wife feel about your porn collection?" [02:00] guy had 20TB+ [02:00] I mean we would probably go after the community section the user uploaded content not the verified members [02:04] That's cute. There's someone over there who had over a PB of cam recordings on Amazon Drive... Not sure if that collection made it anywhere else before the shutdown. [02:04] Anyway, this is getting off-topicky. [02:05] That's true. And I remember that guy, he got a decent amount of news coverage. [02:18] *** ATrescue has joined #archiveteam-bs [02:23] *** ATrescue has quit IRC (Quit: Page closed) [02:25] *** ATrescue has joined #archiveteam-bs [02:27] *** enowaldo has joined #archiveteam-bs [02:41] *** enowaldo has quit IRC (Read error: Operation timed out) [03:16] *** odemg has quit IRC (Ping timeout: 615 seconds) [03:22] *** odemg has joined #archiveteam-bs [03:44] *** ayanami_ has quit IRC (Quit: Leaving) [03:45] *** godane has quit IRC (Read error: Operation timed out) [04:39] *** marked1 has quit IRC (Read error: Operation timed out) [05:26] *** kbtoo_ has joined #archiveteam-bs [05:29] *** kbtoo has quit IRC (Ping timeout: 255 seconds) [05:36] *** godane has joined #archiveteam-bs [05:58] JAA: Hm, that is strange. All I did was adding rcshow=!bot to the search query. [06:16] *** kbtoo__ has joined #archiveteam-bs [06:19] *** kbtoo_ has quit IRC (Ping timeout: 255 seconds) [06:23] *** Zerote has joined #archiveteam-bs [07:28] *** Zerote has quit IRC (Ping timeout: 604 seconds) [07:35] *** Zerote has joined #archiveteam-bs [07:50] *** HashbangI has quit IRC (Read error: Connection reset by peer) [08:41] *** icedice has joined #archiveteam-bs [08:41] SketchCow: i'm digitizing tapes again [08:42] *** ATrescue has quit IRC (Ping timeout: 260 seconds) [08:42] i think if i keep my pidgin to only a few chat rooms the swap zram doesn't get full [08:42] at least thats the theory [08:48] *** qw3rty114 has joined #archiveteam-bs [09:01] *** icedice2 has joined #archiveteam-bs [09:07] *** icedice has quit IRC (Read error: Operation timed out) [09:08] *** icedice2 has quit IRC (Quit: Leaving) [09:08] *** icedice has joined #archiveteam-bs [09:12] *** ranma_ has quit IRC (Ping timeout: 255 seconds) [09:12] *** enowaldo has joined #archiveteam-bs [09:14] *** deevious has joined #archiveteam-bs [09:15] *** ranma_ has joined #archiveteam-bs [09:17] *** enowaldo has quit IRC (Ping timeout: 268 seconds) [09:32] *** killsushi has quit IRC (Quit: Leaving) [10:07] *** icedice has quit IRC (Quit: Leaving) [10:11] *** BlueMax has quit IRC (Quit: Leaving) [10:57] *** deevious has quit IRC (Ping timeout: 252 seconds) [10:58] *** deevious has joined #archiveteam-bs [11:40] lastest tapes : https://www.patreon.com/posts/digitize-tapes-26701843 [11:40] *latest [11:56] *** HashbangI has joined #archiveteam-bs [12:03] *** enowaldo has joined #archiveteam-bs [12:16] *** marked has joined #archiveteam-bs [12:18] *** enowaldo has quit IRC (Read error: Operation timed out) [13:06] *** zhongfu_ has joined #archiveteam-bs [13:11] *** zhongfu__ has joined #archiveteam-bs [13:11] *** zhongfu has quit IRC (Ping timeout: 615 seconds) [13:17] *** deevious has quit IRC (Ping timeout: 252 seconds) [13:17] *** zhongfu_ has quit IRC (Ping timeout: 615 seconds) [13:22] *** Verified_ has quit IRC (Ping timeout: 252 seconds) [13:25] *** enowaldo has joined #archiveteam-bs [13:25] *** deevious has joined #archiveteam-bs [13:29] *** enowaldo has quit IRC (Ping timeout: 252 seconds) [13:36] *** martinlig has joined #archiveteam-bs [13:44] *** enowaldo has joined #archiveteam-bs [13:50] godane: using pidgin for irc? [13:54] my archive warrior has stopped processing any job, I just get a yellow webpage with no tasks visible. Is this expected? [13:58] eythian urlteam tracker might be borked I can’t log in on my iPod will check tomorrow [13:59] Main tracker says: Cannot GET / [14:00] nevermind, i have it blackholed. sorry. [14:03] Fusl: yep [14:09] godane: my condolence [14:10] marked: for fox sake, don't run a warrior if you're blocking certain domains [14:10] or i'm gonna personally ban you across the entire project [14:15] *** Zerote has quit IRC (Read error: Operation timed out) [14:21] *** deevious has quit IRC (Ping timeout: 252 seconds) [14:22] *** deevious has joined #archiveteam-bs [14:28] Flashfire: well, it's not urlteam specifically. It's as though I had concurrent items set to 0. [14:28] maybe I'll kick the running container, see if that does anything. [14:29] that made a difference. I'd forgotten that it has --restart always, so if I tell it to shut down in the web UI it comes straight back. That's useful. [14:32] *** enowaldo has quit IRC (Ping timeout: 265 seconds) [14:36] SketchCow: i think you want to know about this item: https://archive.org/details/micomBASIC19841994 [14:37] there are 117 issues of Basic Magazine from japan [14:37] from 1984-12 to 1994-12 [14:37] also this : https://archive.org/details/OhMZOhx19861989 [14:38] and this : https://archive.org/details/OhX1990-1995 [14:43] JAA: I don’t see which edits purplebot would have swallowed. It delays showing multiple edits for a single pages though (by about 20 minutes), so it may seems like that. [14:43] PurpleSym: Actually, it didn't silence anything, but it only reported them with a delay; so much delay in fact that the order of the edits reported in here changed. I noticed that before, and that delay is usually between 20 and 25 minutes. Any idea where that comes from? [14:43] Damn, ninja'd. [14:44] It’s ok :) [14:44] Ah [14:44] So it showed one edit for Vanillo.co, then delayed the further edits to that page, but showed the edit for Vanillo after the move immediately. [14:45] Yes. The idea was to accumulate multiple edits to avoid flooding the channel. [14:45] Yeah, right. [14:53] Fusl: you would have to join -dev if you want that explanation [14:55] *** icedice has joined #archiveteam-bs [14:58] *** yano_ is now known as yano [15:01] *** enowaldo has joined #archiveteam-bs [15:28] *** ranma_ has quit IRC () [15:32] *** frainz has quit IRC (Ping timeout: 265 seconds) [15:38] *** Zerote has joined #archiveteam-bs [15:56] *** martinlig has quit IRC (Quit: Connection closed for inactivity) [15:58] *** enowaldo has quit IRC (Read error: Operation timed out) [16:08] *** enowaldo has joined #archiveteam-bs [16:52] *** Zerote has quit IRC (Ping timeout: 600 seconds) [17:05] *** wp494 has quit IRC (Ping timeout: 252 seconds) [17:07] *** cfarquhar has quit IRC (Read error: Operation timed out) [17:08] *** cfarquhar has joined #archiveteam-bs [17:09] *** wp494 has joined #archiveteam-bs [17:29] *** cfarquhar has quit IRC (Read error: Operation timed out) [17:30] *** cfarquhar has joined #archiveteam-bs [17:38] *** Pixi has quit IRC (Quit: Pixi) [17:42] *** cfarquhar has quit IRC (Read error: Operation timed out) [17:43] *** cfarquhar has joined #archiveteam-bs [17:47] *** Dallas has quit IRC (Quit: The Lounge - https://thelounge.chat) [17:48] *** Dallas has joined #archiveteam-bs [18:09] *** VerifiedJ has joined #archiveteam-bs [18:29] *** Joseph_ has joined #archiveteam-bs [18:29] *** VerifiedJ has quit IRC (Read error: Connection reset by peer) [18:30] *** enowaldo has quit IRC (Read error: Operation timed out) [18:32] *** VerifiedJ has joined #archiveteam-bs [18:33] *** Joseph_ has quit IRC (Read error: Connection reset by peer) [18:53] *** Pixi has joined #archiveteam-bs [18:53] *** enowaldo has joined #archiveteam-bs [18:56] *** Zerote has joined #archiveteam-bs [19:08] *** VerifiedJ has quit IRC (Ping timeout: 252 seconds) [19:25] *** Terbium_ has quit IRC (Ping timeout: 268 seconds) [19:30] https://github.com/mikf/gallery-dl [19:50] *** PhrackD- has joined #archiveteam-bs [19:51] *** PhrackD has quit IRC (Ping timeout: 600 seconds) [19:51] *** PhrackD- is now known as PhrackD [20:07] *** Medowar has joined #archiveteam-bs [21:10] godane: Good eye. I'll get them somewhere. [21:39] *** enowaldo has quit IRC (Read error: Operation timed out) [22:04] Is anyone here interested in helping with election-related archival? Overview of my recent efforts: https://archiveteam.org/index.php?title=Elections [22:08] *** enowaldo has joined #archiveteam-bs [22:13] *** enowaldo has quit IRC (Ping timeout: 265 seconds) [22:14] JAA: Yeah sure, I can probably help. What do you need? [22:15] jodizzle: The key issue is usually finding all the campaign sites, candidates, parties, associated social media accounts, etc. In other words, creating those lists you see linked there. [22:16] I'm going after today's referendum in Belize right now, and there are also general elections in South Africa today. [22:16] The big one will be the European Parliament elections later this month. [22:16] *** Medowar has quit IRC (Quit: Connection closed for inactivity) [22:18] there's a LOT of candidates standing for election every day [22:19] Yeah, if it's parliamentary elections, I only go after the parties normally. [22:20] And campaigns on the party or party-like level. [22:20] Because yeah, covering every candidate is impossible. [22:20] Hmm alright, I'll start looking around. [22:21] *** BartoCH has quit IRC (Ping timeout: 615 seconds) [22:22] This reminds me that a while back I helped collect some data on the U.S. 2018 midterms. I managed to scrape some sites with well-structured data on candidates, even down to the local level, including home pages, social media accounts, etc.. Didn't get around to archiving it all, though. [22:22] I should wrap back around to that as well (though I imagine the U.S. has better coverage than a lot of the world). [22:23] It definitely has from what I've seen in the past few months since I started doing this. [22:23] You can usually find a list of candidates somewhere on an official government site, but that's about it. [22:24] Which doesn't help at all with referendums. [22:24] I base a lot of my discovery on Wikipedia, either English or the native language. [22:25] *** BartoCH has joined #archiveteam-bs [22:28] JAA: How granular should the elections pages be? Like would this election get it's own tracking page?: https://en.wikipedia.org/wiki/2019_Tshwane_mayoral_election [22:30] jodizzle: I've been going after national-level elections only so far, mainly because regional and local elections are even harder to research. In this case, I'd include it in a "2019 South African general election" page as a subsection I think. [22:31] Okay, sounds reasonable [22:31] Ok, one exception, I went after the mayoral election in Ankara since that's pretty influential on Turkish politics. [22:32] *** Zerote has quit IRC (Read error: Connection reset by peer) [22:42] Yeah, I mean all this stuff is pretty particular so I would imagine there are exceptions. But I think I get the basic idea. [22:44] Oh wait, that election in Tshwane was something separate. I didn't look at it and assumed it also happened today. Hmm. [22:45] It was also an indirect election with only one candidate, so probably there aren't too many things related to it anyway. [22:48] *** enowaldo has joined #archiveteam-bs [22:48] Yeah I guess that one's a little weird to place (but also probably easy to grab). But this one, for instance, would fall under general elections? https://en.wikipedia.org/wiki/2019_Western_Cape_provincial_election [22:49] Since my understanding is that the elections in the South African general election happen on a per-province basis [22:49] Yeah [22:50] Might not technically be absolutely correct, but close enough. [22:56] *** enowaldo has quit IRC (Ping timeout: 252 seconds) [23:04] *** enowaldo has joined #archiveteam-bs [23:20] *** enowaldo has quit IRC (Read error: Operation timed out) [23:23] *** enowaldo has joined #archiveteam-bs [23:24] *** Flux^^ has joined #archiveteam-bs [23:28] *** enowaldo has quit IRC (Ping timeout: 265 seconds) [23:31] *** cfarquhar has quit IRC (Read error: Operation timed out) [23:32] ^ The bot is finally documented. [23:32] VoynichCr: ^ FYI [23:38] *** cfarquhar has joined #archiveteam-bs [23:39] *** enowaldo has joined #archiveteam-bs [23:39] *** wyatt8740 has quit IRC (Ping timeout: 246 seconds) [23:41] *** BlueMax has joined #archiveteam-bs [23:55] *** Flashfire has quit IRC (Excess Flood) [23:55] *** Flashfire has joined #archiveteam-bs