[00:19] *** jamiew has joined #archiveteam-bs [00:34] *** Datechnom has quit IRC (Remote host closed the connection) [01:32] *** DogsRNice has quit IRC (Ping timeout: 276 seconds) [01:32] *** DogsRNice has joined #archiveteam-bs [02:30] *** HP_Archiv has quit IRC (Ping timeout: 610 seconds) [02:40] *** SoraUta has joined #archiveteam-bs [02:48] *** icedice has quit IRC (Quit: Leaving) [03:05] *** cerca has quit IRC (Remote host closed the connection) [03:43] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [03:50] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [03:51] *** ShellyRol has joined #archiveteam-bs [03:51] *** LowLevelM has quit IRC (Quit: Ping timeout (120 seconds)) [03:51] *** LowLevelM has joined #archiveteam-bs [04:09] *** odemgi_ has joined #archiveteam-bs [04:12] *** cppchrisc has quit IRC (Ping timeout: 496 seconds) [04:13] *** cppchrisc has joined #archiveteam-bs [04:15] *** odemgi has quit IRC (Read error: Operation timed out) [04:47] *** qw3rty2 has joined #archiveteam-bs [04:56] *** qw3rty has quit IRC (Ping timeout: 745 seconds) [05:28] *** nicolas17 has quit IRC (Read error: Operation timed out) [05:42] *** me has quit IRC (Read error: Operation timed out) [05:43] *** Pixi has quit IRC (Read error: Operation timed out) [05:43] *** Selavi has quit IRC (Write error: Broken pipe) [05:43] *** Selavii has joined #archiveteam-bs [05:44] *** Selavii is now known as Selavi [05:45] *** fuzzy802 has joined #archiveteam-bs [05:45] *** Pixi has joined #archiveteam-bs [05:46] *** superkuh has quit IRC (Excess Flood) [05:46] *** nyany has quit IRC (Read error: Operation timed out) [05:46] *** twigfoot has quit IRC (Read error: Operation timed out) [05:46] *** tomaspark has quit IRC (Ping timeout: 360 seconds) [05:46] *** underscor has quit IRC (Ping timeout: 360 seconds) [05:47] *** twigfoot has joined #archiveteam-bs [05:47] *** underscor has joined #archiveteam-bs [05:47] *** voltagex has quit IRC (Ping timeout: 262 seconds) [05:48] *** cf has quit IRC (Read error: Operation timed out) [05:48] *** fuzzy8021 has quit IRC (Read error: Operation timed out) [05:48] *** superkuh has joined #archiveteam-bs [05:49] *** VADemon_ has joined #archiveteam-bs [05:49] *** arkiver has quit IRC (Ping timeout: 360 seconds) [05:49] *** swebb has quit IRC (Read error: Operation timed out) [05:49] *** swebb has joined #archiveteam-bs [05:50] *** fuzzy802 has quit IRC () [05:50] *** fuzzy8021 has joined #archiveteam-bs [05:51] *** arkiver has joined #archiveteam-bs [05:51] *** svchfoo3 sets mode: +o arkiver [05:51] *** svchfoo1 sets mode: +o arkiver [05:51] *** unlobito has quit IRC (Ping timeout: 392 seconds) [05:51] *** ShellyRol has quit IRC (Read error: Operation timed out) [05:52] *** chfoo has quit IRC (Ping timeout: 360 seconds) [05:53] *** ShellyRol has joined #archiveteam-bs [05:53] *** tomaspark has joined #archiveteam-bs [05:54] *** Igloo has quit IRC (Read error: Connection reset by peer) [05:55] *** unlobito has joined #archiveteam-bs [05:55] *** nyany has joined #archiveteam-bs [05:55] *** voltagex has joined #archiveteam-bs [05:55] *** me has joined #archiveteam-bs [05:56] *** Igloo has joined #archiveteam-bs [05:56] *** chfoo has joined #archiveteam-bs [05:56] *** svchfoo3 sets mode: +o me [05:56] *** svchfoo1 sets mode: +o Igloo [05:57] *** svchfoo1 sets mode: +o chfoo [05:57] *** svchfoo3 sets mode: +o chfoo [05:59] *** VADemon has quit IRC (Read error: Operation timed out) [06:00] *** Datechnom has joined #archiveteam-bs [06:01] *** jamiew has quit IRC (Textual IRC Client: www.textualapp.com) [06:05] *** tomaspark has quit IRC (Ping timeout: 255 seconds) [06:10] *** cf has joined #archiveteam-bs [06:12] *** jamiew has joined #archiveteam-bs [06:16] *** jamiew has quit IRC (Read error: Operation timed out) [06:17] *** bluefoo has joined #archiveteam-bs [06:17] *** jamiew has joined #archiveteam-bs [07:19] *** i0npulse has quit IRC (Ping timeout: 276 seconds) [07:19] *** i0npulse has joined #archiveteam-bs [07:43] *** ShellyRol has quit IRC (Ping timeout: 610 seconds) [07:54] *** ShellyRol has joined #archiveteam-bs [08:27] *** schbirid has joined #archiveteam-bs [08:33] *** LowLevelM has quit IRC (Read error: Operation timed out) [08:36] *** wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) [08:38] *** LowLevelM has joined #archiveteam-bs [08:44] *** wp494 has joined #archiveteam-bs [08:50] *** killsushi has quit IRC (Leaving) [08:56] *** BlueMaxim has joined #archiveteam-bs [09:09] *** BlueMax has quit IRC (Ping timeout: 745 seconds) [09:44] *** Atom-- has joined #archiveteam-bs [09:50] *** Atom__ has quit IRC (Read error: Operation timed out) [10:45] *** tomaspark has joined #archiveteam-bs [11:15] *** BlueMax has joined #archiveteam-bs [11:24] *** BlueMax has quit IRC (Ping timeout: 276 seconds) [11:25] *** BlueMax has joined #archiveteam-bs [11:26] *** BlueMaxim has quit IRC (Ping timeout: 745 seconds) [11:27] *** DigiDigi has quit IRC (Remote host closed the connection) [11:32] *** cerca has joined #archiveteam-bs [11:55] *** BlueMax has quit IRC (Read error: Connection reset by peer) [11:55] *** BlueMax has joined #archiveteam-bs [12:25] *** HP_Archiv has joined #archiveteam-bs [12:25] *** Nick-PC has joined #archiveteam-bs [12:27] I have a OPs site request - https://www.bricklink.com/v2/main.page BrickLink was just acquired last month by Lego. It's an invaluable resource for all sorts of Lego kits/parts. [12:28] I belong to a Lego Discord Server and can confirm that there is a risk that Lego might revise it or take it offline at some point. Can one of the OPs submit Bricklink.com into archivebot, please? [12:28] HP_Archiv: Do you have an estimate for how large the site is? [12:30] I still don't know how to estimate size of sites, but here's the site map index: https://www.bricklink.com/siteMap.asp [12:30] I would imagine it's fairly sizable [12:31] But not too big for archivebot? [12:31] I honestly do not know - how would I determine this? [12:31] I could run a scrape of the site [12:31] If you could that would be great [12:32] Ok, let me set that up. [12:32] Thank you LowLevelM [12:33] From the Discord server, "I cannot think of anything else, except... [12:33] People are saying that due to the fact that LEGO bought bricklink, bricklink is going to be shut down in max 2 years cause LEGO will try to force people to buy sets from their website or something like that [12:33] So perhaps developing a new marketplace and releasing it then would be a solution?" [12:33] I was looking at the old Lego Mindstorms Invention System kits from the late 90's and happened upon a Discord server full of Lego enthusiasts. Go figure, heh. [12:35] Lmk what you find out LowLevelM [12:35] I will [12:47] Hm, there also appears to be BrickSet, https://brickset.com/ [12:50] https://seashells.io/v/7BSrhqb4 [12:50] Took me a while to get running because I needed to program a random user agent function [12:55] Hm [12:56] Can you ingest this into Archivebot or is it too big? [12:56] I don't know, as the scrape is not finished. [12:57] Oh, right just the bottom half of that page now. Okay. Can you run a scrape on Brickset.com as well, if it [12:57] it's not too much trouble?* [12:58] I can run another scraper instance [12:59] Okay cool, thank you [13:01] https://seashells.io/v/bzf9TneX [13:04] From the current state of the scrape, I think they can be added to archivebot. I will let them know. [13:06] Okay awesome, thank you LowLevelM, appreciate this a lot [13:06] yw [13:10] If you could let me know when they've been submitted that would be great. I like to see things being ingested in real time in the dashboard [13:12] Ok, and if nobody will put it into archivebot, I can archive it using grab-site [13:13] Astrid or Ivan, we've talked before. Or JAA - could one of you take a look at these sites and accept for archivebot? [13:27] *** MrRadar has joined #archiveteam-bs [13:30] *** MrRadar has quit IRC (Read error: Operation timed out) [13:43] HP_Archiv: done [13:44] Thanks eietei95, but wanted one of the Ops to do it so as to ensure it's archived properly [13:44] Unless you're an OP? Didn't see your name in the list [13:44] https://usercontent.irccloud-cdn.com/file/jTIzacjw/image.png [13:45] that said - op doesn't really *mean* anything [13:45] I have op everywhere and I don't know shit about archivebot [13:46] Oh okay, well I was under the impression that only users in the list on the side who were green could use voice to correctly ingest sites for archiving [13:46] Just want to make sure it's done thoroughly \ [14:00] *** oxguy3 has joined #archiveteam-bs [14:30] *** mtntmnky has quit IRC (Remote host closed the connection) [14:30] *** mtntmnky has joined #archiveteam-bs [14:32] *** BlueMax has quit IRC (Read error: Connection reset by peer) [15:17] *** SoraUta has quit IRC (Read error: Operation timed out) [16:03] *** oxguy3 has quit IRC (My MacBook has gone to sleep. ZZZzzz…) [17:07] *** Larsenv has quit IRC (Quit: ZNC 1.7.5 - https://znc.in) [17:19] *** MrRadar has joined #archiveteam-bs [17:23] *** Larsenv has joined #archiveteam-bs [17:26] *** Larsenv has quit IRC (Client Quit) [17:33] *** Larsenv has joined #archiveteam-bs [17:48] *** VerifiedJ has joined #archiveteam-bs [17:48] *** wp494 has quit IRC (Ping timeout: 745 seconds) [17:49] *** wp494 has joined #archiveteam-bs [18:07] *** Dallas has joined #archiveteam-bs [18:09] HP_Archiv - I recall trying to archive that before, and eventually the website started to either heavily rate limit AB or just UA ban it; the job overall might be very big to get D: [18:09] And then it got caught in a crash along with the other jobs [18:11] Ryz, which site, BrickLink or BrickSet? [18:12] BrickLink, I did an archiving attempt back in 2019 November 27 or 28 [18:12] You said 'back in 2019' as if we're already in 2020, heh [18:12] I think we should try it again, if possible, to be honest [18:13] It's running earlier right now, but I really feel it should be more than just AB if the changes are to be site-wide [18:14] Okay, well if required more than AB, what do you suggest? [18:21] *** DigiDigi has joined #archiveteam-bs [18:22] I can say that it's definitely not AB considering it'll take a lot of time; and who knows if the LEGO Group will immediately enforce it or not [18:23] Well what can ArchiveTeam do to capture the whole site? Warrior? Idk what else to suggest, here. Was hoping someone could help with this [18:26] depends on the whether the rate limit is IP based or UA based [18:29] So what now? Can't one of the Ops look at this? [18:31] Did an investigation, yep, it was UA-based; attempt 1 with me was running on default with trent-nz-alpha before it got UA-banned back in 2019 November; subsequent attempts were done earlier today without knowledge that it was UA-banned at the time [18:32] Okay, what's the protocol for UA-based sites in attempts to archive them? [18:34] We tend to use a different useragent which usually does the trick [18:35] Whatever you have to do, etc. [18:35] If you want me to re-submit I can but I don't have Ops [18:36] *** schbirid has quit IRC (Quit: Leaving) [18:55] Well, again it's being run right now; it's just that it's in the middle or near end of December, which is where a bunch of fires of website shutdowns have sprouted out; and slap that with less people being present here s: [18:56] I'm all for lending a hand if submitting a different ways is warranted. But my skillset for all of this is limited compared to others in here, etc [18:56] Understandable ^^ [18:56] Any chance it would work this time even though you submitted the same way last month?\ [19:01] HP_Archiv there's not much more to do right now except watch the job progress [19:01] Hopefully it'll work since it's just a UA ban; and hopefully people from there will not pay attention since it's the holidays~ :p [19:06] Ah good point [19:06] Thanks for your help, Ryz, appreciate it a lot. [19:10] Oh, didn't see your comment at first, will do markedL, let's see what happens. [20:07] *** Dj-Wawa has quit IRC (Dj-Wawa) [20:08] *** Dj-Wawa has joined #archiveteam-bs [20:09] Has anyone tried random user agents? [20:09] *** oxguy3 has joined #archiveteam-bs [20:13] in general random user agents should be avoided until absolutely needed [20:29] *** X-Scale` has joined #archiveteam-bs [20:34] *** X-Scale has quit IRC (Ping timeout: 610 seconds) [20:34] *** X-Scale` is now known as X-Scale [20:44] *** SoraUta has joined #archiveteam-bs [20:44] *** tuluu has quit IRC (Ping timeout: 276 seconds) [20:49] *** oxguy3 has quit IRC (Ping timeout: 246 seconds) [21:20] *** VerifiedJ has quit IRC (Quit: Leaving) [21:21] *** Nick-PC has quit IRC (Ping timeout: 610 seconds) [21:21] *** HP_Archiv has quit IRC (Ping timeout: 610 seconds) [21:21] *** HP_Archiv has joined #archiveteam-bs [21:21] *** Nick-PC has joined #archiveteam-bs [21:25] *** oxguy3 has joined #archiveteam-bs [21:28] *** benjins has quit IRC (Read error: Connection reset by peer) [21:30] *** benjins has joined #archiveteam-bs [21:31] *** benjins has quit IRC (Remote host closed the connection) [21:33] *** benjins has joined #archiveteam-bs [21:54] *** nicolas17 has joined #archiveteam-bs [22:48] *** oxguy3 has quit IRC (My MacBook has gone to sleep. ZZZzzz…) [22:48] *** oxguy3 has joined #archiveteam-bs [22:49] *** oxguy3 has quit IRC (Client Quit) [22:49] *** oxguy3 has joined #archiveteam-bs [22:49] *** oxguy3 has quit IRC (Client Quit) [22:50] *** oxguy3 has joined #archiveteam-bs [22:50] *** oxguy3 has quit IRC (Client Quit) [22:51] *** oxguy3 has joined #archiveteam-bs [22:51] *** oxguy3 has quit IRC (Client Quit) [22:52] *** oxguy3 has joined #archiveteam-bs [22:52] *** oxguy3 has quit IRC (Client Quit) [22:52] *** oxguy3 has joined #archiveteam-bs [22:53] *** oxguy3 has quit IRC (Client Quit) [22:53] *** oxguy3 has joined #archiveteam-bs [22:53] *** oxguy3 has quit IRC (Client Quit) [22:55] *** oxguy3 has joined #archiveteam-bs [22:55] *** oxguy3 has quit IRC (Client Quit) [23:12] *** BlueMax has joined #archiveteam-bs