[00:00] IA is currently chewing on 9.2 Gbit/s inbound and it's on fiiiiiiiiiiire [00:35] *** Soni has joined #archiveteam-bs [00:54] dashcloud: i got thur 2nd tape today [00:58] i'm now capturing the best of benny hill [01:04] *** n00b736 has joined #archiveteam-bs [01:06] Hello - I'm working on scraping sites that contain pre-smartphone mobile games (and both the sites and files are in danger of disappearing) and was wondering if anyone here can help either scripting or otherwise crowdsourcing this endeavor? [01:07] There are two specifically (a Chinese and Russian site) that appear to be in need of archiving, but they are vast... We are looking at 10,000+ games contained in .JAR and other filetypes. [01:17] Is this the right channel? [01:18] *** sarahlynn has joined #archiveteam-bs [01:20] *** sarahlynn has quit IRC (Read error: Connection reset by peer) [01:21] Bluemaxim [01:21] n00b736 Might be someone to speak to [01:21] hello gimme a minute [01:22] Jesus... Haha... Hey Blue - it's RetroRomper... Would it be better to message you on Discord? [01:26] yes [01:27] *** m007a83 has quit IRC (Read error: Connection reset by peer) [01:33] *** n00b736 has quit IRC (Quit: Page closed) [01:36] *** m007a83 has joined #archiveteam-bs [01:46] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [01:51] *** SimpBrain has joined #archiveteam-bs [02:16] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [02:22] *** SimpBrain has joined #archiveteam-bs [02:44] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [02:45] *** thejsa has quit IRC (Quit: No Ping reply in 180 seconds.) [02:45] *** Soni has quit IRC (Quit: No Ping reply in 180 seconds.) [02:48] *** thejsa has joined #archiveteam-bs [02:50] *** SimpBrain has joined #archiveteam-bs [02:51] *** SimpBrain has quit IRC (Remote host closed the connection) [02:58] *** SimpBrain has joined #archiveteam-bs [03:11] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) [03:41] *** RichardG has quit IRC (Ping timeout: 252 seconds) [03:43] *** wp494 has quit IRC (Ping timeout: 492 seconds) [03:43] *** wp494 has joined #archiveteam-bs [03:47] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [03:53] *** SimpBrain has joined #archiveteam-bs [04:13] *** Stiletto has quit IRC (Ping timeout: 255 seconds) [04:30] *** qw3rty115 has joined #archiveteam-bs [04:35] *** qw3rty114 has quit IRC (Read error: Operation timed out) [04:42] *** dhyan_nat has joined #archiveteam-bs [04:46] *** odemgi has joined #archiveteam-bs [04:48] *** odemgi_ has quit IRC (Ping timeout: 252 seconds) [04:55] *** odemg has quit IRC (Ping timeout: 615 seconds) [04:59] *** ndiddy has quit IRC () [05:00] *** SimpBrain has quit IRC (Remote host closed the connection) [05:01] *** odemg has joined #archiveteam-bs [05:05] *** legoktm has joined #archiveteam-bs [05:07] *** SimpBrain has joined #archiveteam-bs [05:11] *** kbtoo_ has quit IRC (Read error: Connection reset by peer) [05:14] *** kbtoo has joined #archiveteam-bs [05:23] *** kbtoo has quit IRC (Read error: Connection reset by peer) [05:26] *** kbtoo has joined #archiveteam-bs [05:36] *** kbtoo has quit IRC (Read error: Connection reset by peer) [05:39] *** kbtoo has joined #archiveteam-bs [06:12] *** marked has quit IRC (Ping timeout: 255 seconds) [06:13] *** marked has joined #archiveteam-bs [06:20] *** Exairnous has quit IRC (Read error: Operation timed out) [06:34] *** turnkit_ has quit IRC () [06:43] *** MrRadar_ has joined #archiveteam-bs [06:44] *** MrRadar has quit IRC (Read error: Operation timed out) [06:54] *** MrRadar has joined #archiveteam-bs [06:56] *** MrRadar_ has quit IRC (Read error: Operation timed out) [07:25] *** SimpBrain has quit IRC (Remote host closed the connection) [07:26] *** SimpBrain has joined #archiveteam-bs [08:14] *** SimpBrain has quit IRC (Remote host closed the connection) [08:15] *** SimpBrain has joined #archiveteam-bs [08:40] *** killsushi has quit IRC (Quit: Leaving) [08:55] *** RichardG has joined #archiveteam-bs [09:20] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [09:20] *** SimpBrain has joined #archiveteam-bs [09:24] *** Odd0002_ has joined #archiveteam-bs [09:25] *** Odd0002 has quit IRC (Ping timeout: 252 seconds) [09:25] *** Odd0002_ is now known as Odd0002 [09:29] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [09:31] *** SimpBrain has joined #archiveteam-bs [09:37] *** BlueMaxim has quit IRC (Quit: Leaving) [09:54] *** JH88 has joined #archiveteam-bs [10:09] I know I've covered this before but can we implement a bw limt on rsync? [10:20] Surely it's just a step in the project pipeline, and we can pass a variable from the UI? [10:56] *** atrocity has joined #archiveteam-bs [11:23] *** dhyan_nat has quit IRC (Read error: Operation timed out) [11:33] Item users:smap/01549/av: Step 4 of 8 Server returned 0 (HERR). Sleeping. [11:33] Herr?! [11:51] "Header error", I believe. Some kind of error occurred while trying to read the response headers. [11:51] But I don't think these error codes are documented anywhere really. [11:53] ah ok [11:53] well reutrning 0 is derp [11:53] Anyway, seems that B/W limits on the actual VM work now :) [11:54] Makes sense if you need to store the status code in an unsigned int since there is no status code 0 in HTTP. [11:56] *** SmileyG has joined #archiveteam-bs [12:02] *** bitBaron has joined #archiveteam-bs [12:05] *** Smiley has quit IRC (Ping timeout: 615 seconds) [12:27] *** VerifiedJ has quit IRC (Read error: Connection reset by peer) [12:28] *** VerifiedJ has joined #archiveteam-bs [12:40] *** wp494 has quit IRC (Ping timeout: 364 seconds) [12:41] *** wp494 has joined #archiveteam-bs [12:48] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [13:23] *** MrRadar2 sets mode: +o MrRadar [13:34] SketchCow: is FoS still a thing? [13:35] *** icedice has joined #archiveteam-bs [13:48] *** MrRadar_ has joined #archiveteam-bs [13:52] *** MrRadar has quit IRC (Read error: Operation timed out) [13:55] *** benjinsmi has joined #archiveteam-bs [13:57] *** benjins has quit IRC (Read error: Operation timed out) [14:24] *** MrRadar_ is now known as MrRadar [14:24] *** MrRadar2 sets mode: +o MrRadar [14:33] *** Stiletto has joined #archiveteam-bs [15:12] *** anarcat has left [15:27] *** dhyan_nat has joined #archiveteam-bs [15:35] *** bitBaron has joined #archiveteam-bs [16:27] JAA: ok, was just wondeing [16:27] wondering* [17:06] *** schbirid has joined #archiveteam-bs [17:38] *** Odd0002_ has joined #archiveteam-bs [17:40] *** Odd0002 has quit IRC (Read error: Operation timed out) [17:40] *** Odd0002_ is now known as Odd0002 [17:44] *** icedice has quit IRC (Read error: Operation timed out) [17:48] *** omarroth has joined #archiveteam-bs [18:16] *** Exairnous has joined #archiveteam-bs [18:29] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [18:29] *** SimpBrain has joined #archiveteam-bs [18:51] *** Exairnous has quit IRC (Read error: Operation timed out) [19:00] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [19:03] *** omarroth has quit IRC (Read error: Connection reset by peer) [19:06] *** omarroth has joined #archiveteam-bs [19:11] *** icedice has joined #archiveteam-bs [19:25] *** bitBaron has joined #archiveteam-bs [19:26] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [19:26] *** S1mpbrain has joined #archiveteam-bs [19:34] I'd prefer BW limits on rsync also. without qos, the other internet devices and downloader would be borked [19:34] ^know from experience [19:52] dashcloud: so i may have another tape captured thats insync [19:52] so that makes about 4 tapes and 3 of the tapes are 4+ hours [20:28] *** BlueMax has joined #archiveteam-bs [20:57] *** atrocity has quit IRC (Ping timeout: 246 seconds) [21:17] dashcloud: i'm going to see about capturing one of the twilight zone tapes you made [21:18] alot of the tapes sent was stuff i normally don't capture [21:18] but i'm doing cause i just want to say the stuff was digitize [21:23] *** dhyan_nat has quit IRC (Read error: Operation timed out) [21:38] *** signius has joined #archiveteam-bs [21:42] *** robbierut has joined #archiveteam-bs [21:43] *** wp494 has quit IRC (Ping timeout: 364 seconds) [21:44] *** wp494 has joined #archiveteam-bs [21:47] Aside from Google minus, has there been another project with say more than 20 rsync targets? [21:47] plenty [21:48] *** Odd0002 has quit IRC (Ping timeout: 252 seconds) [21:50] I'm not changing tracker code, that's why I called it an external load balancer instead [21:50] Marked, you said not changing the tracker code, but you want to add another point of faillure in the system? Something before the tracker? [21:50] how do you intend to do this without either modifying the tracker, or seesaw itself? [21:51] *** Odd0002 has joined #archiveteam-bs [21:52] Create a DNS record, Google minus.rsynctargets.archiveteam.org [21:52] Match the same API as the current tracker [21:53] Then in the Google minus pipeline script point to that DNS record [21:53] The DNS record could point to the tracker or a load balancer [21:54] *** astrid has joined #archiveteam-bs [21:54] For projects that want the current tracker for rsync assignment things are the same [21:55] For projects that have a target surge, the rsync assignment could be made from a different code base [21:55] Seesaw and tracker code base does not need to be modified as rsync hearding needs change [21:58] The matching methods mentioned by others is rsyncing to targets in the same data center, this should be easy to do [21:59] There was more contraversy about matching by disk space or ASN, but the point is it's flexible enough to do whichever [22:00] And the fallback is it can always go back to the tracker or duplicate what the tracker random assignment would do [22:01] To be fair, I see what you want to do. But personally I think its better to have a target communicate with the tracker to automate the adding/removal of targets when they are full/empty. [22:01] This would also lessen the babysitting from people [22:02] No need for dns etc. Just 1 uplink from each target [22:03] It's just a matter of time when the rsync assignments become a bottleneck again. Let the tracker do job assignment and confirmations. Those require the database and are critical [22:03] The health stats is too complex to mix with tracker's other tasks [22:05] Maybe someone knows for sure- does data center mapping improve total through put of the swarm? How would we do that with the tracker assignment of targets? [22:05] *** atomicthu has quit IRC (Read error: Operation timed out) [22:05] Marked: doesnt there need to be a better tracker then? I don't know but it doesnt seem as hard on the hardware as a target with all the incoming data, packing to big files and uploading again. I know the tracker had issues yesterday with a conmection limit, but is that artificial or is that actually the cpu/disk limit of the server also? [22:05] *** omarroth has quit IRC (Read error: Connection reset by peer) [22:06] I know it could be put in seesaw, but seesaw cannot be modified quickly [22:08] I suppose I'm suggesting the two functions of job assignment and target assignment can be decoupled and should live in two code bases [22:08] But since this is not needed all the time, only use it when there's a lot of targets [22:09] Complexity when it's useful to get more throughout [22:09] But turn it off when not needed [22:09] *** atomicthu has joined #archiveteam-bs [22:10] What would be the benefit of that compared to 1 very fast tracker? The ability to turn off is nice, but not really critical right? [22:11] In one code base if the algorithm of assignment needs to change, doesn't the tracker need to restart too? [22:14] Yeah but why would it need to change on a regular basis? [22:19] I guess that part is partially due to the debate of what's a good assignment method. Sounds to me like it will be dynamic situation at least at first. [22:26] I assume it would require try and see before the debate ended. If there was a single known good method beforehand that could be implemented initially. [22:28] The next closest leaving the tracker the same and using bots to turn on and off targets. But this doesn't get the assignment preferences [22:29] To do assignment preferences then you have to add extra health or topology data to the tracker and have the tracker utilize it or export it to seesaw or warrior for a decision there. [22:30] Idk what that data is even exactly this moment but I don't want it in the tracker db or maybe don't want to send a full set to workers [22:34] I'm not big in the code, but I sometimes see people saying "this target is taken off" or " this target is added". Don't know the exact mechanics but there probably is a list the tracker chooses from to send workers to. How hard would it be to let a target do that instead of a person? No need for constant health data, just 1 time to take off a target to let it empty and 1 time to put it on again. [22:35] Ofcourse this won't fix the connection limit of active but very busy targets. But it would cut out quite some waiting to reassign a worker to a target thats not defenitly not accepting incoming. [22:35] *** signius has quit IRC (Quit: Leaving) [23:18] *** Despatche has joined #archiveteam-bs [23:23] *** Exairnous has joined #archiveteam-bs [23:51] *** ndiddy has joined #archiveteam-bs