[00:01] *** godane has joined #archiveteam-bs [01:17] *** DFJustin has quit IRC (Remote host closed the connection) [01:20] *** DFJustin has joined #archiveteam-bs [02:25] *** ReimuHaku has joined #archiveteam-bs [02:40] *** af10b3e5e has quit IRC (Read error: Connection reset by peer) [02:41] *** af10b3e5e has joined #archiveteam-bs [02:49] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [03:02] *** ntntn has joined #archiveteam-bs [03:03] *** ReimuHaku has quit IRC (Read error: Operation timed out) [03:04] *** odemgi_ has joined #archiveteam-bs [03:07] *** odemgi has quit IRC (Ping timeout: 252 seconds) [03:15] *** qw3rty has joined #archiveteam-bs [03:24] *** qw3rty2 has quit IRC (Ping timeout: 745 seconds) [03:49] neat https://www.reddit.com/r/emulation/comments/dexcil/compact_disc_structure_preliminary_proposal_of_a/ [03:50] especially how SecuROM works [04:16] *** af10b3e5e has quit IRC (Quit: https://i.imgur.com/xacQ09F.mp4) [04:16] *** d5f4a3622 has joined #archiveteam-bs [04:20] will WBM replay pagination done by POST ? [04:53] I can probably create warcs with POST content but if the WBM doesn't like that, it would be easier to just paginate it via GET [04:57] *** SmileyG has joined #archiveteam-bs [04:59] *** Smiley has quit IRC (Ping timeout: 258 seconds) [05:08] *** RichardG_ has quit IRC (Quit: Keyboard not found, press F1 to continue) [05:11] *** RichardG has joined #archiveteam-bs [05:23] markedL, there was a page i was looking at recently which paginated with POSTs. i can almost but not quite remember what it was. can you pls tell me an example site so i can go test the WM [05:25] it's possible that viewing an archived page and then triggering some intrapage POST will work [05:26] i'd rather have just tested than type out the theory here though [05:29] if we wait a bit, someone will know [05:41] *** wabu has joined #archiveteam-bs [05:42] I'll start a test crawl for IDs, maybe it'll be informative [05:43] everything else at DiscApps has nice urls. they are all static resources though. pages of topics are dynamic [05:43] i am doing some investigation now too. but that idea of yours is nice [05:46] actually, yes that's where i would have ended up at my next step. making an index. i wasn't actually prepared to get into working on this immediately tonight. i'll make up my mind whether to do now or tomorrow [05:53] the site is very fast, so this is probably doable minus the WBM part. and no idea how many disc IDs there are out there [05:58] so you mean archiving to somewhere other than WM [05:58] thanks for lending your opinion/verdict [05:59] it accords with the impression i had got [06:00] it can be in the WBM, but the next button wouldn't work; so far example "making up" urls like http://disc.yourwebapps.com/discussion.cgi?disc=1&pagemark=20 which is the same as the 2nd page [06:00] so now to investigate what forums/topics/posts are there [06:01] oh really [06:01] so it is [06:03] do new comments in a thread, reorder the messages in the index? [06:03] is it going to then be as simple as changing those two numbers, to access the whole site [06:04] if the order changes it needs more caution of the order of hitting them [06:06] yes it would then need caution indeed, but hopefully it's not like that. can make a/some tests post/s in somehere. i think it allows anon posting [06:06] i'm really not up to speed right now to help out as well as i could [06:08] just collecting/documenting the data of how it works, the ID ranges, types of pages, is plenty helpful [06:13] i am checking on that last question you asked. i'm saying that i wasn't really prepared to - it's gotten late - but fortunately, as you have found, the site is nicely simple [06:15] i have humorously found this forum which i will test for staticness/dyanamism of posts. http://disc.yourwebapps.com/Indices/203069.html [06:17] (i didn't expect that someone would immediately get stuck into it. no doubt, it's nice that you have though) [06:17] lol. late here too, can continue after rest. [06:18] look, i have written multiple sentences to make one point, clear sign of tiredness and not focussed [06:18] good, i'll do the proposed thing tomorrow [06:23] *** Mateon1 has quit IRC (Remote host closed the connection) [06:23] *** Mateon1 has joined #archiveteam-bs [06:25] just correct myself: can create my own forum to test in [06:28] *** d5f4a3622 has quit IRC (Read error: Connection reset by peer) [06:28] *** d5f4a3622 has joined #archiveteam-bs [07:18] *** dxrt has joined #archiveteam-bs [07:18] *** Fusl____ sets mode: +o dxrt [07:18] *** Fusl sets mode: +o dxrt [07:18] *** Fusl_ sets mode: +o dxrt [07:24] *** systwi_ is now known as systwi [08:43] *** odemgi has joined #archiveteam-bs [08:45] *** odemgi_ has quit IRC (Ping timeout: 252 seconds) [08:48] markedL: No, the WBM cannot play back POST requests. I think the responses will be available, but all requests with the same URL will be mixed together of course. [09:34] Soo, I'm getting rate-limited on picosong since a couple hours. [09:35] Stopping my crawl for now and investigating. [09:40] The rate limiting happens on /cdn/HEX.mp3 URLs. It doesn't tell me the rate limit though. I'll analyse my logs to figure that out. [09:40] *** d5f4a3622 has quit IRC (Read error: Connection reset by peer) [09:40] *** d5f4a3622 has joined #archiveteam-bs [09:41] JAA: throw the urls into a tracker project or amqp and crawl it distributed across hundreds of instances? [09:42] ^ [10:24] Soo, not sure what the rate limit is. The number of successful retrievals within a minute varied between 3 and 20 across the last few minutes before I stopped it. [10:26] Before the rate limiting, I was making 400-ish requests per minute. So yeah, need to change how this works. [10:26] Somewhere between 20 and 100 IPs needed I guess. [10:34] *** killsushi has quit IRC (Quit: Leaving) [10:36] *** ShellyRol has quit IRC (Read error: Connection reset by peer) [10:40] *** qw3rty2 has joined #archiveteam-bs [10:44] *** ShellyRol has joined #archiveteam-bs [10:46] *** qw3rty has quit IRC (Ping timeout: 745 seconds) [10:57] ls [10:58] hi [10:58] i just got back too. why've you typed ls [11:00] that was a human error. I meant to say, I saw a comment about disc that was disconcerting. something like old messages were falling off the index [11:00] i saw that too [11:01] that kind of thing i just take as par for the course that is the Internet, now [11:01] i wouldn't fret mate [11:03] and here's why i wouldn't fret [11:03] observe at the bottom of the faq, [11:04] https://disc.yourwebapps.com/faq.html [11:04] How do I delete a DiscApp? DiscApps are removed from the server after 12 weeks of inactivity (no new articles and no administration). [11:08] this site is impressive in it's non modernerity [11:09] (it took quite some time to get used to the jarring feeling that disconcerts you. gradually from when i first read that the IMDb randomly deletes posts. i mean that's still shocking if i give it any thought) [11:10] yes. did you see similar thoughts expressed in those few topics i linked? this is exactly my motivation (obviously) for this [11:11] anyway right now i just have to mainly say [11:12] and clear up, that when i said 'humorously' above, it wasn't meant as mockery, but as joviality [11:18] I'll start on this, should have something done in an hour [11:28] there are some few other pages on the site too, not just the forums, and it's worth getting a complete current list [11:29] for example https://yourwebapps.com/About/, and maybe ihm [11:29] ..., and maybe it's only the ones linked on that page [11:31] [h is next to t on my keyboard, single apostrophe is long-press m, and backspace is next to enter - typo expl] [11:32] Dvorak? [11:33] it's not, i'e never used that. it's workman fwiw [11:34] i do recommend trying it, if thqt kind of thing interests [11:35] ahh ok [11:37] the guy has done good work, i'd say. have long been aware of many researched layouts, but this one i felt able to just go ahead and use [11:37] that sums it up for me i reckon [11:38] lol, you might now see how this nick was selected too [11:38] lolol [11:39] a lot of layouts have some obvious relationship between n and t [11:40] ok [11:40] in dvorak they're next to each other and in colemak they have the same position as workman even [11:44] *** coderobe has quit IRC (Remote host closed the connection) [11:57] *** BlueMax has quit IRC (Read error: Connection reset by peer) [11:57] *** BlueMax has joined #archiveteam-bs [12:05] Fusl, Igloo: So how to move forward with picosong? It's not possible to just distribute the /cdn/HEX.mp3 URLs since those are IP-restricted. So basically we'd have to distribute the entire thing (those URLs are generated when you access the stream or download page). That's possible with qwarc, but I haven't done it before, so it'll require some fiddling. Or would it be possible with mips maybe? [12:07] *** ntntn_ has joined #archiveteam-bs [12:08] *** ntntn has quit IRC (Ping timeout: 260 seconds) [12:08] *** ntntn_ is now known as ntntn [12:10] *** schbirid has quit IRC (Remote host closed the connection) [12:10] Actually no, mips won't work reliably either due to how connections are distributed to IP addresses. There'd have to be a guarantee that the connections used by one item always use the same IP. [12:11] (On the plus side, I've drained nearly all my upload backlog in the last couple hours.) [12:13] Ivy: i took a screenshot. also, pls, was anything relevant typed while i crashed out? https://i.postimg.cc/xdJVXpWM/Screenshot-20191009-124246.png [12:14] that is 'AnySoftKeyboard' [12:14] ntntn: Logs of this channel and a few others are at http://archive.fart.website/bin/irclogger_logs [12:15] i should have known [12:15] thx [12:16] ^^^ [12:16] also yes anysoftkeyboard is nice [12:16] Can you move this keyboard discussion to #archiveteam-ot please? [12:17] sorry about that JAA. noted for future too [12:19] *** coderobe has joined #archiveteam-bs [12:28] picosong seems to fit with tracker if you just pregenerate the IDs into job files [12:29] *** ntntn has quit IRC () [12:30] is it clear the rate limit is per IP or just a uniform slowdown put into every request? [12:30] Yeah, it could of course be done with the tracker. Effectively, qwarc is kind of like the tracker, just locally. However, I'm not going to write Lua code for it, sorry. [12:32] I'm pretty sure the rate limit is per IP. It's Cloudflare, and the error message is this: "The owner of this website (picosong.com) has banned you temporarily from accessing this website." [12:33] I could also try emailing them and asking whether they could whitelist my IP or something. [12:34] I'm probably not the only one to try to grab a full copy of it, and that gets expensive quickly since it's S3. [12:38] *** bluefoo_ has quit IRC (Ping timeout: 745 seconds) [12:40] how many IPs does mips have? [12:43] Many. I don't know the number, but it'd be enough. But as mentioned, it won't work, at least not directly/without changes. [12:45] Many /24s if I remember [13:05] Yeah, something like that. One /24 would be enough already, see numbers above. Assuming they don't ban the entire range or something. [13:06] But yeah, won't work directly. [13:15] my script is getting 429 also, very quick, didn't even reach 100 urls tried [13:16] I'll send them an email. [13:17] glhf [13:19] the /cdn/HEX.mp3 urls seem migrate-able to me. and that's the one getting 492 [13:19] ^429 [13:23] also, I'm testing them via Tor [13:24] this work for everyone? https://picosong.com/cdn/36dca710680fae932906739e3fa37c28.mp3 [13:27] Yes, works. Weird, I'm sure I got 403s before when trying to access a generated link from another IP. [13:28] markedL It looks like it works for me (in the US), downloaded the whole file just fine [13:29] *** bluefoo has joined #archiveteam-bs [13:29] *** HashbangI has quit IRC (Remote host closed the connection) [13:34] It's possible both are true if that was on a S3 URI, but I don't know to generate that test case reliably [13:39] *** HashbangI has joined #archiveteam-bs [13:53] here's an example: http://picosong.s3.amazonaws.com/WDf2/fetch%20me%20the%20microphone.mp3?Signature=8%2FvRquUfTWhcWmzqpNBfnDntLYg%3D&Expires=1570629845&AWSAccessKeyId=AKIAIVYGJY7GGRJY2Y3A [13:56] and in between http://cdn.picosong.com/mgze/Fool%20On%20The%20Hill%20-%20Me%20%26%20The%20Beatles.mp3?Signature=TcPnu53my6mu75wzAa3WOW5Qbmw%3D&Expires=1570630204&AWSAccessKeyId=AKIAIVYGJY7GGRJY2Y3A [14:18] *** DogsRNice has joined #archiveteam-bs [14:21] some stuff going on with gamepedia and fandom [14:21] https://community.fandom.com/wiki/User_blog:MisterWoodhouse/Our_first_update_on_the_new_platform [14:22] *** DogsRNice has quit IRC (Remote host closed the connection) [14:22] *** DogsRNice has joined #archiveteam-bs [14:24] *** DogsRNice has quit IRC (Remote host closed the connection) [14:24] *** DogsRNice has joined #archiveteam-bs [15:07] *** Yurume has quit IRC (No Ping reply in 180 seconds.) [15:09] *** Yurume has joined #archiveteam-bs [15:32] *** Yurume has quit IRC (No Ping reply in 180 seconds.) [15:34] *** Yurume has joined #archiveteam-bs [15:39] *** deevious has quit IRC (Quit: deevious) [15:51] *** Yurume has quit IRC (No Ping reply in 180 seconds.) [15:53] *** Yurume has joined #archiveteam-bs [16:21] *** Raccoon has quit IRC (Ping timeout: 258 seconds) [16:39] *** Mayonaise has quit IRC (Read error: Operation timed out) [16:40] *** Mayonaise has joined #archiveteam-bs [16:40] *** icedice has joined #archiveteam-bs [16:47] *** ats has quit IRC (leaving) [17:03] *** ats has joined #archiveteam-bs [17:14] *** VADemon has quit IRC (Quit: left4dead) [17:18] *** BlueMax has quit IRC (Read error: Connection reset by peer) [17:22] *** chirlu has joined #archiveteam-bs [17:47] picosong backlog is cleared. At least that's good. [18:04] I figure someone will write the lua, if you say that's the best option [18:05] The best option would be them whitelisting me so that I can just continue where I stopped since that requires zero work. [18:06] If they don't reply within a reasonable amount of time, we can take the DPoS route. [18:13] True [18:14] ntntn : test warc for disc> https://transfer.sh/j8rgz/disc1-alpha.warc [18:24] *** d5f4a3622 has quit IRC (Ping timeout: 255 seconds) [18:39] *** d5f4a3622 has joined #archiveteam-bs [19:01] *** bluefoo has quit IRC (Ping timeout: 496 seconds) [19:51] *** bluefoo has joined #archiveteam-bs [20:14] *** Meroje has quit IRC (Quit: bye!) [20:23] *** Meroje has joined #archiveteam-bs [20:26] *** bluefoo has quit IRC (Ping timeout: 252 seconds) [20:32] *** bluefoo has joined #archiveteam-bs [21:57] *** Meroje has quit IRC (Quit: bye!) [21:58] *** Meroje has joined #archiveteam-bs [22:02] *** Meroje has quit IRC (Client Quit) [22:02] *** Meroje has joined #archiveteam-bs [22:06] *** Meroje has quit IRC (Client Quit) [22:06] *** Meroje has joined #archiveteam-bs [22:10] *** Meroje has quit IRC (Client Quit) [22:10] *** Meroje has joined #archiveteam-bs [22:14] *** Meroje has quit IRC (Client Quit) [22:14] *** Meroje has joined #archiveteam-bs [22:28] *** ats_ has joined #archiveteam-bs [22:29] *** ats has quit IRC (Read error: Operation timed out) [22:32] *** ats_ has quit IRC (Read error: Operation timed out) [22:59] *** phillipsj has joined #archiveteam-bs [23:06] *** ScruffyB has quit IRC (Read error: Operation timed out) [23:12] *** BlueMax has joined #archiveteam-bs [23:27] *** ats has joined #archiveteam-bs [23:28] *** VoltZero has joined #archiveteam-bs