[00:15] atomicthu: Can you explain more about that? What's broken about it - what's the usual procedure to access the archives, and at what point does it go wrong under the current circumstances? How long has it been broken? Do you know why it's broken, if or when this is (or was) expected to be fixed, etc.? [00:16] the archives are accessed with a dropdown on every subforum [00:16] you select a date and it shows you threads from that date [00:16] except since whenever it broke, it shows no threads for any date [00:16] there is no ETA on fixing it. something awful's awful, hacked-together version of vbulletin is infamous [00:16] *** BlueMax has joined #archiveteam-bs [00:17] *** Arcorann_ has joined #archiveteam-bs [00:28] *** nicolas17 has quit IRC (Konversation terminated!) [00:36] *** asdf0101 has quit IRC (The Lounge - https://thelounge.chat) [00:36] *** asdf0101 has joined #archiveteam-bs [00:59] *** nicolas17 has joined #archiveteam-bs [01:38] *** dashcloud has joined #archiveteam-bs [02:27] *** katocala has quit IRC (Ping timeout: 496 seconds) [02:57] *** BlueMax has quit IRC (Read error: Connection reset by peer) [03:34] *** qw3rty_ has joined #archiveteam-bs [03:41] *** qw3rty__ has quit IRC (Read error: Operation timed out) [04:15] *** godane has quit IRC (Ping timeout: 610 seconds) [04:15] *** godane has joined #archiveteam-bs [04:34] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [04:48] JAA: So I think I might've found the source of that ARM cookie [04:48] The page was /?sc_mode=edit&sc_itemid=%7BEAAAFF37-D150-4392-A90D-65D24D66DD62%7D&sc_version=7&sc_lang=en&sc_site=website [04:49] WARC-Record-ID: for the request in 00008.warc.gz, the response record is easy enough to find [04:51] Source appears to be from here https://developer.arm.com/tools-and-software/embedded/legacy-tools/ds-5-development-studio/resources/tutorials/linux-symmetric-multiprocessing-kernel-debug [04:51] "Again, if you get any Source Not Found alerts, you have to set the substitution path to the sources you set up as part of Prerequisites." - "Prerequisites" has a link to that page that sets the cooki [04:51] e [04:53] There are similar links around the page on different occurrences of the word "Prerequisites", going to the proper place; obviously the author copied the link directly from their editing tools instead of the public page this one time [05:50] ----- [05:51] So it looks like the Something Awful site is being "transferred" to someone else - https://forums.somethingawful.com/announcement.php?forumid=1 [05:53] "There have been a lot of questions about offsites and donations for offsites and the future of the forums... [t]he short answer is we do not have anywhere for that sort of thing to go at the moment... there should be a solid announcement within a day or so." [05:56] So it looks like the statement of its being "transferred" may be that "announcement" [06:02] *** HP_Archiv has joined #archiveteam-bs [06:05] So I'm talking like godane now [06:06] i nomally don't copy and paste dump [06:06] if i do i give worrying its normaly 3 or 4 lines of bash code to grab something [06:08] I was just referring to the "[Ping]: So I..." structure [06:12] *** HP_Archiv has quit IRC (Read error: Operation timed out) [06:17] *** HP_Archiv has joined #archiveteam-bs [06:28] *** nicolas17 has quit IRC (Quit: Konversation terminated!) [06:40] *** OrIdow6 has quit IRC (Ping timeout: 265 seconds) [06:41] *** OrIdow6 has joined #archiveteam-bs [06:50] *** larryv has quit IRC (larryv) [07:24] *** Arcorann_ is now known as Arcorann [08:22] *** Raccoon has joined #archiveteam-bs [09:51] *** schbirid has joined #archiveteam-bs [10:45] *** Raccoon has quit IRC (Ping timeout: 272 seconds) [11:08] atomicthu: Can you access old threads directly (by modifying the thread ID in the URL), i.e. is only that date selection thing broken? [11:08] *** systwi_ has joined #archiveteam-bs [11:09] OrIdow6: Thanks! I guess I'll rerun it with an ignore for URLs with a sc_mode=edit parameter. Might also look into grabbing it with qwarc independently. [11:11] *** systwi__ has joined #archiveteam-bs [11:12] No idea. If my crawler can find a link to an old one that works, it'll grab it [11:13] *** systwi_ has quit IRC (Read error: Operation timed out) [11:14] *** systwi has quit IRC (Ping timeout: 622 seconds) [11:16] *** systwi__ has quit IRC (Read error: Operation timed out) [11:19] *** Smiley has quit IRC (Read error: Connection reset by peer) [11:20] atomicthu: Random archived thread (according to the error message): https://forums.somethingawful.com/showthread.php?threadid=393011 (no idea what it is, probably something awful, heh) [11:20] *** systwi has joined #archiveteam-bs [11:23] *** systwi has quit IRC (Read error: Operation timed out) [11:26] *** Smiley has joined #archiveteam-bs [11:31] JAA: NP [11:32] Yes, QWarc or something similar may be necessary here to get everything [11:38] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [11:43] Hey JAA, if you're around, question for ya. I have some of those video archive items hashed. What's the proper format to save the hash results to - .txt file or csv or .hash file? [11:47] I guess I'm asking what's the proper way to format this all? I've seen developers host code blocks on blogs and stuff like that with a hash table for data. But since I can't do that on IA, how to format accordingly? [11:48] HP_Archiv: The common format is the hash in hexadecimal format followed by two spaces followed by the filename. [11:48] That's what md5sum, sha1sum, etc. produce and can also read back in for checking with `-c`. [11:50] *** systwi has joined #archiveteam-bs [11:51] JAA: So I just copy/paste the sha1 hash values into notepad and save as .hex? [11:54] SEND JAA [11:55] *** Ravenloft has joined #archiveteam-bs [11:56] HP_Archiv: Sure, you can do that. Just lines of '51a96fff5871d6723f469c1882e715d2 filename.txt'. [11:56] (Length of the hash will depend on the hash function, obviously. [11:58] This is what I have, filenames are long [11:58] https://imgur.com/E0hGHXg [12:01] Oh, I just realized they're long because I'm including the path [12:01] I mean, as long as it's in an easily parseable format, there's no need to change it really. [12:02] Even if you want to do bulk checking with `sha1sum -c`, it would be easy to transform that into the relevant format. [12:02] *** Ryz has quit IRC (Quit: Ping timeout (120 seconds)) [12:03] Well it took about 24 hours to hash all 20 items. So I have the values sitting in csv's for each one. [12:03] Is it worth re-doing? [12:03] Nah [12:03] *** Ryz has joined #archiveteam-bs [12:04] Let's take this to hackint -ot as before. [12:07] JAA: archived thread shows [12:08] So that works, good. :-) [12:09] Current thread IDs appear to be around 3.9 million. I wonder why. [12:09] According to the homepage, there are 2.6M normal threads and 3.4M archived ones. [12:09] So that'd be 6M, not 3.9M. [12:09] Or does the archived count include the non-archived ones? [12:30] *** Raccoon has joined #archiveteam-bs [12:36] JAA: I get something more like 3.1m archived and 400 000 active from randomly trying thread IDs [12:37] https://transfer.notkiska.pw/x2MqA/SAForumRandomIDs.py [12:39] (There's probably not much value in sharing a script that took about 6 minutes to write, but oh well) [12:40] Cheers [12:41] So I guess 3.4M might be archived plus active. [12:42] That's what it looks like [12:42] Not sure where the 2.6m comes from, though [12:42] Yeah, that number is odd. [12:43] It's also interesting how the number of posts per thread is massively higher for "archived" than "total" (using the homepage terms). [12:56] Yes; assuming that the "total threads" and [12:56] I'm accidentally pressing return a lot today [12:59] ... "total posts" are indeed measuring related things, I can imagine that it might be related to how the 6 months is counted; or it could be a result of that the count for live threads is "incomplete" [13:04] *** HP_Archiv has quit IRC (Quit: Leaving) [13:05] If it's not just a result of the ration legitimately changing over time, for whatever reason [13:20] *** BartoCH has quit IRC (Quit: WeeChat 2.8) [13:30] It wouldn't surprise me if the thread statistics are just completely broken on the SA forums. The database is apparently a mess behind the scenes; for like a year the archiving system was broken so posts were just deleted instead of archived (that's why stuff stopped being "archived" around 2016); and reports create threads in mod forums so a huge number of threads aren't visible anyways [13:31] Nice [13:31] Living up to the name at least. [13:33] *** BartoCH has joined #archiveteam-bs [13:58] *** lunik13 has quit IRC (Read error: Connection reset by peer) [13:59] *** lunik13 has joined #archiveteam-bs [14:15] *** Zerote_ has joined #archiveteam-bs [14:20] *** Zerote has quit IRC (Read error: Operation timed out) [14:35] *** katocala has joined #archiveteam-bs [14:38] *** katocala has quit IRC (Client Quit) [14:46] *** katocala has joined #archiveteam-bs [14:52] *** Zerote has joined #archiveteam-bs [14:52] *** qw3rty has joined #archiveteam-bs [14:53] *** bleb has joined #archiveteam-bs [14:55] *** benjinsmi has joined #archiveteam-bs [14:56] *** Dj-Wawa_ has joined #archiveteam-bs [14:56] *** pie_[bnc] has joined #archiveteam-bs [14:57] *** colona_ has joined #archiveteam-bs [14:58] *** Maylay_ has joined #archiveteam-bs [15:00] *** chfoo_ has joined #archiveteam-bs [15:08] *** Zerote_ has quit IRC (se.hub efnet.deic.eu) [15:08] *** qw3rty_ has quit IRC (se.hub efnet.deic.eu) [15:08] *** dashcloud has quit IRC (se.hub efnet.deic.eu) [15:08] *** Maylay has quit IRC (se.hub efnet.deic.eu) [15:08] *** benjins has quit IRC (se.hub efnet.deic.eu) [15:08] *** Dj-Wawa has quit IRC (se.hub efnet.deic.eu) [15:08] *** colona has quit IRC (se.hub efnet.deic.eu) [15:08] *** cm has quit IRC (se.hub efnet.deic.eu) [15:08] *** kiskaWee has quit IRC (se.hub efnet.deic.eu) [15:08] *** pie_ has quit IRC (se.hub efnet.deic.eu) [15:08] *** chfoo has quit IRC (se.hub efnet.deic.eu) [15:18] *** Arcorann has quit IRC (Read error: Connection reset by peer) [15:23] *** bleb is now known as cm [15:55] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [16:33] *** chfoo_ is now known as chfoo [17:18] *** DogsRNice has joined #archiveteam-bs [17:50] Any Projects around for SomethingAwful forums [17:56] *** benjinsmi has quit IRC (Read error: Operation timed out) [18:00] *** nicolas17 has joined #archiveteam-bs [18:40] *** lunik13 has quit IRC (Ping timeout: 265 seconds) [18:40] *** maxfan8 has quit IRC (Ping timeout: 265 seconds) [18:40] *** maxfan8 has joined #archiveteam-bs [18:41] *** Jens has quit IRC (Ping timeout: 265 seconds) [18:41] *** DogsRNice has quit IRC (Ping timeout: 265 seconds) [18:43] JAA: OrIdow6 I have a account with access to the archives [18:44] *** OrIdow6 has quit IRC (Ping timeout: 265 seconds) [18:44] North of 4 Million posts are in the archive [18:44] *** DogsRNice has joined #archiveteam-bs [18:46] *** Tugboat has quit IRC (Ping timeout: 265 seconds) [18:47] *** Jens has joined #archiveteam-bs [18:56] *** OrIdow6 has joined #archiveteam-bs [18:58] *** Jens has quit IRC (Ping timeout: 265 seconds) [19:00] *** Jens has joined #archiveteam-bs [19:01] *** OrIdow6 has quit IRC (Ping timeout: 265 seconds) [19:01] *** Maylay has joined #archiveteam-bs [19:02] *** Maylay_ has quit IRC (Ping timeout: 265 seconds) [19:02] jrwr: s/posts/threads/ ? According to the homepage, there are ~180M posts. [19:02] Correct, Threads [19:03] If you want my login for indexing, I can let you borrow it for a minute (I do use this account) [19:03] *** OrIdow6 has joined #archiveteam-bs [19:04] *** OrIdow6 has quit IRC (se.hub irc.underworld.no) [19:04] *** Jens has quit IRC (se.hub irc.underworld.no) [19:04] *** VoynichCr has quit IRC (se.hub irc.underworld.no) [19:04] *** pew has quit IRC (se.hub irc.underworld.no) [19:04] *** arkiver has quit IRC (se.hub irc.underworld.no) [19:04] *** i0npulse has quit IRC (se.hub irc.underworld.no) [19:04] Really, the only thing it gives is access to the GBS Graveyard [19:04] https://usercontent.irccloud-cdn.com/file/tWC79nqg/image.png [19:06] Good news is, Posts in the GBS Graveyard can be accessed without a login [19:06] https://forums.somethingawful.com/showthread.php?threadid=3790840 [19:06] Just not the index [19:09] also, these forums are not public nor or the threads and posts https://usercontent.irccloud-cdn.com/file/jQIZE334/image.png [19:11] *** OrIdow6 has joined #archiveteam-bs [19:11] *** VoynichCr has joined #archiveteam-bs [19:11] *** i0npulse has joined #archiveteam-bs [19:11] *** pew has joined #archiveteam-bs [19:11] *** arkiver has joined #archiveteam-bs [19:11] *** Tugboat has joined #archiveteam-bs [19:27] *** lunik13 has joined #archiveteam-bs [19:28] *** benjins has joined #archiveteam-bs [20:02] Ryz: It looks like you've already archived all the links that ultramage in #archiveteam suggested. Should any of them be re-run? [20:17] *** Raccoon has quit IRC (Ping timeout: 745 seconds) [20:23] *** SmileyG has joined #archiveteam-bs [20:27] *** Smiley has quit IRC (Read error: Operation timed out) [20:36] *** mtntmnky has quit IRC (Remote host closed the connection) [20:38] so one of my scans got added to retromags [20:38] :-D [20:38] *** mtntmnky has joined #archiveteam-bs [20:40] *** mtntmnky has quit IRC (Remote host closed the connection) [20:40] *** mtntmnky has joined #archiveteam-bs [20:41] Some of them don't seem to have any substantial updates enough, especially jobs that I ran on 2020 May rather than 2020 January [20:41] It's a bit tricky though [20:47] *** ultramage has joined #archiveteam-bs [20:47] alrighty [20:48] for the mobius ingame news pages, they've been posting updates until end of march. IA's snapshot of the root page ends in january, with the initial announcement. [20:49] I'm kinda expecting that they'll do a post on the last day... but maybe not. who knows. Supposedly the whole team also worked on ff7r and they're now doing work on that instead. [20:50] Can you link me the stuff in regards to the WBM timeframe? I did a reun http://www.terra-battle.com/ - which last time I ran it in AB on 2020 May; and http://information.mobiusfinalfantasy.com/ne - which was last run in AB on 2020 January [20:50] jrwr: The GBS Graveyard (what's GBS?) only goes back to 2014 though. [20:50] Idk about indexing, I'd probably just archive the whole thread ID range. [20:50] I already ran https://mobiusff.square-enix-games.com/ on 2020 January, and there doesn't seem to be any significant changes [20:51] okay [20:51] I have not checked if they posted anything new there tbh. probably not. it's just a point of interest, since that site will most likely go offline. [20:53] Ryz: the information site is the ingame news. it's fairly simple in structure, and if you check you'll see that there are a bunch of new items posted since january. This site will also go perma offline, probably once they get around to tearing everything down. [20:56] Archiving http://information.mobiusfinalfantasy.com/ne should be small, as that's already done~ [20:56] whee [20:56] https://mobiusff.square-enix-games.com/ ? That's no changes, their 'News' section is http://information.mobiusfinalfantasy.com/ne - which is separate from the main domain [20:57] okay [21:00] *** dashcloud has joined #archiveteam-bs [21:48] *** mtntmnky has quit IRC (Remote host closed the connection) [21:49] *** mtntmnky has joined #archiveteam-bs [22:04] *** Raccoon has joined #archiveteam-bs [23:17] *** Arcorann has joined #archiveteam-bs [23:58] *** BlueMax has joined #archiveteam-bs