[00:00] Yes\] [00:00] (For context, this is about https://www.htforum.com/forum/ shutting down on the 28th.) [00:02] I didn't know it was possible to save a forum without going to cpanel and creating a database file, of course by the admin itself if he planned to move elsewhere. The software I tried here can't save anything properly. This forum is one I haven't checked in years and discovered it became read-mode recently: http://theabsolute.net/phpBB/ - How long until even that is unavailable for the public? This is going on everywhere. [00:03] I mean, most online forums are dying... [00:04] Yeah, as I said, there is an ArchiveTeam project for archiving all forums. But there's also so much other stuff dying all the time that not much work has happened on actually doing that. [00:05] *** bsmith093 has joined #archiveteam-bs [00:07] Felisto: Thanks, I've added http://theabsolute.net/phpBB/ to ArchiveBot, so it should all be in the Wayback Machine soon. [00:08] The worst part about Web Archive (from Internet Archive) is that I know it has saved a few specific threads from a forum (the threads are not available anymore, their oldest backup is years ahead), but I can't tell which ones Web Archive saved and I can't spend 2 years clicking on each link and finding out 99% weren't archived. [00:10] On which forums? [00:10] You can do a wildcard search to see everything that's indexed. [00:11] E.g. https://web.archive.org/web/*/http://theabsolute.net/phpBB/* [00:12] *** BlueMax has joined #archiveteam-bs [00:21] This is a good example in which we can't tell what has been archived: https://web.archive.org/web/20040929084827/http://www.cinemaemcena.com.br/forum/forum_topics.asp?FID=9 - inside this link most have not been saved, but this one was: https://web.archive.org/web/20040727141039/http://www.cinemaemcena.com.br/forum/forum_posts.asp?TID=44&PN=1 [00:24] Felisto: Yes, there is no easy way to tell which of the links on a page have been archived. But you can get a list of *all* archived topics using https://web.archive.org/web/*/http://www.cinemaemcena.com.br/forum/forum_posts.asp?TID=* [00:43] thanks a lot, this was exactly what I was looking for _o/ [00:43] *** Felisto has left [01:49] *** systwi_ has joined #archiveteam-bs [01:57] *** systwi has quit IRC (Read error: Operation timed out) [02:38] *** lennier2 has joined #archiveteam-bs [02:38] *** voltagex has quit IRC (Quit: ZNC 1.7.2+deb3 - https://znc.in) [02:38] *** apache2_ has quit IRC (Read error: Connection reset by peer) [02:38] *** nepeat has quit IRC (Read error: Connection reset by peer) [02:38] *** BlueMax has quit IRC (Read error: Connection reset by peer) [02:38] *** apache2 has joined #archiveteam-bs [02:38] *** nepeat_ has joined #archiveteam-bs [02:38] *** voltagex has joined #archiveteam-bs [02:39] *** BlueMaxim has joined #archiveteam-bs [02:39] *** DLoader_ has joined #archiveteam-bs [02:42] *** Jon- has joined #archiveteam-bs [02:42] *** ivan has quit IRC (Write error: Connection reset by peer) [02:42] *** ivan has joined #archiveteam-bs [02:49] *** lennier1 has quit IRC (Ping timeout: 745 seconds) [02:49] *** DLoader has quit IRC (Ping timeout: 745 seconds) [02:49] *** DLoader_ is now known as DLoader [02:49] *** lennier2 is now known as lennier1 [02:51] *** jmtd has quit IRC (Ping timeout: 745 seconds) [03:12] *** qw3rty_ has joined #archiveteam-bs [03:20] *** qw3rty__ has quit IRC (Read error: Operation timed out) [04:02] *** trc__ has quit IRC (Read error: Connection reset by peer) [04:04] *** qw3rty has joined #archiveteam-bs [04:05] *** britmob has joined #archiveteam-bs [04:08] *** qw3rty_ has quit IRC (Read error: Operation timed out) [04:09] *** trc has joined #archiveteam-bs [04:12] *** britm0b has quit IRC (Ping timeout: 622 seconds) [04:14] *** lennier2 has joined #archiveteam-bs [04:16] *** scorche has joined #archiveteam-bs [04:16] *** lennier1 has quit IRC (Ping timeout: 260 seconds) [04:16] *** scorche` has quit IRC (Ping timeout: 260 seconds) [04:16] *** lennier2 is now known as lennier1 [04:16] *** ndiddy has quit IRC (Ping timeout: 260 seconds) [04:17] *** ndiddy has joined #archiveteam-bs [06:12] *** lennier2 has joined #archiveteam-bs [06:23] *** lennier2_ has joined #archiveteam-bs [06:23] *** lennier1 has quit IRC (Ping timeout: 745 seconds) [06:23] *** lennier2_ is now known as lennier1 [06:26] *** lennier2 has quit IRC (Ping timeout: 265 seconds) [06:32] does archive bot handle having a custom cookie file? for like when its behind a login wall? [06:32] that would help for things like this where it just needs an account to get past a login wall and someone has one [06:35] *** lennier1 has quit IRC (Ping timeout: 496 seconds) [06:54] mgrandi: last I knew, no [06:55] Hmm, might make it hard to archive that site then [07:01] It's a rare enough situation [07:01] Site is too large to archive via AB in 4 days anyway [07:22] Just for raw text? Maybe someone can do it with just wget-at [07:25] *** HP_Archiv has joined #archiveteam-bs [07:46] *** HP_Archiv has quit IRC (Quit: Leaving) [08:15] *** lennier1 has joined #archiveteam-bs [08:36] *** systwi has joined #archiveteam-bs [08:44] *** systwi_ has quit IRC (Ping timeout: 622 seconds) [08:45] *** jshoard has joined #archiveteam-bs [10:06] *** VerifiedJ has joined #archiveteam-bs [10:14] *** diggan has quit IRC (Read error: Connection reset by peer) [10:14] *** riking_ has quit IRC (Read error: Connection reset by peer) [10:14] *** ftl has quit IRC (Read error: Connection reset by peer) [10:14] *** DrasticAc has quit IRC (Read error: Connection reset by peer) [10:14] *** horkermon has quit IRC (Read error: Connection reset by peer) [10:14] *** pnJay has quit IRC (Read error: Connection reset by peer) [10:15] *** mgrytbak has quit IRC (Read error: Connection reset by peer) [10:15] *** pnJay has joined #archiveteam-bs [10:15] *** riking_ has joined #archiveteam-bs [10:15] *** mgrytbak has joined #archiveteam-bs [10:15] *** ftl has joined #archiveteam-bs [10:16] *** diggan has joined #archiveteam-bs [10:18] *** DrasticAc has joined #archiveteam-bs [10:19] *** horkermon has joined #archiveteam-bs [10:42] *** BlueMaxim has quit IRC (Quit: Leaving) [10:49] *** jesse-s has quit IRC (Read error: Connection reset by peer) [10:51] *** Ryz has quit IRC (Remote host closed the connection) [10:52] *** Ryz has joined #archiveteam-bs [10:52] *** kiska1825 has joined #archiveteam-bs [10:53] *** svchfoo1 sets mode: +o Ryz [10:59] *** tchaypo__ has joined #archiveteam-bs [11:02] *** fallenoak has quit IRC (Read error: Connection timed out) [11:05] *** mgrandi has quit IRC (Ping timeout: 1230 seconds) [11:05] *** mgrandi has joined #archiveteam-bs [11:05] *** tchaypo_ has quit IRC (Ping timeout: 1230 seconds) [11:15] *** tchaypo__ has quit IRC (Read error: Connection timed out) [11:16] *** tchaypo__ has joined #archiveteam-bs [11:16] *** jesse-s has joined #archiveteam-bs [11:31] *** HP_Archiv has joined #archiveteam-bs [11:33] *** betamax_ is now known as betamax [11:35] *** HP_Archiv has quit IRC (Client Quit) [12:09] *** trc has quit IRC (Read error: Connection reset by peer) [12:31] mgrandi: AB is slow. wget-at itself is in fact even slower because there are no concurrent requests. It might be feasible with AB by running multiple jobs across several machines or something like that. Otherwise, a DPoS project or qwarc would do. [12:32] *** Joseph_ has joined #archiveteam-bs [12:45] *** Arcorann has quit IRC (Quit: Leaving) [12:45] *** Arcorann has joined #archiveteam-bs [13:18] *** lunik1 has joined #archiveteam-bs [13:22] *** hook54321 has joined #archiveteam-bs [13:22] *** svchfoo1 sets mode: +o hook54321 [14:33] *** Joseph_ has quit IRC (Quit: Leaving) [14:33] *** Joseph_ has joined #archiveteam-bs [14:34] *** Joseph_ has quit IRC (Read error: Connection reset by peer) [15:03] *** trc has joined #archiveteam-bs [16:28] *** Arcorann has quit IRC (Read error: Connection reset by peer) [16:35] *** lennier2 has joined #archiveteam-bs [16:37] *** Mayonaise has quit IRC (Read error: Operation timed out) [16:37] *** SmileyG has joined #archiveteam-bs [16:37] *** paul2520 has quit IRC (Read error: Operation timed out) [16:38] *** Wingy has quit IRC (Read error: Operation timed out) [16:38] *** robogoat has quit IRC (Read error: Operation timed out) [16:38] *** robogoat has joined #archiveteam-bs [16:38] *** dxrt_ has quit IRC (Read error: Operation timed out) [16:39] *** jshoard_ has joined #archiveteam-bs [16:39] *** Jake has quit IRC (Read error: Operation timed out) [16:39] *** Mayonaise has joined #archiveteam-bs [16:40] *** Smiley has quit IRC (Read error: Operation timed out) [16:41] *** jshoard has quit IRC (Read error: Operation timed out) [16:42] *** lennier1 has quit IRC (Read error: Operation timed out) [16:42] *** lennier2 is now known as lennier1 [16:43] *** sembiance has quit IRC (Read error: Connection reset by peer) [16:45] *** betamax_ has joined #archiveteam-bs [16:48] *** systwi has quit IRC (Ping timeout: 622 seconds) [16:48] *** Jake has joined #archiveteam-bs [16:48] *** sembiance has joined #archiveteam-bs [16:49] *** betamax has quit IRC (Ping timeout: 622 seconds) [16:49] *** endrift has quit IRC (Ping timeout: 622 seconds) [16:49] *** systwi has joined #archiveteam-bs [16:52] *** dxrt_ has joined #archiveteam-bs [16:52] *** dxrt sets mode: +o dxrt_ [16:53] *** endrift has joined #archiveteam-bs [17:05] *** Wingy has joined #archiveteam-bs [17:10] *** Ryz has quit IRC (Read error: Connection reset by peer) [17:12] *** Ryz4 has joined #archiveteam-bs [17:12] *** Ryz4 has quit IRC (Excess Flood) [17:13] *** Ryz has joined #archiveteam-bs [17:13] *** svchfoo1 sets mode: +o Ryz [17:16] *** cascode has joined #archiveteam-bs [17:24] *** schbirid has joined #archiveteam-bs [17:27] *** paul2520 has joined #archiveteam-bs [18:12] *** HP_Archiv has joined #archiveteam-bs [18:13] *** HP_Archiv has quit IRC (Client Quit) [18:32] *** ave_9 has joined #archiveteam-bs [18:38] *** SLC has joined #archiveteam-bs [18:40] The 1,001 number came from me doing "ls | wc -l" without noticing the sqlite object [18:40] I was kinda suspecting that :-) [18:41] According to my deduplicator thingy, I have 533 items with files that were not uploaded previously. [18:42] Somewhere here, I'm going to screw up, but we'll have lots of new ones get in properly. [18:42] *** ave_ has quit IRC (Ping timeout: 745 seconds) [18:42] *** ave_9 is now known as ave_ [18:43] https://archive.org/details/uta_Xevious_1986_U.S._Gold_4891 is my test. it currently doesn't work. It will eventually work. [18:50] seems to be working now :) [18:50] *** cascode has quit IRC (Read error: Operation timed out) [18:58] Yeah! [19:02] *** jshoard__ has joined #archiveteam-bs [19:02] *** jshoard__ has quit IRC (Remote host closed the connection!) [19:02] *** antomati_ has joined #archiveteam-bs [19:03] *** Stilett0 has joined #archiveteam-bs [19:03] *** bleb has joined #archiveteam-bs [19:03] Obviously, it doesn't have the settings/style of the rest of the work. [19:03] *** tonsofpcs has quit IRC (Read error: Operation timed out) [19:03] *** Frogging has quit IRC (Read error: Operation timed out) [19:03] *** Frogging has joined #archiveteam-bs [19:03] *** nyany has quit IRC (Read error: Operation timed out) [19:03] But I am 1. going to upload everything absolutely new 2. Figure out what you changed in the ones that aren't 3. Make them all function like the first set. [19:03] *** prq has quit IRC (Read error: Operation timed out) [19:03] *** TC01 has quit IRC (Write error: Broken pipe) [19:03] *** underscor has quit IRC (Write error: Broken pipe) [19:03] *** superkuh has quit IRC (Read error: Operation timed out) [19:03] *** mtntmnky has quit IRC (Read error: Operation timed out) [19:03] *** yano_ has joined #archiveteam-bs [19:03] *** underscor has joined #archiveteam-bs [19:04] *** jshoard has joined #archiveteam-bs [19:04] *** revi has quit IRC (Ping timeout: 260 seconds) [19:04] *** Pixi has joined #archiveteam-bs [19:04] Like, you don't need to do things. And I linked people the nightmarish main archive file primarily and linked to 2.0 from 1.0 like 'go to this one' [19:04] *** jrwr has quit IRC (Ping timeout: 260 seconds) [19:05] Xevious is running in my other window, hi xevious [19:05] *** phirephly has quit IRC (Read error: Operation timed out) [19:05] *** Maylay has quit IRC (Read error: Operation timed out) [19:05] *** Stiletto has quit IRC (Ping timeout: 376 seconds) [19:05] *** phirephly has joined #archiveteam-bs [19:05] *** twigfoot has quit IRC (Read error: Operation timed out) [19:06] *** DigiDigi has quit IRC (Read error: Operation timed out) [19:06] *** sivoais has quit IRC (Read error: Operation timed out) [19:06] SketchCow: the .tap images didn't change at all, but some may have gotten index files added. There is also a slight chance all pictures will have a different hash because the conversion of pictures is happening every time the archive is exported and if ImageMagick-version changes it may produce different binaries (or there's been applied a better color profile etcetc) [19:06] *** cm has quit IRC (Read error: Operation timed out) [19:06] *** bleb is now known as cm [19:06] *** sivoais has joined #archiveteam-bs [19:06] *** nico_32_ has quit IRC (Read error: Operation timed out) [19:06] *** pie_[bnc] has quit IRC (Read error: Operation timed out) [19:06] *** Larsenv has quit IRC (Read error: Operation timed out) [19:06] *** atphoenix has quit IRC (Write error: Broken pipe) [19:06] *** svchfoo1 has quit IRC (Read error: Operation timed out) [19:06] *** pie_ has joined #archiveteam-bs [19:06] *** Igloo has quit IRC (Read error: Operation timed out) [19:06] *** dxrt has quit IRC (Read error: Operation timed out) [19:06] *** jshoard_ has quit IRC (Ping timeout: 376 seconds) [19:06] *** antomatic has quit IRC (Read error: Operation timed out) [19:06] *** Dj-Wawa has quit IRC (Write error: Broken pipe) [19:06] *** yano has quit IRC (Read error: Operation timed out) [19:07] *** Pixi` has quit IRC (Read error: Operation timed out) [19:08] *** Yurume has quit IRC (Read error: Operation timed out) [19:08] *** trc has quit IRC (Read error: Operation timed out) [19:08] *** twigfoot has joined #archiveteam-bs [19:09] *** Larsenv has joined #archiveteam-bs [19:10] Yeah, I get it. [19:11] Considering it went from 1,000 to 533, I expect most things didn't change. [19:11] *** Yurume has joined #archiveteam-bs [19:12] *** nico_32 has joined #archiveteam-bs [19:17] *** tonsofpcs has joined #archiveteam-bs [19:18] *** superkuh has joined #archiveteam-bs [19:18] *** Dj-Wawa has joined #archiveteam-bs [19:21] *** TC01 has joined #archiveteam-bs [19:22] *** nyany has joined #archiveteam-bs [19:22] *** svchfoo1 has joined #archiveteam-bs [19:22] *** DigiDigi has joined #archiveteam-bs [19:23] *** revi has joined #archiveteam-bs [19:24] *** trc has joined #archiveteam-bs [19:24] *** jrwr has joined #archiveteam-bs [19:24] *** prq has joined #archiveteam-bs [19:24] *** Maylay has joined #archiveteam-bs [19:24] *** Maylay has quit IRC (Remote host closed the connection!) [19:28] *** Igloo has joined #archiveteam-bs [19:29] I've verified all 532 remaining have a .tap file in them, meaning they're md5 different. [19:31] *** mtntmnky has joined #archiveteam-bs [19:31] that sounds about right... [19:33] Now I have to hack up something going "These are already uploaded in previous ones" [19:38] *** betamax_ is now known as betamax [19:42] Verified, I can do a lot of this. [19:42] I suspect I did some custom one-offs [19:42] *** VerifiedJ has quit IRC (Quit: Leaving) [19:53] Verified all 533 are new [20:06] *** schbirid has quit IRC (Quit: Leaving) [20:11] Anyway. In summary, SLC, it's going to all be fine. Do you use Discord? I have a discord. [20:11] All you fucks, I have a discord [20:11] https://discord.gg/UKQUvq [20:23] I have discord but usually favor IRC... I'll join in, though. [20:28] Ewww discord :P [20:29] ha [20:29] Wait [20:29] Checking my wallet [20:29] Oh here it is [20:29] Fuck off [20:29] On it [20:35] *** DogsRNice has joined #archiveteam-bs [20:42] *** bsmith093 has quit IRC (Ping timeout: 272 seconds) [20:57] *** bsmith093 has joined #archiveteam-bs [21:01] *** RichardG has quit IRC (Read error: Connection reset by peer) [21:06] *** RichardG has joined #archiveteam-bs [21:06] *** ave_ has quit IRC (Read error: Connection reset by peer) [21:07] *** ave_ has joined #archiveteam-bs [21:27] *** Mateon1 has quit IRC (Quit: Mateon1) [21:28] *** Mateon1 has joined #archiveteam-bs [21:40] *** DLoader_ has joined #archiveteam-bs [21:47] *** legoktm has joined #archiveteam-bs [21:47] *** duh has quit IRC (Read error: Connection reset by peer) [21:51] *** DLoader has quit IRC (Ping timeout: 745 seconds) [21:51] *** DLoader_ is now known as DLoader [22:26] *** lennier2 has joined #archiveteam-bs [22:29] *** lennier1 has quit IRC (Ping timeout: 272 seconds) [22:29] *** lennier2 is now known as lennier1 [23:01] *** Stilett0 is now known as Stiletto [23:06] *** asdf01011 has joined #archiveteam-bs [23:41] *** benjinsmi has quit IRC (Read error: Connection reset by peer) [23:44] *** benjinsmi has joined #archiveteam-bs [23:44] *** jshoard has quit IRC (Quit: Leaving)