Time |
Nickname |
Message |
00:00
🔗
|
OrIdow6 |
Yes\] |
00:00
🔗
|
JAA |
(For context, this is about https://www.htforum.com/forum/ shutting down on the 28th.) |
00:02
🔗
|
Felisto |
I didn't know it was possible to save a forum without going to cpanel and creating a database file, of course by the admin itself if he planned to move elsewhere. The software I tried here can't save anything properly. This forum is one I haven't checked in years and discovered it became read-mode recently: http://theabsolute.net/phpBB/ - How long until even that is unavailable for the public? This is going on everywhere. |
00:03
🔗
|
Felisto |
I mean, most online forums are dying... |
00:04
🔗
|
JAA |
Yeah, as I said, there is an ArchiveTeam project for archiving all forums. But there's also so much other stuff dying all the time that not much work has happened on actually doing that. |
00:05
🔗
|
|
bsmith093 has joined #archiveteam-bs |
00:07
🔗
|
JAA |
Felisto: Thanks, I've added http://theabsolute.net/phpBB/ to ArchiveBot, so it should all be in the Wayback Machine soon. |
00:08
🔗
|
Felisto |
The worst part about Web Archive (from Internet Archive) is that I know it has saved a few specific threads from a forum (the threads are not available anymore, their oldest backup is years ahead), but I can't tell which ones Web Archive saved and I can't spend 2 years clicking on each link and finding out 99% weren't archived. |
00:10
🔗
|
JAA |
On which forums? |
00:10
🔗
|
JAA |
You can do a wildcard search to see everything that's indexed. |
00:11
🔗
|
JAA |
E.g. https://web.archive.org/web/*/http://theabsolute.net/phpBB/* |
00:12
🔗
|
|
BlueMax has joined #archiveteam-bs |
00:21
🔗
|
Felisto |
This is a good example in which we can't tell what has been archived: https://web.archive.org/web/20040929084827/http://www.cinemaemcena.com.br/forum/forum_topics.asp?FID=9 - inside this link most have not been saved, but this one was: https://web.archive.org/web/20040727141039/http://www.cinemaemcena.com.br/forum/forum_posts.asp?TID=44&PN=1 |
00:24
🔗
|
JAA |
Felisto: Yes, there is no easy way to tell which of the links on a page have been archived. But you can get a list of *all* archived topics using https://web.archive.org/web/*/http://www.cinemaemcena.com.br/forum/forum_posts.asp?TID=* |
00:43
🔗
|
Felisto |
thanks a lot, this was exactly what I was looking for _o/ |
00:43
🔗
|
|
Felisto has left |
01:49
🔗
|
|
systwi_ has joined #archiveteam-bs |
01:57
🔗
|
|
systwi has quit IRC (Read error: Operation timed out) |
02:38
🔗
|
|
lennier2 has joined #archiveteam-bs |
02:38
🔗
|
|
voltagex has quit IRC (Quit: ZNC 1.7.2+deb3 - https://znc.in) |
02:38
🔗
|
|
apache2_ has quit IRC (Read error: Connection reset by peer) |
02:38
🔗
|
|
nepeat has quit IRC (Read error: Connection reset by peer) |
02:38
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
02:38
🔗
|
|
apache2 has joined #archiveteam-bs |
02:38
🔗
|
|
nepeat_ has joined #archiveteam-bs |
02:38
🔗
|
|
voltagex has joined #archiveteam-bs |
02:39
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
02:39
🔗
|
|
DLoader_ has joined #archiveteam-bs |
02:42
🔗
|
|
Jon- has joined #archiveteam-bs |
02:42
🔗
|
|
ivan has quit IRC (Write error: Connection reset by peer) |
02:42
🔗
|
|
ivan has joined #archiveteam-bs |
02:49
🔗
|
|
lennier1 has quit IRC (Ping timeout: 745 seconds) |
02:49
🔗
|
|
DLoader has quit IRC (Ping timeout: 745 seconds) |
02:49
🔗
|
|
DLoader_ is now known as DLoader |
02:49
🔗
|
|
lennier2 is now known as lennier1 |
02:51
🔗
|
|
jmtd has quit IRC (Ping timeout: 745 seconds) |
03:12
🔗
|
|
qw3rty_ has joined #archiveteam-bs |
03:20
🔗
|
|
qw3rty__ has quit IRC (Read error: Operation timed out) |
04:02
🔗
|
|
trc__ has quit IRC (Read error: Connection reset by peer) |
04:04
🔗
|
|
qw3rty has joined #archiveteam-bs |
04:05
🔗
|
|
britmob has joined #archiveteam-bs |
04:08
🔗
|
|
qw3rty_ has quit IRC (Read error: Operation timed out) |
04:09
🔗
|
|
trc has joined #archiveteam-bs |
04:12
🔗
|
|
britm0b has quit IRC (Ping timeout: 622 seconds) |
04:14
🔗
|
|
lennier2 has joined #archiveteam-bs |
04:16
🔗
|
|
scorche has joined #archiveteam-bs |
04:16
🔗
|
|
lennier1 has quit IRC (Ping timeout: 260 seconds) |
04:16
🔗
|
|
scorche` has quit IRC (Ping timeout: 260 seconds) |
04:16
🔗
|
|
lennier2 is now known as lennier1 |
04:16
🔗
|
|
ndiddy has quit IRC (Ping timeout: 260 seconds) |
04:17
🔗
|
|
ndiddy has joined #archiveteam-bs |
06:12
🔗
|
|
lennier2 has joined #archiveteam-bs |
06:23
🔗
|
|
lennier2_ has joined #archiveteam-bs |
06:23
🔗
|
|
lennier1 has quit IRC (Ping timeout: 745 seconds) |
06:23
🔗
|
|
lennier2_ is now known as lennier1 |
06:26
🔗
|
|
lennier2 has quit IRC (Ping timeout: 265 seconds) |
06:32
🔗
|
mgrandi |
does archive bot handle having a custom cookie file? for like when its behind a login wall? |
06:32
🔗
|
mgrandi |
that would help for things like this where it just needs an account to get past a login wall and someone has one |
06:35
🔗
|
|
lennier1 has quit IRC (Ping timeout: 496 seconds) |
06:54
🔗
|
Doran |
mgrandi: last I knew, no |
06:55
🔗
|
mgrandi |
Hmm, might make it hard to archive that site then |
07:01
🔗
|
OrIdow6 |
It's a rare enough situation |
07:01
🔗
|
OrIdow6 |
Site is too large to archive via AB in 4 days anyway |
07:22
🔗
|
mgrandi |
Just for raw text? Maybe someone can do it with just wget-at |
07:25
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
07:46
🔗
|
|
HP_Archiv has quit IRC (Quit: Leaving) |
08:15
🔗
|
|
lennier1 has joined #archiveteam-bs |
08:36
🔗
|
|
systwi has joined #archiveteam-bs |
08:44
🔗
|
|
systwi_ has quit IRC (Ping timeout: 622 seconds) |
08:45
🔗
|
|
jshoard has joined #archiveteam-bs |
10:06
🔗
|
|
VerifiedJ has joined #archiveteam-bs |
10:14
🔗
|
|
diggan has quit IRC (Read error: Connection reset by peer) |
10:14
🔗
|
|
riking_ has quit IRC (Read error: Connection reset by peer) |
10:14
🔗
|
|
ftl has quit IRC (Read error: Connection reset by peer) |
10:14
🔗
|
|
DrasticAc has quit IRC (Read error: Connection reset by peer) |
10:14
🔗
|
|
horkermon has quit IRC (Read error: Connection reset by peer) |
10:14
🔗
|
|
pnJay has quit IRC (Read error: Connection reset by peer) |
10:15
🔗
|
|
mgrytbak has quit IRC (Read error: Connection reset by peer) |
10:15
🔗
|
|
pnJay has joined #archiveteam-bs |
10:15
🔗
|
|
riking_ has joined #archiveteam-bs |
10:15
🔗
|
|
mgrytbak has joined #archiveteam-bs |
10:15
🔗
|
|
ftl has joined #archiveteam-bs |
10:16
🔗
|
|
diggan has joined #archiveteam-bs |
10:18
🔗
|
|
DrasticAc has joined #archiveteam-bs |
10:19
🔗
|
|
horkermon has joined #archiveteam-bs |
10:42
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
10:49
🔗
|
|
jesse-s has quit IRC (Read error: Connection reset by peer) |
10:51
🔗
|
|
Ryz has quit IRC (Remote host closed the connection) |
10:52
🔗
|
|
Ryz has joined #archiveteam-bs |
10:52
🔗
|
|
kiska1825 has joined #archiveteam-bs |
10:53
🔗
|
|
svchfoo1 sets mode: +o Ryz |
10:59
🔗
|
|
tchaypo__ has joined #archiveteam-bs |
11:02
🔗
|
|
fallenoak has quit IRC (Read error: Connection timed out) |
11:05
🔗
|
|
mgrandi has quit IRC (Ping timeout: 1230 seconds) |
11:05
🔗
|
|
mgrandi has joined #archiveteam-bs |
11:05
🔗
|
|
tchaypo_ has quit IRC (Ping timeout: 1230 seconds) |
11:15
🔗
|
|
tchaypo__ has quit IRC (Read error: Connection timed out) |
11:16
🔗
|
|
tchaypo__ has joined #archiveteam-bs |
11:16
🔗
|
|
jesse-s has joined #archiveteam-bs |
11:31
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
11:33
🔗
|
|
betamax_ is now known as betamax |
11:35
🔗
|
|
HP_Archiv has quit IRC (Client Quit) |
12:09
🔗
|
|
trc has quit IRC (Read error: Connection reset by peer) |
12:31
🔗
|
JAA |
mgrandi: AB is slow. wget-at itself is in fact even slower because there are no concurrent requests. It might be feasible with AB by running multiple jobs across several machines or something like that. Otherwise, a DPoS project or qwarc would do. |
12:32
🔗
|
|
Joseph_ has joined #archiveteam-bs |
12:45
🔗
|
|
Arcorann has quit IRC (Quit: Leaving) |
12:45
🔗
|
|
Arcorann has joined #archiveteam-bs |
13:18
🔗
|
|
lunik1 has joined #archiveteam-bs |
13:22
🔗
|
|
hook54321 has joined #archiveteam-bs |
13:22
🔗
|
|
svchfoo1 sets mode: +o hook54321 |
14:33
🔗
|
|
Joseph_ has quit IRC (Quit: Leaving) |
14:33
🔗
|
|
Joseph_ has joined #archiveteam-bs |
14:34
🔗
|
|
Joseph_ has quit IRC (Read error: Connection reset by peer) |
15:03
🔗
|
|
trc has joined #archiveteam-bs |
16:28
🔗
|
|
Arcorann has quit IRC (Read error: Connection reset by peer) |
16:35
🔗
|
|
lennier2 has joined #archiveteam-bs |
16:37
🔗
|
|
Mayonaise has quit IRC (Read error: Operation timed out) |
16:37
🔗
|
|
SmileyG has joined #archiveteam-bs |
16:37
🔗
|
|
paul2520 has quit IRC (Read error: Operation timed out) |
16:38
🔗
|
|
Wingy has quit IRC (Read error: Operation timed out) |
16:38
🔗
|
|
robogoat has quit IRC (Read error: Operation timed out) |
16:38
🔗
|
|
robogoat has joined #archiveteam-bs |
16:38
🔗
|
|
dxrt_ has quit IRC (Read error: Operation timed out) |
16:39
🔗
|
|
jshoard_ has joined #archiveteam-bs |
16:39
🔗
|
|
Jake has quit IRC (Read error: Operation timed out) |
16:39
🔗
|
|
Mayonaise has joined #archiveteam-bs |
16:40
🔗
|
|
Smiley has quit IRC (Read error: Operation timed out) |
16:41
🔗
|
|
jshoard has quit IRC (Read error: Operation timed out) |
16:42
🔗
|
|
lennier1 has quit IRC (Read error: Operation timed out) |
16:42
🔗
|
|
lennier2 is now known as lennier1 |
16:43
🔗
|
|
sembiance has quit IRC (Read error: Connection reset by peer) |
16:45
🔗
|
|
betamax_ has joined #archiveteam-bs |
16:48
🔗
|
|
systwi has quit IRC (Ping timeout: 622 seconds) |
16:48
🔗
|
|
Jake has joined #archiveteam-bs |
16:48
🔗
|
|
sembiance has joined #archiveteam-bs |
16:49
🔗
|
|
betamax has quit IRC (Ping timeout: 622 seconds) |
16:49
🔗
|
|
endrift has quit IRC (Ping timeout: 622 seconds) |
16:49
🔗
|
|
systwi has joined #archiveteam-bs |
16:52
🔗
|
|
dxrt_ has joined #archiveteam-bs |
16:52
🔗
|
|
dxrt sets mode: +o dxrt_ |
16:53
🔗
|
|
endrift has joined #archiveteam-bs |
17:05
🔗
|
|
Wingy has joined #archiveteam-bs |
17:10
🔗
|
|
Ryz has quit IRC (Read error: Connection reset by peer) |
17:12
🔗
|
|
Ryz4 has joined #archiveteam-bs |
17:12
🔗
|
|
Ryz4 has quit IRC (Excess Flood) |
17:13
🔗
|
|
Ryz has joined #archiveteam-bs |
17:13
🔗
|
|
svchfoo1 sets mode: +o Ryz |
17:16
🔗
|
|
cascode has joined #archiveteam-bs |
17:24
🔗
|
|
schbirid has joined #archiveteam-bs |
17:27
🔗
|
|
paul2520 has joined #archiveteam-bs |
18:12
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
18:13
🔗
|
|
HP_Archiv has quit IRC (Client Quit) |
18:32
🔗
|
|
ave_9 has joined #archiveteam-bs |
18:38
🔗
|
|
SLC has joined #archiveteam-bs |
18:40
🔗
|
SketchCow |
The 1,001 number came from me doing "ls | wc -l" without noticing the sqlite object |
18:40
🔗
|
SLC |
I was kinda suspecting that :-) |
18:41
🔗
|
SketchCow |
According to my deduplicator thingy, I have 533 items with files that were not uploaded previously. |
18:42
🔗
|
SketchCow |
Somewhere here, I'm going to screw up, but we'll have lots of new ones get in properly. |
18:42
🔗
|
|
ave_ has quit IRC (Ping timeout: 745 seconds) |
18:42
🔗
|
|
ave_9 is now known as ave_ |
18:43
🔗
|
SketchCow |
https://archive.org/details/uta_Xevious_1986_U.S._Gold_4891 is my test. it currently doesn't work. It will eventually work. |
18:50
🔗
|
SLC |
seems to be working now :) |
18:50
🔗
|
|
cascode has quit IRC (Read error: Operation timed out) |
18:58
🔗
|
SketchCow |
Yeah! |
19:02
🔗
|
|
jshoard__ has joined #archiveteam-bs |
19:02
🔗
|
|
jshoard__ has quit IRC (Remote host closed the connection!) |
19:02
🔗
|
|
antomati_ has joined #archiveteam-bs |
19:03
🔗
|
|
Stilett0 has joined #archiveteam-bs |
19:03
🔗
|
|
bleb has joined #archiveteam-bs |
19:03
🔗
|
SketchCow |
Obviously, it doesn't have the settings/style of the rest of the work. |
19:03
🔗
|
|
tonsofpcs has quit IRC (Read error: Operation timed out) |
19:03
🔗
|
|
Frogging has quit IRC (Read error: Operation timed out) |
19:03
🔗
|
|
Frogging has joined #archiveteam-bs |
19:03
🔗
|
|
nyany has quit IRC (Read error: Operation timed out) |
19:03
🔗
|
SketchCow |
But I am 1. going to upload everything absolutely new 2. Figure out what you changed in the ones that aren't 3. Make them all function like the first set. |
19:03
🔗
|
|
prq has quit IRC (Read error: Operation timed out) |
19:03
🔗
|
|
TC01 has quit IRC (Write error: Broken pipe) |
19:03
🔗
|
|
underscor has quit IRC (Write error: Broken pipe) |
19:03
🔗
|
|
superkuh has quit IRC (Read error: Operation timed out) |
19:03
🔗
|
|
mtntmnky has quit IRC (Read error: Operation timed out) |
19:03
🔗
|
|
yano_ has joined #archiveteam-bs |
19:03
🔗
|
|
underscor has joined #archiveteam-bs |
19:04
🔗
|
|
jshoard has joined #archiveteam-bs |
19:04
🔗
|
|
revi has quit IRC (Ping timeout: 260 seconds) |
19:04
🔗
|
|
Pixi has joined #archiveteam-bs |
19:04
🔗
|
SketchCow |
Like, you don't need to do things. And I linked people the nightmarish main archive file primarily and linked to 2.0 from 1.0 like 'go to this one' |
19:04
🔗
|
|
jrwr has quit IRC (Ping timeout: 260 seconds) |
19:05
🔗
|
SketchCow |
Xevious is running in my other window, hi xevious |
19:05
🔗
|
|
phirephly has quit IRC (Read error: Operation timed out) |
19:05
🔗
|
|
Maylay has quit IRC (Read error: Operation timed out) |
19:05
🔗
|
|
Stiletto has quit IRC (Ping timeout: 376 seconds) |
19:05
🔗
|
|
phirephly has joined #archiveteam-bs |
19:05
🔗
|
|
twigfoot has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
DigiDigi has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
sivoais has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
SLC |
SketchCow: the .tap images didn't change at all, but some may have gotten index files added. There is also a slight chance all pictures will have a different hash because the conversion of pictures is happening every time the archive is exported and if ImageMagick-version changes it may produce different binaries (or there's been applied a better color profile etcetc) |
19:06
🔗
|
|
cm has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
bleb is now known as cm |
19:06
🔗
|
|
sivoais has joined #archiveteam-bs |
19:06
🔗
|
|
nico_32_ has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
pie_[bnc] has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
Larsenv has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
atphoenix has quit IRC (Write error: Broken pipe) |
19:06
🔗
|
|
svchfoo1 has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
pie_ has joined #archiveteam-bs |
19:06
🔗
|
|
Igloo has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
dxrt has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
jshoard_ has quit IRC (Ping timeout: 376 seconds) |
19:06
🔗
|
|
antomatic has quit IRC (Read error: Operation timed out) |
19:06
🔗
|
|
Dj-Wawa has quit IRC (Write error: Broken pipe) |
19:06
🔗
|
|
yano has quit IRC (Read error: Operation timed out) |
19:07
🔗
|
|
Pixi` has quit IRC (Read error: Operation timed out) |
19:08
🔗
|
|
Yurume has quit IRC (Read error: Operation timed out) |
19:08
🔗
|
|
trc has quit IRC (Read error: Operation timed out) |
19:08
🔗
|
|
twigfoot has joined #archiveteam-bs |
19:09
🔗
|
|
Larsenv has joined #archiveteam-bs |
19:10
🔗
|
SketchCow |
Yeah, I get it. |
19:11
🔗
|
SketchCow |
Considering it went from 1,000 to 533, I expect most things didn't change. |
19:11
🔗
|
|
Yurume has joined #archiveteam-bs |
19:12
🔗
|
|
nico_32 has joined #archiveteam-bs |
19:17
🔗
|
|
tonsofpcs has joined #archiveteam-bs |
19:18
🔗
|
|
superkuh has joined #archiveteam-bs |
19:18
🔗
|
|
Dj-Wawa has joined #archiveteam-bs |
19:21
🔗
|
|
TC01 has joined #archiveteam-bs |
19:22
🔗
|
|
nyany has joined #archiveteam-bs |
19:22
🔗
|
|
svchfoo1 has joined #archiveteam-bs |
19:22
🔗
|
|
DigiDigi has joined #archiveteam-bs |
19:23
🔗
|
|
revi has joined #archiveteam-bs |
19:24
🔗
|
|
trc has joined #archiveteam-bs |
19:24
🔗
|
|
jrwr has joined #archiveteam-bs |
19:24
🔗
|
|
prq has joined #archiveteam-bs |
19:24
🔗
|
|
Maylay has joined #archiveteam-bs |
19:24
🔗
|
|
Maylay has quit IRC (Remote host closed the connection!) |
19:28
🔗
|
|
Igloo has joined #archiveteam-bs |
19:29
🔗
|
SketchCow |
I've verified all 532 remaining have a .tap file in them, meaning they're md5 different. |
19:31
🔗
|
|
mtntmnky has joined #archiveteam-bs |
19:31
🔗
|
SLC |
that sounds about right... |
19:33
🔗
|
SketchCow |
Now I have to hack up something going "These are already uploaded in previous ones" |
19:38
🔗
|
|
betamax_ is now known as betamax |
19:42
🔗
|
SketchCow |
Verified, I can do a lot of this. |
19:42
🔗
|
SketchCow |
I suspect I did some custom one-offs |
19:42
🔗
|
|
VerifiedJ has quit IRC (Quit: Leaving) |
19:53
🔗
|
SketchCow |
Verified all 533 are new |
20:06
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
20:11
🔗
|
SketchCow |
Anyway. In summary, SLC, it's going to all be fine. Do you use Discord? I have a discord. |
20:11
🔗
|
SketchCow |
All you fucks, I have a discord |
20:11
🔗
|
SketchCow |
https://discord.gg/UKQUvq |
20:23
🔗
|
SLC |
I have discord but usually favor IRC... I'll join in, though. |
20:28
🔗
|
systwi |
Ewww discord :P |
20:29
🔗
|
SketchCow |
ha |
20:29
🔗
|
SketchCow |
Wait |
20:29
🔗
|
SketchCow |
Checking my wallet |
20:29
🔗
|
SketchCow |
Oh here it is |
20:29
🔗
|
SketchCow |
Fuck off |
20:29
🔗
|
systwi |
On it |
20:35
🔗
|
|
DogsRNice has joined #archiveteam-bs |
20:42
🔗
|
|
bsmith093 has quit IRC (Ping timeout: 272 seconds) |
20:57
🔗
|
|
bsmith093 has joined #archiveteam-bs |
21:01
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
21:06
🔗
|
|
RichardG has joined #archiveteam-bs |
21:06
🔗
|
|
ave_ has quit IRC (Read error: Connection reset by peer) |
21:07
🔗
|
|
ave_ has joined #archiveteam-bs |
21:27
🔗
|
|
Mateon1 has quit IRC (Quit: Mateon1) |
21:28
🔗
|
|
Mateon1 has joined #archiveteam-bs |
21:40
🔗
|
|
DLoader_ has joined #archiveteam-bs |
21:47
🔗
|
|
legoktm has joined #archiveteam-bs |
21:47
🔗
|
|
duh has quit IRC (Read error: Connection reset by peer) |
21:51
🔗
|
|
DLoader has quit IRC (Ping timeout: 745 seconds) |
21:51
🔗
|
|
DLoader_ is now known as DLoader |
22:26
🔗
|
|
lennier2 has joined #archiveteam-bs |
22:29
🔗
|
|
lennier1 has quit IRC (Ping timeout: 272 seconds) |
22:29
🔗
|
|
lennier2 is now known as lennier1 |
23:01
🔗
|
|
Stilett0 is now known as Stiletto |
23:06
🔗
|
|
asdf01011 has joined #archiveteam-bs |
23:41
🔗
|
|
benjinsmi has quit IRC (Read error: Connection reset by peer) |
23:44
🔗
|
|
benjinsmi has joined #archiveteam-bs |
23:44
🔗
|
|
jshoard has quit IRC (Quit: Leaving) |