Time |
Nickname |
Message |
00:34
🔗
|
|
edisdead has joined #archiveteam-bs |
00:35
🔗
|
|
Stiletto has quit IRC (Ping timeout: 265 seconds) |
00:37
🔗
|
|
Stiletto has joined #archiveteam-bs |
00:41
🔗
|
|
VerfiedJ has quit IRC (Quit: Leaving) |
00:54
🔗
|
|
xLovely has quit IRC (Read error: Connection reset by peer) |
01:05
🔗
|
|
sep332 has quit IRC (ZNC 1.6.3+deb1ubuntu0.1 - http://znc.in) |
01:39
🔗
|
JAA |
hook54321: Turns out it's even worse than just a cookie for the English version of animatorexpo.com: the translated strings are loaded lazily through JS from e.g. http://animatorexpo.com/lang/top-en.json or http://animatorexpo.com/lang/header-en.json |
02:53
🔗
|
godane |
SketchCow: i found something interesting for you to grab: https://rutracker.org/forum/viewtopic.php?t=5620076 |
02:53
🔗
|
godane |
its called Update Special |
02:54
🔗
|
godane |
its a russian dvd that came with a magazine i think |
03:05
🔗
|
|
wp494 has joined #archiveteam-bs |
03:21
🔗
|
|
benjinsmi has joined #archiveteam-bs |
03:25
🔗
|
|
benjins has quit IRC (Read error: Operation timed out) |
03:42
🔗
|
|
powerKitt has joined #archiveteam-bs |
03:43
🔗
|
JAA |
Only 2k threads? That's tiny... |
03:43
🔗
|
JAA |
(Continuing from #archiveteam re Byuu forums) |
03:43
🔗
|
powerKitt |
Yeah, it was more obscure than I first though. There's still some valuable information in there imo |
03:43
🔗
|
powerKitt |
https://pastebin.com/tad3Yxp6 |
03:43
🔗
|
powerKitt |
(#archiveteam discussion for reference ^) |
03:44
🔗
|
JAA |
We grabbed it through ArchiveBot about a year ago. There was another job in November, but that had some issues. |
03:44
🔗
|
JAA |
How many posts? |
03:44
🔗
|
JAA |
(Rough estimate's good enough if it doesn't show a total number.) |
03:44
🔗
|
powerKitt |
https://board.byuu.org/viewtopic.php?p=58341 is the newest post |
03:45
🔗
|
JAA |
Oh hey, I can access that. |
03:45
🔗
|
powerKitt |
weird |
03:45
🔗
|
JAA |
Looks like they simply hid the forums from the homepage? |
03:47
🔗
|
powerKitt |
huh |
03:48
🔗
|
powerKitt |
https://board.byuu.org/viewforum.php?f=22 okay that's really weird |
03:48
🔗
|
JAA |
powerKitt: Is 34 the largest forum ID (viewforum.php?f=X)? |
03:48
🔗
|
powerKitt |
https://board.byuu.org/viewforum.php?f=1 |
03:49
🔗
|
powerKitt |
https://board.byuu.org/viewforum.php?f=5 |
03:49
🔗
|
powerKitt |
those are the two top level forum ids |
03:50
🔗
|
powerKitt |
yeah, 34 looks like it's the largest forum id |
03:50
🔗
|
powerKitt |
I can't access it though lol |
03:51
🔗
|
JAA |
Alright, I'll prepare an ArchiveBot job. :-) |
03:51
🔗
|
powerKitt |
https://board.byuu.org/viewforum.php?f=30 is the top level archive forum |
03:52
🔗
|
JAA |
Hmm, can you access that? |
03:52
🔗
|
JAA |
I get a login form. |
03:52
🔗
|
powerKitt |
Yes, I can access that |
03:53
🔗
|
JAA |
I see. Then maybe archiving with your account is still a better option. |
03:53
🔗
|
powerKitt |
there are no posts to it directly though, it just contains f=1 and f=5 |
03:53
🔗
|
JAA |
Ah |
03:53
🔗
|
JAA |
Are there any members-only forums? |
03:54
🔗
|
|
hdch has joined #archiveteam-bs |
03:56
🔗
|
powerKitt |
there are some for internal Higan development that I'm not apart of. |
03:57
🔗
|
powerKitt |
https://board.byuu.org/viewtopic.php?f=2&t=2229 |
04:00
🔗
|
powerKitt |
I don't think there are any members only forums I have access to that have actual threads in them |
04:02
🔗
|
|
hook54321 has quit IRC (Quit: Connection closed for inactivity) |
04:02
🔗
|
powerKitt |
tbh, it's really kinda annoying that Byuu, who has been championing accurate preservation of the SNES, has decided to get all his sites removed from the Wayback Machine, preventing them from being preserved. |
04:03
🔗
|
JAA |
Yeah, that really sucks. |
04:04
🔗
|
powerKitt |
He's taken down all his old emulation articles from byuu.org, which were quite the entertaining reading. I can't even read an archive of them because byuu.org isn't in the wayback machine anymore |
04:04
🔗
|
Flashfire |
If it went through archivebot the WARCs should still be there |
04:07
🔗
|
powerKitt |
Yeah, but having to download WARCs and extract html pages from them just to read an article is absurd |
04:08
🔗
|
pikhq |
Sigh, byuu. They are sometimes really weird about things in ways that confuse me. |
04:09
🔗
|
|
_hdch_ has joined #archiveteam-bs |
04:09
🔗
|
powerKitt |
indeed they are |
04:10
🔗
|
powerKitt |
thanks for reminding me byuu's non-binary by the way, I didn't mean to misgender them. |
04:10
🔗
|
|
hdch has quit IRC (Remote host closed the connection) |
04:10
🔗
|
pikhq |
:) |
04:11
🔗
|
JAA |
I queued an ArchiveBot job. It should grab all threads, assuming the boards don't go down until it starts (or become truly only readable with an account). |
04:13
🔗
|
powerKitt |
hopefully |
04:13
🔗
|
|
qw3rty117 has joined #archiveteam-bs |
04:15
🔗
|
powerKitt |
how do I extract captured pages from a WARC again? |
04:17
🔗
|
|
qw3rty116 has quit IRC (Read error: Operation timed out) |
04:24
🔗
|
JAA |
powerKitt: warcat has an extract command for that. |
04:25
🔗
|
powerKitt |
I just ended up using Webrecorder Player |
04:26
🔗
|
JAA |
Ah, so you wanted playback, not extraction. :-) |
04:26
🔗
|
powerKitt |
oh *nice*, the latest archivebot capture of byuu.org doesn't have the last article |
04:27
🔗
|
powerKitt |
so https://byuu.org/articles/fpgas-arent-magic/ is basically lost now |
04:27
🔗
|
powerKitt |
cool |
04:29
🔗
|
JAA |
(Psst... https://archive.fo/https://byuu.org/articles/fpgas-arent-magic/ ) |
04:29
🔗
|
powerKitt |
thanks |
04:30
🔗
|
Flashfire |
My family is going through the shed again so I should have more stuff to upload soon |
04:30
🔗
|
Flashfire |
we have at least 100 old discs |
04:31
🔗
|
powerKitt |
nice |
04:31
🔗
|
Flashfire |
Mainly old PC games that arent sold anymore |
04:50
🔗
|
|
odemgi has joined #archiveteam-bs |
04:52
🔗
|
|
odemg has quit IRC (Ping timeout: 265 seconds) |
04:53
🔗
|
|
odemgi_ has quit IRC (Read error: Operation timed out) |
05:04
🔗
|
|
odemg has joined #archiveteam-bs |
05:26
🔗
|
|
BasDub has joined #archiveteam-bs |
05:42
🔗
|
godane |
i'm starting to upload the last year of The Morning blaze with Doc Thompson : https://archive.org/details/the-morning-blaze-with-doc-thompson-2017-09-01 |
05:43
🔗
|
godane |
i got to update a ton of collections this next week or so |
05:44
🔗
|
godane |
Germen News for 1998, kpfa, all radio shows i have a collection on |
06:21
🔗
|
powerKitt |
nice one godane |
06:43
🔗
|
|
dxrt_ has joined #archiveteam-bs |
06:43
🔗
|
|
dxrt sets mode: +o dxrt_ |
07:55
🔗
|
|
edisdead has quit IRC (Read error: Connection reset by peer) |
08:01
🔗
|
|
hook54321 has joined #archiveteam-bs |
08:51
🔗
|
|
Stilett0 has joined #archiveteam-bs |
08:53
🔗
|
|
Stiletto has quit IRC (Ping timeout: 252 seconds) |
09:03
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
09:04
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 740 seconds) |
09:06
🔗
|
|
_hdch_ has quit IRC (Ping timeout: 265 seconds) |
09:07
🔗
|
|
Mateon1 has joined #archiveteam-bs |
09:33
🔗
|
|
Stiletto has joined #archiveteam-bs |
09:35
🔗
|
|
Stilett0 has quit IRC (Ping timeout: 265 seconds) |
11:01
🔗
|
|
MR9K has quit IRC (Remote host closed the connection) |
11:02
🔗
|
|
MR9K has joined #archiveteam-bs |
11:02
🔗
|
|
MR9K has quit IRC (Remote host closed the connection) |
11:12
🔗
|
|
sknebel_ is now known as sknebel |
11:45
🔗
|
|
VerfiedJ has joined #archiveteam-bs |
12:06
🔗
|
|
wp494 has quit IRC (Ping timeout: 255 seconds) |
12:06
🔗
|
|
wp494 has joined #archiveteam-bs |
13:17
🔗
|
|
MR9K has joined #archiveteam-bs |
13:33
🔗
|
|
slyphic has quit IRC (Read error: Connection reset by peer) |
13:33
🔗
|
|
slyphic has joined #archiveteam-bs |
14:17
🔗
|
|
knas-sys has joined #archiveteam-bs |
14:33
🔗
|
|
Joseph_ has joined #archiveteam-bs |
14:36
🔗
|
|
VerfiedJ has quit IRC (Ping timeout: 252 seconds) |
15:16
🔗
|
|
Joseph_ is now known as VerfiedJ |
15:24
🔗
|
|
chimyatta has joined #archiveteam-bs |
15:45
🔗
|
|
schbirid has quit IRC (Remote host closed the connection) |
16:08
🔗
|
|
schbirid has joined #archiveteam-bs |
16:17
🔗
|
|
sep332 has joined #archiveteam-bs |
17:50
🔗
|
|
branden_ is now known as branden |
18:39
🔗
|
|
hdch has joined #archiveteam-bs |
18:46
🔗
|
hook54321 |
JAA: example video URL http://animatorexpo.com/promotional03/ |
18:49
🔗
|
JAA |
hook54321: Thanks. Hrm, that thing uses Flash, and it's inserted through JS. Hooray... |
18:49
🔗
|
JAA |
The ArchiveBot job almost definitely didn't grab those. |
18:55
🔗
|
JAA |
And it doesn't look like downloading the video is straightforward either. |
19:01
🔗
|
hook54321 |
Yeah :/ |
19:14
🔗
|
moufu |
the js contains some strings that look like nicovideo ids, but the videos apparently private |
19:14
🔗
|
moufu |
https://www.nicovideo.jp/watch/so26762988 https://www.nicovideo.jp/watch/so26762911 |
19:17
🔗
|
moufu |
ok https://ext.nicovideo.jp/api/video_play_info?video_id=so26762911&token=e2d62aba5253576880ef7d47c7ea290c39c8b2ab2f3543536a6fa1cb81490212&time=1546110948 |
19:24
🔗
|
moufu |
looks like the final link still returns 403 though |
19:25
🔗
|
JAA |
Yeah, something special has to be going on about that link. Maybe the player appends some parameters to it. |
19:25
🔗
|
JAA |
Assuming the videos work in the first place. I don't have Flash, so I can't verify that. |
19:40
🔗
|
|
exoire has joined #archiveteam-bs |
20:27
🔗
|
moufu |
looks like the site has a non-flash video player if your ua contains Android of iPad or iPhone or iPod |
20:27
🔗
|
moufu |
but the request for the video still returns 403 on the original site |
21:07
🔗
|
|
wp494 has quit IRC (Read error: Operation timed out) |
21:07
🔗
|
|
wp494 has joined #archiveteam-bs |
21:24
🔗
|
godane |
so i got some world war ii magazines from 1986 to 1990 |
21:24
🔗
|
godane |
same for Military History Magazine |
22:14
🔗
|
|
BlueMax has joined #archiveteam-bs |
22:35
🔗
|
|
hdch has quit IRC (Ping timeout: 265 seconds) |
23:00
🔗
|
|
hdch has joined #archiveteam-bs |
23:46
🔗
|
JAA |
My Pushshift download is nearly complete. I've retrieved everything except for the 110 GB file containing 250 billion digits of π. One large file (86 GB) is still downloading. Everything else is done. I'll upload to IA when that last download completes. |
23:48
🔗
|
JAA |
Just to be clear: I ended up downloading both the Reddit and the non-Reddit data. It's about 1.25 TB in total (including the π file). |