Time |
Nickname |
Message |
00:11
🔗
|
OrIdow6 |
JAA: Nope - I've only seen stickies in the page itseldf |
00:11
🔗
|
OrIdow6 |
*itself |
00:28
🔗
|
JAA |
Oh, another fun bug of the boards: https://boards.na.leagueoflegends.com/api/PEr1qIcT/discussions?sort_type=recently_replied&num_loaded=1377400 claims to return 21 results but the HTML string is empty. |
00:29
🔗
|
JAA |
(It's supposed to return the 21 discussions that were least recently active ever.) |
00:51
🔗
|
|
godane has joined #archiveteam-bs |
02:04
🔗
|
JAA |
Oh, fantastic. After finally working around all that rubbish, I did a first load test. Guess what, under load, the API returns an empty list instead of data, still with HTTP 200 though. (╯°□°)╯︵ ┻━┻ |
02:37
🔗
|
|
RichardG_ has quit IRC (Quit: Keyboard not found, press F1 to continue) |
02:38
🔗
|
|
RichardG has joined #archiveteam-bs |
03:24
🔗
|
|
wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES) |
03:26
🔗
|
JAA |
Running okay now. Estimated 400 GB and ~50 hours for the NA EN boards (presumably the largest ones). So assuming it goes okay now, it should finish on time. |
03:35
🔗
|
OrIdow6 |
Ok |
03:45
🔗
|
|
wp494 has joined #archiveteam-bs |
04:15
🔗
|
|
qw3rty_ has joined #archiveteam-bs |
04:23
🔗
|
|
qw3rty__ has quit IRC (Read error: Operation timed out) |
04:24
🔗
|
|
katocala has joined #archiveteam-bs |
04:24
🔗
|
|
katocala has left |
04:43
🔗
|
JAA |
Discovered more edge cases, trying to solve them now and will have to restart then. |
05:01
🔗
|
|
ephemer0l has quit IRC (Read error: Connection reset by peer) |
05:11
🔗
|
|
ephemer0l has joined #archiveteam-bs |
05:47
🔗
|
JAA |
Discussion counts on each board according to the API: https://transfer.notkiska.pw/inline/av40k/lol-boards-discussion-counts |
05:48
🔗
|
JAA |
Apparently the English EU forums are available under both eune and euw. |
06:52
🔗
|
JAA |
This is fun. ~2k req/s, only a few dozen errors per minute :-) |
06:59
🔗
|
|
DFJustin has quit IRC (Remote host closed the connection) |
06:59
🔗
|
|
DFJustin has joined #archiveteam-bs |
07:16
🔗
|
|
HP_Archiv has quit IRC (Quit: Leaving) |
10:22
🔗
|
|
ShellyRol has quit IRC (Read error: Connection reset by peer) |
10:24
🔗
|
|
ShellyRol has joined #archiveteam-bs |
11:54
🔗
|
|
PlsHelp has joined #archiveteam-bs |
11:56
🔗
|
PlsHelp |
Also, regarding xbooru.com purging photos: comments at Xbooru is exclusive content so that's another reason to archive it. |
11:56
🔗
|
PlsHelp |
Please read http://blog.booru.org/?p=209 |
12:12
🔗
|
|
mtntmnky has quit IRC (Remote host closed the connection) |
12:13
🔗
|
|
mtntmnky has joined #archiveteam-bs |
12:26
🔗
|
PlsHelp |
FOCUS: https://closeup.booru.org/index.php?page=post&s=list&pid=26440 |
12:27
🔗
|
PlsHelp |
FOCUS: https://cheesecake.booru.org/index.php?page=post&s=list |
12:34
🔗
|
godane |
latest digitize tapes : https://www.patreon.com/posts/digitize-tapes-34874417 |
12:35
🔗
|
PlsHelp |
FOCUS: https://vulva.booru.org/index.php?page=post&s=list&tags=all - these websites which are part of the booru.org network will be PURGED soon! |
12:38
🔗
|
PlsHelp |
https://joy2.booru.org/index.php?page=post&s=list&tags=all |
12:49
🔗
|
PlsHelp |
https://celeb-fake-nude.booru.org/index.php?page=post&s=list&pid=5940 |
12:50
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
13:11
🔗
|
|
zhongfu has quit IRC (Ping timeout: 745 seconds) |
13:12
🔗
|
|
zhongfu has joined #archiveteam-bs |
13:16
🔗
|
|
PlsHelp has quit IRC (Quit: Page closed) |
15:49
🔗
|
|
MaximeleG has joined #archiveteam-bs |
16:10
🔗
|
godane |
SketchCow: so i found this looking for pdfs on twitter : https://twitter.com/rohweroutpost |
16:11
🔗
|
godane |
its a magazine/newspaper from Rohwer Arkansas Japanese Relocation Camp |
17:12
🔗
|
JAA |
All of the LoL forums except the NA ones are done (save for errors to be handled soon). The smaller boards (eune_ro, eune_hu, eune_cs, eune_el, euw_de, jp_ja-jp, pbe_en, oce_en) are also done. The other boards are still running. |
17:13
🔗
|
JAA |
On the forums, I'm saving the thread pages and the forum pagination. I'm bruteforcing thread IDs up to {'na': 4924000, 'euw': 2106000, 'eune': 807000, 'ru': 30000, 'br': 382000, 'lan': 96000, 'las': 183000, 'oce': 89000, 'tr': 137000} |
17:14
🔗
|
JAA |
On the boards, I'm saving the homepage pagination, discussions including pagination in flat view, and user profiles. |
17:14
🔗
|
JAA |
As mentioned before, no images, outlinks, etc. |
17:30
🔗
|
|
Smiley has joined #archiveteam-bs |
17:33
🔗
|
|
SmileyG has quit IRC (Read error: Operation timed out) |
18:41
🔗
|
|
idk has joined #archiveteam-bs |
18:42
🔗
|
Ryz |
Hey JAA, after this, I assume it would be ideal to run the non-targeted images, outlinks, etc in AB after the League of Legends forums grab is finished? |
18:45
🔗
|
|
idk has quit IRC (Ping timeout: 260 seconds) |
18:47
🔗
|
JAA |
Ryz: If anyone wants to extract that information from the WARCs, sure. I'm not planning to do so. |
19:02
🔗
|
|
arkiver_ has quit IRC (Read error: Connection reset by peer) |
19:02
🔗
|
|
arkiver_ has joined #archiveteam-bs |
19:06
🔗
|
|
DogsRNice has joined #archiveteam-bs |
19:07
🔗
|
|
MaximeleG has quit IRC (Ping timeout: 745 seconds) |
19:30
🔗
|
|
DogsRNice has quit IRC (Read error: Connection reset by peer) |
19:37
🔗
|
|
SynMonger has quit IRC (Ping timeout: 276 seconds) |
19:40
🔗
|
|
SynMonger has joined #archiveteam-bs |
19:50
🔗
|
|
Craigle has quit IRC (Ping timeout: 276 seconds) |
19:50
🔗
|
|
anarcat has quit IRC (Ping timeout: 276 seconds) |
19:50
🔗
|
|
benjins has quit IRC (Ping timeout: 276 seconds) |
19:50
🔗
|
|
fredgido has quit IRC (Ping timeout: 276 seconds) |
19:50
🔗
|
|
pew has quit IRC (Ping timeout: 276 seconds) |
19:50
🔗
|
|
klg has quit IRC (Ping timeout: 276 seconds) |
19:50
🔗
|
|
logchfoo2 has quit IRC (Ping timeout: 276 seconds) |
19:51
🔗
|
|
logchfoo3 starts logging #archiveteam-bs at Sat Mar 14 19:51:55 2020 |
19:51
🔗
|
|
logchfoo3 has joined #archiveteam-bs |
19:52
🔗
|
|
logchfoo2 has quit IRC (Ping timeout: 276 seconds) |
19:52
🔗
|
|
Dallas has quit IRC (Ping timeout: 276 seconds) |
19:52
🔗
|
|
Hooloovoo has quit IRC (Ping timeout: 276 seconds) |
19:52
🔗
|
|
arkiver_ has quit IRC (Ping timeout: 276 seconds) |
19:52
🔗
|
|
dxrt has quit IRC (Ping timeout: 276 seconds) |
19:52
🔗
|
|
Fionera has quit IRC (Ping timeout: 276 seconds) |
19:53
🔗
|
|
pew has joined #archiveteam-bs |
19:54
🔗
|
|
godane has quit IRC (Ping timeout: 276 seconds) |
19:54
🔗
|
|
coderobe has quit IRC (Ping timeout: 276 seconds) |
19:56
🔗
|
|
synm0nger has joined #archiveteam-bs |
19:58
🔗
|
|
Xibalba has quit IRC (Ping timeout: 276 seconds) |
19:59
🔗
|
|
Xibalba has joined #archiveteam-bs |
20:01
🔗
|
|
SynMonger has quit IRC (Read error: Operation timed out) |
20:03
🔗
|
|
OrIdow6 has quit IRC (Ping timeout: 276 seconds) |
20:03
🔗
|
|
VoynichCr has quit IRC (Ping timeout: 276 seconds) |
20:05
🔗
|
|
mtntmnky_ has joined #archiveteam-bs |
20:05
🔗
|
|
is-_ has joined #archiveteam-bs |
20:06
🔗
|
|
pew has quit IRC (se.hub irc.underworld.no) |
20:06
🔗
|
|
jmtd has quit IRC (se.hub irc.underworld.no) |
20:06
🔗
|
|
is- has quit IRC (se.hub irc.underworld.no) |
20:06
🔗
|
|
Frogging has quit IRC (se.hub irc.underworld.no) |
20:06
🔗
|
|
purplebot has quit IRC (se.hub irc.underworld.no) |
20:07
🔗
|
|
Jon| has joined #archiveteam-bs |
20:21
🔗
|
|
mtntmnky has quit IRC (Remote host closed the connection) |
20:23
🔗
|
|
dxrt has joined #archiveteam-bs |
20:24
🔗
|
|
svchfoo1 sets mode: +o dxrt |
20:26
🔗
|
|
klg has joined #archiveteam-bs |
20:28
🔗
|
|
VoynichCr has joined #archiveteam-bs |
20:30
🔗
|
|
Samizdat has joined #archiveteam-bs |
20:41
🔗
|
|
pew has joined #archiveteam-bs |
21:00
🔗
|
|
Samizdat has quit IRC (Read error: Connection reset by peer) |
21:32
🔗
|
|
opticnerv has joined #archiveteam-bs |
21:50
🔗
|
|
ephemer0l has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.) |
21:54
🔗
|
|
ephemer0l has joined #archiveteam-bs |
22:07
🔗
|
|
Smiley has quit IRC (Read error: Connection reset by peer) |
22:07
🔗
|
|
Smiley has joined #archiveteam-bs |
22:07
🔗
|
|
antomati_ has joined #archiveteam-bs |
22:09
🔗
|
|
antomatic has quit IRC (Read error: Operation timed out) |
22:15
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
22:28
🔗
|
|
OrIdow6 has joined #archiveteam-bs |
22:39
🔗
|
|
Craigle has quit IRC (Quit: The Lounge - https://thelounge.chat) |
22:48
🔗
|
|
opticnerv has quit IRC (Quit: Leaving) |
22:50
🔗
|
|
Craigle has joined #archiveteam-bs |
23:13
🔗
|
|
BlueMax has joined #archiveteam-bs |