Time |
Nickname |
Message |
00:02
🔗
|
JAA |
NA forums are done except for the pagination of http://forums.na.leagueoflegends.com/board/forumdisplay.php?f=2 which will take another few hours (and a couple errors). Some of the smaller boards finished as well. All running pretty good, my biggest enemy is IA S3. |
00:14
🔗
|
Craigle |
Were you running that through an AB server, or standalone? It seems like you got through it pretty quickly all things considered. |
00:16
🔗
|
JAA |
Completely different thing. I use my own archival tool called qwarc. |
00:16
🔗
|
JAA |
ArchiveBot can't get anywhere near 1k requests per second on a single machine. :-) |
00:27
🔗
|
|
OrIdow6 has quit IRC (Quit: Leaving.) |
00:30
🔗
|
Craigle |
Gotcha, lol. I have seen y'all talk about qwarc before. I just don't understand warc well enough to get the difference. |
00:31
🔗
|
JAA |
WARC = file format we use everywhere to store web page data (basically it's a container for HTTP requests and responses plus metadata), qwarc = "quick warc" = a specific tool for producing WARCs based on a set of retrieval rules |
00:34
🔗
|
|
Lord_Nigh has quit IRC (Quit: ZNC - http://znc.in) |
00:36
🔗
|
JAA |
It doesn't help that qwarc is poorly (read: not at all) documented. I wrote it in haste a while ago to retrieve a particular huge site that we initially wanted to get with a DPoS project but then didn't (Storify), then reorganised the shitty code into something reusable. It has a number of issues you need to be aware of to correctly use it and really needs a rewrite of significant parts so other |
00:36
🔗
|
JAA |
people can do so without basically having to read and understand the entire code. |
00:36
🔗
|
|
HP_Archiv has quit IRC (Quit: Leaving) |
00:38
🔗
|
|
Lord_Nigh has joined #archiveteam-bs |
00:48
🔗
|
|
MaximeleG has quit IRC (Quit: MaximeleG) |
02:06
🔗
|
JAA |
PurpleSym: EFnet murdered purplebot again. |
03:24
🔗
|
|
VADemon_ has quit IRC (Read error: Connection reset by peer) |
03:45
🔗
|
|
OrIdow6 has joined #archiveteam-bs |
03:49
🔗
|
|
OrIdow6 has quit IRC (Remote host closed the connection) |
03:51
🔗
|
|
ranma has quit IRC () |
03:51
🔗
|
|
OrIdow6 has joined #archiveteam-bs |
03:52
🔗
|
|
OrIdow6 has quit IRC (Client Quit) |
04:14
🔗
|
|
qw3rty__ has joined #archiveteam-bs |
04:22
🔗
|
|
qw3rty_ has quit IRC (Read error: Operation timed out) |
04:23
🔗
|
JAA |
5 boards are still running: br_pt, eune_en, las_es, na_en, and tr_tr. The others are done(ish). |
04:28
🔗
|
|
OrIdow6 has joined #archiveteam-bs |
04:37
🔗
|
|
Frogging has joined #archiveteam-bs |
04:56
🔗
|
Craigle |
Makes sense, thanks for the explanation JAA |
07:19
🔗
|
|
scorche has quit IRC (Read error: Operation timed out) |
07:30
🔗
|
|
purplebot has joined #archiveteam-bs |
07:30
🔗
|
PurpleSym |
JAA: Thanks, restarted. |
07:45
🔗
|
|
Hoolootwo is now known as Hooloovoo |
08:43
🔗
|
|
is-_ is now known as is- |
09:02
🔗
|
|
NIC007a83 has quit IRC (Read error: Connection reset by peer) |
09:13
🔗
|
|
schbirid has joined #archiveteam-bs |
09:52
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
10:14
🔗
|
|
schbirid has joined #archiveteam-bs |
10:38
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
11:47
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
12:33
🔗
|
|
HP_Archiv has quit IRC (Quit: Leaving) |
12:39
🔗
|
|
dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) |
12:41
🔗
|
|
dashcloud has joined #archiveteam-bs |
13:07
🔗
|
JAA |
Everything except the na_en boards is done now. Just need to sort out the errors later. |
13:09
🔗
|
|
Smiley has quit IRC (Quit: http://www.milkme.co.uk - You'll never understand.) |
13:10
🔗
|
|
Smiley has joined #archiveteam-bs |
17:03
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 255 seconds) |
17:04
🔗
|
|
Mateon1 has joined #archiveteam-bs |
17:06
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
17:42
🔗
|
|
ShellyRol has quit IRC (Ping timeout: 745 seconds) |
17:43
🔗
|
|
ShellyRol has joined #archiveteam-bs |
17:44
🔗
|
|
schbirid has joined #archiveteam-bs |
18:07
🔗
|
|
kahuna has joined #archiveteam-bs |
18:08
🔗
|
|
kahuna has left |
18:12
🔗
|
|
kahuna_ has joined #archiveteam-bs |
19:05
🔗
|
|
kahuna_ has quit IRC (Read error: Connection reset by peer) |
19:09
🔗
|
|
kahuna has joined #archiveteam-bs |
19:52
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
20:13
🔗
|
JAA |
And now the na_en boards are done (except for errors) as well. |
20:36
🔗
|
|
RichardG_ has joined #archiveteam-bs |
20:37
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
20:49
🔗
|
|
godane has joined #archiveteam-bs |
21:12
🔗
|
|
Hooloovoo has quit IRC (Quit: Temporarily refracted into a free-standing prism.) |
21:44
🔗
|
|
jrwr has quit IRC (Read error: Operation timed out) |
21:46
🔗
|
|
jrwr has joined #archiveteam-bs |
22:08
🔗
|
JAA |
Oh right, thanks. |
22:21
🔗
|
godane |
SketchCow: so it looks like maybe you guys don't have the NARA docs archived |
22:21
🔗
|
godane |
it maybe brute forcible : https://aad.archives.gov/aad/createpdf?rid=1&dt=2472&dl=1345 |
22:42
🔗
|
|
keith20 has joined #archiveteam-bs |
22:44
🔗
|
|
keith20 has quit IRC (Client Quit) |
22:45
🔗
|
|
keith20 has joined #archiveteam-bs |
22:45
🔗
|
|
Stilett0 has joined #archiveteam-bs |
22:46
🔗
|
|
Stiletto has quit IRC (Ping timeout: 360 seconds) |
22:51
🔗
|
|
Stiletto has joined #archiveteam-bs |
22:58
🔗
|
|
Stilett0 has quit IRC (Ping timeout: 745 seconds) |
23:01
🔗
|
|
Stilett0 has joined #archiveteam-bs |
23:04
🔗
|
|
Stiletto has quit IRC (Ping timeout: 745 seconds) |
23:20
🔗
|
|
BlueMax has joined #archiveteam-bs |
23:26
🔗
|
|
Stiletto has joined #archiveteam-bs |
23:27
🔗
|
|
Stilett0 has quit IRC (Ping timeout: 260 seconds) |
23:56
🔗
|
|
step has quit IRC (Quit: ZNC 1.7.5 - https://znc.in) |
23:56
🔗
|
|
Hooloovoo has joined #archiveteam-bs |