Time |
Nickname |
Message |
00:02
🔗
|
ferminter |
should I just recrawl it? (it might take a long time) |
00:02
🔗
|
astrid |
i would make a backup of the file and then edit it |
00:02
🔗
|
ferminter |
I think maybe I could modify warcio or something to output a corrected version |
00:03
🔗
|
ferminter |
Or I could just regex/sed/awk, but that would break any times WARC-Target-URI appeared in the HTML itself (which is doubtful, but still theoretically possible) |
00:03
🔗
|
astrid |
isn't the html usually gzipped |
00:03
🔗
|
astrid |
or are the headers gzipped too |
00:03
🔗
|
ferminter |
oh wait not sure |
00:03
🔗
|
ferminter |
I'll check |
00:04
🔗
|
ferminter |
looks like the headers are gzipped |
00:04
🔗
|
astrid |
ok |
00:05
🔗
|
astrid |
basic rule we've followed around editing warcs in the past is, if you can provide a tool to perfectly reverse the edits and come out with a file bit-for-bit identical with the original input, then it's fine |
01:07
🔗
|
|
BobJonkma has quit IRC (Read error: Operation timed out) |
01:43
🔗
|
|
pizzaiolo has quit IRC (Remote host closed the connection) |
01:47
🔗
|
|
dashcloud has quit IRC (Ping timeout: 492 seconds) |
01:51
🔗
|
|
dashcloud has joined #archiveteam |
02:01
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
02:01
🔗
|
|
RichardG has joined #archiveteam |
02:25
🔗
|
|
Morbus has quit IRC (Ping timeout: 255 seconds) |
03:17
🔗
|
|
ferminter has quit IRC (http://www.mibbit.com ajax IRC Client) |
04:01
🔗
|
|
Stilett0 has joined #archiveteam |
04:18
🔗
|
|
BlueMax has joined #archiveteam |
04:20
🔗
|
|
qw3rty115 has joined #archiveteam |
04:23
🔗
|
|
qw3rty114 has quit IRC (Read error: Operation timed out) |
06:05
🔗
|
|
ferminter has joined #archiveteam |
06:05
🔗
|
ferminter |
IDK if this will be helpful for anyone, but I wrote a quick script to fix that issue: https://gist.github.com/sudonym1/e607abe127f058e9027301e53994b910 |
06:05
🔗
|
ferminter |
probably should've used warcio instead of warctools, since the former works with python 3, but oh well |
06:06
🔗
|
ferminter |
(Wget 1.19 broken angle bracket issue) |
06:06
🔗
|
|
ferminter has quit IRC (Client Quit) |
07:39
🔗
|
|
dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) |
07:39
🔗
|
|
dashcloud has joined #archiveteam |
07:41
🔗
|
|
schbirid has joined #archiveteam |
08:30
🔗
|
|
muramasa has joined #archiveteam |
08:42
🔗
|
|
atomotic has joined #archiveteam |
09:05
🔗
|
|
BlueMaxim has joined #archiveteam |
09:09
🔗
|
|
atomotic has quit IRC (Ping timeout: 260 seconds) |
09:10
🔗
|
|
atomotic has joined #archiveteam |
09:11
🔗
|
|
BlueMax has quit IRC (Read error: Operation timed out) |
09:17
🔗
|
|
RichardG has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
will has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
Smiley has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
BnAboyZ has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
kisspunch has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
Zebranky has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
MrRadar2 has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
BnARobin has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
jtn2 has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
Tenebrae has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
Fusl has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
hook54321 has quit IRC (se.hub irc.efnet.nl) |
09:17
🔗
|
|
Polylith has quit IRC (se.hub irc.efnet.nl) |
09:37
🔗
|
|
BlueMaxim has quit IRC (Leaving) |
11:20
🔗
|
|
pizzaiolo has joined #archiveteam |
11:22
🔗
|
|
mabynogy has joined #archiveteam |
11:32
🔗
|
|
FalconK has quit IRC (Ping timeout: 260 seconds) |
11:37
🔗
|
|
RichardG has joined #archiveteam |
11:37
🔗
|
|
will has joined #archiveteam |
11:37
🔗
|
|
Smiley has joined #archiveteam |
11:37
🔗
|
|
BnAboyZ has joined #archiveteam |
11:37
🔗
|
|
kisspunch has joined #archiveteam |
11:37
🔗
|
|
Zebranky has joined #archiveteam |
11:37
🔗
|
|
MrRadar2 has joined #archiveteam |
11:37
🔗
|
|
BnARobin has joined #archiveteam |
11:37
🔗
|
|
jtn2 has joined #archiveteam |
11:37
🔗
|
|
Tenebrae has joined #archiveteam |
11:37
🔗
|
|
Fusl has joined #archiveteam |
11:37
🔗
|
|
hook54321 has joined #archiveteam |
11:37
🔗
|
|
Polylith has joined #archiveteam |
11:37
🔗
|
|
irc.efnet.nl sets mode: +o MrRadar2 |
12:04
🔗
|
|
atomotic has quit IRC (Quit: atomotic) |
12:09
🔗
|
|
Morbus has joined #archiveteam |
12:28
🔗
|
|
FalconK has joined #archiveteam |
12:45
🔗
|
|
ranavalon has joined #archiveteam |
12:47
🔗
|
|
bitspill has quit IRC () |
12:48
🔗
|
|
bitspill has joined #archiveteam |
13:40
🔗
|
|
DrasticAc has quit IRC () |
13:40
🔗
|
|
DrasticAc has joined #archiveteam |
13:41
🔗
|
wp494 |
Oddshot.tv (basically Twitch clips before Twitch added clips natively) is shutting down this coming Monday: https://www.reddit.com/r/GlobalOffensive/comments/7wd4qh/psa_oddshot_is_shutting_down/ |
13:42
🔗
|
wp494 |
most of it might be stuff that could be found elsewhere already, while other bits may be from streams that got taken down for one reason or another |
13:43
🔗
|
wp494 |
also dxrt, joepie91, MrRadar, SketchCow, please spread the +o |
13:44
🔗
|
wp494 |
dunno what happened to the whole auto-op thing we had going a while back but I guess that doesn't work anymore so /shrug |
13:58
🔗
|
|
atomotic has joined #archiveteam |
14:31
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 252 seconds) |
14:31
🔗
|
|
Mateon1 has joined #archiveteam |
14:56
🔗
|
|
riking has quit IRC () |
14:56
🔗
|
|
riking has joined #archiveteam |
14:57
🔗
|
|
ThisAsYou has quit IRC () |
14:57
🔗
|
|
ThisAsYou has joined #archiveteam |
15:08
🔗
|
|
dogsrcool has joined #archiveteam |
15:09
🔗
|
dogsrcool |
WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD |
15:12
🔗
|
Sanqui |
dogsrcool: what is your quest? |
15:13
🔗
|
dogsrcool |
to get a wiki account |
15:28
🔗
|
Sanqui |
dogsrcool: sure but, what do you want to edit/add? |
15:52
🔗
|
|
Burak has joined #archiveteam |
15:56
🔗
|
Burak |
Oddshot.tv, the stream clip hosting service, is shutting down in three days - https://medium.com/the-oddshot-loop/end-of-an-era-aefeca0420bf |
15:56
🔗
|
Burak |
Could AT archive it? |
16:00
🔗
|
|
atrocity has joined #archiveteam |
16:29
🔗
|
|
ZexaronS has joined #archiveteam |
16:29
🔗
|
|
SketchCow sets mode: +o wp494 |
16:30
🔗
|
|
SketchCow sets mode: +oooo db48x arkiver atluxity antomatic |
16:30
🔗
|
|
SketchCow sets mode: +oooo astrid chfoo closure dashcloud |
16:30
🔗
|
|
SketchCow sets mode: +o DFJustin |
16:39
🔗
|
* |
JAA raises hand |
16:50
🔗
|
|
SketchCow sets mode: +o JAA |
16:54
🔗
|
JAA |
Thanks |
16:54
🔗
|
|
JAA changes topic to: Archive Team: We're not archive.org | https://archiveteam.org/ | Lengthy discussions: #archiveteam-bs | Offtopic: #archiveteam-ot | Yes, we know about Oddshot.t |
16:54
🔗
|
|
JAA changes topic to: Archive Team: We're not archive.org | https://archiveteam.org/ | Lengthy discussions: #archiveteam-bs | Offtopic: #archiveteam-ot | We know about Oddshot.tv |
17:25
🔗
|
|
atomotic has quit IRC (Quit: atomotic) |
17:49
🔗
|
|
MMovie2 has joined #archiveteam |
17:50
🔗
|
|
MMovie has quit IRC (Read error: Operation timed out) |
17:59
🔗
|
|
icedice has joined #archiveteam |
18:00
🔗
|
|
ZexaronS has quit IRC (Quit: Leaving) |
19:06
🔗
|
|
pizzaiolo has quit IRC (Remote host closed the connection) |
19:09
🔗
|
|
pizzaiolo has joined #archiveteam |
19:13
🔗
|
|
Ctrl has quit IRC (Read error: Operation timed out) |
19:14
🔗
|
|
w0rp has quit IRC (Read error: Operation timed out) |
19:16
🔗
|
|
w0rp has joined #archiveteam |
19:22
🔗
|
|
REiN^ has quit IRC (Read error: Operation timed out) |
19:24
🔗
|
|
icedice2 has joined #archiveteam |
19:27
🔗
|
|
icedice2 has quit IRC (Client Quit) |
19:27
🔗
|
|
icedice has quit IRC (Read error: Operation timed out) |
19:35
🔗
|
|
BobJonkma has joined #archiveteam |
19:48
🔗
|
|
don_ has quit IRC (Quit: WeeChat 1.6) |
19:52
🔗
|
|
don_ has joined #archiveteam |
20:19
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
20:29
🔗
|
|
jschwart has joined #archiveteam |
20:32
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
20:34
🔗
|
|
RichardG has joined #archiveteam |
20:34
🔗
|
|
schbirid has joined #archiveteam |
20:51
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:35
🔗
|
|
mabynogy has quit IRC (Quit: dpt.slasheva.com) |
21:42
🔗
|
|
REiN^ has joined #archiveteam |
21:44
🔗
|
|
ZexaronS has joined #archiveteam |
21:45
🔗
|
|
mabynogy has joined #archiveteam |
21:50
🔗
|
|
BlueMax has joined #archiveteam |
22:12
🔗
|
|
mabynogy has quit IRC (Quit: dpt.slasheva.com) |
22:22
🔗
|
|
wp494 sets mode: +ooo FalconK Sanqui swebb |
22:22
🔗
|
|
swebb sets mode: +o Jonimus |
22:22
🔗
|
|
swebb sets mode: +o altlabel |
22:22
🔗
|
|
swebb sets mode: +o balrog |
22:22
🔗
|
|
swebb sets mode: +o beardicus |
22:22
🔗
|
|
swebb sets mode: +o dcmorton |
22:22
🔗
|
|
swebb sets mode: +o edsu |
22:22
🔗
|
|
swebb sets mode: +o midas3 |
22:22
🔗
|
|
swebb sets mode: +o mistym |
22:22
🔗
|
wp494 |
oh. |
22:22
🔗
|
wp494 |
welp, ok, guess there is some auto-opping after all |
22:33
🔗
|
|
pizzaiolo has quit IRC (pizzaiolo) |
22:33
🔗
|
|
pizzaiolo has joined #archiveteam |
22:37
🔗
|
|
pizzaiolo has quit IRC (Client Quit) |
22:38
🔗
|
|
godane has quit IRC (Read error: Operation timed out) |
22:38
🔗
|
|
pizzaiolo has joined #archiveteam |
22:49
🔗
|
|
godane has joined #archiveteam |
22:49
🔗
|
|
godane has quit IRC (Client Quit) |
23:04
🔗
|
|
jschwart has quit IRC (Quit: Konversation terminated!) |
23:18
🔗
|
|
tomaspark has joined #archiveteam |
23:19
🔗
|
|
tomaspark has quit IRC (Client Quit) |
23:28
🔗
|
|
BlueMax has quit IRC (Leaving) |
23:34
🔗
|
|
REiN^ has quit IRC (Read error: Operation timed out) |
23:43
🔗
|
|
icedice has joined #archiveteam |