Time |
Nickname |
Message |
00:09
🔗
|
|
Nertsy has joined #archiveteam |
00:47
🔗
|
|
lytv has quit IRC (Read error: No route to host) |
00:48
🔗
|
|
lytv has joined #archiveteam |
01:12
🔗
|
|
achip has joined #archiveteam |
01:16
🔗
|
|
xk_id has joined #archiveteam |
01:24
🔗
|
|
lytv has quit IRC (Quit: Leaving) |
01:43
🔗
|
|
lytv has joined #archiveteam |
01:44
🔗
|
|
achip_ has joined #archiveteam |
01:44
🔗
|
|
achip has quit IRC (Read error: Connection reset by peer) |
01:54
🔗
|
|
toad has quit IRC (Leaving.) |
02:13
🔗
|
|
acridAxid has quit IRC (Quit: Quitting) |
02:16
🔗
|
|
acridAxid has joined #archiveteam |
02:25
🔗
|
|
mstevenso has joined #archiveteam |
02:35
🔗
|
|
toad1 has joined #archiveteam |
02:48
🔗
|
|
mstevenso has quit IRC (bye) |
03:06
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
03:25
🔗
|
|
primus104 has quit IRC (Leaving.) |
03:26
🔗
|
|
Ymgve has quit IRC () |
03:27
🔗
|
|
achip_ is now known as achip |
03:27
🔗
|
|
brayden has joined #archiveteam |
03:33
🔗
|
|
tev|stdby has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
will__ has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
matthusby has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
maltris has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
Jogie has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
midas has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
useretail has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
SadDM has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
DFJustin has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
pikhq has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
Marc has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:33
🔗
|
|
torvik has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:35
🔗
|
|
tev|stdby has joined #archiveteam |
03:35
🔗
|
|
will__ has joined #archiveteam |
03:35
🔗
|
|
matthusby has joined #archiveteam |
03:35
🔗
|
|
maltris has joined #archiveteam |
03:35
🔗
|
|
Jogie has joined #archiveteam |
03:35
🔗
|
|
midas has joined #archiveteam |
03:35
🔗
|
|
useretail has joined #archiveteam |
03:35
🔗
|
|
SadDM has joined #archiveteam |
03:35
🔗
|
|
DFJustin has joined #archiveteam |
03:35
🔗
|
|
pikhq has joined #archiveteam |
03:35
🔗
|
|
Marc has joined #archiveteam |
03:35
🔗
|
|
torvik has joined #archiveteam |
03:35
🔗
|
|
irc.shaw.ca sets mode: +oo SadDM DFJustin |
03:35
🔗
|
|
swebb sets mode: +o SadDM |
03:35
🔗
|
|
swebb sets mode: +o DFJustin |
03:35
🔗
|
|
xtr-107 has joined #archiveteam |
03:36
🔗
|
|
xtr-107 has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
tev|stdby has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
will__ has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
matthusby has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
maltris has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
Jogie has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
midas has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
useretail has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
SadDM has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
DFJustin has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
pikhq has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
Marc has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:36
🔗
|
|
torvik has quit IRC (ircd.shaw.ca irc.shaw.ca) |
03:37
🔗
|
|
xtr-107 has joined #archiveteam |
03:37
🔗
|
|
tev|stdby has joined #archiveteam |
03:37
🔗
|
|
will__ has joined #archiveteam |
03:37
🔗
|
|
matthusby has joined #archiveteam |
03:37
🔗
|
|
maltris has joined #archiveteam |
03:37
🔗
|
|
Jogie has joined #archiveteam |
03:37
🔗
|
|
midas has joined #archiveteam |
03:37
🔗
|
|
useretail has joined #archiveteam |
03:37
🔗
|
|
SadDM has joined #archiveteam |
03:37
🔗
|
|
DFJustin has joined #archiveteam |
03:37
🔗
|
|
pikhq has joined #archiveteam |
03:37
🔗
|
|
Marc has joined #archiveteam |
03:37
🔗
|
|
torvik has joined #archiveteam |
03:37
🔗
|
|
irc.shaw.ca sets mode: +oo SadDM DFJustin |
03:37
🔗
|
|
swebb sets mode: +o SadDM |
03:37
🔗
|
|
swebb sets mode: +o DFJustin |
04:37
🔗
|
|
xk_id has joined #archiveteam |
04:53
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
05:42
🔗
|
|
xk_id has joined #archiveteam |
05:55
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
05:57
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
06:05
🔗
|
|
dashcloud has joined #archiveteam |
06:25
🔗
|
|
chfoo0 has joined #archiveteam |
06:27
🔗
|
|
chfoo has quit IRC (Ping timeout: 246 seconds) |
06:28
🔗
|
|
chfoo0 is now known as chfoo |
06:30
🔗
|
|
mistym has joined #archiveteam |
06:42
🔗
|
|
xk_id has joined #archiveteam |
06:48
🔗
|
|
mistym_ has joined #archiveteam |
06:54
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
06:54
🔗
|
|
mistym has quit IRC (Ping timeout: 512 seconds) |
07:47
🔗
|
|
xk_id has joined #archiveteam |
07:54
🔗
|
|
xk_id has quit IRC (Read error: Connection reset by peer) |
08:10
🔗
|
|
logchfoo_ starts logging #archiveteam at Mon Feb 16 08:10:18 2015 |
08:10
🔗
|
|
logchfoo_ has joined #archiveteam |
08:12
🔗
|
|
ryan_ has joined #archiveteam |
08:12
🔗
|
|
eprillios has quit IRC (Read error: Operation timed out) |
08:21
🔗
|
|
eprillios has joined #archiveteam |
08:26
🔗
|
|
Cameron_D has joined #archiveteam |
08:41
🔗
|
|
ryan_ has quit IRC (Remote host closed the connection) |
08:56
🔗
|
|
ryan_ has joined #archiveteam |
09:18
🔗
|
|
xk_id has joined #archiveteam |
09:31
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
09:51
🔗
|
|
acridAxid has quit IRC (Read error: Operation timed out) |
09:56
🔗
|
|
primus104 has joined #archiveteam |
10:01
🔗
|
|
acridAxid has joined #archiveteam |
10:03
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
10:07
🔗
|
|
dashcloud has joined #archiveteam |
10:13
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
10:43
🔗
|
|
ekke85 has joined #archiveteam |
10:48
🔗
|
ekke85 |
hello, not sure if this is the right place to ask, so ill ask anyway :) i want to start archiving sites that i look after for a personal history of what sites looked like etc so i am trying to learn as much as possible about it. so warc files...is there a way i can server sites from a warc file? i want to write my own interface in django and i have |
10:48
🔗
|
ekke85 |
most of it done, i can add a site to my interface and get har, screenshots and warc file but i am not sure what i cna do with warc files.. any advise of documents that would be good to read would be nice |
10:48
🔗
|
|
xk_id has joined #archiveteam |
10:51
🔗
|
|
schbirid has joined #archiveteam |
10:57
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
11:03
🔗
|
|
wp494 has quit IRC (Ping timeout: 512 seconds) |
11:07
🔗
|
|
signius has quit IRC (Ping timeout: 335 seconds) |
11:19
🔗
|
|
signius has joined #archiveteam |
11:20
🔗
|
|
lemonkey has quit IRC (Read error: Operation timed out) |
11:21
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
11:22
🔗
|
|
dashcloud has joined #archiveteam |
11:46
🔗
|
|
ekke85 has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) |
11:47
🔗
|
|
ekke85 has joined #archiveteam |
11:51
🔗
|
|
xk_id has joined #archiveteam |
11:55
🔗
|
Nemo_bis |
ekke85: the information available is pretty much all in http://archiveteam.org/index.php?title=The_WARC_Ecosystem or linked from it |
11:56
🔗
|
ekke85 |
Nemo_bis:cool thanks, i am looking at the warc python libs now |
12:01
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
12:08
🔗
|
|
LordNigh2 has joined #archiveteam |
12:08
🔗
|
|
balrog sets mode: +o LordNigh2 |
12:09
🔗
|
|
Ymgve has joined #archiveteam |
12:11
🔗
|
|
Lord_Nigh has quit IRC (Ping timeout: 246 seconds) |
12:11
🔗
|
|
LordNigh2 is now known as Lord_Nigh |
12:16
🔗
|
|
ekke85 has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) |
12:16
🔗
|
|
Void_ has quit IRC (Quit: OOOOoooooooooo................) |
12:51
🔗
|
|
xk_id has joined #archiveteam |
12:53
🔗
|
|
Void_ has joined #archiveteam |
13:00
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
13:07
🔗
|
|
lemonkey has joined #archiveteam |
13:34
🔗
|
|
xk_id has joined #archiveteam |
13:36
🔗
|
|
sankin has joined #archiveteam |
13:41
🔗
|
|
achip has quit IRC (Remote host closed the connection) |
13:52
🔗
|
|
K4k has joined #archiveteam |
13:57
🔗
|
|
K4k has quit IRC (Read error: Connection reset by peer) |
13:57
🔗
|
|
xk_id_ has joined #archiveteam |
13:58
🔗
|
|
K4k has joined #archiveteam |
13:58
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
14:11
🔗
|
|
yan has joined #archiveteam |
14:12
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
14:22
🔗
|
|
K4k has quit IRC (Quit: WeeChat 1.0.1) |
14:49
🔗
|
|
dashcloud has joined #archiveteam |
14:53
🔗
|
ohhdemgir |
GUISE!! (sorry) I have a bunch of reddit json to upload to ia, should I upload as is or compress it first? |
15:00
🔗
|
|
xk_id_ has quit IRC (Remote host closed the connection) |
15:12
🔗
|
Nemo_bis |
compress... |
15:13
🔗
|
Nemo_bis |
CPU cycles spent on compression for such "obscure" stuff, to save spinning disks, are a good investment |
15:29
🔗
|
|
primus104 has quit IRC (Leaving.) |
16:00
🔗
|
|
xk_id has joined #archiveteam |
16:11
🔗
|
|
yan has quit IRC (Quit: leaving) |
16:17
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
16:46
🔗
|
ohhdemgir |
doing it |
16:46
🔗
|
ohhdemgir |
https://archive.org/details/reddit_json_07-2013 |
16:46
🔗
|
ohhdemgir |
got a few years to go |
16:52
🔗
|
Nemo_bis |
ohhdemgir: what does "07-2013" mean? |
16:52
🔗
|
yipdw |
it's US-centric date notation |
16:52
🔗
|
Nemo_bis |
OIC |
16:52
🔗
|
ohhdemgir |
yeah, that part wasn't me |
16:52
🔗
|
Nemo_bis |
could as well write ojfdsjflf as date format |
16:52
🔗
|
yipdw |
it should be addressed |
16:53
🔗
|
yipdw |
but meh |
16:53
🔗
|
yipdw |
that's what the Date fields are for |
16:53
🔗
|
Nemo_bis |
Nice, I'm downloading at 6 MB/s |
16:53
🔗
|
Nemo_bis |
Quite rare for an ia902[56]* server |
16:57
🔗
|
ohhdemgir |
im against my drive, cpu and connection speed with these files, my drives are filling with ftp sites being mirror while im extracting the original 7z files (100M can be 1G+) then renaming, recompressing and uploading... these need to be done fast -_- |
17:01
🔗
|
Nemo_bis |
aww |
17:01
🔗
|
Nemo_bis |
trying xz -9e and bzip2 -9 anyway |
17:09
🔗
|
|
xk_id has joined #archiveteam |
17:11
🔗
|
ohhdemgir |
Nemo_bis, on that data? |
17:17
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
17:25
🔗
|
|
primus104 has joined #archiveteam |
17:37
🔗
|
|
primus104 has quit IRC (Leaving.) |
17:55
🔗
|
|
Nertsy has quit IRC (Ping timeout: 512 seconds) |
18:00
🔗
|
|
xk_id has joined #archiveteam |
18:21
🔗
|
|
Nertsy has joined #archiveteam |
18:26
🔗
|
|
primus104 has joined #archiveteam |
18:28
🔗
|
|
Kniffy has quit IRC (Quit: pup) |
18:30
🔗
|
|
Kniffy has joined #archiveteam |
18:31
🔗
|
|
Nertsy has quit IRC (Read error: Operation timed out) |
18:34
🔗
|
|
Nertsy has joined #archiveteam |
18:36
🔗
|
|
primus104 has quit IRC (Leaving.) |
18:37
🔗
|
|
aaaaaaaaa has joined #archiveteam |
18:40
🔗
|
|
primus104 has joined #archiveteam |
18:40
🔗
|
|
primus104 has quit IRC (Client Quit) |
18:55
🔗
|
Nemo_bis |
ohhdemgir: Comments_2013-07-18.json.tar.xz is 126 MB |
18:56
🔗
|
ohhdemgir |
ohh yea, you'll be able to get those raw .json files down a lot more than i have |
18:57
🔗
|
ohhdemgir |
i just went with default options and unpacking the original 7z's i had laying around, the full set in 7z was like 12.8GB |
18:57
🔗
|
ohhdemgir |
just prepping 2012 and 13 to go up now |
19:47
🔗
|
|
xtr-107 has quit IRC (Read error: Connection reset by peer) |
20:01
🔗
|
|
mistym_ has quit IRC (Remote host closed the connection) |
20:28
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
20:29
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
20:32
🔗
|
|
primus104 has joined #archiveteam |
20:32
🔗
|
|
dashcloud has joined #archiveteam |
20:36
🔗
|
|
xtr-201 has joined #archiveteam |
21:25
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:25
🔗
|
|
bsmith093 has quit IRC (Remote host closed the connection) |
21:29
🔗
|
|
bsmith093 has joined #archiveteam |
21:38
🔗
|
|
ionpulse has quit IRC (Ping timeout: 265 seconds) |
21:52
🔗
|
|
sankin has quit IRC (Leaving.) |
22:00
🔗
|
|
BlueMaxim has joined #archiveteam |
22:06
🔗
|
|
thechip_ has joined #archiveteam |
22:06
🔗
|
|
thechip has quit IRC (Read error: Connection reset by peer) |
22:07
🔗
|
|
thechip_ is now known as thechip |
22:10
🔗
|
|
thechip has quit IRC (Client Quit) |
22:10
🔗
|
|
thechip has joined #archiveteam |
22:11
🔗
|
|
wp494 has joined #archiveteam |
22:39
🔗
|
arkiver |
tomorrow I'll get a few other upcoming projects running |
22:43
🔗
|
|
xk_id has joined #archiveteam |
22:47
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
22:50
🔗
|
|
dashcloud has joined #archiveteam |
22:59
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
23:07
🔗
|
|
xk_id has joined #archiveteam |
23:29
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
23:44
🔗
|
|
Morbus has joined #archiveteam |
23:45
🔗
|
|
MorbusIff has quit IRC (Read error: Operation timed out) |
23:54
🔗
|
ohhdemgir |
Nemo_bis, Done uploading https://archive.org/details/reddit_json_07-2013 |
23:54
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |