Time |
Nickname |
Message |
00:02
🔗
|
|
philpem has quit IRC (Ping timeout: 252 seconds) |
00:18
🔗
|
|
primus104 has quit IRC (Leaving.) |
00:24
🔗
|
|
marvinw has quit IRC (Read error: Operation timed out) |
00:34
🔗
|
|
BlueMaxim has joined #archiveteam |
00:53
🔗
|
SketchCow |
DFJustin: Killing the Touhou |
00:53
🔗
|
SketchCow |
Main issue is that it blows right out to a directory and it's doing it with unicode |
00:53
🔗
|
SketchCow |
I feel like that won't survive to the archive |
00:57
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
00:57
🔗
|
|
BlueMaxim has joined #archiveteam |
01:10
🔗
|
|
aaaaaaaaa has quit IRC (Read error: Connection reset by peer) |
01:14
🔗
|
|
aaaaaaaaa has joined #archiveteam |
01:14
🔗
|
|
swebb sets mode: +o aaaaaaaaa |
01:22
🔗
|
DFJustin |
unicode works ok since the v2 |
01:24
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
01:28
🔗
|
|
dashcloud has joined #archiveteam |
01:28
🔗
|
|
MMovie2 has joined #archiveteam |
01:29
🔗
|
|
MMovie has quit IRC (Ping timeout: 306 seconds) |
01:33
🔗
|
|
mistym has joined #archiveteam |
01:56
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
02:00
🔗
|
|
dashcloud has joined #archiveteam |
02:07
🔗
|
|
Jonimus has joined #archiveteam |
02:13
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
02:26
🔗
|
|
qrstuv has joined #archiveteam |
02:37
🔗
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
02:38
🔗
|
|
mistym has joined #archiveteam |
03:45
🔗
|
|
marvinw has joined #archiveteam |
04:12
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
04:52
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
04:55
🔗
|
|
dashcloud has joined #archiveteam |
05:09
🔗
|
|
Emcy has quit IRC (Read error: Connection reset by peer) |
05:27
🔗
|
|
vitzli has joined #archiveteam |
05:52
🔗
|
|
microguru has joined #archiveteam |
05:52
🔗
|
microguru |
Hello. I've just found out about Archive team. I support your cause and am running a Warrior as we speak. |
06:00
🔗
|
xmc |
great! |
06:12
🔗
|
microguru |
is it OK if i only run the warrior for a few hours a day and with a limit of 1 MBp/s? |
06:15
🔗
|
|
bzc6p_ is now known as bzc6p |
06:15
🔗
|
xmc |
if you stop it in an orderly way, that should be fine |
06:15
🔗
|
xmc |
as in stop it how the documentation says you should, and waiting for it to finish before you close it down |
06:15
🔗
|
microguru |
I use the "stop warrior" button on the web interface to stop it every time |
06:16
🔗
|
xmc |
sounds good! |
06:16
🔗
|
microguru |
alright. |
06:17
🔗
|
microguru |
you guys (we?) do some pretty good work. I first found out about archive team after looking for a file on pomf.se and being told archive team made an archive |
06:18
🔗
|
xmc |
if you're running a warrior, you definitely get to say we :) |
06:19
🔗
|
xmc |
did you get the thing you wanted out of pomf? |
06:19
🔗
|
microguru |
I realised the importance of archiving after losing one too many youtube videos to the DMCA and started youtube-dl'ing everything |
06:19
🔗
|
xmc |
:| |
06:19
🔗
|
microguru |
yeah. I know |
06:20
🔗
|
microguru |
Since them I've archived at least ~500 GB ish for my personal use |
06:20
🔗
|
microguru |
mostly videos and websites (sometimes full copies via wget) |
06:21
🔗
|
* |
xmc nod |
06:21
🔗
|
xmc |
archivebot is good for that sort of thing |
06:21
🔗
|
microguru |
archivebot? |
06:21
🔗
|
xmc |
i throw stuff into archivebot if it seems useful, because i don't really trust anything |
06:21
🔗
|
xmc |
ooh |
06:21
🔗
|
xmc |
you're in for a treat |
06:21
🔗
|
xmc |
join #archivebot on this server |
06:22
🔗
|
xmc |
you can submit wget jobs, which get downloaded and then sent to web.archive.org |
06:22
🔗
|
xmc |
turnaround time for going into the archive is usually a day or so after the download completes |
06:22
🔗
|
microguru |
cool! |
06:23
🔗
|
microguru |
I use "wget -k -m -p -c --wait=10 --random-wait" for my archiving needs |
06:23
🔗
|
* |
xmc nod |
06:23
🔗
|
xmc |
archivebot lets you wander into irc, say "this website is cool, go download it" and the bot takes care of everything else |
06:23
🔗
|
microguru |
the --wait=10 --random-wait makes everything take forever, but it keeps me unbanned |
06:23
🔗
|
xmc |
you have to keep an eye on it in case it gets stuck in a corner, but that's less and less lately |
06:24
🔗
|
xmc |
ya |
06:25
🔗
|
microguru |
just tried out archivebot on http://donh.best.vwh.net for a test |
06:25
🔗
|
microguru |
I'm surprised that website is still up |
06:26
🔗
|
xmc |
what is it? |
06:26
🔗
|
microguru |
it's a personal webpage |
06:27
🔗
|
microguru |
it had a copy of a book that I really liked (http://donh.best.vwh.net/Esperanto/eaccess/eaccess.book.html), so I kept a copy |
06:27
🔗
|
* |
xmc nod |
06:28
🔗
|
microguru |
that book is why I'm intrested in esperanto. |
06:28
🔗
|
microguru |
lots of other good things there too |
06:30
🔗
|
microguru |
depending on what it is I'm preserving for personal use, I'v used print-to-file, wget, and screenshots |
06:31
🔗
|
xmc |
archivebot has a phantomjs mode where it executes javascript and scrolls to the bottom of the page, one pagedown at a time |
06:31
🔗
|
xmc |
it's kind of brute force and it works pretty well |
06:32
🔗
|
microguru |
archivebot's really cool |
06:32
🔗
|
microguru |
not only does it make copies for me, but everyone else can have a copy too without having to email me for it |
06:32
🔗
|
xmc |
yuuuup |
06:32
🔗
|
xmc |
and the copies go in an obvious place |
06:33
🔗
|
microguru |
speaking of that, I wonder how many other people keep personal archives |
06:33
🔗
|
microguru |
there should be a website where people post what things they have and are willing to share, and people can request a copy |
06:33
🔗
|
xmc |
hmmm |
06:34
🔗
|
microguru |
i'd suggest that said website not personally handle the files, for bandwidth and copyright reasons |
06:35
🔗
|
microguru |
people would have to use file hosts to transfer the files |
06:35
🔗
|
xmc |
copyright is not a concern for archiveteam |
06:37
🔗
|
microguru |
exactly, becaue just about everything that's being archived is copyrighted |
06:37
🔗
|
xmc |
it's just a conversation that we've had too much |
06:38
🔗
|
microguru |
the Library of congress archives stuff, so why not us? |
06:38
🔗
|
yipdw |
archivebot also has an improvement list three miles long, no doubt |
06:38
🔗
|
yipdw |
and I wish all I had to do was rub like so, and oh - |
06:38
🔗
|
yipdw |
shit that doesn't really scan the same way, does it |
06:39
🔗
|
xmc |
they have different meter |
06:39
🔗
|
microguru |
given that copyright isn't a concern, is there anything off the table for archiving? |
06:39
🔗
|
xmc |
the second one is almost iambic |
06:40
🔗
|
xmc |
microguru: we prioritize by size/benefit |
06:41
🔗
|
microguru |
so that means something that's hugely important text gets prioritized over not very important videos? |
06:41
🔗
|
xmc |
yeah |
06:41
🔗
|
microguru |
I was thinking that illegal things and encrypted files would be prohibited |
06:42
🔗
|
xmc |
more 'evil' than 'illegal' |
06:42
🔗
|
xmc |
many illegal things aren't actually wrong |
06:42
🔗
|
xmc |
many wrong things aren't illegal |
06:42
🔗
|
microguru |
although maybe even illegal things provided there's a good reason (archiving hate speech for historical analysis) |
06:43
🔗
|
microguru |
"many illegal things aren't actually wrong; many wrong things aren't illegal" isn't that the truth |
06:44
🔗
|
microguru |
kind of like that guy I read about in the news with his basement full of KKK pamphlets he kept for future historians |
06:45
🔗
|
yipdw |
at some point it might be useful to take this to #archiveteam-bs |
06:46
🔗
|
xmc |
i'm of two minds |
06:46
🔗
|
microguru |
http://america.aljazeera.com/watch/shows/america-tonight/articles/2015/7/8/Inside-a-security-experts-collection-of-hateful-artifacts.html |
06:57
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
07:00
🔗
|
microguru |
ok |
07:00
🔗
|
|
microguru has left |
07:28
🔗
|
|
primus104 has joined #archiveteam |
07:32
🔗
|
|
schbirid has joined #archiveteam |
07:40
🔗
|
|
vitzli has quit IRC (Quit: Leaving) |
07:55
🔗
|
|
bzc6p has quit IRC (Read error: Connection reset by peer) |
07:58
🔗
|
|
mistym has joined #archiveteam |
08:04
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
08:31
🔗
|
|
primus104 has quit IRC (Leaving.) |
08:33
🔗
|
|
atomotic has joined #archiveteam |
09:19
🔗
|
|
vitzli has joined #archiveteam |
09:52
🔗
|
|
xk_id has joined #archiveteam |
10:03
🔗
|
|
mistym has joined #archiveteam |
10:08
🔗
|
|
bzc6p has joined #archiveteam |
10:08
🔗
|
|
swebb sets mode: +o bzc6p |
10:11
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
10:11
🔗
|
|
zyphlar has quit IRC (Read error: Connection reset by peer) |
10:11
🔗
|
|
codl_ has quit IRC (Ping timeout: 252 seconds) |
10:11
🔗
|
|
Boltsie has quit IRC (Ping timeout: 252 seconds) |
10:11
🔗
|
|
russss__ has quit IRC (Ping timeout: 252 seconds) |
10:11
🔗
|
|
deathy has quit IRC (Ping timeout: 252 seconds) |
10:11
🔗
|
|
Ctrl-S has quit IRC (Read error: Connection reset by peer) |
10:12
🔗
|
|
zyphlar has joined #archiveteam |
10:12
🔗
|
|
codl_ has joined #archiveteam |
10:12
🔗
|
|
russss__ has joined #archiveteam |
10:12
🔗
|
|
Ctrl-S has joined #archiveteam |
10:12
🔗
|
|
Boltsie has joined #archiveteam |
10:13
🔗
|
|
deathy has joined #archiveteam |
10:25
🔗
|
nico_32 |
apinc.org,a french hoster, will close its shared hosting in 2 month |
10:26
🔗
|
Nemo_bis |
https://twitter.com/aubreymcfato/status/618466994317316096 |
10:26
🔗
|
nico_32 |
18k site in google |
10:26
🔗
|
nico_32 |
*referenced by* |
10:28
🔗
|
nico_32 |
i really need to create an account on archiveteam.org |
11:20
🔗
|
|
primus104 has joined #archiveteam |
11:21
🔗
|
Ctrl-S |
yahoosucks |
11:21
🔗
|
|
bzc6p_ has joined #archiveteam |
11:21
🔗
|
|
swebb sets mode: +o bzc6p_ |
11:26
🔗
|
|
bzc6p has quit IRC (Read error: Operation timed out) |
11:38
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
11:43
🔗
|
|
RichardG has joined #archiveteam |
11:51
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
12:04
🔗
|
|
mistym has joined #archiveteam |
12:04
🔗
|
|
ats has quit IRC (Read error: Connection reset by peer) |
12:06
🔗
|
|
oldcad has joined #archiveteam |
12:09
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
12:10
🔗
|
|
ats has joined #archiveteam |
12:11
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
12:23
🔗
|
|
xk_id has joined #archiveteam |
12:27
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
13:06
🔗
|
|
bzc6p_ is now known as bzc6p |
13:12
🔗
|
|
philpem has joined #archiveteam |
13:29
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
13:32
🔗
|
|
Medowar has quit IRC (Quit: Leaving) |
13:37
🔗
|
|
Froggypwn has quit IRC (Read error: Connection reset by peer) |
13:38
🔗
|
|
Froggypwn has joined #archiveteam |
14:16
🔗
|
|
atomotic has joined #archiveteam |
14:19
🔗
|
|
primus104 has quit IRC (Leaving.) |
14:32
🔗
|
|
mistym has joined #archiveteam |
14:51
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
15:09
🔗
|
|
Jonimus has quit IRC (Ping timeout: 370 seconds) |
15:25
🔗
|
|
Emcy has joined #archiveteam |
15:33
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
15:34
🔗
|
|
jmc_ has joined #archiveteam |
15:35
🔗
|
|
jmc has quit IRC (Ping timeout: 255 seconds) |
15:42
🔗
|
|
primus104 has joined #archiveteam |
15:53
🔗
|
|
nox has quit IRC (Read error: Connection reset by peer) |
16:11
🔗
|
|
nox has joined #archiveteam |
16:14
🔗
|
|
primus104 has quit IRC (Leaving.) |
16:23
🔗
|
|
mistym has joined #archiveteam |
16:27
🔗
|
|
SimpBrain has joined #archiveteam |
16:42
🔗
|
|
tomwsmf-a has joined #archiveteam |
16:46
🔗
|
|
philpem has quit IRC (Ping timeout: 252 seconds) |
16:55
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
16:58
🔗
|
|
dashcloud has joined #archiveteam |
17:34
🔗
|
|
aaaaaaaaa has joined #archiveteam |
17:34
🔗
|
|
swebb sets mode: +o aaaaaaaaa |
17:42
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
17:43
🔗
|
|
Start has joined #archiveteam |
17:48
🔗
|
|
habi has joined #archiveteam |
18:05
🔗
|
|
habi has quit IRC (Quit: Leaving.) |
18:17
🔗
|
|
K4k has joined #archiveteam |
18:19
🔗
|
|
primus104 has joined #archiveteam |
18:34
🔗
|
HCross |
Any big projects on atm except URLTeam? |
18:43
🔗
|
|
vitzli has quit IRC (Quit: Leaving) |
19:00
🔗
|
|
aliz has quit IRC (hub.se irc.du.se) |
19:00
🔗
|
|
Rotab has quit IRC (hub.se irc.du.se) |
19:00
🔗
|
|
Boppen has quit IRC (hub.se irc.du.se) |
19:21
🔗
|
|
aliz has joined #archiveteam |
19:21
🔗
|
|
Rotab has joined #archiveteam |
19:23
🔗
|
|
Boppen has joined #archiveteam |
19:44
🔗
|
|
atomotic has joined #archiveteam |
19:50
🔗
|
|
khaoohs_ is now known as khaoohs |
19:52
🔗
|
|
mistym has quit IRC (Ping timeout: 252 seconds) |
20:03
🔗
|
|
SimpBrain has quit IRC (Quit: Leaving) |
20:06
🔗
|
SketchCow |
The usual large downloads. |
20:06
🔗
|
SketchCow |
Wiki keeps a lot of them |
20:09
🔗
|
|
Froggypwn has quit IRC (Read error: Connection reset by peer) |
20:10
🔗
|
|
Froggypwn has joined #archiveteam |
20:12
🔗
|
|
pfallenop has joined #archiveteam |
20:21
🔗
|
|
xtr-201 has quit IRC (Read error: Connection reset by peer) |
20:29
🔗
|
|
schbirid has quit IRC (Leaving) |
20:40
🔗
|
|
mistym has joined #archiveteam |
20:46
🔗
|
|
Froggypwn has quit IRC (Read error: Connection reset by peer) |
20:48
🔗
|
|
Froggypwn has joined #archiveteam |
21:16
🔗
|
|
K4k has quit IRC (Read error: Connection reset by peer) |
21:16
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
21:18
🔗
|
|
K4k has joined #archiveteam |
21:43
🔗
|
|
mistym_ has joined #archiveteam |
21:43
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
21:45
🔗
|
|
dashcloud has joined #archiveteam |
21:45
🔗
|
|
ripvanwin has quit IRC (Read error: Operation timed out) |
21:45
🔗
|
|
xtr-201 has joined #archiveteam |
21:49
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
22:05
🔗
|
|
chfoo has quit IRC (Remote host closed the connection) |
22:09
🔗
|
SketchCow |
REDDIT CEO OUT |
22:09
🔗
|
SketchCow |
Obviously, the new guy is interim and we should replace him with godane |
22:10
🔗
|
sb057 |
I guess the no-reddit-day thing worked |
22:11
🔗
|
|
philpem has joined #archiveteam |
22:11
🔗
|
|
chfoo has joined #archiveteam |
22:17
🔗
|
|
habi has joined #archiveteam |
22:20
🔗
|
|
tomwsmf-a has quit IRC (Ping timeout: 258 seconds) |
22:20
🔗
|
|
xk_id has joined #archiveteam |
22:20
🔗
|
|
habi has left |
22:30
🔗
|
godane |
SketchCow: i would be the guy to get reddit subs to have git repos |
22:31
🔗
|
godane |
that way EVERYTHING is saved |
22:31
🔗
|
xmc |
hah |
22:32
🔗
|
godane |
a way to make reddit portable for meshnet |
22:32
🔗
|
godane |
that and maybe web archives for urls after its submited |
22:32
🔗
|
godane |
there will still be a live link but also a archive link |
22:34
🔗
|
godane |
SketchCow: also dailymail.co.uk is full archive up to 2004-01 |
22:37
🔗
|
|
K4k has quit IRC (Ping timeout: 186 seconds) |
22:38
🔗
|
goekesmi_ |
godane: this may relevant to your interests: https://speakerdeck.com/ussjoin/the-perfectly-legitimate-project |
22:41
🔗
|
|
BlueMaxim has joined #archiveteam |
22:45
🔗
|
yipdw |
great, now we're going to get a bunch of people saving every goddamn reddit thread again |
22:45
🔗
|
yipdw |
at least they'll go "fast" |
22:49
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
22:52
🔗
|
|
superkuh_ has joined #archiveteam |
22:52
🔗
|
|
superkuh has quit IRC (Read error: Operation timed out) |
22:52
🔗
|
|
kyan has quit IRC (Ping timeout: 258 seconds) |
22:52
🔗
|
|
RKenshin has joined #archiveteam |
22:54
🔗
|
|
db48x has quit IRC (hub.efnet.us irc.Prison.NET) |
22:54
🔗
|
|
sunnymilk has quit IRC (hub.efnet.us irc.Prison.NET) |
22:54
🔗
|
|
Kenshin has quit IRC (hub.efnet.us irc.Prison.NET) |
23:10
🔗
|
|
RKenshin is now known as Kenshin |
23:18
🔗
|
|
wyatt8750 has joined #archiveteam |
23:19
🔗
|
|
wyatt8750 is now known as wyatt8740 |
23:21
🔗
|
|
sunnymilk has joined #archiveteam |
23:26
🔗
|
|
kyan has joined #archiveteam |
23:43
🔗
|
|
bzc6p_ has joined #archiveteam |
23:43
🔗
|
|
swebb sets mode: +o bzc6p_ |
23:49
🔗
|
|
bzc6p has quit IRC (Ping timeout: 600 seconds) |
23:58
🔗
|
|
mistym has joined #archiveteam |