Time |
Nickname |
Message |
00:38
🔗
|
|
Stilett0 has quit IRC () |
00:45
🔗
|
|
Stiletto has joined #archiveteam-ot |
00:58
🔗
|
|
adinbied has quit IRC (Read error: Operation timed out) |
00:58
🔗
|
|
adinbied has joined #archiveteam-ot |
01:14
🔗
|
|
terorie has joined #archiveteam-ot |
01:20
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
01:21
🔗
|
|
terorie has joined #archiveteam-ot |
01:25
🔗
|
|
wp494 has quit IRC (Ping timeout: 255 seconds) |
01:25
🔗
|
|
wp494 has joined #archiveteam-ot |
01:26
🔗
|
|
svchfoo3 sets mode: +o wp494 |
01:51
🔗
|
hook54321 |
Just got into the IIPC Slack |
02:18
🔗
|
|
terorie_ has joined #archiveteam-ot |
02:19
🔗
|
|
terorie has quit IRC (Read error: Operation timed out) |
02:44
🔗
|
Flashfire |
is there anyone that will run tubeup on random videos I find on my travels to the real obscure parts of the web |
02:44
🔗
|
Flashfire |
Ivan has youtube covered but sometimes i find videos from other sites |
02:44
🔗
|
Flashfire |
not all of them are up still |
03:20
🔗
|
|
terorie_ has quit IRC (Remote host closed the connection) |
03:24
🔗
|
JAA |
Who of you guys is e30e? :-) |
03:24
🔗
|
JAA |
ArchiveBot got a mention in https://linuxwit.ch/blog/2018/12/everything-that-lives-is-designed-to-end/ |
03:24
🔗
|
nightpool |
JAA: i am |
03:24
🔗
|
nightpool |
well, my friend |
03:24
🔗
|
nightpool |
but i'm a maintainer |
03:26
🔗
|
JAA |
Ah :-) |
03:29
🔗
|
|
Aoede has quit IRC (Ping timeout: 186 seconds) |
03:30
🔗
|
|
VoynichCr has quit IRC (Read error: Operation timed out) |
03:37
🔗
|
|
Aoede has joined #archiveteam-ot |
03:38
🔗
|
|
svchfoo3 sets mode: +o Aoede |
03:40
🔗
|
|
VoynichCr has joined #archiveteam-ot |
03:50
🔗
|
|
terorie has joined #archiveteam-ot |
03:54
🔗
|
|
terorie has quit IRC (Read error: Operation timed out) |
04:15
🔗
|
|
odemg has quit IRC (Ping timeout: 265 seconds) |
04:22
🔗
|
|
terorie has joined #archiveteam-ot |
04:27
🔗
|
|
odemg has joined #archiveteam-ot |
05:02
🔗
|
|
DarkWorld has quit IRC (Read error: Connection reset by peer) |
05:03
🔗
|
|
DarkWorld has joined #archiveteam-ot |
06:00
🔗
|
Flashfire |
Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen? |
06:07
🔗
|
|
DarkWorld has quit IRC (Read error: Operation timed out) |
06:40
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
06:43
🔗
|
|
schbirid has joined #archiveteam-ot |
06:48
🔗
|
|
terorie has joined #archiveteam-ot |
06:58
🔗
|
Flashfire |
So does anyone want to tubeup some random items? |
06:58
🔗
|
Flashfire |
https://watch-learn.com/video-tutorials/basic-frontend-tools-ssh-scp-sftp-and-git-ftp |
06:58
🔗
|
Flashfire |
thats an example of something I might find |
07:00
🔗
|
|
anarcat has quit IRC (Ping timeout: 265 seconds) |
07:09
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
07:10
🔗
|
|
terorie has joined #archiveteam-ot |
07:19
🔗
|
|
Jusque has quit IRC (Quit: ZNC - http://znc.in) |
07:20
🔗
|
|
Jusque has joined #archiveteam-ot |
08:04
🔗
|
|
DarkWorld has joined #archiveteam-ot |
08:07
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
08:07
🔗
|
|
terorie has joined #archiveteam-ot |
08:10
🔗
|
|
terorie has quit IRC (Read error: Operation timed out) |
08:12
🔗
|
|
terorie has joined #archiveteam-ot |
08:19
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
08:19
🔗
|
|
terorie has joined #archiveteam-ot |
08:24
🔗
|
|
terorie has quit IRC (Ping timeout: 268 seconds) |
08:39
🔗
|
|
terorie has joined #archiveteam-ot |
09:11
🔗
|
|
VerifiedJ has joined #archiveteam-ot |
09:13
🔗
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
09:24
🔗
|
|
DarkWorld has quit IRC (Read error: Connection reset by peer) |
09:24
🔗
|
|
DarkWorld has joined #archiveteam-ot |
09:48
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
10:21
🔗
|
|
adinbied has quit IRC (Read error: Connection reset by peer) |
10:22
🔗
|
|
adinbied has joined #archiveteam-ot |
10:28
🔗
|
|
wp494 has quit IRC (Ping timeout: 506 seconds) |
10:28
🔗
|
|
wp494 has joined #archiveteam-ot |
10:29
🔗
|
|
svchfoo3 sets mode: +o wp494 |
12:10
🔗
|
|
caff has quit IRC (Read error: Connection reset by peer) |
12:52
🔗
|
|
anarcat has joined #archiveteam-ot |
12:57
🔗
|
|
DarkWorld has quit IRC (Ping timeout: 600 seconds) |
12:57
🔗
|
|
DarkWorld has joined #archiveteam-ot |
13:37
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
13:37
🔗
|
|
terorie has joined #archiveteam-ot |
13:41
🔗
|
|
terorie has quit IRC (Read error: Operation timed out) |
13:50
🔗
|
|
terorie has joined #archiveteam-ot |
14:20
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
14:20
🔗
|
|
terorie has joined #archiveteam-ot |
14:21
🔗
|
|
DarkWorld has quit IRC (Ping timeout: 633 seconds) |
14:38
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
14:41
🔗
|
|
terorie has joined #archiveteam-ot |
15:18
🔗
|
hook54321 |
IIPC is ... interesting. |
15:19
🔗
|
hook54321 |
"Gopher is not mentioned by the standard and I'm not aware of any existing guidance or tools for archiving gopher as WARC files. If someone does come up with a concrete proposal it'd be great to file it as an issue against https://github.com/iipc/warc-specifications for consideration for inclusion in future revisions of the standard. Even if it's not mature enough or there's not enough support to get it into the ISO standard |
15:19
🔗
|
hook54321 |
proper it'd still be good to document an approach on the warc specifications website/github so others interested in archiving gopher can follow suit." |
15:20
🔗
|
JAA |
Interesting. I always thought their policy was "implementation first, please". |
15:24
🔗
|
hook54321 |
Out of curiosity, what would happen if someone tried to just use a WARC writing proxy? |
15:26
🔗
|
hook54321 |
I'm guessing probably nothing? |
15:26
🔗
|
JAA |
Well, the proxy would have to support Gopher for that to work, in which case there'd be an implementation of a WARC-writing Gopher tool. |
15:27
🔗
|
JAA |
Since WARCs don't store raw network dumps but slightly interpreted data (request/response pairs), there is no way to implement a generic WARC-writing proxy. |
15:28
🔗
|
JAA |
You need to add support for each protocol individually. |
15:28
🔗
|
JAA |
Which was one of the reasons why PurpleSym went with that high-level abstraction of the network data in crocoite/chromebot. |
15:29
🔗
|
JAA |
The downside is that the WARCs don't contain the raw request/response data as sent over the network. The advantage is that it also supports HTTP/2, WebSocket, etc. |
15:30
🔗
|
hook54321 |
ah |
15:37
🔗
|
hook54321 |
Apparantly ARC has support for GOPHER. https://usercontent.irccloud-cdn.com/file/PjVhJaaO/image.png |
15:39
🔗
|
hook54321 |
I'm assuming that implies there was at some point something that crawled gopher sites and recorded them in an ARC file. |
15:51
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
16:19
🔗
|
|
terorie has joined #archiveteam-ot |
16:34
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
16:34
🔗
|
|
Verified_ has joined #archiveteam-ot |
16:36
🔗
|
|
VerifiedJ has quit IRC (Ping timeout: 252 seconds) |
16:40
🔗
|
|
terorie has joined #archiveteam-ot |
16:46
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
17:48
🔗
|
|
uhhh has joined #archiveteam-ot |
17:51
🔗
|
|
benjinsmi has quit IRC (Leaving) |
17:52
🔗
|
|
benjins has joined #archiveteam-ot |
18:11
🔗
|
|
uhhh has quit IRC (Ping timeout: 252 seconds) |
18:28
🔗
|
|
jesso_ has joined #archiveteam-ot |
18:36
🔗
|
|
nbneer has joined #archiveteam-ot |
18:40
🔗
|
|
nbneer has quit IRC (Ping timeout: 265 seconds) |
18:47
🔗
|
|
terorie has joined #archiveteam-ot |
18:50
🔗
|
|
uhhh has joined #archiveteam-ot |
18:50
🔗
|
|
uhhh has left |
18:51
🔗
|
|
terorie has quit IRC (Read error: Operation timed out) |
19:09
🔗
|
|
jut has quit IRC (Quit: Ram upgrade has arrived. Woot!!) |
19:21
🔗
|
|
caff has joined #archiveteam-ot |
19:25
🔗
|
|
wp494 has quit IRC (Ping timeout: 255 seconds) |
19:26
🔗
|
|
wp494 has joined #archiveteam-ot |
19:26
🔗
|
|
svchfoo1 sets mode: +o wp494 |
19:32
🔗
|
|
terorie has joined #archiveteam-ot |
20:07
🔗
|
|
jut has joined #archiveteam-ot |
20:14
🔗
|
|
keith20 has joined #archiveteam-ot |
20:16
🔗
|
|
keith20 has quit IRC (Remote host closed the connection) |
20:16
🔗
|
|
keith20 has joined #archiveteam-ot |
20:19
🔗
|
|
keith20 has quit IRC (Client Quit) |
20:33
🔗
|
|
terorie has quit IRC (Remote host closed the connection) |
20:56
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 265 seconds) |
20:56
🔗
|
|
Mateon1 has joined #archiveteam-ot |
21:26
🔗
|
|
BlueMax has joined #archiveteam-ot |
21:27
🔗
|
|
VerifiedJ has joined #archiveteam-ot |
21:29
🔗
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
22:33
🔗
|
|
terorie has joined #archiveteam-ot |
22:37
🔗
|
|
terorie has quit IRC (Read error: Operation timed out) |
22:39
🔗
|
|
t2t2 has quit IRC (Read error: Operation timed out) |
22:39
🔗
|
|
t2t2 has joined #archiveteam-ot |
22:45
🔗
|
|
DarkWorld has joined #archiveteam-ot |
22:58
🔗
|
|
tuluu has quit IRC (Remote host closed the connection) |
23:02
🔗
|
|
tuluu has joined #archiveteam-ot |
23:16
🔗
|
|
DarkWorld has quit IRC (Ping timeout: 600 seconds) |
23:17
🔗
|
|
DarkWorld has joined #archiveteam-ot |
23:24
🔗
|
|
Fusl has joined #archiveteam-ot |
23:24
🔗
|
|
nyphryx has joined #archiveteam-ot |
23:25
🔗
|
|
diggan has joined #archiveteam-ot |
23:25
🔗
|
teej_ |
So... |
23:25
🔗
|
nyphryx |
So. |
23:26
🔗
|
teej_ |
Do you think BOINC will be useful? |
23:26
🔗
|
nyphryx |
BOINC? Protein folding? |
23:26
🔗
|
nyphryx |
Let me Google.. |
23:26
🔗
|
Flashfire |
........... |
23:26
🔗
|
Flashfire |
BOINC as in distributed computing |
23:26
🔗
|
nyphryx |
Yes. |
23:26
🔗
|
teej_ |
It's a software that allows easy distribution of computing. |
23:27
🔗
|
teej_ |
It's also simple to install and run. |
23:27
🔗
|
Flashfire |
It also crashes my computer faster than the warrior |
23:27
🔗
|
nyphryx |
The problem is storage. Concerning the Tumblr issue I did not scrollback, old CISO forgot IRC-tech related commands, |
23:28
🔗
|
nyphryx |
In any case, the storage is a problem in a sense that do you really need to have porn unencrypted on your laptop? |
23:28
🔗
|
Flashfire |
You encrypt your porn? |
23:28
🔗
|
Flashfire |
What kind of weird fetishes do you have that you insist on encrypting it? |
23:28
🔗
|
teej_ |
Storage? Don't worry about that. That's not in our control. We're uploading it to the Archive Team servers and to the Internet Archive. |
23:29
🔗
|
teej_ |
We can handle the encryption in the script. |
23:30
🔗
|
teej_ |
We can have different types of jobs too. Some that need encryption, and others that don't. |
23:30
🔗
|
nyphryx |
Tumblr NSFW storage. I'm not talking about Warrior WARC tech...and guys, take it easy on new comers to the team, some bring value. |
23:30
🔗
|
Flashfire |
The storage is handled by IA |
23:30
🔗
|
Flashfire |
its out of our hands |
23:30
🔗
|
teej_ |
That's what I said. |
23:30
🔗
|
Flashfire |
We download it and upload it to FOS aka the Fortress of Solitude |
23:31
🔗
|
nyphryx |
Alright. |
23:31
🔗
|
Flashfire |
A staging server owned by our illustrious leader |
23:31
🔗
|
Flashfire |
He then pushes the data to the archive |
23:31
🔗
|
Flashfire |
The data doesnt stay on your computer for long |
23:32
🔗
|
teej_ |
So we need a better distribution and tracking method. And then we need optimized code. |
23:32
🔗
|
Flashfire |
Thing is we are all volunteers |
23:32
🔗
|
Flashfire |
nobody gets paid for this |
23:32
🔗
|
teej_ |
I really think BOINC will help. |
23:32
🔗
|
Flashfire |
I disagree |
23:33
🔗
|
nyphryx |
What I'm pointing out is that the data on your computer might be data that belongs to somebody else, and the PartyVan might show up while the data is for 2 seconds on your computer because the judge and prosecutor? Don't care. |
23:33
🔗
|
teej_ |
Flashfire: Why do you disagree? |
23:33
🔗
|
Flashfire |
if you are worried then dont run the project |
23:33
🔗
|
Flashfire |
Its the risk we all take |
23:33
🔗
|
Flashfire |
BOINC has completly different ideologies |
23:34
🔗
|
nyphryx |
I think I'll git branch the project at some point since geocities.yahoo.com is gone, now geocities.jp is gone, and now XX% of tumblr.com will be gone. |
23:34
🔗
|
teej_ |
nyphryx: Don't worry about the encryption part yet. That can easily be taken care of. The biggest issue is optimization and deployment of the code. |
23:34
🔗
|
nyphryx |
BOINC sure, but how to crawl a topic that's being deleted from a website with blogs |
23:35
🔗
|
Flashfire |
Too much stuff that needs to be done tracker side for it to work for BOINC |
23:35
🔗
|
nyphryx |
As in, "NSFW blogs on Tumblr in a universe where Tumblr does not put the NSFW tag On" |
23:35
🔗
|
|
kpcyrd has joined #archiveteam-ot |
23:35
🔗
|
Flashfire |
Plus Boinc has massive problems with deduplication |
23:35
🔗
|
Flashfire |
The same job goes out to 5-10 different people |
23:36
🔗
|
teej_ |
Flashfire: You can change that. |
23:36
🔗
|
Flashfire |
Then we have the issue of claim releasing |
23:36
🔗
|
teej_ |
I don't think you know enough about BOINC. |
23:36
🔗
|
Flashfire |
I dont think I know enough to be commenting on this discussion at all but I still think its a shit idea |
23:37
🔗
|
nyphryx |
(personally do not know much about BOINC but i will take a look) |
23:37
🔗
|
teej_ |
BOINC isn't actually doing anything except making it easier to distribute jobs. |
23:37
🔗
|
nyphryx |
Relative of Hadoop*? |
23:37
🔗
|
JAA |
What problem does BOINC solve that we currently have? |
23:38
🔗
|
JAA |
I.e. what would we gain by switching? |
23:38
🔗
|
teej_ |
Deployment to a larger audience. |
23:38
🔗
|
JAA |
I have a feeling that it'd be a lot of work for probably very little gain. |
23:41
🔗
|
nyphryx |
teej_, are you around the channel? |
23:41
🔗
|
teej_ |
If we add the project to a BOINC tracking site for people to join, it can allow us to get many more individuals involved. That can increase the net rate at which jobs are finished. |
23:41
🔗
|
teej_ |
nyphryx: What do you mean? |
23:42
🔗
|
teej_ |
JAA: ^ |
23:45
🔗
|
teej_ |
Many universities use BOINC as well. And universities generally have good internet connections. |
23:46
🔗
|
JAA |
Doesn't mean they're interested in downloading NSFW gifs from Tumblr though. |
23:47
🔗
|
JAA |
As far as I know, BOINC is all about distributed computing, not distributed downloads. |
23:48
🔗
|
teej_ |
Good point. |
23:59
🔗
|
teej_ |
Okay. Another thing is to we need to look into is a more optimized way of downloading and creating warcs. |