#archiveteam-ot 2018-12-12,Wed

↑back Search

Time Nickname Message
00:38 🔗 Stilett0 has quit IRC ()
00:45 🔗 Stiletto has joined #archiveteam-ot
00:58 🔗 adinbied has quit IRC (Read error: Operation timed out)
00:58 🔗 adinbied has joined #archiveteam-ot
01:14 🔗 terorie has joined #archiveteam-ot
01:20 🔗 terorie has quit IRC (Remote host closed the connection)
01:21 🔗 terorie has joined #archiveteam-ot
01:25 🔗 wp494 has quit IRC (Ping timeout: 255 seconds)
01:25 🔗 wp494 has joined #archiveteam-ot
01:26 🔗 svchfoo3 sets mode: +o wp494
01:51 🔗 hook54321 Just got into the IIPC Slack
02:18 🔗 terorie_ has joined #archiveteam-ot
02:19 🔗 terorie has quit IRC (Read error: Operation timed out)
02:44 🔗 Flashfire is there anyone that will run tubeup on random videos I find on my travels to the real obscure parts of the web
02:44 🔗 Flashfire Ivan has youtube covered but sometimes i find videos from other sites
02:44 🔗 Flashfire not all of them are up still
03:20 🔗 terorie_ has quit IRC (Remote host closed the connection)
03:24 🔗 JAA Who of you guys is e30e? :-)
03:24 🔗 JAA ArchiveBot got a mention in https://linuxwit.ch/blog/2018/12/everything-that-lives-is-designed-to-end/
03:24 🔗 nightpool JAA: i am
03:24 🔗 nightpool well, my friend
03:24 🔗 nightpool but i'm a maintainer
03:26 🔗 JAA Ah :-)
03:29 🔗 Aoede has quit IRC (Ping timeout: 186 seconds)
03:30 🔗 VoynichCr has quit IRC (Read error: Operation timed out)
03:37 🔗 Aoede has joined #archiveteam-ot
03:38 🔗 svchfoo3 sets mode: +o Aoede
03:40 🔗 VoynichCr has joined #archiveteam-ot
03:50 🔗 terorie has joined #archiveteam-ot
03:54 🔗 terorie has quit IRC (Read error: Operation timed out)
04:15 🔗 odemg has quit IRC (Ping timeout: 265 seconds)
04:22 🔗 terorie has joined #archiveteam-ot
04:27 🔗 odemg has joined #archiveteam-ot
05:02 🔗 DarkWorld has quit IRC (Read error: Connection reset by peer)
05:03 🔗 DarkWorld has joined #archiveteam-ot
06:00 🔗 Flashfire Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?
06:07 🔗 DarkWorld has quit IRC (Read error: Operation timed out)
06:40 🔗 terorie has quit IRC (Remote host closed the connection)
06:43 🔗 schbirid has joined #archiveteam-ot
06:48 🔗 terorie has joined #archiveteam-ot
06:58 🔗 Flashfire So does anyone want to tubeup some random items?
06:58 🔗 Flashfire https://watch-learn.com/video-tutorials/basic-frontend-tools-ssh-scp-sftp-and-git-ftp
06:58 🔗 Flashfire thats an example of something I might find
07:00 🔗 anarcat has quit IRC (Ping timeout: 265 seconds)
07:09 🔗 terorie has quit IRC (Remote host closed the connection)
07:10 🔗 terorie has joined #archiveteam-ot
07:19 🔗 Jusque has quit IRC (Quit: ZNC - http://znc.in)
07:20 🔗 Jusque has joined #archiveteam-ot
08:04 🔗 DarkWorld has joined #archiveteam-ot
08:07 🔗 terorie has quit IRC (Remote host closed the connection)
08:07 🔗 terorie has joined #archiveteam-ot
08:10 🔗 terorie has quit IRC (Read error: Operation timed out)
08:12 🔗 terorie has joined #archiveteam-ot
08:19 🔗 terorie has quit IRC (Remote host closed the connection)
08:19 🔗 terorie has joined #archiveteam-ot
08:24 🔗 terorie has quit IRC (Ping timeout: 268 seconds)
08:39 🔗 terorie has joined #archiveteam-ot
09:11 🔗 VerifiedJ has joined #archiveteam-ot
09:13 🔗 Verified_ has quit IRC (Ping timeout: 252 seconds)
09:24 🔗 DarkWorld has quit IRC (Read error: Connection reset by peer)
09:24 🔗 DarkWorld has joined #archiveteam-ot
09:48 🔗 BlueMax has quit IRC (Quit: Leaving)
10:21 🔗 adinbied has quit IRC (Read error: Connection reset by peer)
10:22 🔗 adinbied has joined #archiveteam-ot
10:28 🔗 wp494 has quit IRC (Ping timeout: 506 seconds)
10:28 🔗 wp494 has joined #archiveteam-ot
10:29 🔗 svchfoo3 sets mode: +o wp494
12:10 🔗 caff has quit IRC (Read error: Connection reset by peer)
12:52 🔗 anarcat has joined #archiveteam-ot
12:57 🔗 DarkWorld has quit IRC (Ping timeout: 600 seconds)
12:57 🔗 DarkWorld has joined #archiveteam-ot
13:37 🔗 terorie has quit IRC (Remote host closed the connection)
13:37 🔗 terorie has joined #archiveteam-ot
13:41 🔗 terorie has quit IRC (Read error: Operation timed out)
13:50 🔗 terorie has joined #archiveteam-ot
14:20 🔗 terorie has quit IRC (Remote host closed the connection)
14:20 🔗 terorie has joined #archiveteam-ot
14:21 🔗 DarkWorld has quit IRC (Ping timeout: 633 seconds)
14:38 🔗 terorie has quit IRC (Remote host closed the connection)
14:41 🔗 terorie has joined #archiveteam-ot
15:18 🔗 hook54321 IIPC is ... interesting.
15:19 🔗 hook54321 "Gopher is not mentioned by the standard and I'm not aware of any existing guidance or tools for archiving gopher as WARC files. If someone does come up with a concrete proposal it'd be great to file it as an issue against https://github.com/iipc/warc-specifications for consideration for inclusion in future revisions of the standard. Even if it's not mature enough or there's not enough support to get it into the ISO standard
15:19 🔗 hook54321 proper it'd still be good to document an approach on the warc specifications website/github so others interested in archiving gopher can follow suit."
15:20 🔗 JAA Interesting. I always thought their policy was "implementation first, please".
15:24 🔗 hook54321 Out of curiosity, what would happen if someone tried to just use a WARC writing proxy?
15:26 🔗 hook54321 I'm guessing probably nothing?
15:26 🔗 JAA Well, the proxy would have to support Gopher for that to work, in which case there'd be an implementation of a WARC-writing Gopher tool.
15:27 🔗 JAA Since WARCs don't store raw network dumps but slightly interpreted data (request/response pairs), there is no way to implement a generic WARC-writing proxy.
15:28 🔗 JAA You need to add support for each protocol individually.
15:28 🔗 JAA Which was one of the reasons why PurpleSym went with that high-level abstraction of the network data in crocoite/chromebot.
15:29 🔗 JAA The downside is that the WARCs don't contain the raw request/response data as sent over the network. The advantage is that it also supports HTTP/2, WebSocket, etc.
15:30 🔗 hook54321 ah
15:37 🔗 hook54321 Apparantly ARC has support for GOPHER. https://usercontent.irccloud-cdn.com/file/PjVhJaaO/image.png
15:39 🔗 hook54321 I'm assuming that implies there was at some point something that crawled gopher sites and recorded them in an ARC file.
15:51 🔗 terorie has quit IRC (Remote host closed the connection)
16:19 🔗 terorie has joined #archiveteam-ot
16:34 🔗 terorie has quit IRC (Remote host closed the connection)
16:34 🔗 Verified_ has joined #archiveteam-ot
16:36 🔗 VerifiedJ has quit IRC (Ping timeout: 252 seconds)
16:40 🔗 terorie has joined #archiveteam-ot
16:46 🔗 terorie has quit IRC (Remote host closed the connection)
17:48 🔗 uhhh has joined #archiveteam-ot
17:51 🔗 benjinsmi has quit IRC (Leaving)
17:52 🔗 benjins has joined #archiveteam-ot
18:11 🔗 uhhh has quit IRC (Ping timeout: 252 seconds)
18:28 🔗 jesso_ has joined #archiveteam-ot
18:36 🔗 nbneer has joined #archiveteam-ot
18:40 🔗 nbneer has quit IRC (Ping timeout: 265 seconds)
18:47 🔗 terorie has joined #archiveteam-ot
18:50 🔗 uhhh has joined #archiveteam-ot
18:50 🔗 uhhh has left
18:51 🔗 terorie has quit IRC (Read error: Operation timed out)
19:09 🔗 jut has quit IRC (Quit: Ram upgrade has arrived. Woot!!)
19:21 🔗 caff has joined #archiveteam-ot
19:25 🔗 wp494 has quit IRC (Ping timeout: 255 seconds)
19:26 🔗 wp494 has joined #archiveteam-ot
19:26 🔗 svchfoo1 sets mode: +o wp494
19:32 🔗 terorie has joined #archiveteam-ot
20:07 🔗 jut has joined #archiveteam-ot
20:14 🔗 keith20 has joined #archiveteam-ot
20:16 🔗 keith20 has quit IRC (Remote host closed the connection)
20:16 🔗 keith20 has joined #archiveteam-ot
20:19 🔗 keith20 has quit IRC (Client Quit)
20:33 🔗 terorie has quit IRC (Remote host closed the connection)
20:56 🔗 Mateon1 has quit IRC (Ping timeout: 265 seconds)
20:56 🔗 Mateon1 has joined #archiveteam-ot
21:26 🔗 BlueMax has joined #archiveteam-ot
21:27 🔗 VerifiedJ has joined #archiveteam-ot
21:29 🔗 Verified_ has quit IRC (Ping timeout: 252 seconds)
22:33 🔗 terorie has joined #archiveteam-ot
22:37 🔗 terorie has quit IRC (Read error: Operation timed out)
22:39 🔗 t2t2 has quit IRC (Read error: Operation timed out)
22:39 🔗 t2t2 has joined #archiveteam-ot
22:45 🔗 DarkWorld has joined #archiveteam-ot
22:58 🔗 tuluu has quit IRC (Remote host closed the connection)
23:02 🔗 tuluu has joined #archiveteam-ot
23:16 🔗 DarkWorld has quit IRC (Ping timeout: 600 seconds)
23:17 🔗 DarkWorld has joined #archiveteam-ot
23:24 🔗 Fusl has joined #archiveteam-ot
23:24 🔗 nyphryx has joined #archiveteam-ot
23:25 🔗 diggan has joined #archiveteam-ot
23:25 🔗 teej_ So...
23:25 🔗 nyphryx So.
23:26 🔗 teej_ Do you think BOINC will be useful?
23:26 🔗 nyphryx BOINC? Protein folding?
23:26 🔗 nyphryx Let me Google..
23:26 🔗 Flashfire ...........
23:26 🔗 Flashfire BOINC as in distributed computing
23:26 🔗 nyphryx Yes.
23:26 🔗 teej_ It's a software that allows easy distribution of computing.
23:27 🔗 teej_ It's also simple to install and run.
23:27 🔗 Flashfire It also crashes my computer faster than the warrior
23:27 🔗 nyphryx The problem is storage. Concerning the Tumblr issue I did not scrollback, old CISO forgot IRC-tech related commands,
23:28 🔗 nyphryx In any case, the storage is a problem in a sense that do you really need to have porn unencrypted on your laptop?
23:28 🔗 Flashfire You encrypt your porn?
23:28 🔗 Flashfire What kind of weird fetishes do you have that you insist on encrypting it?
23:28 🔗 teej_ Storage? Don't worry about that. That's not in our control. We're uploading it to the Archive Team servers and to the Internet Archive.
23:29 🔗 teej_ We can handle the encryption in the script.
23:30 🔗 teej_ We can have different types of jobs too. Some that need encryption, and others that don't.
23:30 🔗 nyphryx Tumblr NSFW storage. I'm not talking about Warrior WARC tech...and guys, take it easy on new comers to the team, some bring value.
23:30 🔗 Flashfire The storage is handled by IA
23:30 🔗 Flashfire its out of our hands
23:30 🔗 teej_ That's what I said.
23:30 🔗 Flashfire We download it and upload it to FOS aka the Fortress of Solitude
23:31 🔗 nyphryx Alright.
23:31 🔗 Flashfire A staging server owned by our illustrious leader
23:31 🔗 Flashfire He then pushes the data to the archive
23:31 🔗 Flashfire The data doesnt stay on your computer for long
23:32 🔗 teej_ So we need a better distribution and tracking method. And then we need optimized code.
23:32 🔗 Flashfire Thing is we are all volunteers
23:32 🔗 Flashfire nobody gets paid for this
23:32 🔗 teej_ I really think BOINC will help.
23:32 🔗 Flashfire I disagree
23:33 🔗 nyphryx What I'm pointing out is that the data on your computer might be data that belongs to somebody else, and the PartyVan might show up while the data is for 2 seconds on your computer because the judge and prosecutor? Don't care.
23:33 🔗 teej_ Flashfire: Why do you disagree?
23:33 🔗 Flashfire if you are worried then dont run the project
23:33 🔗 Flashfire Its the risk we all take
23:33 🔗 Flashfire BOINC has completly different ideologies
23:34 🔗 nyphryx I think I'll git branch the project at some point since geocities.yahoo.com is gone, now geocities.jp is gone, and now XX% of tumblr.com will be gone.
23:34 🔗 teej_ nyphryx: Don't worry about the encryption part yet. That can easily be taken care of. The biggest issue is optimization and deployment of the code.
23:34 🔗 nyphryx BOINC sure, but how to crawl a topic that's being deleted from a website with blogs
23:35 🔗 Flashfire Too much stuff that needs to be done tracker side for it to work for BOINC
23:35 🔗 nyphryx As in, "NSFW blogs on Tumblr in a universe where Tumblr does not put the NSFW tag On"
23:35 🔗 kpcyrd has joined #archiveteam-ot
23:35 🔗 Flashfire Plus Boinc has massive problems with deduplication
23:35 🔗 Flashfire The same job goes out to 5-10 different people
23:36 🔗 teej_ Flashfire: You can change that.
23:36 🔗 Flashfire Then we have the issue of claim releasing
23:36 🔗 teej_ I don't think you know enough about BOINC.
23:36 🔗 Flashfire I dont think I know enough to be commenting on this discussion at all but I still think its a shit idea
23:37 🔗 nyphryx (personally do not know much about BOINC but i will take a look)
23:37 🔗 teej_ BOINC isn't actually doing anything except making it easier to distribute jobs.
23:37 🔗 nyphryx Relative of Hadoop*?
23:37 🔗 JAA What problem does BOINC solve that we currently have?
23:38 🔗 JAA I.e. what would we gain by switching?
23:38 🔗 teej_ Deployment to a larger audience.
23:38 🔗 JAA I have a feeling that it'd be a lot of work for probably very little gain.
23:41 🔗 nyphryx teej_, are you around the channel?
23:41 🔗 teej_ If we add the project to a BOINC tracking site for people to join, it can allow us to get many more individuals involved. That can increase the net rate at which jobs are finished.
23:41 🔗 teej_ nyphryx: What do you mean?
23:42 🔗 teej_ JAA: ^
23:45 🔗 teej_ Many universities use BOINC as well. And universities generally have good internet connections.
23:46 🔗 JAA Doesn't mean they're interested in downloading NSFW gifs from Tumblr though.
23:47 🔗 JAA As far as I know, BOINC is all about distributed computing, not distributed downloads.
23:48 🔗 teej_ Good point.
23:59 🔗 teej_ Okay. Another thing is to we need to look into is a more optimized way of downloading and creating warcs.

irclogger-viewer