#archiveteam 2016-05-19,Thu

↑back Search

Time Nickname Message
00:01 πŸ”— Start http://www.wnd.com/2016/05/california-wants-copyrights-on-everything/
00:07 πŸ”— hive-mind has quit IRC (Ping timeout: 260 seconds)
00:09 πŸ”— hive-mind has joined #archiveteam
00:20 πŸ”— BlueMaxim has joined #archiveteam
00:24 πŸ”— JW_work or for those who don't find wnd an appealing source, here's the press release they cribbed from: https://www.eff.org/deeplinks/2016/04/ab-2880
00:51 πŸ”— Fusl asking my question from yesterday <Fusl> is there an irc channel for yuku archive?
01:12 πŸ”— MrRadar Fusl: I'm not sure. I've seen discussion of it in this one. Ping: arkiver
01:12 πŸ”— phuzion has quit IRC (Remote host closed the connection)
01:16 πŸ”— fie has joined #archiveteam
01:23 πŸ”— JesseW has joined #archiveteam
01:27 πŸ”— philpem has quit IRC (Ping timeout: 260 seconds)
01:43 πŸ”— Fusl arkiver: copy-pasting what i dumped in here yesterday regarding yuku... https://scr.meo.ws/paste/2016-05-19-03-42-48-jeda5tEL.txt
01:43 πŸ”— wyatt8740 has quit IRC (Read error: Operation timed out)
01:51 πŸ”— hook54321 has quit IRC (Quit: Connection closed for inactivity)
02:28 πŸ”— phuzion has joined #archiveteam
02:37 πŸ”— MMovie1 has joined #archiveteam
02:38 πŸ”— MMovie has quit IRC (Read error: Operation timed out)
03:01 πŸ”— wyatt8740 has joined #archiveteam
03:33 πŸ”— hook54321 has joined #archiveteam
03:36 πŸ”— acridAxid has quit IRC (marauder)
03:37 πŸ”— acridAxid has joined #archiveteam
03:45 πŸ”— RichardG_ has joined #archiveteam
03:46 πŸ”— RichardG has quit IRC (Ping timeout: 258 seconds)
03:56 πŸ”— RichardG_ has quit IRC (Ping timeout: 260 seconds)
03:59 πŸ”— RichardG has joined #archiveteam
04:07 πŸ”— RichardG_ has joined #archiveteam
04:07 πŸ”— RichardG has quit IRC (Read error: Connection reset by peer)
04:08 πŸ”— RichardG_ is now known as RichardG
04:38 πŸ”— Sk1d has quit IRC (Ping timeout: 194 seconds)
04:46 πŸ”— Sk1d has joined #archiveteam
04:47 πŸ”— BartoCH has quit IRC (Ping timeout: 260 seconds)
04:54 πŸ”— BartoCH has joined #archiveteam
06:15 πŸ”— blahah has joined #archiveteam
06:35 πŸ”— JesseW has quit IRC (Ping timeout: 370 seconds)
06:38 πŸ”— vitzli has joined #archiveteam
06:41 πŸ”— tomwsmf-a has quit IRC (Ping timeout: 258 seconds)
07:22 πŸ”— schbirid has joined #archiveteam
07:48 πŸ”— ariscop has quit IRC (Read error: Operation timed out)
08:00 πŸ”— BlueMaxim has quit IRC (Read error: Operation timed out)
08:02 πŸ”— BlueMaxim has joined #archiveteam
08:09 πŸ”— metalcamp has joined #archiveteam
08:10 πŸ”— no2pencil has quit IRC (Read error: Operation timed out)
08:11 πŸ”— no2pencil has joined #archiveteam
08:41 πŸ”— WinterFox has joined #archiveteam
08:53 πŸ”— ariscop has joined #archiveteam
09:20 πŸ”— atomotic has joined #archiveteam
09:32 πŸ”— BlueMaxim has quit IRC (Quit: Leaving)
10:01 πŸ”— hook54321 has quit IRC (Quit: Connection closed for inactivity)
10:32 πŸ”— atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
10:52 πŸ”— SilSte has quit IRC (Remote host closed the connection)
11:40 πŸ”— ndiddy has quit IRC (Read error: Operation timed out)
11:57 πŸ”— SilSte has joined #archiveteam
12:07 πŸ”— Morbus has joined #archiveteam
12:28 πŸ”— WinterFox has quit IRC (Remote host closed the connection)
13:01 πŸ”— phuzion has quit IRC (Quit: Bye)
13:02 πŸ”— phuzion has joined #archiveteam
13:11 πŸ”— phuzion has quit IRC (Quit: Bye)
13:13 πŸ”— phuzion has joined #archiveteam
13:29 πŸ”— blahah anyone here interested in archiving the BCC recipes site?
13:29 πŸ”— blahah someone has made a clone, and the code is open
13:29 πŸ”— blahah but I feel like it would be safer in the archive, and distributed https://github.com/user24/auntiesrecipes
13:56 πŸ”— MrRadar We already ran it through Archivebot
13:57 πŸ”— blahah nice
14:40 πŸ”— khaoohs has quit IRC (Read error: Connection reset by peer)
15:47 πŸ”— tomwsmf-a has joined #archiveteam
16:11 πŸ”— JesseW has joined #archiveteam
16:19 πŸ”— JesseW has quit IRC (Ping timeout: 370 seconds)
16:22 πŸ”— atomotic has joined #archiveteam
16:28 πŸ”— vitzli has quit IRC (Quit: Leaving)
16:28 πŸ”— Nemo_bis SSRN archival https://github.com/paultopia/scholaw/issues/1#issuecomment-220328277
16:30 πŸ”— JesseW has joined #archiveteam
16:38 πŸ”— Honno has joined #archiveteam
16:38 πŸ”— JesseW has quit IRC (Ping timeout: 370 seconds)
16:51 πŸ”— Honno_ has quit IRC (Read error: Operation timed out)
17:01 πŸ”— schbirid https://github.com/user24/auntiesrecipes
17:06 πŸ”— atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
17:16 πŸ”— Froggypwn has quit IRC (Quit: ~ Trillian Astra - www.trillian.im ~)
17:21 πŸ”— philpem has joined #archiveteam
17:22 πŸ”— Froggypwn has joined #archiveteam
17:41 πŸ”— tomwsmf-a has quit IRC (Read error: Operation timed out)
17:44 πŸ”— ranma do you guys rescan things every so often? (E.g. that github)
17:53 πŸ”— MrRadar The news grabber constantly monitors over 800 news sites for new articles but otherwise everything we do is pretty much one-off
18:05 πŸ”— hook54321 has joined #archiveteam
18:08 πŸ”— blahah is anyone interested in archiving academic papers?
18:08 πŸ”— blahah I know about the PDF sweep which will get stuff that is freely available
18:08 πŸ”— blahah what about stuff that is not?
18:10 πŸ”— PurpleSym Like scihub?
18:11 πŸ”— blahah yes, like that
18:12 πŸ”— blahah or generally any non-public papers
18:13 πŸ”— PurpleSym Where would you put them? IA would have to take them down pretty quickly.
18:14 πŸ”— blahah not sure - I guess a distributed archive would be best
18:15 πŸ”— blahah the problem at the moment is that the torrent speeds are terrible because of the routing from russia to EU / US
18:17 πŸ”— blahah so currently archiveteam puts everything on the internet archive?
18:17 πŸ”— PurpleSym Yes.
18:17 πŸ”— blahah I'm interested in where the line is for copyright
18:17 πŸ”— blahah basically if someone is upset, they can request a takedown?
18:18 πŸ”— PurpleSym That’s how it works right now.
18:18 πŸ”— blahah ok
18:22 πŸ”— godane has quit IRC (Quit: Leaving.)
18:25 πŸ”— PurpleSym Also, your account might be taken down entirely if too many complaints arrive.
18:30 πŸ”— blahah I see
18:31 πŸ”— JW_work ArchiveTeam has a few things stored off of the Internet Archive (e.g. gittorious, the IA.BAK stuff, seeding of the URLteam results) and we are OK with more.
18:31 πŸ”— JW_work But IA provides a very nice host for a lot of the stuff we grab.
18:31 πŸ”— blahah are there any other giant hosts?
18:32 πŸ”— JW_work what do you mean by "other giant hosts"?
18:32 πŸ”— Frogging other hosts that are high capacity like IA, I assume
18:33 πŸ”— JW_work There aren't any with similar political purposes, AFAIK. Others with similar capacity include Google, Microsoft, Amazon and the NSA. :-)
18:34 πŸ”— JW_work others with similar aims include various national libraries
18:35 πŸ”— Frogging google, microsoft, and amazon don't host shit for us :p
18:35 πŸ”— JW_work I bet they do, we just may not know it.
18:35 πŸ”— JW_work I really really doubt if google doesn't have an impressive fraction of IA's collections quietly sitting on their servers somewhere.
18:35 πŸ”— JW_work It's not like they don't have the space.
18:45 πŸ”— blahah there are places like CERN that have vast capacity too - they have zenodo for scientific data
18:46 πŸ”— blahah yeah I did mean places with high capacity, and I was thinking specifically of places that welcome deposits
18:46 πŸ”— JW_work good point
18:53 πŸ”— blahah ok so the hypothetical scenario I put to you all is this...
18:53 πŸ”— blahah scihub has copied about 50million papers that were previously locked behind a paywall
18:53 πŸ”— blahah it's in the region of 50TB of data
18:54 πŸ”— blahah if scihub were to be raided or otherwise dismantled in the future, what strategies could they hypothetically use to prevent the loss of all the data
18:54 πŸ”— blahah ideas so far include to hide all the pdfs inside images using steganography, and archive them on flickr and other photo stores
18:55 πŸ”— blahah or to disguise them as scientific datasets and archive them on scientific data archives
18:55 πŸ”— PurpleSym IPFS?
18:55 πŸ”— blahah to spread them out in tiny archives to lots of free http static hosts around the world
18:55 πŸ”— JW_work both of those seem sensible to me
18:55 πŸ”— blahah IPFS is also on the table, but it requires people willing to join the swarm
18:56 πŸ”— PurpleSym That’s always the problem with distributed solutions.
19:00 πŸ”— blahah any other crazy ideas?
19:00 πŸ”— blahah or not crazy
19:01 πŸ”— Frogging this seems relevant (though perhaps not useful) http://www.archiveteam.org/index.php?title=Valhalla
19:02 πŸ”— Frogging it's the same question; "where can we put big things, other than the Archive"
19:02 πŸ”— PurpleSym Universities tend to have lots of storage as well. Might be worth asking them to – silently – host the data.
19:06 πŸ”— JW_work 50TB, at US$100 / TB is $5,000.
19:07 πŸ”— JW_work which isn't cheap, but isn't completely unreasonable either
19:09 πŸ”— JW_work what does Amazon Glacier charge?
19:11 πŸ”— JW_work The basic issue is maintaining the doublethink of "there's this data β€” I don't know what it is, I can't access it, I certainly don't have any reason to think it is illegal β€” but if someone happens to want it, sometime in the future, I will keep it for them"
19:12 πŸ”— xmc glacier for 50T is USD$350/month => USD$4,200/yr
19:13 πŸ”— midas glacier is expensive for downloading data
19:14 πŸ”— PurpleSym And 4300$ for retrieval bandwidth.
19:17 πŸ”— MrRadar Backblaze B2 is even cheaper than Amazon Glacier at $0.005/GB/month
19:21 πŸ”— midas 1PB would cost just 60k/year, if we just stuff it full :p
19:21 πŸ”— PurpleSym Dedicated box at OVH: 0.008€/GB/month. (12x4TB/Softraid)
19:22 πŸ”— Frogging is that supposed to be euros?
19:22 πŸ”— Frogging or did you mean dollars
19:23 πŸ”— PurpleSym Yes, Euro.
19:23 πŸ”— Frogging that'd be $5376.72 USD per year for 50T
19:24 πŸ”— Frogging (for comparison's sake)
19:24 πŸ”— PurpleSym And dedicated box at Hetzner: 0.003€/GB/month. (15x6TB)
19:25 πŸ”— PurpleSym (includes 100TB bandwidth)
19:26 πŸ”— PurpleSym ~$2000 USD/year for 50T.
19:26 πŸ”— Frogging that's not terrible
19:27 πŸ”— PurpleSym Note that you can’t get β€œjust” 50T though. It’s all or nothing.
19:28 πŸ”— luckcolor how about you upload the encrypted archives on archive.org
19:28 πŸ”— luckcolor and then when the site closes you can release the decryption key
19:28 πŸ”— HCross #archiveteam-bs
19:29 πŸ”— luckcolor agree
19:29 πŸ”— HCross WOOP WOOP Off topic
19:29 πŸ”— Frogging ok
19:31 πŸ”— luckcolor The offtopic alarm has been triggered
19:31 πŸ”— luckcolor :P
19:33 πŸ”— Frogging blahah: join #archiveteam-bs
19:46 πŸ”— blahah sorry was putting kid to bed. I was trying to calculate S3 costs earlier - seemed silly
19:46 πŸ”— blahah JW_work: the doublethink is spot one
19:46 πŸ”— blahah *on
19:46 πŸ”— luckcolor no s3 it's probably not worth it
19:46 πŸ”— blahah there are two basic scenarios: someone knowingly hosts the data, or someone hosts it while being ignorant of the contents
19:47 πŸ”— * Frogging points at #archiveteam-bs
19:47 πŸ”— blahah luckcolor: yeah I realised that eventually
19:47 πŸ”— blahah luckcolor: I was thinking along similar lines for encrypted stuff
19:47 πŸ”— JW_work blahah: please join #archiveteam-bs and discuss it there, not here
19:47 πŸ”— blahah also works with any place that will archive data
19:47 πŸ”— blahah ok sorry
20:22 πŸ”— ariscop has quit IRC (Ping timeout: 506 seconds)
20:30 πŸ”— zgrant has joined #archiveteam
20:31 πŸ”— zgrant has quit IRC (Client Quit)
20:34 πŸ”— brayden_ has quit IRC (Read error: Operation timed out)
20:52 πŸ”— ariscop has joined #archiveteam
20:56 πŸ”— godane has joined #archiveteam
21:16 πŸ”— khaoohs has joined #archiveteam
21:38 πŸ”— Madthias has joined #archiveteam
21:40 πŸ”— schbirid has quit IRC (Quit: Leaving)
21:47 πŸ”— metalcamp has quit IRC (Ping timeout: 244 seconds)
22:03 πŸ”— zgrant has joined #archiveteam
22:09 πŸ”— incog has joined #archiveteam
22:09 πŸ”— incog anybody got a scrape of kuro5hin?
22:09 πŸ”— incog im coming up blank with the usual searches
22:10 πŸ”— tomwsmf-a has joined #archiveteam
22:12 πŸ”— incog no wayback no cache
22:13 πŸ”— incog im looking for the ogg frog zines and a specific article on xanga being a ghetto botnet due to an exploited vuln
22:13 πŸ”— incog these were the only places as far as i know they were
22:16 πŸ”— incog used to be at http://www.kuro5hin.org/story/2004/12/28/161214/43
22:16 πŸ”— Honno has quit IRC (Read error: Operation timed out)
22:16 πŸ”— zgrant has quit IRC (Quit: http://chat.efnet.org (EOF))
22:31 πŸ”— hook54321 has quit IRC (Quit: Connection closed for inactivity)
22:37 πŸ”— Stiletto has quit IRC (Ping timeout: 244 seconds)
22:40 πŸ”— incog http://k5.semantic-db.org/diary-slurp/161942--archive-diaries--html-diaries--nested-format.zip
22:40 πŸ”— incog found smth
22:41 πŸ”— JW_work yeah, I remembered there was something, but couldn't remember the details
22:47 πŸ”— atrocity has quit IRC (Ping timeout: 246 seconds)
22:56 πŸ”— incog http://archive.is/mtpf oh here it is
23:06 πŸ”— incog still no ogg frog, oh well
23:08 πŸ”— JW_work has quit IRC (Read error: Operation timed out)
23:11 πŸ”— incog http://atdt.freeshell.org/k5/
23:18 πŸ”— JW_work has joined #archiveteam
23:23 πŸ”— atrocity has joined #archiveteam
23:36 πŸ”— BlueMaxim has joined #archiveteam
23:58 πŸ”— Stiletto has joined #archiveteam

irclogger-viewer