#archiveteam-bs 2017-08-28,Mon

↑back Search

Time Nickname Message
00:01 🔗 hook54321 JAA: They're live again. https://dailystormer.al/
00:07 🔗 SketchCow Ha ha, white supremacists living under the pleasure of albania
00:10 🔗 jrwr I wonder if bitmitigate.com knows they are hosting them
00:11 🔗 hook54321 Based on their twitter feed, I'm assuming so. https://twitter.com/bitmitigate
00:30 🔗 BlueMaxim has joined #archiveteam-bs
00:35 🔗 namibj_ jrwr: I am willing to participate too.
00:37 🔗 namibj_ Also, I am currently trying to get the stuff/knowledge to start some material/corrosion testing for long-term no-maintance tape storage. Just the mechanical side for now though. I expect about 15$/TB to store for 1-2 decades, in lots of about 50TB+
00:38 🔗 namibj_ Trying to get water vapor diffusion numbers for metal cans, which is not that easy.
01:28 🔗 drumstick has joined #archiveteam-bs
01:33 🔗 drumstick has quit IRC (Ping timeout: 268 seconds)
01:38 🔗 drumstick has joined #archiveteam-bs
01:44 🔗 Somebody2 plue: just usernames is worth putting up on IA just so it's less likely to get lost.
01:44 🔗 Somebody2 It's not the valuable historically without the actual data, agreed.
01:46 🔗 Somebody2 One thing we could do that would likely be a smaller pile of data, is to grab *just text* from all the blogs on that list.
01:46 🔗 Somebody2 I know the images on Tumblr are really important, but so is the text, and it's lots smaller.
01:46 🔗 Somebody2 Also, making a warrior project just to grab metadata from all those blogs would be nice.
01:46 🔗 Somebody2 i.e. how many posts, what's the oldest and newest, etc.
01:47 🔗 Somebody2 It would also be *great* if knowledge of writing warrior projects was more widely distributed, so if any of you want to practice...
02:07 🔗 namibj_ Somebody2: I do approve a lot of the effort to distribute the knowledge of writing warrior projects
02:09 🔗 namibj_ Do we have _any_ usefull estimate on the size of the combined size of the tumblr image? We are kind-of freaked out by our expectations, so we have not seriously tried to start archiving at all...
02:14 🔗 namibj_ We are currently archiving a selection of 4chan boards though, and will likely soon have 8chan ready too, but those archives are a bit of an issue due to them being compiled live, i.e. before moderation, and if you know 4chan there is a lot of stuff you do not want to host. Moderation filters most of that out, but in the spirit of archiving we prefer to not restric ourselves to censored records of
02:14 🔗 namibj_ history. Personally I would decide against censorship in a binary decision (on current levels of censorship, i.e. I'd prefer to just get rid of all censorship than keep the present state), but that is hardly something really to consider when you speak of the huge amount stored by tumblr.
02:15 🔗 Somebody2 I don't think that's so much an issue with Tumblr, esspecially by now, considering the censorship Verizon has started putting in.
02:16 🔗 namibj_ Yeah. Much less there.
02:16 🔗 Somebody2 As for total size -- you can get some estimates by looking at the sizes of the many tumblr blogs we've grabbed with Archivebot.
02:16 🔗 Somebody2 then multiply by the number in the list.
02:16 🔗 namibj_ Do you have numbers?
02:17 🔗 Somebody2 Not offhand -- but the Archivebot data is all available on IA, so you can figure it out there.
02:17 🔗 namibj_ Oh ok
02:17 🔗 Somebody2 If we want to discuss this further, we should probably make a separate channel; name suggestions?
02:17 🔗 * Somebody2 is going AFK for a bit
02:18 🔗 namibj_ For storage estimates I am only really concerned that the accuracy is so that if you say 10TB, it might also be 2.5 or 40, but in that area.
02:18 🔗 wabu has joined #archiveteam-bs
02:18 🔗 namibj_ tumbleweed?
02:19 🔗 namibj_ But tbh last name suggestion i made a few months ago was pretty terrible, according to the others.
02:23 🔗 namibj_ Considering the scale to be a number of bits needed to count the bytes storead, i.e. 32 being 4GiB, 40 being 1TiB, anything up to 38 is no issue, likely to be more work to manage than to store, 44 no issue to store in a potentail cold storage, and 55 about the limit the warrior system can reasonably do for a single project. Anything above 36 is not necessarily for the internet archive though.
02:23 🔗 namibj_ At least not yet.
02:25 🔗 Somebody2 tumbleweed seems fine to me
02:26 🔗 Somebody2 but it's already in use :-(
02:26 🔗 namibj_ Uh
02:26 🔗 namibj_ Any better idea=
02:27 🔗 Somebody2 tumbledown?
02:27 🔗 namibj_ yeah
02:27 🔗 Somebody2 feel free to join it
02:41 🔗 jrwr namibj_: what are you going on about with this storage system
02:41 🔗 jrwr We have newsbuddy eating that on a daily basis now
02:42 🔗 Somebody2 jrwr: *How* much is newsbuddy producing per day?
02:42 🔗 jrwr I handled 16 million urls yesterday
02:42 🔗 jrwr on the dedupe server
02:43 🔗 jrwr 38TB per the tracker http://tracker.archiveteam.org/newsgrabber/
02:44 🔗 jrwr mind you, we know that tumblr will be a stupid large project if we ever need to save them
02:46 🔗 Somebody2 Cool, nice to have the stats.
02:47 🔗 namibj_ has quit IRC (Ping timeout: 268 seconds)
02:47 🔗 namibj1 has joined #archiveteam-bs
02:48 🔗 namibj1 Somebody2: sorry for the delay, my bouncer has some connectivity issues.
02:53 🔗 jrwr Ya
02:53 🔗 jrwr If you join #newsgrabberbot
02:53 🔗 jrwr you can see as the URLs are added in
02:56 🔗 Fletcher- is now known as Fletcher_
03:07 🔗 Kisikilli has quit IRC (Quit: http://www.okay.uz/ (Ping timeout))
03:24 🔗 Somebody2 namibj1: no worries, it happens
03:27 🔗 Fletcher is now known as Fletcher-
03:27 🔗 Fletcher_ is now known as Fletcher
03:52 🔗 qw3rty115 has joined #archiveteam-bs
03:56 🔗 qw3rty114 has quit IRC (Read error: Operation timed out)
04:09 🔗 HarryCros has quit IRC (Read error: Connection reset by peer)
04:09 🔗 HarryCros has joined #archiveteam-bs
04:11 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:18 🔗 Sk1d has joined #archiveteam-bs
04:40 🔗 namibj_ has joined #archiveteam-bs
05:51 🔗 klg has joined #archiveteam-bs
06:07 🔗 schbirid has joined #archiveteam-bs
06:10 🔗 brayden has quit IRC (Read error: Connection reset by peer)
06:14 🔗 hook54321 JAA: I'm not sure where this twitter feed is getting info from, but slightly concerning. https://twitter.com/dnsstream/status/902035519776858112
06:18 🔗 hook54321 huh. .al domains don't have a central whois server.
06:23 🔗 godane did anyone look at pressreader to get more recent newspapers
06:23 🔗 godane recent as in the last 10+ years
06:32 🔗 Mateon1 has quit IRC (Ping timeout: 245 seconds)
06:32 🔗 Mateon1 has joined #archiveteam-bs
07:01 🔗 brayden has joined #archiveteam-bs
07:01 🔗 swebb sets mode: +o brayden
07:01 🔗 godane i think issuu.com is screwed up
07:26 🔗 HCross2 jrwr: sorry to burst your bubble - that 36TB is since we moved to the warrior. We do about 1TB a day
07:30 🔗 schbirid has quit IRC (Quit: Leaving)
07:37 🔗 atluxity has quit IRC (Ping timeout: 506 seconds)
07:40 🔗 godane looks like issuu.com is fixed
08:26 🔗 drumstick has quit IRC (Read error: Operation timed out)
08:39 🔗 drumstick has joined #archiveteam-bs
08:51 🔗 JerryStie has quit IRC (Read error: Operation timed out)
09:24 🔗 SimpBrain has quit IRC (Remote host closed the connection)
09:47 🔗 ruunyan has quit IRC (Read error: Operation timed out)
09:49 🔗 ruunyan has joined #archiveteam-bs
10:48 🔗 BlueMaxim has quit IRC (Quit: Leaving)
11:36 🔗 odemg godane, finally coming in but he still wants to clean them up before putting them out properly, here's what he's handed me so far and wants feedback on cleaning up the video before encoding/putting them out: http://dheval.eieidoh.net:8880/DataHoarder/tmp_requests/rednight39_Conan_VHS/
11:40 🔗 drumstick has quit IRC (Read error: Operation timed out)
11:43 🔗 REiN^ some scene stuff AJJ-The_Bible_2-CD-FLAC-2016-NBFLAC
11:44 🔗 REiN^ anybody collecting soft for BeOS?
11:46 🔗 godane odemg: i'm downloading it now
11:47 🔗 odemg <3
11:47 🔗 godane i don't know much about video capturing and encoding but i do my best
11:48 🔗 odemg he's all good with that he just wants to restore the video as much as possible
11:49 🔗 godane odemg: do you know how to get pressreader newspapers by chance?
12:18 🔗 odemg godane, not tried, no idea what those are so nope?
12:25 🔗 odemg godane, if you're not on the tracker you may be interested in this, though it's taking forever to derive: https://archive.org/details/societyglitchaugust2017
12:29 🔗 odemg it would be nice to have it sorted but it's huge task, I've also saved all he original upload descriptions some of which are quite crazy... example: https://www.reddit.com/r/opendirectories/comments/6waf8b/floyd_mayweather_vs_conor_mcgregor/dm6kaiy/?context=3
13:22 🔗 namibj_ what was the name for the tumblr channel again? I got DDOSd, and my bouncer isweired.
13:23 🔗 PurpleSym namibj_: #tumbledown
13:34 🔗 namibj_ thx
15:13 🔗 kristian_ has joined #archiveteam-bs
15:15 🔗 JAA "Wayback Machine failed to return archive information." and "This snapshot cannot be displayed due to an internal error." :-|
15:19 🔗 HCross has joined #archiveteam-bs
15:22 🔗 HarryCros has quit IRC (Ping timeout: 268 seconds)
16:00 🔗 kristian_ has quit IRC (Quit: Leaving)
16:10 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
16:13 🔗 Lord_Nigh has joined #archiveteam-bs
16:34 🔗 schbirid has joined #archiveteam-bs
16:43 🔗 atrocity do people actually like old commercials?
16:51 🔗 namibj_ atrocity: sometimes. more a s a curiosity effet/looking at stuff in a museum, than for afternoion tv binge replacement, afaik
17:05 🔗 robink has quit IRC (Read error: Connection reset by peer)
17:09 🔗 robink has joined #archiveteam-bs
17:23 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
17:25 🔗 RichardG has joined #archiveteam-bs
19:24 🔗 Asparagir has joined #archiveteam-bs
19:47 🔗 atrocity haha, ok. i have a bunch of old tapes i have from...sources...that has old 80's and 90's commercials on them
20:11 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
20:13 🔗 hook54321 JAA: I saw someone post about that on Twitter as well.
20:16 🔗 RichardG has joined #archiveteam-bs
20:19 🔗 RichardG_ has joined #archiveteam-bs
20:20 🔗 fie has quit IRC (Ping timeout: 268 seconds)
20:23 🔗 RichardG has quit IRC (Ping timeout: 370 seconds)
20:23 🔗 hook54321 I wish the Community College near me had something like this, except less expensive. http://digitalcuration.umaine.edu/
20:24 🔗 RichardG has joined #archiveteam-bs
20:26 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
20:29 🔗 RichardG_ has quit IRC (Ping timeout: 370 seconds)
20:34 🔗 fie has joined #archiveteam-bs
20:45 🔗 godane odemg: the conan videos look great
20:46 🔗 godane i don't think filters would be needed
20:48 🔗 odemg yeah i didnt think so
20:58 🔗 kimmer has joined #archiveteam-bs
21:23 🔗 sep332_ has joined #archiveteam-bs
21:24 🔗 sep332 has quit IRC (Read error: Operation timed out)
21:32 🔗 namibj1 atrocity: do you have h/w to digitize them without compression? I.e. read the video signal into raw pixels and then just store them for later non-real-time compression? If not, we could brobably find soem way to digitize them well. May I ask, what country they are in?
21:33 🔗 astrid i have an ntsc-firewire capture thingy that encodes each frame as a distinct jpeg
21:33 🔗 astrid it looks a lot better than mpeg
21:33 🔗 astrid but the files are yooge
21:34 🔗 RichardG has joined #archiveteam-bs
21:44 🔗 namibj1 Good
21:44 🔗 namibj1 You encode later with ffmpeg/libx264
21:45 🔗 namibj1 astrid: I can guide you through the encoding and creation of the files for upload.
21:46 🔗 astrid excuse me?
21:46 🔗 namibj1 If you have a way to feed the vhs into your computer and store the files for a w2eek or so abbout.
21:46 🔗 astrid i don't have the tapes
21:46 🔗 astrid you probably are talking to the wrong person
21:46 🔗 namibj1 Oh
21:47 🔗 namibj1 sry, I thought you responded.
21:47 🔗 astrid i responded that such hardware exists
21:48 🔗 astrid that's all :)
21:48 🔗 namibj1 Uh I know it exists
21:48 🔗 namibj1 I just don't know if he has access.
21:48 🔗 * astrid nods quietly
21:49 🔗 * namibj1 knows a bit or two about video coding
21:49 🔗 astrid sorrz
21:50 🔗 namibj1 Nah, I don't have that much experience with old tech.
22:07 🔗 drumstick has joined #archiveteam-bs
22:23 🔗 Geekonoci has joined #archiveteam-bs
22:33 🔗 schbirid has quit IRC (Quit: Leaving)
23:04 🔗 atrocity i meant they were ripped tapes
23:04 🔗 atrocity from a private tracker
23:04 🔗 atrocity so whatever format they're in (probably shitty mpeg)
23:04 🔗 astrid ah nice
23:04 🔗 atrocity like old nick shows and stuff with commercials in between and stuff
23:07 🔗 namibj1 Uh
23:07 🔗 namibj1 So the stuff you can access is already digital?
23:07 🔗 namibj1 Do you have a way to upload them?
23:51 🔗 atrocity this is kind of scary: https://tech.slashdot.org/story/17/08/28/1725232/how-the-nsa-identified-satoshi-nakamoto

irclogger-viewer