#archiveteam-bs 2017-08-28,Mon

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
hook54321JAA: They're live again. https://dailystormer.al/ [00:01]
SketchCowHa ha, white supremacists living under the pleasure of albania [00:07]
jrwrI wonder if bitmitigate.com knows they are hosting them [00:10]
hook54321Based on their twitter feed, I'm assuming so. https://twitter.com/bitmitigate [00:11]
.... (idle for 19mn)
***BlueMaxim has joined #archiveteam-bs [00:30]
namibj_jrwr: I am willing to participate too.
Also, I am currently trying to get the stuff/knowledge to start some material/corrosion testing for long-term no-maintance tape storage. Just the mechanical side for now though. I expect about 15$/TB to store for 1-2 decades, in lots of about 50TB+
Trying to get water vapor diffusion numbers for metal cans, which is not that easy.
[00:35]
........... (idle for 50mn)
***drumstick has joined #archiveteam-bs [01:28]
drumstick has quit IRC (Ping timeout: 268 seconds) [01:33]
drumstick has joined #archiveteam-bs [01:38]
Somebody2plue: just usernames is worth putting up on IA just so it's less likely to get lost.
It's not the valuable historically without the actual data, agreed.
One thing we could do that would likely be a smaller pile of data, is to grab *just text* from all the blogs on that list.
I know the images on Tumblr are really important, but so is the text, and it's lots smaller.
Also, making a warrior project just to grab metadata from all those blogs would be nice.
i.e. how many posts, what's the oldest and newest, etc.
It would also be *great* if knowledge of writing warrior projects was more widely distributed, so if any of you want to practice...
[01:44]
..... (idle for 20mn)
namibj_Somebody2: I do approve a lot of the effort to distribute the knowledge of writing warrior projects
Do we have _any_ usefull estimate on the size of the combined size of the tumblr image? We are kind-of freaked out by our expectations, so we have not seriously tried to start archiving at all...
[02:07]
We are currently archiving a selection of 4chan boards though, and will likely soon have 8chan ready too, but those archives are a bit of an issue due to them being compiled live, i.e. before moderation, and if you know 4chan there is a lot of stuff you do not want to host. Moderation filters most of that out, but in the spirit of archiving we prefer to not restric ourselves to censored records of
history. Personally I would decide against censorship in a binary decision (on current levels of censorship, i.e. I'd prefer to just get rid of all censorship than keep the present state), but that is hardly something really to consider when you speak of the huge amount stored by tumblr.
[02:14]
Somebody2I don't think that's so much an issue with Tumblr, esspecially by now, considering the censorship Verizon has started putting in. [02:15]
namibj_Yeah. Much less there. [02:16]
Somebody2As for total size -- you can get some estimates by looking at the sizes of the many tumblr blogs we've grabbed with Archivebot.
then multiply by the number in the list.
[02:16]
namibj_Do you have numbers? [02:16]
Somebody2Not offhand -- but the Archivebot data is all available on IA, so you can figure it out there. [02:17]
namibj_Oh ok [02:17]
Somebody2If we want to discuss this further, we should probably make a separate channel; name suggestions?
Somebody2 is going AFK for a bit
[02:17]
namibj_For storage estimates I am only really concerned that the accuracy is so that if you say 10TB, it might also be 2.5 or 40, but in that area. [02:18]
***wabu has joined #archiveteam-bs [02:18]
namibj_tumbleweed?
But tbh last name suggestion i made a few months ago was pretty terrible, according to the others.
Considering the scale to be a number of bits needed to count the bytes storead, i.e. 32 being 4GiB, 40 being 1TiB, anything up to 38 is no issue, likely to be more work to manage than to store, 44 no issue to store in a potentail cold storage, and 55 about the limit the warrior system can reasonably do for a single project. Anything above 36 is not necessarily for the internet archive though.
At least not yet.
[02:18]
Somebody2tumbleweed seems fine to me
but it's already in use :-(
[02:25]
namibj_Uh
Any better idea=
[02:26]
Somebody2tumbledown? [02:27]
namibj_yeah [02:27]
Somebody2feel free to join it [02:27]
jrwrnamibj_: what are you going on about with this storage system
We have newsbuddy eating that on a daily basis now
[02:41]
Somebody2jrwr: *How* much is newsbuddy producing per day? [02:42]
jrwrI handled 16 million urls yesterday
on the dedupe server
38TB per the tracker http://tracker.archiveteam.org/newsgrabber/
mind you, we know that tumblr will be a stupid large project if we ever need to save them
[02:42]
Somebody2Cool, nice to have the stats. [02:46]
***namibj_ has quit IRC (Ping timeout: 268 seconds)
namibj1 has joined #archiveteam-bs
[02:47]
namibj1Somebody2: sorry for the delay, my bouncer has some connectivity issues. [02:48]
jrwrYa
If you join #newsgrabberbot
you can see as the URLs are added in
[02:53]
***Fletcher- is now known as Fletcher_ [02:56]
Kisikilli has quit IRC (Quit: http://www.okay.uz/ (Ping timeout)) [03:07]
.... (idle for 17mn)
Somebody2namibj1: no worries, it happens [03:24]
***Fletcher is now known as Fletcher-
Fletcher_ is now known as Fletcher
[03:27]
...... (idle for 25mn)
qw3rty115 has joined #archiveteam-bs
qw3rty114 has quit IRC (Read error: Operation timed out)
[03:52]
HarryCros has quit IRC (Read error: Connection reset by peer)
HarryCros has joined #archiveteam-bs
Sk1d has quit IRC (Ping timeout: 250 seconds)
[04:09]
Sk1d has joined #archiveteam-bs [04:18]
..... (idle for 22mn)
namibj_ has joined #archiveteam-bs [04:40]
............... (idle for 1h11mn)
klg has joined #archiveteam-bs [05:51]
.... (idle for 16mn)
schbirid has joined #archiveteam-bs
brayden has quit IRC (Read error: Connection reset by peer)
[06:07]
hook54321JAA: I'm not sure where this twitter feed is getting info from, but slightly concerning. https://twitter.com/dnsstream/status/902035519776858112
huh. .al domains don't have a central whois server.
[06:14]
godanedid anyone look at pressreader to get more recent newspapers
recent as in the last 10+ years
[06:23]
***Mateon1 has quit IRC (Ping timeout: 245 seconds)
Mateon1 has joined #archiveteam-bs
[06:32]
...... (idle for 29mn)
brayden has joined #archiveteam-bs
swebb sets mode: +o brayden
[07:01]
godanei think issuu.com is screwed up [07:01]
...... (idle for 25mn)
HCross2jrwr: sorry to burst your bubble - that 36TB is since we moved to the warrior. We do about 1TB a day [07:26]
***schbirid has quit IRC (Quit: Leaving) [07:30]
atluxity has quit IRC (Ping timeout: 506 seconds) [07:37]
godanelooks like issuu.com is fixed [07:40]
.......... (idle for 46mn)
***drumstick has quit IRC (Read error: Operation timed out) [08:26]
drumstick has joined #archiveteam-bs [08:39]
JerryStie has quit IRC (Read error: Operation timed out) [08:51]
....... (idle for 33mn)
SimpBrain has quit IRC (Remote host closed the connection) [09:24]
..... (idle for 23mn)
ruunyan has quit IRC (Read error: Operation timed out)
ruunyan has joined #archiveteam-bs
[09:47]
............ (idle for 59mn)
BlueMaxim has quit IRC (Quit: Leaving) [10:48]
.......... (idle for 48mn)
odemggodane, finally coming in but he still wants to clean them up before putting them out properly, here's what he's handed me so far and wants feedback on cleaning up the video before encoding/putting them out: http://dheval.eieidoh.net:8880/DataHoarder/tmp_requests/rednight39_Conan_VHS/ [11:36]
***drumstick has quit IRC (Read error: Operation timed out) [11:40]
REiN^some scene stuff AJJ-The_Bible_2-CD-FLAC-2016-NBFLAC
anybody collecting soft for BeOS?
[11:43]
godaneodemg: i'm downloading it now [11:46]
odemg<3 [11:47]
godanei don't know much about video capturing and encoding but i do my best [11:47]
odemghe's all good with that he just wants to restore the video as much as possible [11:48]
godaneodemg: do you know how to get pressreader newspapers by chance? [11:49]
...... (idle for 29mn)
odemggodane, not tried, no idea what those are so nope? [12:18]
godane, if you're not on the tracker you may be interested in this, though it's taking forever to derive: https://archive.org/details/societyglitchaugust2017
it would be nice to have it sorted but it's huge task, I've also saved all he original upload descriptions some of which are quite crazy... example: https://www.reddit.com/r/opendirectories/comments/6waf8b/floyd_mayweather_vs_conor_mcgregor/dm6kaiy/?context=3
[12:25]
........... (idle for 53mn)
namibj_what was the name for the tumblr channel again? I got DDOSd, and my bouncer isweired. [13:22]
PurpleSymnamibj_: #tumbledown [13:23]
namibj_thx [13:34]
.................... (idle for 1h39mn)
***kristian_ has joined #archiveteam-bs [15:13]
JAA"Wayback Machine failed to return archive information." and "This snapshot cannot be displayed due to an internal error." :-| [15:15]
***HCross has joined #archiveteam-bs
HarryCros has quit IRC (Ping timeout: 268 seconds)
[15:19]
........ (idle for 38mn)
kristian_ has quit IRC (Quit: Leaving) [16:00]
Lord_Nigh has quit IRC (Read error: Operation timed out)
Lord_Nigh has joined #archiveteam-bs
[16:10]
..... (idle for 21mn)
schbirid has joined #archiveteam-bs [16:34]
atrocitydo people actually like old commercials? [16:43]
namibj_atrocity: sometimes. more a s a curiosity effet/looking at stuff in a museum, than for afternoion tv binge replacement, afaik [16:51]
***robink has quit IRC (Read error: Connection reset by peer)
robink has joined #archiveteam-bs
[17:05]
RichardG has quit IRC (Read error: Connection reset by peer)
RichardG has joined #archiveteam-bs
[17:23]
........................ (idle for 1h59mn)
Asparagir has joined #archiveteam-bs [19:24]
..... (idle for 23mn)
atrocityhaha, ok. i have a bunch of old tapes i have from...sources...that has old 80's and 90's commercials on them [19:47]
..... (idle for 24mn)
***RichardG has quit IRC (Read error: Connection reset by peer) [20:11]
hook54321JAA: I saw someone post about that on Twitter as well. [20:13]
***RichardG has joined #archiveteam-bs
RichardG_ has joined #archiveteam-bs
fie has quit IRC (Ping timeout: 268 seconds)
RichardG has quit IRC (Ping timeout: 370 seconds)
[20:16]
hook54321I wish the Community College near me had something like this, except less expensive. http://digitalcuration.umaine.edu/ [20:23]
***RichardG has joined #archiveteam-bs
RichardG has quit IRC (Read error: Connection reset by peer)
RichardG_ has quit IRC (Ping timeout: 370 seconds)
[20:24]
fie has joined #archiveteam-bs [20:34]
godaneodemg: the conan videos look great
i don't think filters would be needed
[20:45]
odemgyeah i didnt think so [20:48]
***kimmer has joined #archiveteam-bs [20:58]
...... (idle for 25mn)
sep332_ has joined #archiveteam-bs
sep332 has quit IRC (Read error: Operation timed out)
[21:23]
namibj1atrocity: do you have h/w to digitize them without compression? I.e. read the video signal into raw pixels and then just store them for later non-real-time compression? If not, we could brobably find soem way to digitize them well. May I ask, what country they are in? [21:32]
astridi have an ntsc-firewire capture thingy that encodes each frame as a distinct jpeg
it looks a lot better than mpeg
but the files are yooge
[21:33]
***RichardG has joined #archiveteam-bs [21:34]
namibj1Good
You encode later with ffmpeg/libx264
astrid: I can guide you through the encoding and creation of the files for upload.
[21:44]
astridexcuse me? [21:46]
namibj1If you have a way to feed the vhs into your computer and store the files for a w2eek or so abbout. [21:46]
astridi don't have the tapes
you probably are talking to the wrong person
[21:46]
namibj1Oh
sry, I thought you responded.
[21:46]
astridi responded that such hardware exists
that's all :)
[21:47]
namibj1Uh I know it exists
I just don't know if he has access.
[21:48]
astridastrid nods quietly [21:48]
namibj1namibj1 knows a bit or two about video coding [21:49]
astridsorrz [21:49]
namibj1Nah, I don't have that much experience with old tech. [21:50]
.... (idle for 17mn)
***drumstick has joined #archiveteam-bs [22:07]
.... (idle for 16mn)
Geekonoci has joined #archiveteam-bs [22:23]
schbirid has quit IRC (Quit: Leaving) [22:33]
....... (idle for 31mn)
atrocityi meant they were ripped tapes
from a private tracker
so whatever format they're in (probably shitty mpeg)
[23:04]
astridah nice [23:04]
atrocitylike old nick shows and stuff with commercials in between and stuff [23:04]
namibj1Uh
So the stuff you can access is already digital?
Do you have a way to upload them?
[23:07]
......... (idle for 44mn)
atrocitythis is kind of scary: https://tech.slashdot.org/story/17/08/28/1725232/how-the-nsa-identified-satoshi-nakamoto [23:51]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)