Time |
Nickname |
Message |
02:47
๐
|
godane1 |
Famicoman: ping |
06:03
๐
|
turnkit |
In America, government hacks you... http://www.whitehouse.gov/blog/2013/01/22/roll-your-sleeves-get-involved-and-get-civic-hacking |
06:04
๐
|
turnkit |
or as Andrew Auernheimer @rabite says, "If your app is deemed unamerican you go to prison!" |
06:05
๐
|
Zebranky |
Good ol' weev |
06:39
๐
|
illunatic |
fuck |
06:40
๐
|
illunatic |
media is pushing a story of john kerry saying hackers are nukes |
06:41
๐
|
illunatic |
http://activepolitic.com:82/external/1785.html |
06:41
๐
|
illunatic |
00:12 <+u4t> imagine if a hacker had been dropped on hiroshima or nagasaki |
06:41
๐
|
illunatic |
00:12 <+u4t> there would be NO SURVIVORS |
06:47
๐
|
kennethr- |
hahaha |
06:47
๐
|
kennethr- |
well, admittedly, our shit is so fucking vulnerable |
06:50
๐
|
kennethr- |
can't we all just get along? |
07:16
๐
|
illunatic |
<!-- __ _ _ _ __| |_ (_)__ _____ / _` | '_/ _| ' \| |\ V / -_) \__,_|_| \__|_||_|_| \_/\___| --> |
07:16
๐
|
illunatic |
oops |
07:17
๐
|
illunatic |
<!-- __ _ _ _ __| |_ (_)__ _____ --> |
07:17
๐
|
illunatic |
<!--/ _` | '_/ _| ' \| |\ V / -_) --> |
07:17
๐
|
illunatic |
<!--\__,_|_| \__|_||_|_| \_/\___| --> |
07:31
๐
|
turnkit |
thinking a lot about the "bubble" window-period of lots of semi-freely available information due to the CD-ROM/DVD-ROM era of distribution... we get this big explosion of INFORMATION availability and then in a couple decades it seems like that access is less available. Is all the info still there like it was 15 years ago? |
07:31
๐
|
turnkit |
What the heck is this publication? http://www.worldcat.org/title/canadian-cd-rom-newsletter/oclc/18111186 |
07:34
๐
|
turnkit |
If every CD-ROM is now a web site... where is the site, "The rock cycle in Michigan" lol -- http://www.worldcat.org/title/rock-cycle-in-michigan-cd-rom/oclc/49221323 |
07:35
๐
|
turnkit |
oh, I guess this is it... https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&ved=0CD0QFjAB&url=http%3A%2F%2Fwww.oakland.k12.mi.us%2Fscope%2Fseventh_lessons%2Fscience%2Funit5%2FSC070504TB.ppt&ei=G4cDUaHoEMSQqgHejoCQDg&usg=AFQjCNEsym5C44QZ3CC9eyKiLWcxmiXZTg&sig2=32aCcKoVhvrMHq_CCDrBCA&bvm=bv.41524429,d.aWM |
07:35
๐
|
godane1 |
just for you guys to know i download g4tv.com videos |
07:36
๐
|
turnkit |
should have linked: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&ved=0CEEQxQEwAQ&url=https%3A%2F%2Fdocs.google.com%2Fviewer%3Fa%3Dv%26q%3Dcache%3A_OwfKzETh84J%3Awww.oakland.k12.mi.us%2Fscope%2Fseventh_lessons%2Fscience%2Funit5%2FSC070504TB.ppt%2B%26hl%3Den%26gl%3Dus%26pid%3Dbl%26srcid%3DADGEESglvcHQH5V5tU6eFA2ZTt2-Ddh9wo-lL-DrtMVCnDKHMa4ELCoNRUdpLh0Sh-EQGoNT1RTJ-LYACtXu9 |
07:36
๐
|
turnkit |
Geeze godane1-- made me think I was posting in the non -bs page. something meaningful? :) |
07:37
๐
|
illunatic |
balrog_: http://archive.org/details/AaronSwartzMemorial0 |
07:39
๐
|
illunatic |
i left the original video titles because i wanted to keep it as close to the original format as possible |
07:39
๐
|
illunatic |
all that has changed is the container from flv to mp4 |
07:40
๐
|
DFJustin |
hmm that newsletter is in the vancouver public library, not so far |
07:41
๐
|
DFJustin |
doesn't sound like it necessarily has an enclosed disc though |
07:47
๐
|
illunatic |
http://archive.org/details/AaronSwartzMemorialAtTheInternetArchive |
07:59
๐
|
illunatic |
authoritarians! |
08:26
๐
|
turnkit |
DFJustin-- these days I'm thinking I should take a cd-reader into any large local library to do archiving sessions. I doubt they will keep those discs for ten+, or twenty+ years. |
08:27
๐
|
DFJustin |
yeah go for it |
08:27
๐
|
DFJustin |
cds rot eventually too |
08:47
๐
|
turnkit |
thanks for uploading the memorial... am watching... good thoughts. |
08:49
๐
|
turnkit |
29m05s+ "if you are a programmer or technologist like many of you in the audience today, you have special powers and special responsibilities..." |
08:50
๐
|
turnkit |
"... you can do magic" |
08:51
๐
|
turnkit |
abt 29m15s+ "Aaron really could do magic. And I'm dedicated to making sure his magic doesn't end with his death. I hope you'll join me" -- missed her name, Aaron's girlfriend. |
08:52
๐
|
turnkit |
Sobering stuff. |
08:52
๐
|
illunatic |
turnkit: very inspiring |
08:58
๐
|
SketchCow |
Taren\ |
08:59
๐
|
godane1 |
SketchCow: I'm mirroring the g4 videos on g4tv.com |
08:59
๐
|
godane1 |
i have the list |
08:59
๐
|
godane1 |
getting tons of old techtv videos |
09:01
๐
|
godane1 |
in other news i have a mcdonalds employee training video from 1972 |
09:01
๐
|
turnkit |
training video is neat |
09:01
๐
|
turnkit |
b/c of the age and the format |
09:01
๐
|
turnkit |
and iconic source |
09:01
๐
|
godane1 |
and i'm getting a blockbuster employee training from 2002 |
09:02
๐
|
turnkit |
I've got some Tandy training LASERDISC but it's from like 1992 -- not nearly as cool. haha. :) |
09:02
๐
|
turnkit |
category: CorporateTraining -? |
09:03
๐
|
turnkit |
what are you using to digitize? |
09:06
๐
|
turnkit |
Whoever was involved in shooting the Internet Archive memorial did a very nice job. Multi-cam shoot with good establishing shots. Nothing overdone, simple, but live-edited well. |
09:07
๐
|
godane1 |
turnkit: i just find stuff on the web |
09:07
๐
|
turnkit |
you mean, not on youtube? |
09:07
๐
|
turnkit |
:) |
09:08
๐
|
turnkit |
have you heard the 1800-sos-apple phone tech support calls collection? |
09:08
๐
|
turnkit |
it's like only 10 calls |
09:08
๐
|
turnkit |
some apple IIe and Mac+ type era calls |
09:09
๐
|
turnkit |
I think I found them on DSLreports.com again... someone uploaded them to their forum (that's a good forum/site to archive btw) |
09:09
๐
|
turnkit |
http://www.dslreports.com/forum/r12697522-humor-Apple-Support-Calls |
09:09
๐
|
turnkit |
weird old stuff... |
09:12
๐
|
illunatic |
http://blog.greenpirate.org/aaronz-swartz-memorials-archived/ |
09:19
๐
|
DFJustin |
#1 weird old trick for fixing your IIe |
09:25
๐
|
turnkit |
just realized there is not much chat going on b/c it's friday night at 3:30 AM. someone has a life? |
09:28
๐
|
turnkit |
http://archive.org/details/AaronSwartzMemorialAtTheInternetArchive |
09:29
๐
|
turnkit |
56m -- Tim O'Reilly makes sobering statements about being formed by the things that "defeat" us. Enjoying these talks. Feels like a good church service. |
09:30
๐
|
illunatic |
i'm in the same time zone turnkit |
09:30
๐
|
illunatic |
yeah that was a great one |
09:31
๐
|
turnkit |
East Texas here. |
09:31
๐
|
illunatic |
Ausint |
09:31
๐
|
illunatic |
-t |
09:31
๐
|
illunatic |
or put it somewhere more appropriate anyway :D |
09:31
๐
|
turnkit |
you guys have a DC meet or 2600 down there? |
09:32
๐
|
illunatic |
t is playing leapfrog |
09:32
๐
|
illunatic |
hm i don't know if they have 2600 |
09:32
๐
|
illunatic |
yes of course |
09:32
๐
|
illunatic |
http://www.meetup.com/ATX-2600/ |
09:32
๐
|
illunatic |
would have been surprised if not |
09:32
๐
|
turnkit |
Dallas has an interesting DC group , but I hate driving an hour+ each way |
09:32
๐
|
turnkit |
ah |
09:33
๐
|
turnkit |
I'm near Tyler |
09:33
๐
|
turnkit |
podunk out here but I like it |
09:34
๐
|
illunatic |
:) |
09:34
๐
|
illunatic |
as long as you have internets i guess |
09:34
๐
|
illunatic |
priorities |
09:34
๐
|
turnkit |
Tyler has Suddenlink's nationwide VOIP control/termination I think. They have a good network but I'm out in the countryside with AT&T DSL |
09:34
๐
|
turnkit |
3mb/s and am trying to game the system to get 6 which is technically possible but for some reason has been a big battle. |
09:35
๐
|
illunatic |
there's a dungeons and dragons meetup too |
09:35
๐
|
illunatic |
no time for that unfortunately |
09:35
๐
|
turnkit |
lulzies. we have a "Game Board Geek" store in town |
09:35
๐
|
turnkit |
I play Settlers but not most of the games they have there. |
09:36
๐
|
illunatic |
i literally just upgraded from 3mbps |
09:36
๐
|
turnkit |
congrats. I know your pain. lol |
09:36
๐
|
illunatic |
:) |
09:36
๐
|
illunatic |
at&t was charging about $50 per month for that |
09:36
๐
|
turnkit |
yep |
09:36
๐
|
illunatic |
because we had signed a year deal or whatever |
09:37
๐
|
turnkit |
what service providers do you havei n Austin? |
09:37
๐
|
illunatic |
at least we were able to get a new year deal for slightly better speeds |
09:37
๐
|
illunatic |
at&t and time warner |
09:37
๐
|
illunatic |
who i would never use |
09:37
๐
|
turnkit |
so what/who do you have now? |
09:37
๐
|
illunatic |
in this area anyway |
09:37
๐
|
illunatic |
at&t still |
09:38
๐
|
illunatic |
i dunno |
09:38
๐
|
illunatic |
i might like to move to sweden for some time |
09:38
๐
|
turnkit |
did you ever get notices on datacaps? |
09:38
๐
|
illunatic |
not that i know of |
09:38
๐
|
turnkit |
I think ATT has them, but are not implimented in our area - thank goodness |
09:39
๐
|
turnkit |
it's like 150GB or 250GB per month, I can't remember |
09:39
๐
|
illunatic |
I've been meaning to update since this http://blog.greenpirate.org/comcast-data-caps-how-long-to-reach-your-cap/ |
09:39
๐
|
illunatic |
i think they upped it to 350GB, not sure |
09:39
๐
|
turnkit |
I'll switch to business class as soon as they impliment here if I get close to it. Business doesn't have a cap |
09:39
๐
|
turnkit |
but it's more expensive |
09:40
๐
|
illunatic |
http://muninetworks.org/content/bandwidth-caps-are-unnecessary-and-counterproductive |
09:40
๐
|
turnkit |
there's a good twit.tv special on datacaps -- guys from ISP Hurricane Electric (I think that's their name) |
09:41
๐
|
illunatic |
nice |
09:41
๐
|
turnkit |
yeah - peak congestion is the real problem - not overall capping |
09:41
๐
|
turnkit |
at least in our market |
09:44
๐
|
turnkit |
I am somewhat suspicious AT&T is purposely selling lower speeds than they can in order to push purchasing of data plans over mobile. |
09:45
๐
|
turnkit |
it's crazy but Verizon is about 15 miles from me too and they have purposefully not deployed DSL. Their rep came out to our counry town to tell a group interested in DSL that it would never come but 4G data would "in the future" -- guess which make more money for Verizon? |
09:45
๐
|
turnkit |
So I wonder if ATT isn't motivated the same way |
09:47
๐
|
illunatic |
actually they are tho |
09:47
๐
|
illunatic |
and they are cutting unlimited plans for mobile |
09:48
๐
|
illunatic |
oh man i left them because of that |
09:48
๐
|
illunatic |
they give some small amount of bandwidth |
09:48
๐
|
illunatic |
and if you hit it, another $20 fee each time |
09:48
๐
|
illunatic |
i wasn't even using it and somehow hitting it from crappy apps or something |
09:48
๐
|
illunatic |
no idea why tbh |
09:56
๐
|
turnkit |
anyone know how to play an archive.org video starting right at a time code (like youtubes &t=0m0s)? |
09:57
๐
|
turnkit |
I want to post to FB Malamud's 82m40s+ Aaron Schwartz portion |
09:57
๐
|
illunatic |
what kind of video player do they use? |
09:58
๐
|
illunatic |
maybe you should suggest that feature |
09:59
๐
|
turnkit |
Don't know what they are using. But would be nice too if the timeline could be expanded to fill the screen for accurate scrubbing. A ninety minute video is hard to scrub on when it's shown in 4" of screen real estate. |
09:59
๐
|
turnkit |
well right now, in full screen the scrub is only 1/3 of the screen |
10:01
๐
|
turnkit |
the Malamud's talk should be re-up'd on it's own |
10:02
๐
|
turnkit |
it's so good |
10:02
๐
|
turnkit |
with energy |
10:02
๐
|
turnkit |
clear passion |
10:02
๐
|
turnkit |
but I like this guy already |
10:02
๐
|
turnkit |
so I'm biased |
10:05
๐
|
turnkit |
// Lossless flv/mp4 cutters? |
10:09
๐
|
DFJustin |
turnkit: if you click on the animated gif thumbnail you get a bunch of thumbnail images with links to time codes |
10:09
๐
|
turnkit |
Thanks for the "feedback" suggestion |
10:09
๐
|
DFJustin |
tweak as needed |
10:09
๐
|
turnkit |
I see where to do that |
10:09
๐
|
turnkit |
"We use the jwplayer from longtail video." |
10:09
๐
|
turnkit |
DFJustin - I'll check it. thx |
10:11
๐
|
turnkit |
well it added code, such as ?start=5009.5 at the end of the URL |
10:12
๐
|
turnkit |
ah -- the time code shows starting at "0" but when you hit play it pops forward. |
10:12
๐
|
turnkit |
THANKS |
10:12
๐
|
turnkit |
that works |
10:19
๐
|
omf__ |
I think the best hackers will never work for the gov for a number of reasons |
10:20
๐
|
omf__ |
resistance to authority figures |
10:20
๐
|
ersi |
Mainly because the gov is a bunch of assholes |
10:20
๐
|
omf__ |
not being squeaky clean enough to work for them |
10:20
๐
|
omf__ |
etc ... |
10:20
๐
|
ersi |
decitivive assholes |
10:20
๐
|
omf__ |
this bs to try to get us to help them will not work |
10:20
๐
|
turnkit |
If you don't feel you have time to watch the whole memorial watch the last part first... it might motivate you :) http://archive.org/details/AaronSwartzMemorialAtTheInternetArchive?start=4955 |
10:21
๐
|
turnkit |
haha -- yeah -- kids with life long authority issues / authorities with life long exploition and power trip issues -- the world is kinda f'd up |
10:21
๐
|
turnkit |
good luck getting them to enjoy eachother in a functional way |
10:22
๐
|
omf__ |
Also and I think the biggest reason is you have to play by their rules |
10:23
๐
|
omf__ |
and that is a fucking non-starter for me |
10:23
๐
|
omf__ |
When I do a security audit at a company they want me to bash everything in |
10:23
๐
|
omf__ |
The gov freaks out if you find something too unexpected because there were *rules* |
10:23
๐
|
turnkit |
haha -- yeah makes sense |
10:24
๐
|
godane1 |
uploaded: http://archive.org/details/McDonalds.Employee.Training.VHS.1972 |
10:24
๐
|
turnkit |
"control" is why they hire audits. when they discover less of it, they aren't feeling warm fuzzies |
10:24
๐
|
omf__ |
thanks godane, I really want to watch that video |
10:25
๐
|
omf__ |
Most of the hackers who I know are on the level are fucking kernel programmers for Linux and FreeBSD |
10:25
๐
|
turnkit |
hmm... McDonalds video -vs.- Malamud's talk @ Aaron Schwartz memorial? |
10:25
๐
|
turnkit |
Correct answer: both |
10:25
๐
|
omf__ |
the gov wouldn't pay them enough to come over |
10:25
๐
|
omf__ |
yeah I am a big fan of Malamud |
10:26
๐
|
turnkit |
he's awesome. I got his book on "the internet" from the 90's a couple years ago... I still need to finish... same issues different decade and different data |
10:27
๐
|
turnkit |
MickeyD is making me happy and gaayyyy... oh wait. |
10:27
๐
|
turnkit |
not really but if you listen to the song you'll know how I feel |
10:27
๐
|
turnkit |
:) |
10:28
๐
|
omf__ |
I need to dig up my copy of waiting and see how there fake training compares |
10:28
๐
|
DFJustin |
man why does everyone have to rip vhs to mpeg-1, like it wasn't crap enough already |
10:29
๐
|
omf__ |
because the vhs conversion units usually spit out mpeg 1 or 2 |
10:29
๐
|
godane1 |
DFJustin: the g4tv.com videos may piss you off |
10:29
๐
|
omf__ |
my vhs reader dumps straight to dvd |
10:29
๐
|
turnkit |
I have an old Avid system... digitizes to Avid Meridien codec that nothing can use. So I'd have to transcode to something... and something thats compressed.... |
10:29
๐
|
godane1 |
a lot of full episodes 56k |
10:30
๐
|
turnkit |
Oh man, I got to the sad part in the McD video :( |
10:30
๐
|
turnkit |
I was so happy till "if I wanted frys I WOULD HAVE ASKED FOR T H E M ! !!!!@" |
10:30
๐
|
omf__ |
turnkit, you sure ffmpeg does not have support for that avid codec? They have support for the others like dnxhd |
10:31
๐
|
turnkit |
hmm... but the bandwidth is huge |
10:31
๐
|
turnkit |
I worked on a Fox SportsNet show in the early 2000's -- BluetorchTV (extreme sports) -- we MASTERED at 3:1 compression |
10:32
๐
|
turnkit |
Avid Meridien from that era could do what they called 1:1, 2:1, 3:1, 10:1 and 20:1 (and some single field codecs) |
10:32
๐
|
turnkit |
But it's a tought decision -- what compression should be used. |
10:33
๐
|
turnkit |
I'm glad that McD vid starts so happy... people are mean. |
10:33
๐
|
turnkit |
I'm hoping it will come full circle and the hero will win at the end. lol |
10:34
๐
|
omf__ |
Someone just started testing a robot that can make 350 burgers an hour |
10:34
๐
|
turnkit |
1:1 avid footage for one hour: 185.3 GB. http://www.digitalrebellion.com/webapps/video_calc.html |
10:34
๐
|
omf__ |
the fast food jobs death clock has started |
10:35
๐
|
turnkit |
about 3GB/minute |
10:35
๐
|
omf__ |
that isn't so much "avid" footage as raw |
10:35
๐
|
turnkit |
Well that's what broadcast quality footage is... "uncompressed" -- aka "raw" |
10:35
๐
|
turnkit |
at 3:1 one minute is down to 260MB/minute... a lot more reasonable |
10:36
๐
|
omf__ |
were you guys doing 4:3 ratio or 16:9? |
10:36
๐
|
turnkit |
this was 2001 I think - all 4:3 |
10:36
๐
|
turnkit |
SD - NTSC |
10:36
๐
|
turnkit |
but if I want to archive something well... it's a tough call for me |
10:37
๐
|
omf__ |
I am so glad there are other video peeps in this group. |
10:37
๐
|
turnkit |
I'll probably compromise and do 3:1 or digitize 1:1 and then transcode to MP4 or AVC at a high bitrate |
10:37
๐
|
turnkit |
deinterlacing issues can make a mess |
10:37
๐
|
turnkit |
people don't understand it and then end up wanting to deinterlace everything which is a no-no |
10:38
๐
|
omf__ |
yep |
10:39
๐
|
turnkit |
but if you don't have a way to deinterlace on playback, then the footage looks bad |
10:39
๐
|
turnkit |
so what's the answer? |
10:39
๐
|
turnkit |
I think the footage has to be archived in it's native format |
10:39
๐
|
godane1 |
i have figured out the video names dates for the g4tv.com videos |
10:39
๐
|
turnkit |
and we have to work on output engines that render based on the device being used -- these days mostly in a progressive format |
10:40
๐
|
godane1 |
its YYMMDD |
10:40
๐
|
turnkit |
typical american dating style... so weird. :) |
10:41
๐
|
turnkit |
Man - I want a happy meal now |
10:41
๐
|
godane1 |
also for everyone at home keeping count i have 13.1Gb of g4tv.com videos |
10:41
๐
|
turnkit |
that's pretty good |
10:42
๐
|
godane1 |
bad news any file with vc# is going to end in a 404 |
10:43
๐
|
turnkit |
why is that? |
10:44
๐
|
godane1 |
dont know |
10:44
๐
|
turnkit |
bug in the scraper? |
10:44
๐
|
godane1 |
no |
10:44
๐
|
turnkit |
not on the server? |
10:44
๐
|
godane1 |
maybe |
10:47
๐
|
turnkit |
if you strip off the #vc can you manually dl? |
10:47
๐
|
turnkit |
oh, nevermind |
10:47
๐
|
turnkit |
doh |
10:48
๐
|
turnkit |
I read that wrong |
10:48
๐
|
godane1 |
its 404 on website page |
10:48
๐
|
godane1 |
http://www.g4tv.com/videos/1675//moby-interview/ |
10:49
๐
|
godane1 |
plus side is some of the vc videos i think are there but with normal names and different path |
10:54
๐
|
turnkit |
why don't all magazine publishers do this?:http://www.ebay.com/itm/National-Geographic-112-Years-32-CD-Rom-Set-Complete-Magazine-All-Issues-PC-Mac-/271101566054 |
10:54
๐
|
turnkit |
Even looks like the licensed the deal out to whatever company name is on the box... but at least they didn't sit on their assets. |
10:56
๐
|
turnkit |
that moby link just hangs for me... but I think all the g4tv video links are doing that... not sure why |
10:57
๐
|
omf__ |
I agree turnkit. I got fucking stacks of mags I keep for interesting bits but I wish they were digital. Time & Newsweek are two of the worst |
10:58
๐
|
omf__ |
well newsweek is dead in print now |
10:59
๐
|
turnkit |
is that new news? |
10:59
๐
|
turnkit |
Daily Beast ate them, no? |
11:00
๐
|
turnkit |
great quote from a 1994 CD-ROM review in the Seattle Times: "Angus' Law states that in any category, 85 percent is mediocre or less, 10 percent is just fine and 5 percent is excellent. The rash of CD-ROM titles conforms." |
11:01
๐
|
turnkit |
Some of that mediocrity is actually mineable though. |
11:01
๐
|
omf__ |
I still claim Sturgeon's Law |
11:02
๐
|
omf__ |
mainly because once you get exposed to really good content for a while a bunch of things you liked are now crap |
11:13
๐
|
turnkit |
omf__ - sorry I'm not reading well -- I skipped the highlighted line your replied to me in and just read the next as if it was the only thing your wrote |
11:13
๐
|
turnkit |
agreed. |
11:13
๐
|
turnkit |
Somebody has to digitize and license affordably. |
11:14
๐
|
turnkit |
Or just back off and let enthusiasts help the community of knowledge |
11:19
๐
|
omf__ |
I know this is obvious, but it is worth stating. The search capabilities enabled by having things digital is far more useful than most can imagine. |
11:20
๐
|
omf__ |
There are so many forever projects I had that are now possible because of data becoming free and things like internet archive and freebase |
11:21
๐
|
omf__ |
that is why JSTOR is such a cock tease. All online and searchable... just too fucking expensive for the 99% |
12:00
๐
|
Schbirid |
google play downloading dashboard graph porn http://news.ycombinator.com/item?id=5117983 |
13:23
๐
|
joepie91 |
omf__, turnkit, I'd argue for a modified version of Sturgeons Law: 90% of everything starts out as being crap, and this percentage increases over time |
13:29
๐
|
joepie91 |
http://www.youtube.com/watch?v=DvrPsBiZYaY |
13:34
๐
|
ersi |
http://i.imgur.com/oPOidhn.gif |
14:31
๐
|
joepie91 |
how would one mirror an entire site plus 1 level deep of external links? |
14:31
๐
|
joepie91 |
using wget-warc |
14:44
๐
|
alard |
joepie91: Try wget-lua. |
14:47
๐
|
joepie91 |
aside from me not speaking lua, does that do WARCs? |
14:48
๐
|
alard |
Yes, the warc thing is included in Wget 1.14, and wget-lua is based on the current wget git version. I'm making a little example script. |
14:52
๐
|
joepie91 |
alright, thanks\ |
14:53
๐
|
alard |
https://gist.github.com/2c5bbbeec96979c84768 |
14:53
๐
|
alard |
wget-lua --recursive --page-requisites, but no --span-hosts |
14:55
๐
|
alard |
(Didn't try it.) |
14:58
๐
|
ersi |
Yeah, keep a watch on the process by following the output (or output to a log and tail that) |
15:02
๐
|
alard |
The most recent Wget+Lua tar is here: http://warriorhq.archiveteam.org/downloads/wget-lua/ |
15:03
๐
|
alard |
or here: https://raw.github.com/ArchiveTeam/xanga-grab/master/get-wget-lua.sh |
15:11
๐
|
joepie91 |
alard: doesn't that just do page assets though? |
15:11
๐
|
alard |
No, line 14-16 should accept any URL that is referred to from the original host. |
15:12
๐
|
alard |
start_url_parsed is the URL that you give to Wget on the command line. |
15:12
๐
|
alard |
parent is the URL where Wget found the current URL. |
15:12
๐
|
alard |
urlpos is the URL that Wget is considering. |
15:13
๐
|
alard |
So I think it should work, but do check it. |
15:32
๐
|
joepie91 |
it seems to just be doing infinite recursion now |
15:32
๐
|
joepie91 |
downloading half the internet |
15:33
๐
|
joepie91 |
cc alard |
15:33
๐
|
alard |
Oh. |
15:34
๐
|
alard |
What are your Wget options? |
15:34
๐
|
joepie91 |
./wget-lua -t 2 --lua-script=../luascripts/externaldl.lua -e robots=off --wait 0.25 http://home.hccnet.nl/t.amerongen/ --mirror --warc-file=at-amerongen |
15:35
๐
|
joepie91 |
actually I'm not sure |
15:35
๐
|
joepie91 |
this guy may just haev a lot of outbound links |
15:35
๐
|
joepie91 |
let me watch it for a bit longer |
15:36
๐
|
alard |
In the meantime, could you send an update of the seesaw-kit to pypi? |
15:36
๐
|
joepie91 |
hm, have you updated the setup.py with the new version? |
15:37
๐
|
joepie91 |
git doesn't indicate any changes to setup.py |
15:37
๐
|
joepie91 |
oh right |
15:37
๐
|
joepie91 |
that's defined elsewhere |
15:38
๐
|
joepie91 |
never mind |
15:38
๐
|
alard |
Ah yes, I had to look it up, but the version isn't in setup.py. |
15:39
๐
|
ersi |
ooh, it's tagged and all ^_^ |
15:40
๐
|
joepie91 |
there we go, updated |
15:40
๐
|
joepie91 |
pip install --upgrade seesaw-kit |
15:40
๐
|
joepie91 |
to update |
15:40
๐
|
joepie91 |
Connecting to www.imageshack.us|208.94.0.38|:80... failed: No route to host. |
15:40
๐
|
joepie91 |
derp? |
15:40
๐
|
ersi |
derp indeed |
15:41
๐
|
alard |
Thanks. The Xanga script needs the latest version. |
15:41
๐
|
joepie91 |
oh god, they're probably on cogent |
15:41
๐
|
joepie91 |
cogent has basically nulled all routing with voxility afaik |
15:41
๐
|
joepie91 |
it's a bit annoying |
15:41
๐
|
ersi |
cogent for the lose |
15:41
๐
|
joepie91 |
they were complaining about "too much abuse" |
15:41
๐
|
joepie91 |
but ironic given how much crap comes off cogent itself |
15:41
๐
|
joepie91 |
bit * |
15:42
๐
|
joepie91 |
but okay |
16:51
๐
|
alard |
I wrote a little bit of documentation for the Wget+Lua callbacks: https://github.com/alard/wget-lua/wiki/Wget-with-Lua-hooks |
17:01
๐
|
ersi |
nice |
17:24
๐
|
kennethr- |
this video is pretty epic http://www.youtube.com/watch?feature=player_embedded&v=WaPni5O2YyI |
17:33
๐
|
ersi |
Indeed |
17:50
๐
|
omf__ |
What do people use for backups now days? I used cdr, then dvdr, and now bluray but even that is still a pain |
17:50
๐
|
omf__ |
I found out in my company's backups that some disks are bad due to a shitty drive that reported burns as good |
17:51
๐
|
omf__ |
I thought about going raid hard drives on a large scale but after the thailand floods and the cluster fuck around hard drives that does not seem as feasible |
17:51
๐
|
omf__ |
unless I waste the money on "enterprise" drives |
17:52
๐
|
omf__ |
any archivers got suggestions? |
17:53
๐
|
ersi |
Most important is to have off-site copies |
17:53
๐
|
ersi |
I currently don't have any backups >_> |
17:53
๐
|
omf__ |
I know that, I am asking about formats |
17:54
๐
|
omf__ |
I had a terrible crash in 96 in which I spent a month with norton disk editor recovering everything. I keep backups but I found some backups failed |
17:54
๐
|
omf__ |
I do social media data mining |
17:54
๐
|
omf__ |
129 of the 400 backup disks have appeared to fail |
17:56
๐
|
illunatic |
https://www.techdirt.com/articles/20130125/07585121787/german-court-recognizes-that-internet-connection-is-now-indispensable-modern-life.shtml heh |
17:56
๐
|
omf__ |
some of that data is proving unrecoverable and since I measure collection time in years this is a shit bag problem. |
17:56
๐
|
omf__ |
illunatic, one country down, everyone else left to go |
17:57
๐
|
Schbirid |
omf__: harddisks are cheap ;) |
17:57
๐
|
Schbirid |
3tb for 100รขยยฌ, cant beat that with discs |
18:00
๐
|
omf__ |
current backup size 9.8 terabytes |
18:01
๐
|
omf__ |
and bluray disks are way cheaper, even the quality ones |
18:01
๐
|
omf__ |
hard drives it might be |
18:01
๐
|
omf__ |
I considered tape as well |
18:01
๐
|
Smiley |
blu ray isnt old enough to trust for archiving |
18:02
๐
|
omf__ |
They said the same about dvd and 10 years later my first burned dvds are still rock solid |
18:02
๐
|
omf__ |
shit it is more than 10 |
18:02
๐
|
omf__ |
more like 15 now |
18:02
๐
|
omf__ |
I am not afraid of media |
18:03
๐
|
omf__ |
Schbirid, any specific drives you recommend |
18:03
๐
|
Smiley |
lucky you |
18:03
๐
|
omf__ |
I tested a ton of shit before I commited |
18:03
๐
|
Schbirid |
omf__: any, but 3 redundant or so |
18:03
๐
|
omf__ |
I tested every bluray media on the market but was stupid to trust the drive |
18:03
๐
|
omf__ |
I normally get seagates with no problems. |
18:04
๐
|
omf__ |
Does anyone else test their drives? |
18:04
๐
|
Smiley |
optical drives or real drives? |
18:04
๐
|
omf__ |
I do a 35 pass dban to flex the drive out and then a multipass badblocks to test every single sector |
18:05
๐
|
omf__ |
hard disk drives |
18:05
๐
|
Schbirid |
nope, they might fail just because the day felt like an opportunity, so that is worthless imo |
18:06
๐
|
Smiley |
its a good start... |
18:06
๐
|
Smiley |
but what Schbirid said |
18:06
๐
|
Smiley |
redundancy is key. |
18:07
๐
|
omf__ |
that and have proper hard drive cooling. I have found that extends drive life quite a bit |
18:07
๐
|
db48x |
hmm |
18:07
๐
|
db48x |
that report google put out says otherwise |
18:07
๐
|
omf__ |
but this backup would only be on to receive updates and then be turned off |
18:07
๐
|
db48x |
drive lifetime is weakly anticorrelated with temperature |
18:08
๐
|
db48x |
also, most drives that die will do so quickly, so an stress-test is a very good idea |
18:08
๐
|
db48x |
make sure you know about the bad ones as soon as possible, then rma them |
18:08
๐
|
Schbirid |
interesting! |
18:08
๐
|
db48x |
buy a few more than you need to set up your raid, rma the ones that fail the test, then use their replacements as spares for the raid array |
18:08
๐
|
db48x |
yea, the temperature thing was surprising |
18:09
๐
|
db48x |
there wasn't enough information for them to determine a cause, but the speculation is that the increased vibration from the fans offsets the benefit of improved ventilation |
18:10
๐
|
Smiley |
fans..... in a DC? |
18:11
๐
|
Smiley |
guess it depends what kind of chassis |
18:12
๐
|
Smiley |
now i think about it dell have a lot of fans near the hdd bays |
18:12
๐
|
omf__ |
you mean the 2007 report? |
18:12
๐
|
db48x |
yea |
18:12
๐
|
db48x |
yes, 2007 |
18:12
๐
|
Smiley |
though google dont have cases? |
18:13
๐
|
omf__ |
That shit is already out of date. Remember it was on older drives they ran and they held back all the really good bits |
18:13
๐
|
omf__ |
like which brands, models, lifetime ages matched up |
18:13
๐
|
omf__ |
I do not even have a drive that old |
18:13
๐
|
omf__ |
plus is it repeatable |
18:14
๐
|
omf__ |
no one else is really opening the door for us to take a peek and see |
18:14
๐
|
Smiley |
samsung drives have failed lots for us.... |
18:15
๐
|
omf__ |
you would think with an industry this old there would be more information |
18:15
๐
|
db48x |
omf__: agreed |
18:15
๐
|
Smiley |
such as? |
18:15
๐
|
omf__ |
Dell, HP, Apple, google, facebook, etc.. all keep their yaps shut because it helps their bottom line |
18:16
๐
|
Smiley |
if you prove the unreliability of one company its a death sentence |
18:16
๐
|
omf__ |
like Packard Bell |
18:16
๐
|
Smiley |
dell rebrand drives |
18:17
๐
|
omf__ |
the upside to the whole hard disk industry is things get bigger in size and faster. |
18:18
๐
|
db48x |
BER stays the same though |
18:18
๐
|
omf__ |
just imagine showing a SSD drive to people in the 80s |
18:18
๐
|
db48x |
heh |
18:18
๐
|
omf__ |
yeah the error rate on the drives is not going down |
18:18
๐
|
omf__ |
there is a good recent report on that |
18:19
๐
|
db48x |
I think we just have to live with that, and move to more reliable software |
18:19
๐
|
omf__ |
which is why I agree redundancy is so freaking important |
18:19
๐
|
db48x |
ZFS is the way to go, I think |
18:19
๐
|
omf__ |
zfs is already old, there is some interesting new stuff coming out of netapp |
18:19
๐
|
db48x |
even the linux support is shaping up, thanks to LLNL |
18:20
๐
|
omf__ |
if only Oracle was not Oracle |
18:20
๐
|
db48x |
indeed |
18:22
๐
|
omf__ |
I just want to stop losing my freaking data |
18:22
๐
|
db48x |
:) |
18:22
๐
|
illunatic |
omf__: yep :D |
18:22
๐
|
omf__ |
if it wasn't for the twitter tos I would upload all the twitter data I collected to IA |
18:23
๐
|
omf__ |
which is funny since the IA doesn't really have a backup either |
18:23
๐
|
db48x |
you could upload a very large file containing random numbers, and then a very small file containing a random number... |
18:23
๐
|
db48x |
IA has mirrors |
18:24
๐
|
omf__ |
full mirrors? I heard most were partial at best |
18:24
๐
|
db48x |
I know the one in Alexandria is partial |
18:39
๐
|
joepie91 |
omf__: upload a giant file with random data, search for occurrences of every single tweet in it, and store the positions |
18:40
๐
|
joepie91 |
*technically* you're not reproducing the tweets |
18:40
๐
|
joepie91 |
just indicating where coincidental copies can be found :) |
18:40
๐
|
joepie91 |
(oh man, a judge would probably have a field day with this, haha) |
18:40
๐
|
Smiley |
random data? why? just a ascii-16 list of charas ;) |
18:41
๐
|
joepie91 |
Smiley: I never specified the source alphabet :P |
18:42
๐
|
Smiley |
joepie91: inlcude ALL THE ALPHABETS |
18:42
๐
|
joepie91 |
on a different note |
18:42
๐
|
joepie91 |
my insertion of the khan academy datasets into a database is going okay |
18:42
๐
|
joepie91 |
now that I have a sane mysql lib to work with |
18:47
๐
|
balrog_ |
mistym: that thing arrived |
18:53
๐
|
mistym |
balrog_: Awesome! |
19:20
๐
|
SketchCow |
Whew |
21:00
๐
|
illunatic |
nice discussion going on here if anyone is interested https://kat.ph/blog/GreenPirate/ |
21:04
๐
|
illunatic |
spiderwort |
22:19
๐
|
godane1 |
SketchCow: you may have to mirror the g4tv.com videos too |
22:20
๐
|
godane1 |
my internet wifi sucks |
22:57
๐
|
omf__ |
My memory is slipping. Anyone remember the page for SketchCow's universal file format collection thing. I want to read all the material over before emailing him a question |
23:01
๐
|
DFJustin |
http://fileformats.archiveteam.org/ |
23:02
๐
|
SketchCow |
Boy, I love the idea someone has to do research before asking me something. |
23:02
๐
|
SketchCow |
That ensures a quick, speedy reply. |
23:10
๐
|
godane1 |
i have a list of the video files |
23:11
๐
|
godane1 |
https://archive.org/details/g4tv.com-video-url-list-1 |
23:12
๐
|
godane1 |
wait thats a old list |
23:20
๐
|
godane1 |
i updated that item |