#archiveteam 2013-05-26,Sun

↑back Search

Time Nickname Message
00:07 🔗 ivan` is anyone backing up coursera? someone I know just asked for 'introduction to sociology' which got removed like many courses
00:08 🔗 ivan` well, not removed, but archive hidden
00:15 🔗 balrog why hidden?
00:15 🔗 ivan` the course providers decides whether the archive stays up after the class ends
01:03 🔗 hneio sounds like it could be an ip issue
01:04 🔗 hneio if the course providers explicitly don't want it up
03:42 🔗 SketchCow HEY UNDERSCOR
03:42 🔗 SketchCow I just like poking him on principle.
03:43 🔗 underscor >:c
03:43 🔗 underscor Aw, fuck
03:43 🔗 underscor Everything is filled up and turbofucked
03:44 🔗 underscor I hate it when that happens
03:44 🔗 SketchCow Wow, I've had formspring uploading for a solid day +
03:51 🔗 underscor I'm tempted to pause tracker jobs while I pick up this mess
03:52 🔗 underscor Or at least temporarily shift uploads to fos
03:52 🔗 underscor Because these machines need to be manually drained/maneuvered onto other disks
03:53 🔗 SketchCow FOS can handle a little more stuff incoming
03:53 🔗 SketchCow But don't throw the whole shebang at it.
03:54 🔗 underscor K
03:54 🔗 underscor I'll add fos back in to the rotation and just set up rate limits on my boxes
04:03 🔗 SketchCow I see stuff coming in.
04:31 🔗 SketchCow Poor machine, it had so much formspring to deal with
04:34 🔗 SketchCow I came across your uploads for TOSEC on the Internet Archive while just screwing around one day, and I'm very confused about what's stored in this archive. It says images, but does that mean ROM images, or does it mean something along the line of JPEG images? Also, I've noticed that the Internet Archive's archive of TOSEC material is much, much smaller than is on the official TOSEC webpage. Are you planning on updating the collection constantly, or ar
04:34 🔗 SketchCow And, I also came across your Archiveteam upload of the website Friendster. I noticed that you were only able to grab about 20% of the accounts on this site before it was shuttered. Have you considered taking snapshots or entire grabs of other social networks before they are shuttered as a premature safety measure?
04:34 🔗 SketchCow Thanks a million for your time!
04:34 🔗 SketchCow ..
04:34 🔗 SketchCow See this, this is the opposite of a helpful letter.
04:34 🔗 SketchCow This is now how you write a letter to me.
04:34 🔗 SketchCow "Explain to me tons of stuff I could suss out for myself, while also, I think you missed a bunch of stuff but I don't feel like giving examples. Big fan!"
04:36 🔗 BlueMax These dipshits tend to annoy me too
04:36 🔗 SketchCow You never get mail like this.
04:36 🔗 SketchCow http://www.youtube.com/watch?v=L8onlB0F1_A
04:37 🔗 BlueMax I run a YouTube gaming channel and people leave comments asking questions that they could find their own answer to with 5 minutes in Google
04:40 🔗 BlueMax It tends to piss me off, especially this one guy who left 20 comments on 20 videos over one month saying "are ps2 games on vita"
05:59 🔗 Nemo_bis An "archive your flickr" tool would be hugely popular these days http://www.flickr.com/help/forum/en-us/72157633650721234/72157633651235708/
05:59 🔗 underscor We have tools for that
06:00 🔗 underscor I should polish them up sometime soon
06:05 🔗 Nemo_bis underscor: also for Windows users?
06:05 🔗 underscor eww, windows users
06:05 🔗 * Nemo_bis hides
06:06 🔗 underscor Uh, no, they're in ruby and/or bash, iirc
06:06 🔗 underscor I should port them to python
06:06 🔗 Nemo_bis well, think of linux users first :p
06:06 🔗 underscor I mean
06:06 🔗 underscor ideally
06:06 🔗 Nemo_bis :)
06:06 🔗 Nemo_bis in a few days, people will be too angry to bother about archiving
06:06 🔗 underscor someone could set up a box that is like "gimme your username, and I'll spit this massive fucking zip at you"
06:07 🔗 Nemo_bis yep
06:07 🔗 Nemo_bis like some of our past trackers where you could allow usernames
06:07 🔗 Nemo_bis *add
06:07 🔗 underscor http://pi.pe/ this is a neat thing
06:13 🔗 Nemo_bis Ah. And what happens if destination is Google Drive? Let's try.
06:13 🔗 Nemo_bis Also, everyone update http://archiveteam.org/index.php?title=Flickr please
06:15 🔗 Nemo_bis Does it *really* manage to archive Facebook streams? Those are really horrible
06:26 🔗 Nemo_bis doesn't seem to do anything
09:01 🔗 DFJustin SketchCow: re: examples of stuff that you're missing, you don't have most of the tosec iso sets like IBM http://www.pleasuredome.org.uk/details.php?id=137667e22983694b2f81e8eb141b9ecc38f9741f http://www.pleasuredome.org.uk/details.php?id=03d445619790d251655e69427282aa9fab22a243
09:01 🔗 DFJustin sega saturn http://www.pleasuredome.org.uk/details.php?id=6d4119213ca6d49c7467c5c0857bf17247b152e5
09:01 🔗 DFJustin etc. just search for tosec iso
09:01 🔗 DFJustin I assume that's what this fellow was referring to
09:04 🔗 BlueMax yeah SketchCow doesn't like Pleasuredome DFJustin
09:07 🔗 SketchCow And that's fine. No ISO yet.
09:07 🔗 SketchCow But what about the regular rom sets.
09:07 🔗 SketchCow I'm STILL making that thing presentable into a 1.0
09:08 🔗 SketchCow I wonder what rom sets I'm missing.
09:08 🔗 SketchCow I
09:09 🔗 SketchCow I've got a script I'm running right now, which can take a pile of a directory and make it into a fully formed collection
09:09 🔗 SketchCow I'm running it on the TOSEC-PIX I download
09:09 🔗 SketchCow THAT is taking forever too.
09:09 🔗 SketchCow But I'm jamming dozens of magazines, manuals and newsletters up every minute.
09:15 🔗 DFJustin I don't think there's much missing in the way of rom sets other than the last year's worth of updates
09:15 🔗 BlueMax ^
09:16 🔗 BlueMax I would check the TOSEC website for you but it's erroring out on me
09:16 🔗 DFJustin pleasuredome is not that hard to work with, if you leave that tosec-pix thing going for a while on a box with a fat pipe you'll have loads of credit in no time
09:19 🔗 BlueMax I wish I had a fat pipe to up my own credit on it.
09:20 🔗 BlueMax Main reason I love Underground Gamer is that a $25 donation makes you immune to being shitlisted / banned automatically
09:21 🔗 SketchCow There we go - full derp on the uploads
09:21 🔗 ersi Wack it up, full-derp!
09:21 🔗 SketchCow DURPMAX
09:21 🔗 ersi DerpMax!
09:22 🔗 DFJustin oh found a typo, https://archive.org/details/Front_Fareast_Magic_Drive_TOSEC_2012_04_23 "Farest" -> "Fareast"
09:25 🔗 BlueMax full derp?
09:28 🔗 SketchCow Far East
09:30 🔗 SketchCow Ha, I'm oblitering the incoming queue
09:30 🔗 SketchCow I should go sleep for a while
09:32 🔗 BlueMax gnight...also if you could seed that Picpx torrent I sent you that'd be nice :P
09:33 🔗 DFJustin oh there are some missing rom sets, will e-mail details tomorrow
09:33 🔗 SketchCow Good.
09:34 🔗 SketchCow I was going to do that search down the line, but I figured I'd ask.
09:34 🔗 SketchCow Like, just tracking down decent descriptions of systems and putting those into the entry and adding a photo, that's been eating days
09:34 🔗 SketchCow So many of them
09:34 🔗 SketchCow For updates, of course, I'll just transfer the photos.
09:34 🔗 SketchCow and the descs.
09:37 🔗 BlueMax Is there anything I can help with related to the TOSEC sets?
12:24 🔗 Marcelo Did Yahoo buy Tumblr just to close it later? :p
13:03 🔗 Marcelo The command line output in "Current project" (AT Warrior) is so... dead. Why not show detailed information?
13:05 🔗 Cameron_D http://www.reddit.com/r/technology/comments/1f1m9x/googles_schmidt_teens_mistakes_will_never_go_away/ca61ahq
13:07 🔗 Marcelo "There's a group called "Archive Team" who hates that archive.org obeys robots.txt, and will be downloading all of reddit and making it searchable so that your stupid posts here will be forever available."
13:10 🔗 omf_ Are we supposed to give a fuck what some redditor thinks? I have been archiving reddit for over year and see no reason to stop
13:11 🔗 Cameron_D no, I found the comment amusing
13:11 🔗 Cameron_D http://www.reddit.com/r/privacy/comments/1emh4r/urgent_delete_any_old_reddit_posts_you_dont_want/
13:11 🔗 Cameron_D and hey, top comment is someone sensible
13:14 🔗 Marcelo "Disallow: /my_shiny_metal_ass" Reddit doesn't want its shiny metal ass to be crawled.
13:15 🔗 omf_ Wow what fucking morons. Let me tell you a short story. When I was growing up we were reminded not to break the law etc.. because of school records and criminal records etc.. Things you did have consequences over the course of the rest of your life.
13:19 🔗 omf_ Now after posting something to the internet for everyone to see they are crying foul and want it erased? Too fucking bad, you were stupid enough to embarrass yourself online live with it
13:45 🔗 omf_ I like watching uninformed people flip their shit over archiving.
13:46 🔗 omf_ And how they insult Archive Team as a whole.
14:17 🔗 ersi Redditors can kiss my ass
14:21 🔗 * Smiley wonders how offtopic this might go.
14:42 🔗 Tomcat_ "subcontractors" lol
14:42 🔗 Tomcat_ Now where did I keep this contract again?
16:16 🔗 SketchCow Morning.
16:16 🔗 SketchCow omf, how goes the warc gallery
17:40 🔗 omf_ Couple quick questions.
17:40 🔗 omf_ If a warc has a lot of images do you want it broken down into smaller collages
17:41 🔗 DFJustin re: earlier discussion, it's easy to say it's your own fault as a teenager for posting dumb shit online, but the thing is there is a lot of science showing that the "responsible" parts of the brain don't fully develop until later
17:42 🔗 DFJustin and that's one of the reasons most countries don't give youths long-term punishments for crimes
17:43 🔗 DFJustin that said I have no idea what can be done to stop it now that we have an internet
17:44 🔗 SketchCow I think it's a side effect of how fast the technology fell on society.
17:44 🔗 SketchCow I was arrested for shoplifting once, got community service. Got a mugshot, too.
17:44 🔗 SketchCow None of that's online, etc.
17:44 🔗 omf_ DFJustin, You hit the nail on the head. The internet is changing things and people need to accept that instead of ignore or deny it
17:45 🔗 SketchCow I'd probably be sad if my little sad sack face was on the net.
17:45 🔗 SketchCow I have turned the process of adding a directory of the same item (one magazine run, one set of console manuals) into one command followed by one keypress.
17:45 🔗 DFJustin basically social attitudes / behaviour of hr interviewers / etc. are going to have to change
17:45 🔗 SketchCow Can't get faster than that.
17:49 🔗 omf_ One keypress - http://stream1.gifsoup.com/view4/2239283/homer-bird-o.gif
17:52 🔗 SketchCow Pretty much.
17:52 🔗 SketchCow Basically, I made the default "make a collection for this set of items", but I needed a way to say "nah, don't do that"
18:28 🔗 zenguy_pc hi, is there a script to save reddit threads to pdf or epub ?
18:28 🔗 zenguy_pc maybe even mht
18:48 🔗 SketchCow Not that I know of.
18:48 🔗 SketchCow There should be.
18:48 🔗 SketchCow I've got three separate windows slamming newsletters, magazines and manuals into the archive.
18:48 🔗 SketchCow New one being added every second. Every. Second.
18:54 🔗 omf_ SketchCow, building whole libraries 3 items a second
18:54 🔗 omf_ fuck academia
19:00 🔗 SketchCow Obviously, it takes longer than it should to derive the resulting items, and I do have to go back and link the collections.
19:00 🔗 omf_ WiK, how many TB are you up to now?
19:00 🔗 SketchCow But the magazines are going in very fast now, three windows.
19:02 🔗 SketchCow It's very hard to tell my uploading.
19:02 🔗 SketchCow But bear in mind, these magazines and newsletters are pretty small PDFs.
19:02 🔗 SketchCow Like, anywhere from 500k to 20mb
19:02 🔗 SketchCow So in total, this TOSEC-PIX I'm incorporating is maybe 375gb
19:06 🔗 SketchCow I forget the meta manager way to say "what have I added today"
19:06 🔗 SketchCow I'm sure it's something.
19:07 🔗 SketchCow I'm keeping the three windows very busy, no downtim.
19:08 🔗 SketchCow Amiga Joker Magazine (German), Computer News 80 (TRS-80 Magazine) and One for ST Games (Atari ST Magazine)
19:28 🔗 flaushy awesome to see names that i remember reading (amiga joker)
19:29 🔗 zenguy_pc i have a vps.. with 50-70GB free .. anythign i could run on it?
19:29 🔗 underscor SketchCow: publidate 20130526*
19:29 🔗 underscor publicdate*
19:29 🔗 underscor you'll have to turn on the field
19:29 🔗 zenguy_pc i don't need anything personally but i'd figure i'd try to help for a short while
19:30 🔗 SketchCow underscor: I can tack it onto the end manually
19:32 🔗 omf_ zenguy_pc, which version of Linux is on it
19:34 🔗 SketchCow underscor: Actually, it's publicdate, and the form is in 2013-05-26*
19:34 🔗 SketchCow But other than every single aspect of the help being wrong, thanks
19:34 🔗 * SketchCow gatorade dump
19:34 🔗 SketchCow Anyway. I've added 1,498 discrete texts to archive.org today.
19:35 🔗 SketchCow 149 yesterday.
19:36 🔗 SketchCow 396 day before that, 1,306 day before that, 959 day before that, 343 day before that.
19:36 🔗 SketchCow So a good busy week.
19:37 🔗 godane starting to look like in the missing ids for 10000s area only the man show clips are alive
19:47 🔗 zenguy_pc debian 7.0
19:47 🔗 zenguy_pc sorry for the late response
19:52 🔗 omf_ zenguy_pc, well the baseline perl looks new enough that it shouldn't be too complicated to get the screehshot application on there. I am taking screens of the front of posterous blogs
19:53 🔗 omf_ debian 6 is just way too out of date
19:57 🔗 zenguy_pc i just reinstalled debian since my upgrade last week failed.. i successfully upgraded it on the second try and I shut it down until i could follow some guides to secure it
19:57 🔗 zenguy_pc what will i need to run? .
20:04 🔗 joepie91 <ivan`>is anyone backing up coursera? someone I know just asked for 'introduction to sociology' which got removed like many courses
20:04 🔗 joepie91 I have an old-ish (few months) dump of their course metadqta
20:04 🔗 joepie91 metadata *
20:04 🔗 joepie91 not sure to what extent courses are hidden
20:04 🔗 joepie91 because it really only is the metadata
21:04 🔗 SketchCow DFJustin: Guy was referring to my not referencing the newest datasets
21:04 🔗 DFJustin k
22:04 🔗 WiK well, ive got a few hundred gb left on this last drive, once thats full ill be at 10tb
22:06 🔗 WiK omf_: 10tb in a day or two
22:07 🔗 omf_ nice
22:08 🔗 WiK then n i think ill have to stop downloading for awhile
22:08 🔗 WiK ill be out of drive space, and ill need to finish sorting/grepping all the results for my defcon talk, ill fund out mid next month if its accepted or not
22:13 🔗 godane we have some clips of action blast
22:23 🔗 godane later tonight i maybe showing one of Jason Scott talks to my brother
22:48 🔗 ivan` heh, got two opml files out of that reddit post
22:54 🔗 ivan` I don't think anyone there realizes how Reader works
22:55 🔗 zenguy_pc what reddit post?
22:56 🔗 ivan` <Cameron_D> http://www.reddit.com/r/privacy/comments/1emh4r/urgent_delete_any_old_reddit_posts_you_dont_want/
22:56 🔗 zenguy_pc reddit rss?
22:56 🔗 zenguy_pc oh i saw that earlier
22:57 🔗 zenguy_pc i don't mind if my account is archived.. i just have to be extra careful of linking it to my offline identity
22:58 🔗 zenguy_pc some people are more paranoid creating accounts weekly .. i don't have much need for that
22:58 🔗 zenguy_pc i just vote save and comment intermittenly..
23:28 🔗 wp494 it's astonishing how people have any expectation of privacy in a public forum
23:28 🔗 wp494 don't want it public? don't say it in public
23:30 🔗 dashcloud the problem is for many years there was privacy by obscurity- sure, the info was public, but you weren't going to know about it unless someone told you or you already knew
23:31 🔗 dashcloud now that it's so easy to find things and connect them, a lot of that is going away

irclogger-viewer