#archiveteam-bs 2014-03-05,Wed

↑back Search

Time Nickname Message
00:04 🔗 nico ivan`: the whole harddrive ?
00:04 🔗 nico i grep just on the swap
00:42 🔗 dashcloud this movie on kickstarter about the theft of a stuxnet-like cyber weapon looked interesting, so I backed it: https://www.kickstarter.com/projects/1747096622/crow-hill-the-feature-film
02:22 🔗 Nonelse http://www.retrothing.com/2009/04/10-year-old-sony-comic-book-phone-already-forgotten.html
03:45 🔗 godane hey SketchCow
03:46 🔗 godane i think i know of a way around the semi-admin problem in computerandtechvideos
03:46 🔗 godane i think the way around that is put rev3 stuff in a revision3 collection
03:47 🔗 godane and the twit stuff in to a twit collection
03:47 🔗 godane then i can have full admin of those collections
03:47 🔗 godane and subcollection
04:28 🔗 SketchCow It'll head towards that way, yes.
04:46 🔗 godane just know This week in startups is NOT a Revision 3 show
04:50 🔗 godane its most likely to go that way just so we can get the computerandtechvideos neater
04:50 🔗 godane that way there isn't 87 collections in it
04:51 🔗 godane there will just be something close to 20 or 30
04:55 🔗 godane anyways i'm going to upload Engadget Distro pdfs
08:35 🔗 SketchCow An enormous amount of items dumping into the archive from my various attached hard drives. :)
15:16 🔗 SadDM Random Blogspot blogger: "Derp de durr... I made a PDF that I want to share with my audience. Oh wow, look how easy it is to host it on Dropbox!"
15:16 🔗 SadDM SadDM (clicking link 6 months later): "Noooooooo!"
15:18 🔗 ersi More like Derpbox
15:18 🔗 ersi Or Burpbox
15:19 🔗 SadDM I mean, I get it. All of a sudden folks are able to host files... it's miraculous to them.
15:19 🔗 SadDM But man, talk about transient.
15:26 🔗 ersi Yeah
15:35 🔗 midas ah, derpbox, it's like devnull-as-a-service.com
15:41 🔗 SadDM ugh, I'm also going to put google docs in the same boat... I haven't been burned yet, but I brace myself every time I click a link.
15:42 🔗 DFJustin rapidshare etc. etc. are worse than either of those
15:44 🔗 DFJustin google doesn't actively expire files after x months
15:45 🔗 DFJustin although none of them are waybackable (annoyingly, dropbox used to be)
15:48 🔗 SadDM right. I forgot all about those (though they seem to come up less frequently in the parts of the web that I trawl through).
15:49 🔗 www2 this is the current robot.txt of dropbox http://pastebin.com/jdA0xKcc
16:11 🔗 DFJustin a lot of the MAME scene folks seem to have latched onto sendspace which deletes everything after just a couple months
16:15 🔗 DFJustin want the photos of this rare arcade game from a 3 month old post? too bad! http://www.mameworld.info/ubbthreads/showflat.php?Cat=&Number=318306&page=4&view=expanded&sb=5&o=&fpart=1&vc=1&new=#Post318306
16:16 🔗 SadDM oh... that's downright criminal :-P
16:16 🔗 SadDM just this weekend I did a grab of a forum where there were a TON of images hosted on photobucket
16:17 🔗 SadDM I actually grepped through the warc.gz to find them and download them too
16:17 🔗 SadDM then I rolled both of the warcs together
16:18 🔗 SadDM It was kind of a pain, but given that the photos were the focus of the threads they were in, I figured it was worth it.
16:44 🔗 Coderjoe dropbox doesn't actively delete files after x months, either. The user did that because they ran out of space and made the poor decision that those files were no longer important
16:46 🔗 SadDM Yeah, and that almost makes it worse. Clearly it was important enough to share with the world at one point, but now the new episode of Dancing With the Stars takes priority.
20:34 🔗 SadDM Sorry about all of that crap I just pasted into #archiveteam guys. Mental note... don't keep a huge, full paste buffer :-(
20:35 🔗 Baljem I'm sure you can rejoin (and apologise!) there, if your client is quite finished pasting!
20:36 🔗 Baljem (also, uh, perhaps use a client with accidental paste detection... it's saved my bacon a couple of times)
20:36 🔗 midas SadDM: all is well, join again :p
20:36 🔗 SadDM Yeah, I though irssi had paste protection... maybe it's not set up out of the box
20:38 🔗 yipdw that looked like it was getting pretty steamy
20:38 🔗 ivan` fwiw, disconnecting your IRC client after spamming will stop the spam
20:39 🔗 ivan` a proper IRC client not running in your terminal like hexchat will also let you paste newlines into the input box without sending them
20:40 🔗 Schbirid_ SadDM: at least it was not some more graphic fanfic ;)
20:40 🔗 Baljem SadDM: ah, I'm on irssi too - so it does have it, just needs configuring I guess!
20:42 🔗 SadDM it's weird, I accidentlly tap the right button in a putty window... it immediatly fills up with the story, and then irssi says: "Pasting 7 lines to #archiveteam. Press Ctrl-K if you wish to do this or Ctrl-C to cancel."
20:42 🔗 SadDM thanks irssi... what about the other 30 lines I didn't mean to paste?
20:42 🔗 yipdw so I wanted to know if there was more about Kran and Jendara
20:42 🔗 SadDM do you really?
20:47 🔗 Leo_TCK that's what putty does
20:47 🔗 Leo_TCK right click always pastes
20:47 🔗 Leo_TCK whatever is in the buffer
20:47 🔗 Leo_TCK left click highlights/copies
20:47 🔗 SadDM oh I know, I must have grabbed my mouse funny :-P
20:48 🔗 Leo_TCK where was that story from anyway?
20:50 🔗 midas here maybe: http://paizo.com/paizo/blog/v5748dyo5lfwm?Skinwalkers-Sample-Chapter
20:50 🔗 SadDM Leo_TCK: it's a sample chapter posted at http://paizo.com/paizo/blog/v5748dyo5lfwm?Skinwalkers-Sample-Chapter
20:50 🔗 midas ^5 SadDM
20:50 🔗 SadDM ha... ninja'd while I tripple-checked my paste buffer
20:58 🔗 Leo_TCK hmm
21:07 🔗 arkiver Is it possible to somehow get a list of all .nl websites or .zw websites?
21:10 🔗 ivan` crawl the homepages of .nl websites and look for links to other .nl websites
21:11 🔗 ivan` if you have that, let me know and I'll make a huge seed list for you based on my URLs
21:12 🔗 ersi arkiver: I would contact SIDN(.nl) and say something like "I'm doing a research paper on the structure of the web in the Netherlands. Could I possibly get a list of all currently registered .NL domains? Thanks"
21:13 🔗 ersi If you'd like to not lie, you could just replace "a research paper" to "research" and bam, it's not a lie anymore.
21:14 🔗 arkiver hmm thank you ivan` and ersi
21:15 🔗 godane looks like Meet John Doe (1941 firm) is going to be played on The Blaze
21:15 🔗 ersi and .zw seems to be handled by http://www.zispa.org.zw/
21:15 🔗 arkiver I'll try both ways, will send an email tomorrow to SIDN and start crawling a list of .nl websites (first need to find out what the best way is to do that... )
21:15 🔗 godane only reason is cause its in public domain
21:16 🔗 ersi arkiver: I bet there isn't a best way, but any way you start is a good start :)
21:17 🔗 arkiver ersi: haha, yes! I'll start it tomorrow and tell you how it goes
21:17 🔗 ersi I'd crawl through Common Crawl's past crawls for .NL domains and crawl those pages first. Maybe take a looksie at Alexa and stuff
21:18 🔗 ersi Then maybe some web directories, like https://en.wikipedia.org/wiki/List_of_web_directories
21:18 🔗 ersi and/or if you know any ".nl portals" where users have hosted their stuff/had homepages and stuff previously, like prior to Facebook and what not
21:48 🔗 arkiver ersi: thank you fo your help ersi! Will try some things out tomorrow... :)
21:48 🔗 ersi np :)

irclogger-viewer