#archiveteam 2013-07-02,Tue

↑back Search

Time Nickname Message
00:01 🔗 Coderjoe I feel like I'm watching a film Mr. McFeely delivered to Mister Rogers
00:01 🔗 Coderjoe well, until the automation sets in. that's a bit different
00:03 🔗 winr4r i rather like it :)
00:03 🔗 Coderjoe then it goes a bit wonka
00:17 🔗 SketchCow Won the 1959 doc oscar
00:20 🔗 winr4r oh, i'm not imagining having seen it before then
00:47 🔗 SketchCow http://archive.org/search.php?query=collection%3Aafterhoursdjs_livesets&sort=-publicdate
00:47 🔗 SketchCow SO MUCH TRANCE
00:57 🔗 xmc aw mai gawd
01:31 🔗 arrith1 oh wow, wasn't even recognizing that. i love ah.fm
01:33 🔗 arrith1 di.fm, ah.fm and pure.fm
01:33 🔗 arrith1 are all really good
03:27 🔗 SketchCow Ha ha arguing with google guy about death of google reader
03:30 🔗 BlueMax lol
03:30 🔗 BlueMax I still haven't moved off Google Reader
03:30 🔗 BlueMax I really should
03:37 🔗 xmc SketchCow: rofl
03:40 🔗 BlueMax "of course there was a reason. you just may not agree with it."
03:40 🔗 BlueMax what sort of bullshit fucking reason is that
03:44 🔗 BlueMax "hi, I'm hitler, there was a reason I killed millions of jews in the holocaust, you just may not agree with it"
03:47 🔗 winr4r "You must be a real hoot on the slave ship."
03:48 🔗 winr4r oh my god that was awesome
03:55 🔗 winr4r but i love how he picked on the literal meaning of "no reason" as if you actually meant literally no reason, as if they just tossed a fucking coin with "kill google reader" on one side and "eat a burger" on the other
03:55 🔗 winr4r rather than the obvious meaning of "no non-bullshit reason"
03:55 🔗 winr4r i *love* it when people go selective-sperging like that
03:57 🔗 omf_ It is just a deflection technique to try and discredit an argument which itself shows how weak the other position is
04:10 🔗 winr4r yes
04:11 🔗 winr4r in related news, there are still people who will get into a twitter fight with jason
04:11 🔗 winr4r aka "bunny meats blender"
04:11 🔗 winr4r not gonna work out too well for fluffy
04:13 🔗 omf_ I always thought of Jason on twitter like a giant meteor on the way to destroy your planet. You can say anything you want but it is not going to change the outcome
04:14 🔗 omf_ kinda like this http://medias.omgif.net/wp-content/uploads/2011/08/Real-bullet-bill-real-attack.gif
04:14 🔗 BlueMax I get the image of Jason in a tuxedo slamming head first into Yahoo HQ
04:18 🔗 omf_ ehonda style?
04:20 🔗 winr4r oddly, reader is still up for me
04:23 🔗 Jonimus Still loads for me, though I've moved everything to newsblur, which is FOSS so I can host myself even if the main site goes down.
04:46 🔗 arrith1 what google guy?
05:09 🔗 omf_ Seth L. in this thread https://twitter.com/textfiles/status/351866764289769472 guy just doesn't get it
05:20 🔗 arrith1 ah, ty
06:32 🔗 SketchCow I've added a new tool.
06:32 🔗 SketchCow http://www.archiveteam.org/index.php?title=User:Jscott/Sorry_That_I_am_All_Up_In_Your_Shit
06:32 🔗 SketchCow So, in the future, feel free to link someone I'm arguing with
06:35 🔗 winr4r hahaha
06:37 🔗 arrith1 oh my god haha
06:42 🔗 BlueMax very good
06:56 🔗 godane SketchCow: at some point your going to have setup a theblazetv-hightlights collection
06:56 🔗 godane i only say that cause grabing all of them
06:57 🔗 godane good news is my brute force xml grab is working
06:58 🔗 godane getting lots of bad data but just have search for any file with 'Page Not Found' and remove it to fix that problem
07:11 🔗 arrith1 godane: all of that will get the job done, though if you're not in a rush for a current project, it might be good to try to work some python out
07:36 🔗 ivan` https://www.google.com/reader/about/
07:36 🔗 ivan` API is still up
07:41 🔗 ivan` API is down
07:42 🔗 BlueMax RIP Google Reader
07:42 🔗 arrith1 whwhwhatt
07:43 🔗 arrith1 no
07:43 🔗 arrith1 getting lots of errors
07:43 🔗 arrith1 :(
07:44 🔗 arrith1 curse you mihaip and your memory leaking totally free tool
07:55 🔗 BlueMax and that's my migration to RSSOwl done
08:18 🔗 winr4r aw
08:18 🔗 winr4r reader has finally died
08:26 🔗 * winr4r salutes ivan`
08:49 🔗 ivan` winr4r: you can also thank alard for writing most of the necessary software beforehand
09:03 🔗 ivan` I think the ArchiveTeam wiki is getting HNed, though it still responds after a while
10:03 🔗 arrith1 https://news.ycombinator.com/item?id=5976263 Google Reader is dead (google.com) 134 points by voidfiles 2 hours ago | 79 comments
10:14 🔗 GLaDOS .
12:36 🔗 * Baljem hrms and hopes SketchCow never has reason to send him that link
12:37 🔗 Baljem as generally I fall into the first category of boring people, who just happened to notice far too many instances of Jason being right so ended up getting involved ;)
12:47 🔗 Smiley Hackaday being possibly sold/moved on....
12:49 🔗 ersi :O
12:49 🔗 BlueMax O_O
12:49 🔗 ersi source?
12:55 🔗 Smiley http://hackaday.com/2013/07/01/hackaday-looking-for-a-good-home/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+hackaday%2FLgoM+%28Hack+a+Day%29
12:56 🔗 ersi ah
12:57 🔗 godane luckly i did a backup of hackaday last year
12:58 🔗 godane and its right here: https://archive.org/details/hackaday-2004-2011-20120730-mirror
13:15 🔗 godane pushing my mirror of www.boilingfrogspost.com
13:15 🔗 godane this has no mp3 in it
13:34 🔗 godane uploaded: https://archive.org/details/www.boilingfrogspost.com-20130623
14:31 🔗 SketchCow Do another backup of hackaday, please.
14:42 🔗 RichardG for somebody with access to the IA collection, could my google answers archive I made a long time ago get moved
14:43 🔗 balrog http://archive.org/details/google-answers-archive to the Archive Team collections, IIRC. I can't do it, someone with IA edit access will have to.
14:45 🔗 RichardG yeah that
14:52 🔗 SketchCow Done.
15:04 🔗 godane thanks to me mirroring techcrunch
15:05 🔗 godane i may have found the right way to hackaday.com
15:05 🔗 godane wget "$website/$year/" --mirror --warc-file=$website-$year-$(date +%Y%m%d) --warc-cdx --accept-regex="(/$year/|/common/images/|\.jpg|\.png|\.jpeg|\.gif)" --reject-regex='(\?)' -E -H --domains=$website,weblogsinc.com,files.wordpress.com -o wget.log
15:09 🔗 ivan` http://greader-items.dyn.ludios.net:32047/common_crawl_index_urls.bz2 in case anyone wants a 22GB dump of commoncrawl URLs, will be down in a day
15:09 🔗 balrog what's the best way to mirror phpbb forums?
15:09 🔗 ivan` you can get the same thing from the common_crawl_index tool but it takes two days to download uncompressed from a public S3 dataset
15:20 🔗 godane fixed: wget "$website/$year/" --mirror --warc-file=$website-$year-$(date +%Y%m%d) --warc-cdx --accept-regex="(/$year/|/common/images/|\.jpg|\.png|\.jpeg|\.gif)" --reject-regex='(replytocom)' -E -H --domains=$website,weblogsinc.com,files.wordpress.com -o wget.log
15:20 🔗 SketchCow I can grab that, ivan.
15:20 🔗 SketchCow Are we sure Archive.org doesn't already have that?
15:21 🔗 godane i may have just need to block replytocom and not just anything with ? in the url
15:21 🔗 SketchCow http://archive.org/details/commoncrawl
15:21 🔗 SketchCow hmmm.
15:22 🔗 SketchCow 5.93M/s eta 1h 43m
15:24 🔗 SketchCow 3.92M/s eta 39m 24s
15:24 🔗 SketchCow Not bad.
15:24 🔗 SketchCow So, people will need to keep reminding others of this, but right now only underscor and I have access to the IA Archive Team collection.
15:24 🔗 SketchCow I can swap things over quickly.
15:25 🔗 SketchCow Unfortunately, there's no way to grant anyone here the ability to make another person's item enter the archive.
15:31 🔗 godane 2004 hackaday.com urls is downloaded
15:38 🔗 Jonimus balrog: there are scripts to scrape phpbb to other forums.
15:38 🔗 Jonimus I've done DB dumps of a few hosted fora that way.
15:59 🔗 RichardG thanks for the Google Answers archive move
16:33 🔗 winr4r hooray
16:33 🔗 * winr4r updates the wiki
17:05 🔗 winr4r so as the reader API is dead, shall i move the "how to get involved" stuff into a /Warroom or /Archive subpage of the google reader page on the wiki?
17:05 🔗 SketchCow Yes.
17:05 🔗 winr4r okay
17:05 🔗 winr4r shall i do that for other ended projects as i encounter them too?
18:13 🔗 zumthing while decommissioning a machine, i came across some files from the yahoo video project that I don't think ever made it to their final resting place with the internet archive. anyone got a pointer on what I might do with them?
18:13 🔗 winr4r zumthing: ping SketchCow
18:13 🔗 zumthing and whereby files, i mean 900GB of data
18:14 🔗 zumthing winr4r: i shall try that again.
18:15 🔗 winr4r probably quicker to post him the drive or something
18:15 🔗 winr4r lol, 900 gigabytes
18:23 🔗 Smiley SketchCow: did you grab the "last 11 posterous blogs" item?
18:24 🔗 Smiley I seem to recall I'm some sort of admin but I forget what to do right now.
18:44 🔗 SketchCow zumthing: I'll give you an ftp - will that work for you?
18:45 🔗 ivan` winr4r: thanks for cleaning all that up
18:45 🔗 winr4r ivan`: any time :)
18:45 🔗 zumthing SketchCow: that would, or I'm just trying to track down a disk to mail you.
18:46 🔗 zumthing SketchCow: i found the archive.org site related to the upload, perhaps I can pare down what I have to not duplicate what's already there.
18:47 🔗 winr4r ivan`: also as you're a wiki admin now, you should probably add xanga to the front page under current projects
19:03 🔗 ivan` winr4r: added, let me know if it should say anything else
19:13 🔗 SketchCow zumthing: Up to you
19:14 🔗 balrog SketchCow: did you grab the files I had, or do you want me to upload them to an ftp?
19:14 🔗 SketchCow I officially forget.
19:15 🔗 balrog if you want, just give me an ftp and I'll upload
19:15 🔗 balrog this is the pile of google video files
19:16 🔗 ivan` I nominate winr4r for wiki adminship based on his great edits and superior command of English
19:17 🔗 ivan` also I need a break and don't want to edit at the moment ;)
19:33 🔗 Smiley https://archive.org/details/BuyValiumCheapCodNoRxValiumOnlineOvernightDeliveryCodOvernight lol wut?
19:37 🔗 SketchCow We find those and delete those
19:37 🔗 SketchCow Are you just now finding out we have spam to deal with?
19:38 🔗 Smiley haha yes
19:42 🔗 SketchCow A lot.
19:42 🔗 SketchCow We deal with a lot of spam.
19:49 🔗 winr4r ivan`: thanks :)
19:50 🔗 ivan` but seriously I might be MIA for a while and would like the homepage to be less broken ("add link here...")
19:51 🔗 SketchCow IVAN: SHORTEST ADMIN REIGN EVER
19:51 🔗 ivan` haha
19:52 🔗 winr4r you're like the lady jane grey of wikis
19:53 🔗 SketchCow Ivan fired, Winr4r now an admin
19:53 🔗 ivan` thanks
20:08 🔗 winr4r okay that's Recently Ended Projects updated
20:09 🔗 ivan` cool
21:19 🔗 ivan` does anyone want a bzip2'ed version of greader feed stats that is ~25GB instead of ~130GB of .warc.gz?
21:19 🔗 ivan` also available over fast internet only briefly
21:22 🔗 winr4r ivan`: not me personally, but is it in the archive.org collection?
21:22 🔗 winr4r seems it should be!
21:22 🔗 ivan` the .warc.gz's are
21:22 🔗 ivan` I guess it could be
21:23 🔗 winr4r yes
21:34 🔗 ivan` http://techcrunch.com/2013/07/02/yahoo-acquires-qwiki-for-around-50-million/
21:36 🔗 winr4r i wonder how this one's going to end!
21:40 🔗 winr4r "Thank you for being a part of our story - one which is far from over."
21:40 🔗 winr4r i swear there's one guy who's hired to write all of these blog entries
21:41 🔗 winr4r all of the "we got acquired" ones that is
21:45 🔗 winr4r in any case, this is an obvious talent acquisition, you don't pay $50 million for a fucking slideshow generator because you think it's going to make you more money
23:58 🔗 SketchCow OK, so let's figure out how to download them.

irclogger-viewer