Time |
Nickname |
Message |
05:03
π
|
winr4r |
so what the fuck, via.me |
05:03
π
|
winr4r |
three DAYS warning before deleting everything? |
05:03
π
|
winr4r |
THREE FUCKING DAYS? |
05:04
π
|
BlueMax |
potential backup: http://www.pspminis.com/ |
05:05
π
|
BlueMax |
won't be deleted but nothing more will be posted, might be a good idea to grab it now |
05:17
π
|
godane |
BlueMax: i'm backing it up right now |
05:25
π
|
godane |
uploaded: https://archive.org/details/polytroncorporation.com-20130727 |
05:41
π
|
godane |
looks like search on archive.org is not updating |
05:41
π
|
godane |
i say that cause i'm around episode 90 of labrats and i'm still stuck at episode 55 in search |
06:23
π
|
BlueMax |
how's pspminis going, godane |
06:24
π
|
godane |
its still going |
06:24
π
|
godane |
86+mb |
06:25
π
|
godane |
i'm not mirroring the forums right now cause i just want to focus on the main site |
14:13
π
|
SketchCow |
BACK |
14:13
π
|
SketchCow |
OK, let's catch up. |
14:19
π
|
SmileyG |
SketchCow: ok. |
14:20
π
|
SketchCow |
winr4r: Until I can see proof, I don't think it was 3 days warning for via.me. |
14:20
π
|
* |
SmileyG ponders where to start. |
14:20
π
|
SketchCow |
I've been reading |
14:20
π
|
SmileyG |
K, short version |
14:21
π
|
SmileyG |
GLaDOS/antomic were thinking about/working on generating a list of users already in wayback for xanga, and putting them to the back of the queue |
14:21
π
|
SmileyG |
snapjoy is ready to go into it's own subcollection. |
14:21
π
|
* |
SmileyG can't think of anything else |
14:24
π
|
winr4r |
SketchCow: well, we heard about it yesterday, so i went to IA, and their most recent crawl which had the notice was yesterday |
14:25
π
|
winr4r |
http://web.archive.org/web/20130721062456/http://via.me/ |
14:25
π
|
SketchCow |
I'm SURE it was at LEAST a month. |
14:25
π
|
winr4r |
the one before that, on the 21st, did not, so that means they might have given 10 days notice |
14:25
π
|
winr4r |
unless they told folks by email before that |
14:27
π
|
SketchCow |
https://twitter.com/izayoi1616/status/359514629963128836 |
14:27
π
|
winr4r |
consider me corrected |
14:27
π
|
winr4r |
i did check the usual suspects like techcrunch first |
14:28
π
|
SketchCow |
Well, not corrected, it's 10 days. |
14:29
π
|
SketchCow |
http://via.me/help#retirement |
14:30
π
|
winr4r |
which isn't a whole lot better, but i was still wrong |
14:30
π
|
winr4r |
"Links to your photos on Via.me will still function until July 30th" |
14:30
π
|
winr4r |
"This means photo hosting will go away on August 1st." |
14:30
π
|
winr4r |
? |
14:36
π
|
omf_ |
SketchCow, I am downloading all of buzzdata before they close in 2 days |
14:36
π
|
SmileyG |
winr4r: one day.... |
14:36
π
|
SmileyG |
can you bash something out? |
14:36
π
|
winr4r |
SmileyG: that sounds like an indecent proposal |
14:36
π
|
winr4r |
but yes |
14:43
π
|
SketchCow |
http://www.archiveteam.org/index.php?title=File:BDclosed-03.jpg |
14:45
π
|
winr4r |
crap, one that got away :( |
14:45
π
|
SketchCow |
Omg if working on it |
14:45
π
|
SketchCow |
He just jsaid so. |
14:45
π
|
SketchCow |
OK, apparently a little woozy typing |
14:46
π
|
winr4r |
too early? :) |
14:47
π
|
omf_ |
yeah buzzdata requires you to drive a browser to get all the js bullshit to work to access the public datasets |
14:47
π
|
omf_ |
downloading the data is easy, discovering it is the time consuming part |
14:48
π
|
winr4r |
omf_: and there's no JSON-spewing interface running behind that? |
14:50
π
|
omf_ |
the API is how I am downloading the datasets but the api does not give me a list of usernames which is the key for access |
14:50
π
|
winr4r |
oh, shit |
14:51
π
|
omf_ |
GET `https://:HIVE_NAME.buzzdata.com/api/:USERNAME` where HIVE_NAME is optional (if you leave it out, it just gets the public stuff) |
14:51
π
|
omf_ |
the API assumes you know the username but does not provide a username discovery mechanism |
14:51
π
|
winr4r |
and getting the data set is /api/:USERNAME/:some_id ? |
14:54
π
|
omf_ |
http://buzzdata.com/faq/api/api-methods#download |
14:54
π
|
winr4r |
yeah i just looked, should have done that rather than asking questions |
14:54
π
|
winr4r |
:) |
14:54
π
|
omf_ |
https://:HIVE_NAME.buzzdata.com/api/:USERNAME/:DATASET_SHORT_NAME/:DATAFILE_UUID/download_request |
14:54
π
|
omf_ |
yeah their API is alright for the features they have |
14:58
π
|
winr4r |
anything i can do to help? |
15:00
π
|
winr4r |
mistym: hiiiii |
15:01
π
|
mistym |
winr4r: Morning! |
15:01
π
|
omf_ |
winr4r, if it is not done in 4-5 hours I might |
15:02
π
|
winr4r |
omf_: is your big problem finding usernames? |
15:03
π
|
omf_ |
The problem is the script takes a while since it has to load and run the js bullshit |
15:03
π
|
omf_ |
Nothing requires any kind of extra work |
15:04
π
|
winr4r |
ah, gotcha :\ |
15:04
π
|
omf_ |
They only have 2,375 users total |
15:04
π
|
omf_ |
and some have no datasets |
15:04
π
|
omf_ |
its a very small site |
15:06
π
|
winr4r |
mm :\ |
15:15
π
|
omf_ |
I just remembered the question I had for you winr4r |
15:17
π
|
omf_ |
So we got a great talks section and an in the media section. Maybe we should a section for technical articles about how we do stuff |
15:20
π
|
winr4r |
i can only think of one article that goes in there ("Site exploration"), though that's a case for writing more |
15:20
π
|
winr4r |
oh, yeah, wget recipes and shit, we have a page on that i think |
15:21
π
|
winr4r |
though there's likely lots on the wiki that i do not know about :) |
15:29
π
|
omf_ |
I mean more along the lines of a blog post I wrote on how I wrote a script to collect all the mailing archives for opensolaris. I know a few others wrote blog posts about tools they made as well |
15:30
π
|
omf_ |
A walk through of how the tool was built versus just how to use the tool |
15:32
π
|
winr4r |
omf_: yes, that would be excellent |
18:31
π
|
Asparagir |
YOU GUISE MY FIRST PANICGRAB UPLOAD TO IA WORKED! |
18:31
π
|
Asparagir |
http://archive.org/details/jewishgen.org-panicgrab-20130710 |
18:31
π
|
Asparagir |
Just needs to get moved to the ArchiveTeam section, not communitytexts. |
18:33
π
|
winr4r |
Asparagir: PROUD OF YOU SON |
18:33
π
|
Asparagir |
I'M A DAUGHTER, DAAAAD |
18:35
π
|
winr4r |
YOU'RE SON IF I SAY SO SON |
18:36
π
|
Asparagir |
YOU NEVER LET ME HAVE ANY FUN! *cries, runs to room, slams door* |
18:40
π
|
closure |
http://blog.theoldreader.com/post/56798895350/desperate-times-call-for-desperate-measures "You will have two weeks to export your OPML file regardless of our decision" |
18:41
π
|
antomatic |
AAAARGH! |
18:42
π
|
antomatic |
Guess those 'you have 3 days left to subscribe to FeedHQ' emails were well-timed then. |
18:50
π
|
Asparagir |
Hey omf_ -- thank you for getting the BuzzData stuff. I liked that site; too bad it's going away. |
18:52
π
|
winr4r |
closure: what the fuuuuuuuuuuuck |
18:53
π
|
winr4r |
wow, i actually switched to theoldreader as well |
18:55
π
|
antomatic |
I moved to feedhq's 30-day trial and oldreader at the same time, intending to pick one - was going to choose the free option (because I am awfully stingy) but looks like I won't get away with that now. :) |
19:00
π
|
winr4r |
'Last week difficulty level was changed to Γ’ΒΒhellΓ’ΒΒ in every possible aspect we could imagine, we have been sleep deprived for 10 days and this impacts us way too much.' |
19:00
π
|
winr4r |
how about, if you're going to call yourself a google reader alternative, and actually invite people to use you, *be willing to do the fucking work to not let people down* |
19:00
π
|
winr4r |
even if it's free |
19:01
π
|
antomatic |
It has been an incredible journey.. |
19:01
π
|
antomatic |
sad though, nevertheless. |
19:03
π
|
winr4r |
actually, at this point in my life, i wouldn't mind one less thing to keep up with, so maybe that is a sign |
19:03
π
|
winr4r |
in the same way that seeing a distant city nuked is a sign that you should buy more stuff locally, but |
19:03
π
|
winr4r |
not something welcome, but heyyyyeyyy i'm your silver lining |
19:04
π
|
DFJustin |
Asparagir: nice work, poking around the website most of it seems to be behind a login wall so I guess you did a crawl with cookies? |
19:05
π
|
Asparagir |
No, I just crawled all the non-cookie areas of the site, which include hundreds of town and shtetl pages, photos, family artifacts, etc. |
19:05
π
|
DFJustin |
definitely worth grabbing anything with a "powered by ancestry.com" logo, they seem to have a habit of buying free resources and then making them only for ancestry subscribers |
19:05
π
|
Asparagir |
I didn't want to get in trouble by using my login and cookie -- some parts have a strict user agreement. |
19:05
π
|
Asparagir |
Oh, I know, Believe me I know. |
19:06
π
|
Asparagir |
I also didn't grab any personal content, like family trees (which are also behind the login). Just the public stuff. |
19:07
π
|
Asparagir |
Thing I do for fun: build open source database systems for genealogy and historical groups, so they can publish their data *without* handing it over to for-profit groups liek Ancestry. |
19:07
π
|
Asparagir |
http://www.LeafSeek.com/ |
19:07
π
|
DFJustin |
o/\o |
19:08
π
|
Asparagir |
In use: http://genealogy.org.il/AID/ |
19:08
π
|
Asparagir |
And: search.geshergalicia.org |
19:08
π
|
Asparagir |
http://search.geshergalicia.org/ |
19:09
π
|
winr4r |
Asparagir: hey, that's awesome! |
19:09
π
|
Asparagir |
Thanks! It really bothers me how much public vital records data and historical data is getting swallowed up by for-profit groups. |
19:10
π
|
winr4r |
Asparagir: there's a big problem in that field, actually, which is that 1) machine-readable data is tied up in that way 2) even more is tied up in proprietary formats |
19:10
π
|
Asparagir |
There are lots of cases in the past few years of formerly-public data becoming hidden behind paywalls. |
19:10
π
|
winr4r |
which i am sure you know about, i'm saying i've been aware of the problem for a while |
19:10
π
|
Asparagir |
Yep! |
19:10
π
|
Asparagir |
Sad genealogy panda. |
19:10
π
|
winr4r |
Asparagir: so, i'll sit here and admire you for a bit |
19:10
π
|
Asparagir |
If you like. :-) |
19:10
π
|
winr4r |
k! |
19:10
π
|
* |
winr4r sits, admires |
19:11
π
|
* |
antomatic agrees |
19:15
π
|
winr4r |
that ends our coverage of the Asparagir admiration society |
19:17
π
|
Asparagir |
Well, hold off on the kudoes until I start rescuing some more shit. I was surprised to see a lot of major genealogy websites not well-represented in the Wayback Machine. |
19:17
π
|
Asparagir |
Gotta fix that. |
19:18
π
|
winr4r |
Asparagir: yes, you do |
19:18
π
|
omf_2 |
Some might be robots.txt bs |
19:19
π
|
DFJustin |
yeah wayback is going to be essential in the future, for example here is great info on one of my ancestors from a website which is now gone thanks to yahoo http://web.archive.org/web/20091027171723/http://www.geocities.com/SouthBeach/Canal/5891/john.html |
19:20
π
|
Asparagir |
Whoever was rescuing the old webtv stuff a few weeks, ago, thank you for remembering to include family history search terms in the stuff you were pulling. |
19:26
π
|
ntnd |
Frustrated by only finding long deleted forum posts which supposedly held the golden answer to my, and people long dead's questions: How would somebody with a software engineering background start archiving community sites in the most ideal way? |
19:26
π
|
ntnd |
Felt like http://xkcd.com/979/ but only worse since it's often so close |
19:28
π
|
SmileyG |
ntnd: two sec |
19:28
π
|
SmileyG |
http://archiveteam.org/index.php?title=Wget#Creating_WARC_with_wget |
19:31
π
|
ntnd |
SmileyG: This only follows links on the same domain doesn't it? |
19:32
π
|
omf_2 |
that is where --span-domains comes in |
19:33
π
|
ntnd |
Ahh, very nice |
19:35
π
|
CowerZZZZ |
just grab ancestry.com and be done with it :) |
19:37
π
|
godane |
i thought that xkcb.com was closing cause of slashdot: http://entertainment.slashdot.org/story/13/07/28/2227246/signs-point-to-xkcds-time-ending?utm_source=rss1.0mainlinkanon&utm_medium=feed |
19:38
π
|
godane |
title saying 'signs point to xkcds time ending' |
19:38
π
|
godane |
it was just a very long comic they made |
19:39
π
|
godane |
i may do a panic craw later just incase |
19:40
π
|
DFJustin |
that seems like the kind of thing wayback would have all of |
19:44
π
|
SmileyG |
s/they/he |
19:44
π
|
SmileyG |
godane: no chance of it closing afaik |
19:44
π
|
SmileyG |
I think randall would announce it for archiving purposes far before it's time |
19:44
π
|
SmileyG |
and I think everything is crawled anyway |
20:16
π
|
* |
ivan` grabs all of ftp://ftp.supermicro.com |
20:19
π
|
joepie91 |
http://blog.theoldreader.com/post/56798895350/desperate-times-call-for-desperate-measures |
20:20
π
|
joepie91 |
oh |
20:20
π
|
joepie91 |
was already posted |
20:28
π
|
SketchCow |
When Scott joined the Internet Archive, the Loon rejoiced; she believed (and still vehemently believes) that the world at large and the library/archives world desperately need Scott to do the work he does. |
20:28
π
|
SketchCow |
Notwithstanding that belief, the Loon knows full well that Scott would never survive in an ordinary archives or library context. Scott doesn.t just break The Rules, you see; Scott stomps The Rules flat and pisses gleefully on them, particularly though not exclusively online for all to see. |
20:28
π
|
SketchCow |
Given that, not even Scott.regardless of his hands-on knowledge of digital archiving, regardless of his skill at assembling technical communities for useful ends, regardless of his many and varied accomplishments, regardless of his high public profile.could stay in a library or archives job with the Rules-enforcers gunning for him, as they inevitably would. |
20:28
π
|
SketchCow |
http://gavialib.com/2013/07/silencing-librarianship-and-gender-who-can-break-the-rules/ |
20:30
π
|
xmc |
hmmmm |
20:30
π
|
SketchCow |
Translation: I'm going to jail |
20:31
π
|
xmc |
librarian jail |
20:31
π
|
xmc |
I'm probably going to radio jail, for what it's worth |
20:31
π
|
SketchCow |
We're all going to jail. |
20:31
π
|
SketchCow |
ALL |
20:32
π
|
xmc |
I'm ok with this. |
20:32
π
|
SketchCow |
I just figured out who this person is. |
20:33
π
|
SketchCow |
OH MAN |
20:33
π
|
SketchCow |
Forgot to mention |
20:33
π
|
SketchCow |
They did a WARC session at this NDSA thing I went to |
20:33
π
|
SketchCow |
I got up during the Q&A and said we'd added WARC to WGET. |
20:33
π
|
SketchCow |
Cheering. Cheering! |
20:34
π
|
xmc |
and then? |
20:35
π
|
xmc |
there *has* to be an "and then" |
20:35
π
|
antomatic |
"buxom cross-dressers threw fake gold coins at our feet as we discussed the fate of the revolution." |
20:35
π
|
SketchCow |
People were happy the end |
20:35
π
|
xmc |
I seem to remember hearing about this |
20:36
π
|
xmc |
ok |
20:36
π
|
antomatic |
also good |
20:36
π
|
xmc |
so this Loon didn't piss all over wget |
20:36
π
|
xmc |
great |
22:34
π
|
balrog |
http://thenextweb.com/insider/2013/07/29/the-old-reader-to-close-public-site-in-two-weeks-users-who-joined-before-google-reader-axing-news-can-stay/ |