| Time |
Nickname |
Message |
|
00:27
🔗
|
wp494 |
[18:55:59.859] <WiK> how goes it wp494 ? |
|
00:27
🔗
|
wp494 |
good, you? |
|
00:29
🔗
|
WiK |
im ok, just chillin |
|
01:12
🔗
|
xmc |
hm |
|
01:12
🔗
|
xmc |
I have some fortunecity data taking up space that I kind of need right now |
|
01:12
🔗
|
xmc |
well, only a gig |
|
01:12
🔗
|
xmc |
not that then |
|
01:24
🔗
|
xmc |
looks like I've got a few hundred gigs of orphaned warc files |
|
01:40
🔗
|
omf_ |
pending upload xmc ? |
|
01:47
🔗
|
xmc |
I don't know what to do with them |
|
01:47
🔗
|
xmc |
from fortunecity, splinder, mobileme, and picplz |
|
01:47
🔗
|
xmc |
obviously I've missed the boat on the megawarcs :P |
|
01:49
🔗
|
omf_ |
I just upload loose warcs as new items anyway |
|
01:50
🔗
|
omf_ |
they will get sucked in at some point |
|
01:55
🔗
|
xmc |
aye |
|
02:47
🔗
|
omf_ |
WiK, I am mentioning gitdigger in my docs talk coming up in september |
|
02:48
🔗
|
SketchCow |
http://ascii.textfiles.com/archives/3974 Inside Brewster's Magnificent Contraption |
|
02:50
🔗
|
WiK |
sweet |
|
02:53
🔗
|
omf_ |
I am watching your bsidesLV talk |
|
02:54
🔗
|
WiK |
ya, ill add the defcon/derbycon vids there as well when i get them |
|
03:25
🔗
|
dashcloud |
SketchCow: what's the link to your latest talk? |
|
03:28
🔗
|
SketchCow |
Which one? |
|
03:36
🔗
|
omf_ |
http://www.archiveteam.org/index.php?title=Talks |
|
03:38
🔗
|
S[h]O[r]T |
weee |
|
03:42
🔗
|
S[h]O[r]T |
i want to upload the video with the audio from archive.org..i might just do that |
|
03:50
🔗
|
dashcloud |
I thought you'd only done one recently. I'll check the list omf_ provided. Thanks! |
|
04:09
🔗
|
SketchCow |
I did the one at NDSA |
|
04:09
🔗
|
SketchCow |
I did a speech at DEFCON about the documentary, that's not out yet |
|
04:09
🔗
|
SketchCow |
http://archive.org/details/20130724JasonScottNDSADigitalPreservation2013ArchiveTeam |
|
04:14
🔗
|
ambience |
Looks like the draft archive team talk from DC17 isn't on there. |
|
04:15
🔗
|
ambience |
makes sense though, it was a draft |
|
04:16
🔗
|
SketchCow |
The skytalk? |
|
04:16
🔗
|
ambience |
yeah |
|
04:16
🔗
|
SketchCow |
The skytalks are not recorded, on purpose, for this reason. |
|
04:16
🔗
|
ambience |
ah, that makes sense |
|
04:16
🔗
|
SketchCow |
I gave a talk on engineering fame that was a real rough sketch |
|
04:16
🔗
|
SketchCow |
It may never see the light of recorded day. |
|
04:16
🔗
|
SketchCow |
But I could give it at skytalks |
|
04:17
🔗
|
SketchCow |
Yeah, the whole point of skytalks is to give people a chance to fly and for people to see betas |
|
04:17
🔗
|
ambience |
I really enjoyed the archive team skytalk, totally didn't realize that was the purpose of them |
|
04:17
🔗
|
SketchCow |
If you were in the room, you were lucky |
|
04:18
🔗
|
ambience |
that i was |
|
04:18
🔗
|
ambience |
also had a cool hallway convo with you afterward. it was fun. |
|
04:23
🔗
|
SketchCow |
I'd have to see a picture to remember you, but I'm sure I would. |
|
04:23
🔗
|
SketchCow |
I meet a lot of people in the course of a DEFCON. |
|
04:24
🔗
|
SketchCow |
People walking with me find it ridiculous |
|
04:25
🔗
|
SketchCow |
http://digital-archiving.blogspot.com/2013/08/a-short-detective-story-involving-5.html?utm_source=twitterfeed&utm_medium=twitter |
|
04:29
🔗
|
ambience |
SketchCow: one on the right. unsure if my hair was as long. it's a picture from around that time though. https://fbcdn-sphotos-f-a.akamaihd.net/hphotos-ak-prn2/167756_10150115961726320_712837_n.jpg |
|
04:30
🔗
|
ambience |
I linked to the defcon doc on fb yesterday and you responded to one of my friends who called it self-aggrandizing, haha. |
|
05:56
🔗
|
SketchCow |
https://archive.org/details/20130801DEFCONDocumentaryPremiereAudienceReaction |
|
08:01
🔗
|
Nemo_bis |
windows is not anle to recognise files without extension, srsly? |
|
08:05
🔗
|
ersi |
yeah, hehe |
|
10:29
🔗
|
BlueMax |
Nemo_bis, it's a fair thing for Windows not to recognize extensionless files |
|
14:26
🔗
|
SketchCow |
ambience: Yes, now I recall. |
|
14:44
🔗
|
ZoeB |
So is anyone archiving the groups.yahoo.com messages? |
|
15:40
🔗
|
Baljem |
hmm, that's a good question - also whether it's possible to archive all of them (ISTR when I flagged a group I used to run as moribund, it prevented public access to the group archives - WTF) |
|
15:41
🔗
|
ZoeB |
Looking at one group I'm a member of as an example, it spans back to 2001, which would be quite a lot of information to lose... |
|
15:42
🔗
|
ZoeB |
It seems to be publicly accessible, this example, though scraping the plaintexts of all the messages would require a fairly cunning script |
|
15:45
🔗
|
ZoeB |
Even having only the still-active groups would be much better than nothing |
|
16:06
🔗
|
ZoeB |
I have to go now... just an idea to think about, anyway. |
|
17:33
🔗
|
SketchCow |
I am not sure we're archiving them. We should be. |
|
18:03
🔗
|
balrog |
on that topic: anyone here good with perl? |
|
18:03
🔗
|
balrog |
those are ANNOYING to archive, since most groups require you to be a member to see any of the good stuff |
|
18:03
🔗
|
balrog |
someone wrote a perl archiver that kinda works but yahoo broke it with a recent auth change ;( |
|
18:04
🔗
|
omf_ |
link Baljem |
|
18:04
🔗
|
omf_ |
balrog, |
|
18:04
🔗
|
omf_ |
I meant |
|
18:04
🔗
|
balrog |
http://sourceforge.net/p/grabyahoogroup/code/127/tree/trunk/ is upstream; https://github.com/balr0g/grabyahoogroup/commits/master is a couple of patches I added to make it more reliable. |
|
18:04
🔗
|
balrog |
but I don't really get perl that well |
|
18:05
🔗
|
balrog |
usage is as follows: perl5.10 /path/to/grabyahoogroup.pl --username "user" --password "pass" --group groupname --verbose --verbose --verbose --verbose --verbose |
|
18:05
🔗
|
balrog |
however yahoo broke auth recently |
|
18:06
🔗
|
balrog |
the changes I made cause it to archive a lot slower but make it so you don't get 999s at all :) |
|
18:08
🔗
|
omf_ |
Yeah I can fix up the pile of shit you inherited balrog. Looks like Perl from the 90s |
|
18:08
🔗
|
omf_ |
I guess there is no point in making it be able to grab multiple pages at once because of Yahoo throttling |
|
18:09
🔗
|
omf_ |
To me this is a two step problem. 1. get all the groups. 2. get each groups content |
|
18:10
🔗
|
omf_ |
balrog, was that delay time the first one you tried? |
|
18:10
🔗
|
omf_ |
Also do you happen to remember how long it took to get banned |
|
18:13
🔗
|
balrog |
omf_: no it was like the second or third |
|
18:13
🔗
|
balrog |
but it was absolutely solid. |
|
18:13
🔗
|
omf_ |
sweet |
|
18:14
🔗
|
balrog |
the upstream dude has been extremely busy |
|
18:14
🔗
|
balrog |
yeah the perl looked like a pile of shit which is why I got lost looking at it ;( |
|
18:14
🔗
|
omf_ |
ewww tons of html parsing with regex for no reason |
|
18:14
🔗
|
balrog |
usually I'm not all that bad with new languages |
|
18:14
🔗
|
balrog |
yep, that |
|
18:14
🔗
|
omf_ |
this is total crap |
|
18:14
🔗
|
balrog |
still, if I use an old auth cookie, it just works (in most cases) |
|
18:15
🔗
|
omf_ |
are most of the groups in need of auth or are they just public |
|
18:15
🔗
|
balrog |
can you salvage any part of this? |
|
18:15
🔗
|
balrog |
if you care about files, database, photos, or attachments, then yes in need of auth |
|
18:15
🔗
|
omf_ |
It tells me the structure of what to look for and follow which is very useful |
|
18:15
🔗
|
balrog |
if you only care about messages, then it's 50/50 |
|
18:15
🔗
|
omf_ |
do you need authentication to verify a group exists? |
|
18:15
🔗
|
balrog |
no |
|
18:16
🔗
|
balrog |
well maybe for "private" groups |
|
18:16
🔗
|
balrog |
http://launch.groups.yahoo.com/group/yamahadx/ is a typical group which has messages and attachments "public-available" |
|
18:17
🔗
|
balrog |
the perl script will detect which sections you have access to and only download those |
|
18:17
🔗
|
balrog |
so if you use it without auth on that group it should grab messages and attachments |
|
18:17
🔗
|
omf_ |
yes examples are good. I am looking at the groups list to see how easy that will be to build up |
|
18:17
🔗
|
balrog |
if you use it with auth it will grab the entire thing |
|
18:17
🔗
|
balrog |
you mean to download all groups? |
|
18:17
🔗
|
omf_ |
yes |
|
18:17
🔗
|
balrog |
right now I'd just make it work for single groups |
|
18:18
🔗
|
balrog |
and again I care a lot about files/photos/database since there's tons of good stuff usually |
|
18:18
🔗
|
balrog |
the "gross hack" is to get around yahoo sometimes returning an empty message |
|
18:18
🔗
|
balrog |
it's a horrible GOTO :p |
|
18:19
🔗
|
omf_ |
is it still a response 200 even if the post is empty? |
|
18:21
🔗
|
omf_ |
The other reason this is so hard to read is it is 8 separate libraries worth of code in one file |
|
18:23
🔗
|
omf_ |
GrabYahoo, GrabYahoo::Client, GrabYahoo::Logger, GrabYahoo::Messages, GrabYahoo::Files, GrabYahoo::Attachments, GrabYahoo::Members, GrabYahoo::Photos |
|
18:23
🔗
|
omf_ |
are all the library namespaces defined in that file. Standard practice is to only define 1 per file |
|
18:25
🔗
|
omf_ |
How did you get the auth token from your account to try this program? |
|
18:28
🔗
|
balrog |
by using it before yahoo changed auth |
|
18:29
🔗
|
omf_ |
poop |
|
18:42
🔗
|
balrog |
omf_: let me know how this goes. I wouldn't mind helping, but this perl is just unreadable ;( |
|
18:53
🔗
|
DFJustin |
no, this perl is unreadable :) https://www.cs.cmu.edu/~dst/DeCSS/Gallery/qrpff.pl |
|
18:56
🔗
|
mistym |
No, *this* perl is unreadable http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html |
|
18:57
🔗
|
DFJustin |
lol |
|
19:13
🔗
|
DFJustin |
http://www.neogaf.com/forum/showthread.php?t=652647 |
|
19:29
🔗
|
SketchCow |
A bright moment in life |
|
19:29
🔗
|
SketchCow |
Meanwhile, the archivists have their meetings |
|
19:30
🔗
|
mistym |
We need to meet to come up with the perfect solution |
|
19:30
🔗
|
mistym |
Guys, if we did something that turned out to not be perfect then that would be bad |
|
19:30
🔗
|
mistym |
So let's just wait for perfection |
|
19:31
🔗
|
mistym |
*allows literally billions of digital records to be destroyed* |