#archiveteam 2013-03-10,Sun

↑back Search

Time Nickname Message
00:00 🔗 Smiley hmmmm, I don't know exactly what the warc's do, I can't answer that
00:00 🔗 RubyDo ok
00:00 🔗 soultcer Well the warcs contain http headers, the download parameters and a logfile
00:01 🔗 RubyDo sorry, what is "warcs"?
00:02 🔗 soultcer It's a file format that archive.org uses to store archived websites. We use it as well for our projects
00:02 🔗 RubyDo ok thanks
00:05 🔗 RubyDo Do you always find enough time and resources to rescue 100% of the website before it is gone?
00:05 🔗 chronomex not often
00:05 🔗 chronomex maybe 40-60% of the time
00:05 🔗 chronomex (as much as one can measure a count of widely-varying sized objects)
00:06 🔗 RubyDo ok
00:06 🔗 RubyDo Do you find objection from site owners when you start archiving?
00:06 🔗 chronomex depends.
00:07 🔗 chronomex some people really take ownership of a site
00:07 🔗 chronomex others say "nope, done with that, whatever you want" and don't do anything
00:07 🔗 chronomex many site owners object to the higher-than-normal load we offer them
00:08 🔗 chronomex for example the posterous project we're running right now is offering significant load to their systems
00:09 🔗 chronomex posterous the website has been running a bit slower than usual (responding in ~150% of normal time), to cite a current event
00:10 🔗 RubyDo and how do you react for the cases when they object? do you continue your mission or stop archiving?
00:10 🔗 chronomex we're really having to lay it on them, though, it's a tight time window and a whole bunch of content that costs cpu/database time for them to render
00:10 🔗 chronomex fuck no do we stop, this is archiveteam
00:11 🔗 RubyDo ok :)
00:11 🔗 chronomex if it can be had and we think it has a fair chance of being valuable to someone now or in the future, we'll do anything we can to overcome blocks
00:12 🔗 chronomex we go to extra lengths to get the actual content, which in some cases (complex dynamic pages, flash) may be much more than any simple web spider will ever get
00:13 🔗 ersi WARC (Web ARCHive) is an ISO Standard File Format by the way.
00:13 🔗 RubyDo oh ok! Thanks
00:14 🔗 RubyDo How many are the archive team members?
00:14 🔗 ersi Hard to say, there's no real membership
00:15 🔗 ersi but everyone who help out, which can be quite many to just a few
00:15 🔗 chronomex 15-50 depending on how you count activity
00:15 🔗 chronomex there's 105 people in this channel right now
00:16 🔗 chronomex at his defcon 19 talk, jason said onstage "fuck you, you are ALL in archiveteam!"
00:16 🔗 RubyDo :)
00:16 🔗 chronomex one question that I've never seen asked is "is it archive team, archiveteam, Archive Team, Archiveteam, or ArchiveTeam?"
00:16 🔗 chronomex I tend to go for archiveteam or Archiveteam
00:17 🔗 RubyDo ok
00:18 🔗 soultcer RubyDo: Just curious, for what class is your project?
00:18 🔗 RubyDo Course: Management of Electronic Documents
00:19 🔗 RubyDo Master program of Information management in university of Ottawa in Canada
00:19 🔗 chronomex library sciences/informatics?
00:19 🔗 chronomex ah
00:20 🔗 RubyDo yes
00:22 🔗 ersi We're like library visitors, saving a burning library
00:22 🔗 ersi we're not as quiet though
00:23 🔗 RubyDo For your archival plan, you divide the website into many torrents and you use the warrier to download the website,
00:24 🔗 RubyDo but how do you know the size of a website and the number of parts in order to download all of it?
00:24 🔗 ersi We don't
00:24 🔗 soultcer We usually saves sites with user-generated content, so we divide the work as "1 item per user on that site"
00:25 🔗 ersi and since it's usually like that, after a while we start seeing averages on users
00:27 🔗 RubyDo what do you mean by "1 item per user"?
00:30 🔗 RubyDo I mean how do you grant that 2 members are not downloading the same files?
00:31 🔗 soultcer There is a tracker that assigns each warrior a list of users
00:31 🔗 RubyDo ok
00:31 🔗 RubyDo Thank you very much everybody!
00:32 🔗 Smiley np
00:32 🔗 Smiley good luck.
00:32 🔗 RubyDo thanks :)
00:36 🔗 chronomex somehow I was expecting more questions
00:36 🔗 Smiley yah
00:36 🔗 Smiley ah well, always nice to help
00:37 🔗 Smiley chronomex: it's possible they've done their research and just needed a few things clearing up
00:37 🔗 Smiley most of those questions aren't clean from the warrior, even if your running it, to be fair
00:37 🔗 chronomex true
10:32 🔗 omf_ I added some more info to our clown hosting page
19:29 🔗 SketchCow The first Posterous batch is in. But this is a difficult one.
19:29 🔗 SketchCow We need to talk about all the parallel projects, I'm worried we're going to lose one.
19:33 🔗 balrog_ yeah we have Punchfork (done), Posterous (getting the most attention but going too slow), Yahoo Message Boards (important), opensolaris (needs to be finished)
19:33 🔗 balrog_ chronomex: any chance you can look at the other osol repos?
19:33 🔗 soultcer Don't forget storylane
19:34 🔗 balrog_ oh yes, that's another
19:34 🔗 balrog_ *sigh*
19:34 🔗 chronomex balrog_: what other osol repos? sorry I'm losing it
19:34 🔗 chronomex s/losing it/forgot/
19:34 🔗 balrog_ chronomex: see http://www.archiveteam.org/index.php?title=Closedsolaris
19:34 🔗 soultcer Actually, storylane tracker says it's almost done
19:34 🔗 balrog_ there's src. and repo.
19:34 🔗 chronomex thx
19:34 🔗 chronomex aha
19:34 🔗 balrog_ you did src., but there are more (and some svn repos) on repo.
19:34 🔗 balrog_ if you need help archiving svn repos, let me know
19:34 🔗 balrog_ for those you'd use svnsync
19:34 🔗 chronomex right
19:35 🔗 balrog_ there's also hub and static
19:35 🔗 chronomex jfc
19:35 🔗 balrog_ static probably can be just done with wget/warc
19:35 🔗 balrog_ ideally, be logged in when doing so
19:36 🔗 chronomex is there a list of things on repo. ?
19:36 🔗 balrog_ repo requires login
19:36 🔗 balrog_ use bugmenot or so
19:36 🔗 chronomex ok
19:36 🔗 balrog_ http://www.bugmenot.com/view/opensolaris.org
19:36 🔗 balrog_ https://repo.opensolaris.org/info/projects.action
19:37 🔗 balrog_ probably need to wget, then compile list of repos, and compare
19:37 🔗 chronomex ah, cool.
19:37 🔗 balrog_ also not all are anonymous
19:37 🔗 balrog_ those will be lost, oh well
19:37 🔗 chronomex :\
20:20 🔗 godane looks like i'm uploading ces press conf
20:52 🔗 godane so i have access to thebox.bz again
20:59 🔗 balrog_ they let you back?
20:59 🔗 godane i login using a proxy
20:59 🔗 godane but i can log back in with my own ip
21:14 🔗 godane so wget is banned
21:15 🔗 godane from thebox.bz
21:24 🔗 godane i just fixed it
21:25 🔗 godane i had the wrong forum id number
21:35 🔗 godane CES '09: LG Electronics Press Conference: http://archive.org/details/g4tv.com-video35862
21:36 🔗 godane Dead Rising: Chop Til You Drop Japanese Music Video: http://archive.org/details/g4tv.com-video35837
22:26 🔗 omf_ SketchCo1, balrog_ You guys also forgot the 30+ sites in the gamespy, ugo, ign, 1up deal
22:26 🔗 balrog_ ugh
22:26 🔗 SketchCo1 and Poland
22:34 🔗 arkhive I have a question to the Team. Since there has never been an emulator for the LaserActive to play Mega LD games could a KickStarter be created to get funding for whatever equipment needed to make one?
22:34 🔗 arkhive I'm not knowledgeable enough to make one but i'd toss in a hundred bucks to the KickStarter.
22:35 🔗 arkhive And I'd even donate my CLD-A100 and S10 Sega Pac.
22:36 🔗 arkhive I just think it'd be cool to have an emu and also have the very few Mega LD games/software that were released/leaked to be dumped for use.
22:36 🔗 arkhive Let me know your opinions and such. Thanks :)
22:41 🔗 DFJustin afaik the only thing standing in the way of laseractive emulation is dumps of the discs
22:41 🔗 DFJustin aaron giles of the mame team set up a method of dumping laserdiscs properly including all the off-screen info but so far nobody else has done so to my knowledge
22:42 🔗 * ersi read "meme team"
22:43 🔗 chronomex heh, meme team
22:43 🔗 chronomex archiveteam subcommittee on cat macros
22:45 🔗 DFJustin the bios runs in mess already http://imageshack.us/a/img543/8372/laseract.png
22:45 🔗 chronomex cool cool
22:50 🔗 shaqfu LaserActive emulation? Awesome
23:06 🔗 arkhive http://en.wikipedia.org/wiki/BBC_Domesday_Project
23:07 🔗 arkhive more specifically: http://en.wikipedia.org/wiki/BBC_Domesday_Project#Preservation
23:08 🔗 arkhive DFJustin: is it actively or somewhat actively being worked on?
23:09 🔗 * ersi glares in Dr Who's general direction
23:21 🔗 GLaDOS We only have 20 days left for MessageBoards
23:25 🔗 GLaDOS Bah
23:25 🔗 GLaDOS ...
23:34 🔗 DFJustin no it's not actively being worked on
23:38 🔗 omf_ I think the current page has all the active projects listed now

irclogger-viewer