Time |
Nickname |
Message |
00:00
🔗
|
Smiley |
hmmmm, I don't know exactly what the warc's do, I can't answer that |
00:00
🔗
|
RubyDo |
ok |
00:00
🔗
|
soultcer |
Well the warcs contain http headers, the download parameters and a logfile |
00:01
🔗
|
RubyDo |
sorry, what is "warcs"? |
00:02
🔗
|
soultcer |
It's a file format that archive.org uses to store archived websites. We use it as well for our projects |
00:02
🔗
|
RubyDo |
ok thanks |
00:05
🔗
|
RubyDo |
Do you always find enough time and resources to rescue 100% of the website before it is gone? |
00:05
🔗
|
chronomex |
not often |
00:05
🔗
|
chronomex |
maybe 40-60% of the time |
00:05
🔗
|
chronomex |
(as much as one can measure a count of widely-varying sized objects) |
00:06
🔗
|
RubyDo |
ok |
00:06
🔗
|
RubyDo |
Do you find objection from site owners when you start archiving? |
00:06
🔗
|
chronomex |
depends. |
00:07
🔗
|
chronomex |
some people really take ownership of a site |
00:07
🔗
|
chronomex |
others say "nope, done with that, whatever you want" and don't do anything |
00:07
🔗
|
chronomex |
many site owners object to the higher-than-normal load we offer them |
00:08
🔗
|
chronomex |
for example the posterous project we're running right now is offering significant load to their systems |
00:09
🔗
|
chronomex |
posterous the website has been running a bit slower than usual (responding in ~150% of normal time), to cite a current event |
00:10
🔗
|
RubyDo |
and how do you react for the cases when they object? do you continue your mission or stop archiving? |
00:10
🔗
|
chronomex |
we're really having to lay it on them, though, it's a tight time window and a whole bunch of content that costs cpu/database time for them to render |
00:10
🔗
|
chronomex |
fuck no do we stop, this is archiveteam |
00:11
🔗
|
RubyDo |
ok :) |
00:11
🔗
|
chronomex |
if it can be had and we think it has a fair chance of being valuable to someone now or in the future, we'll do anything we can to overcome blocks |
00:12
🔗
|
chronomex |
we go to extra lengths to get the actual content, which in some cases (complex dynamic pages, flash) may be much more than any simple web spider will ever get |
00:13
🔗
|
ersi |
WARC (Web ARCHive) is an ISO Standard File Format by the way. |
00:13
🔗
|
RubyDo |
oh ok! Thanks |
00:14
🔗
|
RubyDo |
How many are the archive team members? |
00:14
🔗
|
ersi |
Hard to say, there's no real membership |
00:15
🔗
|
ersi |
but everyone who help out, which can be quite many to just a few |
00:15
🔗
|
chronomex |
15-50 depending on how you count activity |
00:15
🔗
|
chronomex |
there's 105 people in this channel right now |
00:16
🔗
|
chronomex |
at his defcon 19 talk, jason said onstage "fuck you, you are ALL in archiveteam!" |
00:16
🔗
|
RubyDo |
:) |
00:16
🔗
|
chronomex |
one question that I've never seen asked is "is it archive team, archiveteam, Archive Team, Archiveteam, or ArchiveTeam?" |
00:16
🔗
|
chronomex |
I tend to go for archiveteam or Archiveteam |
00:17
🔗
|
RubyDo |
ok |
00:18
🔗
|
soultcer |
RubyDo: Just curious, for what class is your project? |
00:18
🔗
|
RubyDo |
Course: Management of Electronic Documents |
00:19
🔗
|
RubyDo |
Master program of Information management in university of Ottawa in Canada |
00:19
🔗
|
chronomex |
library sciences/informatics? |
00:19
🔗
|
chronomex |
ah |
00:20
🔗
|
RubyDo |
yes |
00:22
🔗
|
ersi |
We're like library visitors, saving a burning library |
00:22
🔗
|
ersi |
we're not as quiet though |
00:23
🔗
|
RubyDo |
For your archival plan, you divide the website into many torrents and you use the warrier to download the website, |
00:24
🔗
|
RubyDo |
but how do you know the size of a website and the number of parts in order to download all of it? |
00:24
🔗
|
ersi |
We don't |
00:24
🔗
|
soultcer |
We usually saves sites with user-generated content, so we divide the work as "1 item per user on that site" |
00:25
🔗
|
ersi |
and since it's usually like that, after a while we start seeing averages on users |
00:27
🔗
|
RubyDo |
what do you mean by "1 item per user"? |
00:30
🔗
|
RubyDo |
I mean how do you grant that 2 members are not downloading the same files? |
00:31
🔗
|
soultcer |
There is a tracker that assigns each warrior a list of users |
00:31
🔗
|
RubyDo |
ok |
00:31
🔗
|
RubyDo |
Thank you very much everybody! |
00:32
🔗
|
Smiley |
np |
00:32
🔗
|
Smiley |
good luck. |
00:32
🔗
|
RubyDo |
thanks :) |
00:36
🔗
|
chronomex |
somehow I was expecting more questions |
00:36
🔗
|
Smiley |
yah |
00:36
🔗
|
Smiley |
ah well, always nice to help |
00:37
🔗
|
Smiley |
chronomex: it's possible they've done their research and just needed a few things clearing up |
00:37
🔗
|
Smiley |
most of those questions aren't clean from the warrior, even if your running it, to be fair |
00:37
🔗
|
chronomex |
true |
10:32
🔗
|
omf_ |
I added some more info to our clown hosting page |
19:29
🔗
|
SketchCow |
The first Posterous batch is in. But this is a difficult one. |
19:29
🔗
|
SketchCow |
We need to talk about all the parallel projects, I'm worried we're going to lose one. |
19:33
🔗
|
balrog_ |
yeah we have Punchfork (done), Posterous (getting the most attention but going too slow), Yahoo Message Boards (important), opensolaris (needs to be finished) |
19:33
🔗
|
balrog_ |
chronomex: any chance you can look at the other osol repos? |
19:33
🔗
|
soultcer |
Don't forget storylane |
19:34
🔗
|
balrog_ |
oh yes, that's another |
19:34
🔗
|
balrog_ |
*sigh* |
19:34
🔗
|
chronomex |
balrog_: what other osol repos? sorry I'm losing it |
19:34
🔗
|
chronomex |
s/losing it/forgot/ |
19:34
🔗
|
balrog_ |
chronomex: see http://www.archiveteam.org/index.php?title=Closedsolaris |
19:34
🔗
|
soultcer |
Actually, storylane tracker says it's almost done |
19:34
🔗
|
balrog_ |
there's src. and repo. |
19:34
🔗
|
chronomex |
thx |
19:34
🔗
|
chronomex |
aha |
19:34
🔗
|
balrog_ |
you did src., but there are more (and some svn repos) on repo. |
19:34
🔗
|
balrog_ |
if you need help archiving svn repos, let me know |
19:34
🔗
|
balrog_ |
for those you'd use svnsync |
19:34
🔗
|
chronomex |
right |
19:35
🔗
|
balrog_ |
there's also hub and static |
19:35
🔗
|
chronomex |
jfc |
19:35
🔗
|
balrog_ |
static probably can be just done with wget/warc |
19:35
🔗
|
balrog_ |
ideally, be logged in when doing so |
19:36
🔗
|
chronomex |
is there a list of things on repo. ? |
19:36
🔗
|
balrog_ |
repo requires login |
19:36
🔗
|
balrog_ |
use bugmenot or so |
19:36
🔗
|
chronomex |
ok |
19:36
🔗
|
balrog_ |
http://www.bugmenot.com/view/opensolaris.org |
19:36
🔗
|
balrog_ |
https://repo.opensolaris.org/info/projects.action |
19:37
🔗
|
balrog_ |
probably need to wget, then compile list of repos, and compare |
19:37
🔗
|
chronomex |
ah, cool. |
19:37
🔗
|
balrog_ |
also not all are anonymous |
19:37
🔗
|
balrog_ |
those will be lost, oh well |
19:37
🔗
|
chronomex |
:\ |
20:20
🔗
|
godane |
looks like i'm uploading ces press conf |
20:52
🔗
|
godane |
so i have access to thebox.bz again |
20:59
🔗
|
balrog_ |
they let you back? |
20:59
🔗
|
godane |
i login using a proxy |
20:59
🔗
|
godane |
but i can log back in with my own ip |
21:14
🔗
|
godane |
so wget is banned |
21:15
🔗
|
godane |
from thebox.bz |
21:24
🔗
|
godane |
i just fixed it |
21:25
🔗
|
godane |
i had the wrong forum id number |
21:35
🔗
|
godane |
CES '09: LG Electronics Press Conference: http://archive.org/details/g4tv.com-video35862 |
21:36
🔗
|
godane |
Dead Rising: Chop Til You Drop Japanese Music Video: http://archive.org/details/g4tv.com-video35837 |
22:26
🔗
|
omf_ |
SketchCo1, balrog_ You guys also forgot the 30+ sites in the gamespy, ugo, ign, 1up deal |
22:26
🔗
|
balrog_ |
ugh |
22:26
🔗
|
SketchCo1 |
and Poland |
22:34
🔗
|
arkhive |
I have a question to the Team. Since there has never been an emulator for the LaserActive to play Mega LD games could a KickStarter be created to get funding for whatever equipment needed to make one? |
22:34
🔗
|
arkhive |
I'm not knowledgeable enough to make one but i'd toss in a hundred bucks to the KickStarter. |
22:35
🔗
|
arkhive |
And I'd even donate my CLD-A100 and S10 Sega Pac. |
22:36
🔗
|
arkhive |
I just think it'd be cool to have an emu and also have the very few Mega LD games/software that were released/leaked to be dumped for use. |
22:36
🔗
|
arkhive |
Let me know your opinions and such. Thanks :) |
22:41
🔗
|
DFJustin |
afaik the only thing standing in the way of laseractive emulation is dumps of the discs |
22:41
🔗
|
DFJustin |
aaron giles of the mame team set up a method of dumping laserdiscs properly including all the off-screen info but so far nobody else has done so to my knowledge |
22:42
🔗
|
* |
ersi read "meme team" |
22:43
🔗
|
chronomex |
heh, meme team |
22:43
🔗
|
chronomex |
archiveteam subcommittee on cat macros |
22:45
🔗
|
DFJustin |
the bios runs in mess already http://imageshack.us/a/img543/8372/laseract.png |
22:45
🔗
|
chronomex |
cool cool |
22:50
🔗
|
shaqfu |
LaserActive emulation? Awesome |
23:06
🔗
|
arkhive |
http://en.wikipedia.org/wiki/BBC_Domesday_Project |
23:07
🔗
|
arkhive |
more specifically: http://en.wikipedia.org/wiki/BBC_Domesday_Project#Preservation |
23:08
🔗
|
arkhive |
DFJustin: is it actively or somewhat actively being worked on? |
23:09
🔗
|
* |
ersi glares in Dr Who's general direction |
23:21
🔗
|
GLaDOS |
We only have 20 days left for MessageBoards |
23:25
🔗
|
GLaDOS |
Bah |
23:25
🔗
|
GLaDOS |
... |
23:34
🔗
|
DFJustin |
no it's not actively being worked on |
23:38
🔗
|
omf_ |
I think the current page has all the active projects listed now |