| Time |
Nickname |
Message |
|
00:36
๐
|
dnova |
can I begin rsyncing splinder while still downloading? |
|
00:37
๐
|
chronomex |
yes |
|
00:37
๐
|
dnova |
ok I need to get on that asap. gotta catch sketchcow for a slot? |
|
00:37
๐
|
chronomex |
indeed |
|
00:37
๐
|
dnova |
thanks |
|
01:38
๐
|
SketchCow |
BACK |
|
01:38
๐
|
Coderjoe |
dnova: use the upload script |
|
01:39
๐
|
dnova |
# (ask SketchCow for a module name) |
|
01:39
๐
|
dnova |
lol |
|
01:40
๐
|
Coderjoe |
i know. I meant when you do get a module name, use the upload script |
|
01:40
๐
|
dnova |
SketchCow: I want to start uploading my splinders |
|
01:40
๐
|
dnova |
Coderjoe: mos def |
|
02:16
๐
|
underscor |
http://i.imgur.com/1fcec.png |
|
02:20
๐
|
chronomex |
hah |
|
02:26
๐
|
BlueMax |
*facepalm* |
|
02:28
๐
|
Coderjoe |
never put dicks in your ears |
|
02:58
๐
|
RedType |
Coderjoe: you think cleaning out earwax is hard? |
|
03:08
๐
|
SketchCow |
Hello, everyone. |
|
03:08
๐
|
SketchCow |
There are two reporters, Eva Talmadge and Matt/Matthias Schwartz, trying to do a story on Archive Team. |
|
03:08
๐
|
SketchCow |
Please do not talk to them. |
|
03:08
๐
|
SketchCow |
Let's put that in the lines. |
|
03:08
๐
|
SketchCow |
-------------------------------------- |
|
03:08
๐
|
SketchCow |
Hello, everyone. |
|
03:08
๐
|
SketchCow |
There are two reporters, Eva Talmadge and Matt/Matthias Schwartz, trying to do a story on Archive Team. |
|
03:08
๐
|
SketchCow |
Let's put that in the lines. |
|
03:08
๐
|
SketchCow |
Please do not talk to them. |
|
03:08
๐
|
SketchCow |
-------------------------------------- |
|
03:25
๐
|
kennethre |
SketchCow: Channel Topic, perhaps? |
|
03:28
๐
|
SketchCow |
I expect some people will ignore. |
|
03:28
๐
|
SketchCow |
But I did want to say it. |
|
03:31
๐
|
db48x |
out of curiosity, what's your reasoning there? |
|
03:51
๐
|
SketchCow |
http://www.mattathiasschwartz.com/ |
|
03:51
๐
|
SketchCow |
Go read the other articles |
|
03:51
๐
|
SketchCow |
tell me how we'll fare. |
|
03:58
๐
|
godane |
it looks like there is no way to simple turn wikipedia dump into a wikipedia website |
|
03:59
๐
|
godane |
is there any tools you guys use to read wiki dumps like a full index website? |
|
04:12
๐
|
chronomex |
godane: what's the goal? |
|
04:34
๐
|
dashcloud |
someone at some point in this channel asked for a copy of Coming Soon (online magazine) (www.csoon.com)- I tried use wget-warc to make a copy of it |
|
04:38
๐
|
Paradoks |
dashcloud: Any idea what it'd require to verify that you did things correctly? I'd love to help, but know very little about wget-warc. |
|
04:41
๐
|
dashcloud |
here's the command I used to grab it: http://pastebin.com/Yzzw28ep, and the site's still up, minus 10-20 pages |
|
04:43
๐
|
dashcloud |
I'm short on time right now, but I'm happy to send over my copy tomorrow |
|
04:44
๐
|
Paradoks |
Cool. I'll take a stab at it if no one else more qualified steps forward. |
|
04:44
๐
|
dashcloud |
the only other thing I think you need is to make sure all the directories mentioned in the command exist |
|
04:45
๐
|
dashcloud |
(i.e don't rely on wget to create them) |
|
04:45
๐
|
dashcloud |
good night folks! |
|
05:19
๐
|
godane |
chronomex: was trying to host a local lan version of wikipedia |
|
05:20
๐
|
chronomex |
ah, hm. |
|
05:20
๐
|
chronomex |
wow this matt guy is really artsy-fartsy with his writing |
|
05:55
๐
|
NotGLaDOS |
Can't we just throw them off instead? |
|
05:59
๐
|
SketchCow |
Today has been catch-up day. |
|
05:59
๐
|
NotGLaDOS |
Also, have you got that rsync slot set up for me? |
|
06:01
๐
|
SketchCow |
I can do that. |
|
06:06
๐
|
Zebranky |
While you're around, I'd like to throw out an "archive.org was the only source for an extremely helpful page" testimonial, as if you needed more |
|
06:06
๐
|
Zebranky |
So much good for the Internet. |
|
06:07
๐
|
NotGLaDOS |
note: this is archiveteam.org. We only have access to archive.org, we don't run it. |
|
06:08
๐
|
chronomex |
and by access we mean we have no more access than anyone else with an account |
|
06:08
๐
|
chronomex |
for the most part |
|
06:08
๐
|
NotGLaDOS |
What he said. |
|
06:14
๐
|
Zebranky |
I know. That was directed at SketchCow. |
|
06:15
๐
|
Zebranky |
Since this is a convenient way to throw quick thoughts at him |
|
06:23
๐
|
NotGLaDOS |
He's the same as what we said: only has member access. |
|
06:39
๐
|
Zebranky |
Fair enough. My understanding was that he worked a bit closer with them. |
|
08:47
๐
|
kin37ik |
so, google buzz is going down soonish i hear |
|
08:50
๐
|
yipdw |
that's the buzz |
|
10:08
๐
|
BlueMax |
lol |
|
10:37
๐
|
emijrp |
sharing info about the damaged libraries by hurricane irene throguht facebook pages (see last 3 paragraphs), looks like a long term solution hell yeah http://www.librarian.net/stax/3652/helping-libraries-damaged-by-hurricane-irene/ |
|
10:38
๐
|
emijrp |
webcite allows uploading link batches http://www.webcitation.org/comb |
|
10:38
๐
|
emijrp |
it is very useful to archive tons |
|
10:39
๐
|
emijrp |
Archive-It tool from Internet Archive is not free, so, in this case, IA sucks |
|
10:40
๐
|
BlueMax |
:/ |
|
10:43
๐
|
emijrp |
metadata for 15000+ knols complete |
|
10:44
๐
|
emijrp |
the channel is #klol |
|
11:15
๐
|
emijrp |
trying to archive all AT wiki using the webcite comb www.webcitation.org/comb |
|
11:17
๐
|
emijrp |
clicked the submit button, but the process is slow,... waitnig |
|
11:19
๐
|
db48x |
you put 15000 urls into www.webcitation.org/comb? |
|
11:19
๐
|
emijrp |
no |
|
11:19
๐
|
emijrp |
15000 metadata from knols downloaded, really 20000 now |
|
11:20
๐
|
emijrp |
the webcite submit is this http://www.archiveteam.org/index.php?title=Template:Navigation_box |
|
11:20
๐
|
emijrp |
less than 100 links |
|
11:20
๐
|
emijrp |
it scrapes all the links, you checkbox the desired links and archive |
|
11:20
๐
|
db48x |
ooh |
|
11:21
๐
|
db48x |
that makes more sense |
|
11:22
๐
|
emijrp |
by the way, uploading knol links batches to webcite is a choice |
|
11:23
๐
|
emijrp |
just downloading all knols, tar gzip and upload to IA is a shit |
|
11:23
๐
|
emijrp |
most of AT projects are not viewable |
|
11:23
๐
|
emijrp |
just huge packs |
|
11:31
๐
|
db48x |
yea, given their size that's been the easiest way to go |
|
11:43
๐
|
underscor |
<NotGLaDOS> He's the same as what we said: only has member access. |
|
11:43
๐
|
underscor |
Actually, he's a full admin, iirc |
|
12:23
๐
|
emijrp |
~25k metadata chunk for your tests http://www.sendspace.com/file/o8fthv |
|
12:23
๐
|
emijrp |
tab delimited |
|
13:54
๐
|
NotGLaDOS |
underscor: interesting. |
|
15:06
๐
|
SketchCow |
Brp |
|
17:47
๐
|
emijrp |
how many sites have closed this year? |
|
17:48
๐
|
Schbirid |
milliona |
|
17:48
๐
|
Schbirid |
s |
|
17:51
๐
|
tef |
emijrp: check out the wiki for the deathwatch pages |
|
17:51
๐
|
emijrp |
i mean, i feel that this year has been very bad |
|
17:52
๐
|
tef |
it can only get worse |
|
18:18
๐
|
emijrp |
SketchCow: why IA doesnt setup anything like this to allow people transcript books? https://es.wikisource.org/w/index.php?title=P%C3%A1gina:Plat%C3%B3n_-_La_Rep%C3%BAblica_%281805%29,_Tomo_1.djvu/322&action=edit&redlink=1 IA OCR is worst ever |
|
18:18
๐
|
Schbirid |
that would be really cool |
|
18:20
๐
|
ersi |
emijrp: that text made little sense |
|
18:21
๐
|
emijrp |
text on the left is OCR autofill, later a person rewrite needed phrases |
|
18:23
๐
|
emijrp |
a corrected page is this https://es.wikisource.org/wiki/P%C3%A1gina:Plat%C3%B3n_-_La_Rep%C3%BAblica_%281805%29,_Tomo_1.djvu/89 |
|
18:40
๐
|
emijrp |
https://twitter.com/#!/brewster_kahle |
|
18:42
๐
|
ersi |
emijrp: point being? A specific tweet? click the "X hours ago" to direct link |
|
18:42
๐
|
emijrp |
no, just recommending that twitter account |
|
18:42
๐
|
ersi |
Okay |
|
18:43
๐
|
* |
Schbirid sendsd ersi into the fresh air |
|
18:43
๐
|
ersi |
Weeeee! |
|
19:11
๐
|
emijrp |
The digital materials, we can make copies of. And weรขยยveรขยยwe have two copies within the United States, and we have a partial copy in Alexandria, Egypt, which is, I guess, fitting, as we have a large-scale swap agreement with them to archive their materials, and they archive ours. And also in Amsterdam, we have a partial copy. If there are five or six copies of these materials worldwide, I think Iรขยยd feel safe. |
|
19:11
๐
|
emijrp |
http://www.democracynow.org/2011/8/24/pioneering_internet_archivists_brewster_kahle_and |
|
19:12
๐
|
emijrp |
So, ... |
|
22:42
๐
|
Wyatt |
So at what point should I just give up and kill a wget? |
|
23:30
๐
|
DoubleJ |
Wyatt: Check the files directory. If the most recent directory was recently modified it's still going. |
|
23:31
๐
|
DoubleJ |
And if there are a crapload of domains it's trying to download all of Splinder/MobileMe/whatever. |