| Time |
Nickname |
Message |
|
00:10
🔗
|
|
wyatt874- is now known as wyatt8740 |
|
00:27
🔗
|
|
fie__ has joined #archiveteam |
|
00:31
🔗
|
|
fie_ has quit IRC (Ping timeout: 268 seconds) |
|
01:04
🔗
|
|
JesseW has joined #archiveteam |
|
01:07
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
01:21
🔗
|
|
w0rp has quit IRC (Ping timeout: 268 seconds) |
|
01:21
🔗
|
|
w0rp has joined #archiveteam |
|
02:21
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
02:30
🔗
|
|
dashcloud has joined #archiveteam |
|
03:05
🔗
|
|
SilSte has quit IRC (Read error: Operation timed out) |
|
03:05
🔗
|
|
SilSte has joined #archiveteam |
|
03:05
🔗
|
|
BlueMaxim has joined #archiveteam |
|
03:15
🔗
|
|
odie5533_ has quit IRC (Read error: Operation timed out) |
|
03:18
🔗
|
|
db48x has joined #archiveteam |
|
03:35
🔗
|
|
odie5533 has joined #archiveteam |
|
03:35
🔗
|
|
odie5533 has quit IRC (Connection closed) |
|
04:24
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
|
06:07
🔗
|
|
Ungstein has joined #archiveteam |
|
06:16
🔗
|
|
khaoohs_ has joined #archiveteam |
|
06:22
🔗
|
|
khaoohs__ has quit IRC (Read error: Operation timed out) |
|
06:25
🔗
|
|
Dark_Star has quit IRC (Ping timeout: 600 seconds) |
|
06:42
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
|
06:44
🔗
|
ploopkazo |
has anyone looked into archiving the Arch wiki? |
|
06:45
🔗
|
ploopkazo |
doesn't look like it's in Knowledge/Wikis in the archiveteam wiki template |
|
06:55
🔗
|
|
Elegance has quit IRC (Quit: :(){ :|:& };:) |
|
06:56
🔗
|
|
primus104 has joined #archiveteam |
|
06:57
🔗
|
|
Elegance has joined #archiveteam |
|
07:38
🔗
|
|
Rotab has joined #archiveteam |
|
07:42
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
07:56
🔗
|
|
schbirid has joined #archiveteam |
|
08:04
🔗
|
|
atomotic has joined #archiveteam |
|
08:19
🔗
|
|
signius_ has quit IRC (Ping timeout: 252 seconds) |
|
08:37
🔗
|
|
signius has joined #archiveteam |
|
09:12
🔗
|
|
primus104 has joined #archiveteam |
|
10:33
🔗
|
Deewiant |
ploopkazo: https://github.com/lahwaacz/arch-wiki-docs and http://kmkeen.com/arch-wiki-lite/ are available as Arch packages, at least |
|
10:35
🔗
|
ploopkazo |
Deewiant: no xml/zim dumps though? |
|
11:00
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
|
11:06
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
11:07
🔗
|
Deewiant |
ploopkazo: No idea, and no idea what could be missing from those |
|
11:10
🔗
|
|
Ungstein has quit IRC (Quit: Leaving.) |
|
11:15
🔗
|
|
Ungstein has joined #archiveteam |
|
11:21
🔗
|
|
ete has joined #archiveteam |
|
11:30
🔗
|
|
xk_id has joined #archiveteam |
|
11:33
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
11:47
🔗
|
|
Froggypwn has quit IRC (Quit: ~ Trillian Astra - www.trillian.im ~) |
|
11:52
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
11:57
🔗
|
|
xk_id has joined #archiveteam |
|
12:06
🔗
|
|
atomotic has joined #archiveteam |
|
12:27
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
12:29
🔗
|
|
xk_id has joined #archiveteam |
|
12:49
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
12:51
🔗
|
|
xk_id has joined #archiveteam |
|
13:01
🔗
|
|
zenguy_pc has quit IRC (Read error: Operation timed out) |
|
13:02
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
13:04
🔗
|
|
xk_id has joined #archiveteam |
|
13:07
🔗
|
|
mahavira has joined #archiveteam |
|
13:08
🔗
|
mahavira |
http://www.archiveteam.org/index.php?title=Blogger Why doesn't this article mention ./sitemap.xml? |
|
13:09
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
13:09
🔗
|
mahavira |
whatever.blogspot.com/sitemap.xml will more often than not provide direct links to every post. |
|
13:11
🔗
|
|
xk_id has joined #archiveteam |
|
13:11
🔗
|
|
Froggypwn has joined #archiveteam |
|
13:13
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
|
13:23
🔗
|
|
primus104 has joined #archiveteam |
|
13:27
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
13:28
🔗
|
|
xk_id has joined #archiveteam |
|
13:28
🔗
|
|
zenguy_pc has joined #archiveteam |
|
13:30
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
13:33
🔗
|
|
xk_id has joined #archiveteam |
|
13:34
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
13:43
🔗
|
|
xk_id has joined #archiveteam |
|
13:47
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
13:56
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
14:01
🔗
|
|
Sk1d has quit IRC (Quit: ZNC - http://znc.in) |
|
14:02
🔗
|
|
PurpleSym has joined #archiveteam |
|
14:07
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
|
14:41
🔗
|
|
Start has joined #archiveteam |
|
15:28
🔗
|
yipdw |
mahavira: in most cases like that it's usually just because nobody wrote it down |
|
15:28
🔗
|
yipdw |
FWIW, wpull uses sitemap.xml if it is available |
|
15:28
🔗
|
yipdw |
so depending on the machinery it is not necessary to explicitly mention sitemap availability |
|
15:29
🔗
|
yipdw |
that said, feel free to add that information |
|
15:30
🔗
|
mahavira |
I'm trying to lynx -dump urls from ./sitemap.xml but google blocks me after a couple requests. I'm about to start firefox and use extensions to do the job. Any ideas? |
|
15:31
🔗
|
yipdw |
slowing down can help |
|
15:31
🔗
|
mahavira |
wget -wait 15 --random-wait didn't work. What do you think would work? |
|
15:31
🔗
|
yipdw |
user-agent tricks, Accept header stuff |
|
15:31
🔗
|
mahavira |
Been there, done that |
|
15:32
🔗
|
yipdw |
there's a body of successes at http://archive.fart.website/archivebot/viewer/?q=blogspot |
|
15:32
🔗
|
yipdw |
and there's a bunch of stuff that a server can glom onto, so if you started out too aggressive then further tweaks can become irrelevant |
|
15:32
🔗
|
yipdw |
try another machine, etc. |
|
15:33
🔗
|
mahavira |
Another machine is not a option, I guess I'll just have to wait for a few hours before trying to do anything again. |
|
15:34
🔗
|
|
xk_id has joined #archiveteam |
|
15:34
🔗
|
mahavira |
I'm trying to download all blogs from X location ( blogspot profiles are tagged by location ), so that I can build a linguistic corpus out of them. |
|
15:34
🔗
|
|
JesseW has joined #archiveteam |
|
15:42
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
15:51
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
15:53
🔗
|
|
xk_id has joined #archiveteam |
|
15:59
🔗
|
|
antomati_ has joined #archiveteam |
|
16:01
🔗
|
|
Dark_Star has joined #archiveteam |
|
16:02
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
16:03
🔗
|
|
xk_id has quit IRC (Ping timeout: 600 seconds) |
|
16:05
🔗
|
|
edsu_ is now known as edsu |
|
16:06
🔗
|
|
antomatic has quit IRC (Ping timeout: 492 seconds) |
|
16:07
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
|
16:26
🔗
|
|
primus104 has joined #archiveteam |
|
16:40
🔗
|
|
Start has joined #archiveteam |
|
16:44
🔗
|
|
atomotic has joined #archiveteam |
|
16:49
🔗
|
|
mahavira has quit IRC (Ping timeout: 483 seconds) |
|
16:50
🔗
|
|
xk_id has joined #archiveteam |
|
16:54
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
|
16:55
🔗
|
|
philpem has joined #archiveteam |
|
17:00
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
17:33
🔗
|
|
primus104 has joined #archiveteam |
|
17:37
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
17:45
🔗
|
|
scyther has joined #archiveteam |
|
17:59
🔗
|
|
superkuh has quit IRC (Remote host closed the connection) |
|
18:05
🔗
|
|
Spiders has joined #archiveteam |
|
18:16
🔗
|
|
aaaaaaaaa has joined #archiveteam |
|
18:18
🔗
|
Spiders |
looks like archive.moe is having some problems. |
|
18:19
🔗
|
|
db48x has quit IRC (Read error: Connection reset by peer) |
|
18:23
🔗
|
|
Spiders has quit IRC (Quit: Page closed) |
|
18:33
🔗
|
|
Fusl has quit IRC (Ping timeout: 600 seconds) |
|
18:40
🔗
|
|
Fusl has joined #archiveteam |
|
19:07
🔗
|
|
phuzion has quit IRC (Remote host closed the connection) |
|
19:10
🔗
|
|
phuzion has joined #archiveteam |
|
19:53
🔗
|
|
Start has joined #archiveteam |
|
19:58
🔗
|
|
Start has quit IRC (Client Quit) |
|
20:03
🔗
|
|
phuzion has quit IRC (Remote host closed the connection) |
|
20:05
🔗
|
|
phuzion has joined #archiveteam |
|
20:05
🔗
|
|
schbirid has quit IRC (Remote host closed the connection) |
|
20:30
🔗
|
|
scyther has quit IRC (Quit: Leaving) |
|
20:47
🔗
|
|
xk_id has quit IRC (Remote host closed the connection) |
|
20:48
🔗
|
|
xk_id has joined #archiveteam |
|
20:50
🔗
|
|
xk_id_ has joined #archiveteam |
|
20:52
🔗
|
|
xk_id has quit IRC (Ping timeout: 252 seconds) |
|
20:54
🔗
|
|
PurpleSym has quit IRC (Remote host closed the connection) |
|
21:04
🔗
|
|
chfoo has quit IRC (Quit: chfoo) |
|
21:09
🔗
|
|
Start has joined #archiveteam |
|
21:12
🔗
|
|
chfoo has joined #archiveteam |
|
21:15
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
21:19
🔗
|
|
HarryCros is now known as HCross |
|
21:22
🔗
|
|
chfoo has quit IRC (Quit: chfoo) |
|
21:23
🔗
|
|
xk_id_ has quit IRC (Remote host closed the connection) |
|
21:26
🔗
|
|
chfoo has joined #archiveteam |
|
21:53
🔗
|
arkiver |
SketchCow: how's FOS on space? |
|
21:55
🔗
|
arkiver |
blingee is restarted, the grab of rutracker and thiingiverse is running and comcast is almost finished |
|
21:56
🔗
|
arkiver |
thingiverse has a lot of tiny files due to the fast grab we did |
|
22:00
🔗
|
|
BlueMaxim has joined #archiveteam |
|
22:09
🔗
|
|
Boltsie has quit IRC (Remote host closed the connection) |
|
22:09
🔗
|
|
JSharp has quit IRC (Remote host closed the connection) |
|
22:09
🔗
|
|
russss__ has quit IRC (Write error: Broken pipe) |
|
22:09
🔗
|
|
zyphlar has quit IRC (Write error: Broken pipe) |
|
22:10
🔗
|
|
nox has quit IRC (Read error: Operation timed out) |
|
22:11
🔗
|
|
nox has joined #archiveteam |
|
22:22
🔗
|
|
_desu_ has quit IRC (Remote host closed the connection) |
|
22:22
🔗
|
|
filippo__ has quit IRC (Remote host closed the connection) |
|
22:22
🔗
|
|
Ctrl-S has quit IRC (Read error: Connection reset by peer) |
|
22:22
🔗
|
|
antonizoo has quit IRC (Remote host closed the connection) |
|
22:23
🔗
|
|
russss__ has joined #archiveteam |
|
22:26
🔗
|
|
JSharp has joined #archiveteam |
|
22:32
🔗
|
|
Start has joined #archiveteam |
|
22:41
🔗
|
|
zyphlar has joined #archiveteam |
|
23:06
🔗
|
|
xk_id has joined #archiveteam |
|
23:20
🔗
|
SketchCow |
FOS has 4tb free |
|
23:22
🔗
|
|
Boltsie has joined #archiveteam |
|
23:47
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
23:48
🔗
|
|
Fusl has quit IRC (Ping timeout: 255 seconds) |
|
23:55
🔗
|
|
filippo__ has joined #archiveteam |
|
23:55
🔗
|
|
mksplg has quit IRC (Ping timeout: 506 seconds) |
|
23:58
🔗
|
|
_desu_ has joined #archiveteam |