| Time |
Nickname |
Message |
|
00:11
🔗
|
|
Ymgve has quit IRC () |
|
00:30
🔗
|
|
LordNigh2 has joined #archiveteam |
|
00:38
🔗
|
|
Lord_Nigh has quit IRC (Ping timeout: 600 seconds) |
|
00:38
🔗
|
|
LordNigh2 is now known as Lord_Nigh |
|
01:22
🔗
|
|
mutoso has joined #archiveteam |
|
01:39
🔗
|
|
cf has joined #archiveteam |
|
01:41
🔗
|
|
ete_ has joined #archiveteam |
|
01:48
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
01:48
🔗
|
|
arkhive has joined #archiveteam |
|
01:49
🔗
|
|
the_fox has quit IRC (Ping timeout: 335 seconds) |
|
01:49
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
01:50
🔗
|
|
the_fox has joined #archiveteam |
|
02:13
🔗
|
|
Aranje has quit IRC (Read error: Operation timed out) |
|
02:16
🔗
|
|
philpem has quit IRC (Ping timeout: 272 seconds) |
|
02:24
🔗
|
|
Aranje has joined #archiveteam |
|
02:37
🔗
|
|
REiN^ has quit IRC () |
|
02:37
🔗
|
|
REiN^ has joined #archiveteam |
|
02:56
🔗
|
|
signius_ has quit IRC (Ping timeout: 258 seconds) |
|
02:57
🔗
|
|
ete_ has quit IRC (Remote host closed the connection) |
|
03:09
🔗
|
|
mistym has joined #archiveteam |
|
03:09
🔗
|
|
signius_ has joined #archiveteam |
|
03:28
🔗
|
|
rejon has joined #archiveteam |
|
04:17
🔗
|
|
ex-parro1 has quit IRC (Leaving.) |
|
04:28
🔗
|
|
ruukasu has quit IRC (Quit: WeeChat 1.0.1) |
|
04:28
🔗
|
|
ruukasu has joined #archiveteam |
|
04:29
🔗
|
|
ruukasu has quit IRC (Client Quit) |
|
04:29
🔗
|
|
ruukasu has joined #archiveteam |
|
04:50
🔗
|
|
BlueMaxim has joined #archiveteam |
|
04:55
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
|
05:03
🔗
|
|
todrobbin has joined #archiveteam |
|
05:09
🔗
|
|
ruukasu has quit IRC (Quit: WeeChat 1.0.1) |
|
05:10
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
05:10
🔗
|
|
mistym has joined #archiveteam |
|
05:17
🔗
|
|
ruukasu has joined #archiveteam |
|
05:33
🔗
|
|
Start is now known as StartAway |
|
05:34
🔗
|
|
antomati_ has joined #archiveteam |
|
05:36
🔗
|
|
antomatic has quit IRC (Ping timeout: 633 seconds) |
|
05:45
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
05:48
🔗
|
|
todrobbin has quit IRC (todrobbin) |
|
05:50
🔗
|
|
todrobbin has joined #archiveteam |
|
05:56
🔗
|
|
todrobbin has quit IRC (Quit: todrobbin) |
|
05:59
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
06:06
🔗
|
|
dashcloud has joined #archiveteam |
|
06:11
🔗
|
|
BiggieJo1 has joined #archiveteam |
|
06:15
🔗
|
|
BiggieJon has quit IRC (Read error: Operation timed out) |
|
07:24
🔗
|
|
ZorbaTHut has quit IRC (Read error: Connection reset by peer) |
|
07:25
🔗
|
|
ZorbaTHut has joined #archiveteam |
|
07:37
🔗
|
midas |
SketchCow: do you have a collection ready for Viddy? |
|
07:50
🔗
|
|
primus104 has joined #archiveteam |
|
07:57
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
08:04
🔗
|
|
dashcloud has joined #archiveteam |
|
08:05
🔗
|
|
ex-parrot has quit IRC (Read error: Operation timed out) |
|
08:06
🔗
|
|
ex-parrot has joined #archiveteam |
|
08:07
🔗
|
|
APerti has quit IRC (Read error: Operation timed out) |
|
08:13
🔗
|
|
APerti has joined #archiveteam |
|
08:18
🔗
|
SketchCow |
Yes. I need to know the user account on IA to grant admin |
|
08:19
🔗
|
|
mistym has joined #archiveteam |
|
08:29
🔗
|
SketchCow |
Done. archiveteam_viddy is now your victim. |
|
08:30
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
|
08:30
🔗
|
SketchCow |
It has all the proper logo and writing and so on. |
|
08:31
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
08:34
🔗
|
midas |
thanks SketchCow ! |
|
08:54
🔗
|
|
amerrykan has quit IRC (Quit: Quitting) |
|
09:26
🔗
|
|
APerti has quit IRC (Ping timeout: 480 seconds) |
|
09:27
🔗
|
|
amerrykan has joined #archiveteam |
|
09:29
🔗
|
|
primus104 has joined #archiveteam |
|
09:36
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
10:41
🔗
|
|
antomati_ is now known as antomatic |
|
10:48
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
10:55
🔗
|
|
dashcloud has joined #archiveteam |
|
11:26
🔗
|
|
Ymgve has joined #archiveteam |
|
11:38
🔗
|
|
ruukasu has quit IRC (Ping timeout: 265 seconds) |
|
12:21
🔗
|
|
schbirid has joined #archiveteam |
|
12:21
🔗
|
|
Emcy_ has quit IRC (Read error: Connection reset by peer) |
|
12:55
🔗
|
|
cf has quit IRC (cf) |
|
13:11
🔗
|
|
Morbus has quit IRC (Quit: http://www.disobey.com/) |
|
13:14
🔗
|
|
Morbus has joined #archiveteam |
|
13:16
🔗
|
|
ruukasu has joined #archiveteam |
|
13:33
🔗
|
|
useretail has quit IRC (ircd.shaw.ca irc.shaw.ca) |
|
13:33
🔗
|
|
rduser has quit IRC (ircd.shaw.ca irc.shaw.ca) |
|
13:33
🔗
|
|
Jogie has quit IRC (ircd.shaw.ca irc.shaw.ca) |
|
13:33
🔗
|
|
w0rp has quit IRC (ircd.shaw.ca irc.shaw.ca) |
|
13:33
🔗
|
|
SadDM has quit IRC (ircd.shaw.ca irc.shaw.ca) |
|
13:33
🔗
|
|
Sellyme has quit IRC (ircd.shaw.ca irc.shaw.ca) |
|
13:33
🔗
|
|
w0rp_ has joined #archiveteam |
|
13:34
🔗
|
|
sankin has joined #archiveteam |
|
13:34
🔗
|
|
Sellyme has joined #archiveteam |
|
13:34
🔗
|
|
SadDM has joined #archiveteam |
|
13:35
🔗
|
|
rduser has joined #archiveteam |
|
13:42
🔗
|
|
primus104 has joined #archiveteam |
|
13:48
🔗
|
|
w0rp_ is now known as w0rp |
|
13:49
🔗
|
|
sankin has quit IRC (Leaving.) |
|
13:49
🔗
|
|
useretail has joined #archiveteam |
|
14:00
🔗
|
|
sankin has joined #archiveteam |
|
14:02
🔗
|
|
ruukasu has quit IRC (Quit: WeeChat 1.0.1) |
|
14:07
🔗
|
|
ruukasu has joined #archiveteam |
|
14:22
🔗
|
|
ruukasuu has joined #archiveteam |
|
14:22
🔗
|
|
ruukasu has quit IRC (Ping timeout: 265 seconds) |
|
14:23
🔗
|
|
ruukasuu has quit IRC (Client Quit) |
|
14:37
🔗
|
|
REiN^ has quit IRC () |
|
14:38
🔗
|
|
REiN^ has joined #archiveteam |
|
14:57
🔗
|
|
BiggieJo1 is now known as BiggieJon |
|
15:19
🔗
|
|
StartAway is now known as Start |
|
15:24
🔗
|
|
BiggieJon has left |
|
15:26
🔗
|
|
cf has joined #archiveteam |
|
15:34
🔗
|
|
mistym has joined #archiveteam |
|
15:34
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
15:39
🔗
|
|
BiggieJon has joined #archiveteam |
|
15:43
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
15:44
🔗
|
|
Start has quit IRC (Remote host closed the connection) |
|
15:55
🔗
|
|
mistym has joined #archiveteam |
|
15:56
🔗
|
|
aaaaaaaaa has joined #archiveteam |
|
16:00
🔗
|
schbirid |
privat.t-online.de has a lot of personal homepages, no idea how to discover them all though |
|
16:00
🔗
|
midas |
google site:privat.t-online.de ? |
|
16:02
🔗
|
schbirid |
yeah but google does not let one paginate anymore after ~25 or something |
|
16:03
🔗
|
arkiver |
Google will give you the number of found links, like 1 million, but will only allow you to view 1000 |
|
16:16
🔗
|
|
thechip has quit IRC (Read error: Connection reset by peer) |
|
16:18
🔗
|
|
Emcy has joined #archiveteam |
|
16:23
🔗
|
|
chipper_ has joined #archiveteam |
|
16:24
🔗
|
|
chipper_ has left |
|
16:34
🔗
|
SadDM |
SketchCow: can you move https://archive.org/details/DarkHorseComicsMessageBoards-FinalGrab into the archive team colloection when you have a moment? |
|
16:37
🔗
|
SketchCow |
Done |
|
16:50
🔗
|
|
ruukasu has joined #archiveteam |
|
16:55
🔗
|
SketchCow |
Tripod.com is going down |
|
16:56
🔗
|
arkiver |
tripod.com |
|
16:56
🔗
|
SketchCow |
Maybe |
|
16:58
🔗
|
xmc |
! ? |
|
16:59
🔗
|
|
Start_ has joined #archiveteam |
|
16:59
🔗
|
arkiver |
Sites aren't hard to save, problem is the discovery of the sites that exist. http://196thovi.tripod.com/ |
|
16:59
🔗
|
xmc |
somewhere between 25-jun-2014 and today they got rid of <http://team-blog.tripod.com/>, but still link it from the front page |
|
17:00
🔗
|
xmc |
http://web.archive.org/web/20140625035208/http://team-blog.tripod.com/ |
|
17:00
🔗
|
arkiver |
we have various sources (wayback, google, etc.) |
|
17:00
🔗
|
arkiver |
but those will most likely not get everything. The wayback just doesn't have all websites and google only shows the first 1000 results |
|
17:01
🔗
|
DFJustin |
there's also searching wayback for the old url, http://members.tripod.com/* |
|
17:01
🔗
|
chfoo |
http://urlsearch.commoncrawl.org/?q=tripod.com |
|
17:01
🔗
|
arkiver |
yeah, I mentioned that |
|
17:02
🔗
|
arkiver |
I mean the wayback |
|
17:02
🔗
|
arkiver |
not the commoncrawl yet |
|
17:02
🔗
|
arkiver |
SketchCow: if you are not able to get a full list of websites some way (they might have some hidden index on their site?), would you like to contact them about this? |
|
17:02
🔗
|
chfoo |
we can do a discovery scraping google/bing with a dictionary if that's needed |
|
17:03
🔗
|
arkiver |
a dictionary on google? |
|
17:03
🔗
|
chfoo |
a word list i mean |
|
17:04
🔗
|
arkiver |
Like: site:*.tripod.com *aaa* |
|
17:04
🔗
|
arkiver |
site:*.tripod.com *the* etc.? |
|
17:05
🔗
|
DFJustin |
hmm it's not accepting my tripod password |
|
17:06
🔗
|
Start_ |
should we start a project for http://ep1c.com? |
|
17:06
🔗
|
Start_ |
it's also owned by viddy and shutting down on the same date (dec. 15) |
|
17:07
🔗
|
arkiver |
Start_: yep, I saw your posts about it (sorry for not responding) |
|
17:07
🔗
|
DFJustin |
reset works though |
|
17:09
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
17:10
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
17:15
🔗
|
|
dashcloud has joined #archiveteam |
|
17:22
🔗
|
|
ruukasu has quit IRC (Ping timeout: 265 seconds) |
|
17:23
🔗
|
schbirid |
not tripod :(( |
|
17:24
🔗
|
schbirid |
http://members.tripod.com/robots.txt has sitemaps |
|
17:24
🔗
|
|
Start_ is now known as Start |
|
17:25
🔗
|
schbirid |
whoever does this, please grab angelfire in the same go. same sitemap structure |
|
17:25
🔗
|
schbirid |
also please educate me how you do it, because i got stuck with angelfire and got no help |
|
17:26
🔗
|
arkiver |
thanks for those sitemaps |
|
17:26
🔗
|
arkiver |
I'll create a discovery project which will find all the sites using those sitemaps |
|
17:26
🔗
|
schbirid |
all URLs are inside the sitemaps |
|
17:27
🔗
|
arkiver |
yes |
|
17:28
🔗
|
schbirid |
just not the media embedded in those sites, that was my problem |
|
17:29
🔗
|
arkiver |
schbirid: do you have an example for me? |
|
17:29
🔗
|
arkiver |
and were you using wget lua? |
|
17:29
🔗
|
schbirid |
http://members.tripod.com/a1modularhomes/sitemap.xml random |
|
17:29
🔗
|
schbirid |
nope |
|
17:29
🔗
|
schbirid |
i gave up because i would have made a mess |
|
17:29
🔗
|
|
lbft has quit IRC (Read error: Operation timed out) |
|
17:30
🔗
|
arkiver |
do you mean by "the media embedded in those sites" external pictures and videos? |
|
17:30
🔗
|
schbirid |
the sitemaps only have html pages |
|
17:30
🔗
|
schbirid |
so any images etc need to be found |
|
17:30
🔗
|
arkiver |
I see what you mean now, sorry |
|
17:30
🔗
|
schbirid |
:) |
|
17:30
🔗
|
arkiver |
Yeah, I'll get those done by wget lua |
|
17:31
🔗
|
aaaaaaaaa |
maybe there should be a tripod channel |
|
17:31
🔗
|
arkiver |
Maybe it's not going to be a discovery project btw |
|
17:31
🔗
|
arkiver |
but we'll see |
|
17:32
🔗
|
garyrh |
#wobbly ? |
|
17:33
🔗
|
arkiver |
SketchCow: do we have the shutdown date? |
|
17:33
🔗
|
|
lbft has joined #archiveteam |
|
17:34
🔗
|
SketchCow |
No, and there's a chance this tip may have just come from someone finding what you did - the site seems really on the rack, blog no longer works, etc. |
|
17:34
🔗
|
|
mistym has joined #archiveteam |
|
17:38
🔗
|
aaaaaaaaa |
#byepod is what I was thinking |
|
17:43
🔗
|
|
philpem has joined #archiveteam |
|
17:50
🔗
|
|
Start has quit IRC (Ping timeout: 272 seconds) |
|
18:05
🔗
|
|
Jogie has joined #archiveteam |
|
18:09
🔗
|
|
APerti has joined #archiveteam |
|
18:10
🔗
|
|
rejon has quit IRC (Ping timeout: 480 seconds) |
|
18:35
🔗
|
|
cf_ has joined #archiveteam |
|
18:36
🔗
|
|
cf has quit IRC (Ping timeout: 246 seconds) |
|
18:36
🔗
|
|
cf_ is now known as cf |
|
18:39
🔗
|
|
primus104 has joined #archiveteam |
|
18:43
🔗
|
|
thechip has joined #archiveteam |
|
18:56
🔗
|
|
thechip has quit IRC (Read error: Operation timed out) |
|
19:03
🔗
|
arkiver |
SketchCow: ok if I wait till there is more information on the shutdown before I get the scripts ready? |
|
19:04
🔗
|
|
Sk2d has joined #archiveteam |
|
19:04
🔗
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
19:04
🔗
|
|
Sk2d is now known as Sk1d |
|
19:11
🔗
|
|
Sk1d has quit IRC (Ping timeout: 265 seconds) |
|
19:11
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
19:14
🔗
|
|
Sk1d has joined #archiveteam |
|
19:21
🔗
|
|
dashcloud has joined #archiveteam |
|
19:23
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
19:30
🔗
|
|
dashcloud has quit IRC (Remote host closed the connection) |
|
19:31
🔗
|
|
dashcloud has joined #archiveteam |
|
19:32
🔗
|
arkiver |
midas: http://dat.serveert.me.uk/p/ftp |
|
19:32
🔗
|
arkiver |
is currently down :/ |
|
19:34
🔗
|
SketchCow |
Yes, please do. |
|
19:35
🔗
|
arkiver |
ok |
|
19:42
🔗
|
|
Start has joined #archiveteam |
|
19:43
🔗
|
|
ruukasu has joined #archiveteam |
|
19:46
🔗
|
|
bauruine has quit IRC (Ping timeout: 265 seconds) |
|
19:48
🔗
|
|
philpem has quit IRC (Ping timeout: 272 seconds) |
|
19:51
🔗
|
|
bauruine has joined #archiveteam |
|
20:02
🔗
|
Start |
https://roon.io |
|
20:02
🔗
|
Start |
http://blog.ghost.org/roon/ |
|
20:02
🔗
|
Start |
"The Roon.io hosted platform will be closing its doors on December 31st, 2014." |
|
20:03
🔗
|
|
Kniffy has quit IRC (Quit: pup) |
|
20:03
🔗
|
|
thechip has joined #archiveteam |
|
20:05
🔗
|
|
Kniffy has joined #archiveteam |
|
20:08
🔗
|
|
ruukasu has quit IRC (Ping timeout: 265 seconds) |
|
20:16
🔗
|
Start |
here's a google crawl for roon: http://paste.archivingyoursh.it/goxowihalo.avrasm |
|
20:19
🔗
|
|
SN4T14 has quit IRC (Ping timeout: 369 seconds) |
|
20:19
🔗
|
Start |
looks like roon can be sequentially scraped through its api: https://roon.io/developer/blogs |
|
20:28
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
|
20:32
🔗
|
|
Start has joined #archiveteam |
|
20:34
🔗
|
|
primus104 has joined #archiveteam |
|
20:37
🔗
|
arkiver |
SketchCow: a fast and small project is starting very soon: ziplist |
|
20:37
🔗
|
arkiver |
#zipyourlips |
|
20:37
🔗
|
|
ex-parro1 has joined #archiveteam |
|
20:37
🔗
|
arkiver |
That one is going to FOS, currently 30.000 warc's |
|
20:42
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
20:43
🔗
|
Start |
cf: since you've been doing API scrapes for a couple recent projects, mind doing one for roon? |
|
20:44
🔗
|
Start |
http://archiveteam.org/index.php?title=Roon |
|
20:44
🔗
|
cf |
Start: I’ll have a go at it. Not sure when I’ll get around to it, but within a week or so |
|
20:45
🔗
|
arkiver |
Start: are those api's just incremental numbered? |
|
20:45
🔗
|
Start |
yes |
|
20:45
🔗
|
|
dashcloud has joined #archiveteam |
|
20:45
🔗
|
arkiver |
then I'll do them in the scripts, we also save the api urls that way |
|
20:45
🔗
|
Start |
ok |
|
20:45
🔗
|
cf |
Yea, just about to say |
|
20:46
🔗
|
Start |
we need an irc channel name for roon |
|
20:46
🔗
|
Start |
#rooin |
|
20:48
🔗
|
Start |
or maybe #rooined |
|
20:49
🔗
|
Start |
i like rooined better |
|
21:01
🔗
|
|
T31M has quit IRC (Quit: Leaving) |
|
21:09
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
|
21:09
🔗
|
|
aaaaaaaaa has joined #archiveteam |
|
21:25
🔗
|
|
cf has quit IRC (Ping timeout: 265 seconds) |
|
21:26
🔗
|
|
Start_ has joined #archiveteam |
|
21:27
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
|
21:28
🔗
|
midas |
arkiver: ill fix it in a minute |
|
21:36
🔗
|
midas |
fixed |
|
21:36
🔗
|
midas |
forgot it rebooted this box |
|
21:36
🔗
|
|
bauruine has quit IRC (Ping timeout: 265 seconds) |
|
21:36
🔗
|
|
K4k has joined #archiveteam |
|
21:38
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
|
21:42
🔗
|
|
bauruine has joined #archiveteam |
|
21:54
🔗
|
arkiver |
thanks midas |
|
21:56
🔗
|
|
SN4T14 has joined #archiveteam |
|
21:57
🔗
|
|
sankin has quit IRC (Leaving.) |
|
22:00
🔗
|
|
hive-mind has quit IRC (Ping timeout: 272 seconds) |
|
22:07
🔗
|
|
ruukasu has joined #archiveteam |
|
22:09
🔗
|
|
cbb has joined #archiveteam |
|
22:11
🔗
|
|
thechip has quit IRC (Quit: Leaving...) |
|
22:12
🔗
|
|
hive-mind has joined #archiveteam |
|
22:14
🔗
|
|
Start_ is now known as Start |
|
22:17
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
22:20
🔗
|
|
dashcloud has joined #archiveteam |
|
22:22
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
|
22:24
🔗
|
arkiver |
SketchCow: ziplist should be incoming on FOS now, 30.000 warc's |
|
22:24
🔗
|
|
Start has joined #archiveteam |
|
22:25
🔗
|
|
K4k has quit IRC (Ping timeout: 378 seconds) |
|
22:25
🔗
|
|
schbirid has quit IRC (Leaving) |
|
22:27
🔗
|
|
cf has joined #archiveteam |
|
22:40
🔗
|
|
REiN^ has quit IRC (Read error: Connection reset by peer) |
|
22:41
🔗
|
|
REiN^ has joined #archiveteam |
|
22:46
🔗
|
Start |
arkiver: the highest valid roon blog i could find was: https://roon.io/api/v1/blogs/122233 |
|
22:47
🔗
|
|
REiN^ has quit IRC (Read error: Connection reset by peer) |
|
22:47
🔗
|
arkiver |
Start: thanks, but first ep1c |
|
22:47
🔗
|
arkiver |
Tomorrow is ep1c day |
|
22:47
🔗
|
Start |
ok |
|
22:48
🔗
|
Start |
i'm guessing that ep1c's grab scripts will be very similar to viddy's? |
|
22:48
🔗
|
|
REiN^ has joined #archiveteam |
|
22:49
🔗
|
|
signius_ has quit IRC (Read error: Operation timed out) |
|
22:50
🔗
|
arkiver |
probably, but I'll see that tomorrow |
|
23:02
🔗
|
|
signius_ has joined #archiveteam |
|
23:07
🔗
|
|
ex-parro1 has quit IRC (Remote host closed the connection) |
|
23:07
🔗
|
dashcloud |
so tripod really is going down? |
|
23:08
🔗
|
|
Start has quit IRC (Ping timeout: 378 seconds) |
|
23:08
🔗
|
garyrh |
maaaybe |
|
23:08
🔗
|
dashcloud |
might as well grab angelfire while we're at it- you'd then have the three big players from the 90s |
|
23:10
🔗
|
xmc |
does lycos still host homepages? |
|
23:11
🔗
|
|
REiN^ has quit IRC (Read error: Operation timed out) |
|
23:15
🔗
|
dashcloud |
yeah- there's classic 90s Angelfire:http://www.angelfire.com/sd/ScrewAOL/ and the modern Angelfire: http://www.angelfire.lycos.com/ |
|
23:17
🔗
|
|
ex-parro1 has joined #archiveteam |
|
23:26
🔗
|
dashcloud |
so modern angelfire is probably not too hard to archive because there's sitemaps listing the pages: http://www.angelfire.com/sitemap-index-00.xml.gz |
|
23:26
🔗
|
dashcloud |
I started that project but when wget got killed because of memkiller, I stopped |
|
23:50
🔗
|
|
cf has quit IRC (Quit: cf) |
|
23:57
🔗
|
godane |
so some of the KBS News Today i got are incomplete |
|
23:58
🔗
|
godane |
doing a 2nd rtmpdump gets me a bigger file |
|
23:58
🔗
|
godane |
i must have been doing too many at once |