Time |
Nickname |
Message |
00:01
π
|
JAA |
If anyone has a bit of time, I'd appreciate help with verifying that my archives of AMO are complete. Come to #outofammo if interested. |
00:04
π
|
|
nertzy has joined #archiveteam |
00:05
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
00:08
π
|
|
Sk1d has joined #archiveteam |
00:13
π
|
|
m007a83 has joined #archiveteam |
00:21
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
00:23
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
00:23
π
|
|
Sk1d has joined #archiveteam |
00:45
π
|
|
VerifiedJ has quit IRC (Quit: Leaving) |
00:51
π
|
|
Mateon1 has quit IRC (Ping timeout: 265 seconds) |
00:52
π
|
|
Mateon1 has joined #archiveteam |
00:58
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
01:01
π
|
|
Sk1d has joined #archiveteam |
01:05
π
|
|
ats has quit IRC (Ping timeout: 252 seconds) |
01:15
π
|
Flashfire |
Arkiver FTP needs serious work |
01:15
π
|
arkiver |
yes |
01:15
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
01:15
π
|
Flashfire |
YES |
01:16
π
|
Flashfire |
letβs move this to #effteepee |
01:19
π
|
|
Sk1d has joined #archiveteam |
01:31
π
|
|
twoTBHetz has quit IRC (Ping timeout: 260 seconds) |
01:32
π
|
|
Kitaru_ has quit IRC (Quit: This computer has gone to sleep) |
01:35
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
01:39
π
|
|
Sk1d has joined #archiveteam |
01:56
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
01:59
π
|
|
Sk1d has joined #archiveteam |
02:05
π
|
|
Stilett0 has joined #archiveteam |
02:10
π
|
|
Stiletto has quit IRC (Ping timeout: 633 seconds) |
02:15
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
02:20
π
|
|
Sk1d has joined #archiveteam |
02:30
π
|
|
dtm has quit IRC (Read error: Operation timed out) |
02:32
π
|
|
pizzaiolo has quit IRC (west.us.hub irc.Prison.NET) |
02:32
π
|
|
Ryz has quit IRC (west.us.hub irc.Prison.NET) |
02:32
π
|
|
achip has quit IRC (west.us.hub irc.Prison.NET) |
02:33
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
02:37
π
|
|
Sk1d has joined #archiveteam |
02:48
π
|
|
pizzaiolo has joined #archiveteam |
02:50
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
02:54
π
|
|
Sk1d has joined #archiveteam |
02:59
π
|
|
Kitaru_ has joined #archiveteam |
03:05
π
|
|
dtm has joined #archiveteam |
03:05
π
|
|
achip has joined #archiveteam |
03:07
π
|
|
Ryz has joined #archiveteam |
03:32
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
03:36
π
|
|
Sk1d has joined #archiveteam |
03:40
π
|
|
bakJAA_ has joined #archiveteam |
03:40
π
|
|
swebb sets mode: +o bakJAA_ |
03:40
π
|
|
JAA sets mode: +o bakJAA_ |
03:40
π
|
|
bakJAA has quit IRC (Read error: Connection reset by peer) |
03:41
π
|
|
kyounko has quit IRC (Ping timeout: 492 seconds) |
03:41
π
|
|
Mikal_i2p has quit IRC (Ping timeout: 492 seconds) |
03:43
π
|
|
mgrytbak_ has quit IRC (Ping timeout: 492 seconds) |
03:44
π
|
|
Mikal_i2p has joined #archiveteam |
03:47
π
|
|
mgrytbak_ has joined #archiveteam |
03:50
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
03:54
π
|
|
Sk1d has joined #archiveteam |
04:07
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
04:12
π
|
|
Sk1d has joined #archiveteam |
04:26
π
|
|
matthusb_ has joined #archiveteam |
04:28
π
|
|
Martle__ has quit IRC (Ping timeout: 252 seconds) |
04:28
π
|
|
matthusb_ has quit IRC (Remote host closed the connection) |
04:28
π
|
|
matthusby has quit IRC (Read error: Operation timed out) |
04:28
π
|
|
matthusby has joined #archiveteam |
04:43
π
|
|
qw3rty114 has joined #archiveteam |
04:50
π
|
|
qw3rty113 has quit IRC (Read error: Operation timed out) |
04:50
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
04:53
π
|
|
Kitaru_ has quit IRC (Quit: This computer has gone to sleep) |
04:54
π
|
|
Sk1d has joined #archiveteam |
04:57
π
|
|
odemg has quit IRC (Read error: Operation timed out) |
05:06
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
05:10
π
|
|
Sk1d has joined #archiveteam |
05:11
π
|
|
odemg has joined #archiveteam |
05:23
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
05:27
π
|
|
Sk1d has joined #archiveteam |
05:39
π
|
|
Kitaru_ has joined #archiveteam |
05:46
π
|
Lord_Nigh |
is there a project channel for the free music archive thing? |
05:47
π
|
Flashfire |
Dunno |
05:49
π
|
|
pino_p has joined #archiveteam |
05:51
π
|
pino_p |
How many of us have already heard about Free Music Archive going dark? https://www.theverge.com/2018/11/7/18073346/free-music-archive-closing-wfmu-creative-commons-cheyenne-hohman |
05:54
π
|
pino_p |
(checks log) Lord_Nigh, see #musicateam |
05:55
π
|
pino_p |
and https://www.archiveteam.org/index.php?title=Free_Music_Archive |
06:01
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
06:04
π
|
|
nertzy has joined #archiveteam |
06:05
π
|
|
Sk1d has joined #archiveteam |
06:16
π
|
|
pino_p has quit IRC (Quit: Leaving) |
06:17
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
06:22
π
|
|
Sk1d has joined #archiveteam |
06:24
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
06:36
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
06:41
π
|
|
Sk1d has joined #archiveteam |
06:53
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
06:58
π
|
|
Sk1d has joined #archiveteam |
07:06
π
|
hiroi |
Is there a way I can download the Geocities (US) data from a few years ago? The torrent is having no speed, not too surprisingly |
07:09
π
|
Nemo_bis |
hiroi: there were seeders as recently as few months ago |
07:09
π
|
Nemo_bis |
Asking here may be a good way for someone to reseed. |
07:10
π
|
Nemo_bis |
Meanwhile, I remembered to upload the first of the two Twitter "integrity" datasets yesterday https://archive.org/details/twitter-integrity-ira |
07:13
π
|
hiroi |
Okay I have a torrenting client sitting there waiting for someone to feed her. If anyone would be so kind... |
07:43
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
07:48
π
|
|
Sk1d has joined #archiveteam |
08:00
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
08:03
π
|
hiroi |
I decided to grab from IAβs Geocities Valhalla instead (this is the same dataset right?); no need for seeder now thanks to all though |
08:04
π
|
|
Sk1d has joined #archiveteam |
08:12
π
|
|
adinbied has quit IRC (Remote host closed the connection) |
08:12
π
|
|
adinbied has joined #archiveteam |
08:15
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
08:18
π
|
|
adinbied_ has joined #archiveteam |
08:19
π
|
|
adinbied has quit IRC (Ping timeout: 252 seconds) |
08:20
π
|
|
Sk1d has joined #archiveteam |
08:32
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
08:36
π
|
|
Sk1d has joined #archiveteam |
08:37
π
|
|
twiggy36 has joined #archiveteam |
08:37
π
|
twiggy36 |
JOIN |
08:38
π
|
Flashfire |
join what? |
08:39
π
|
|
twiggy36 has quit IRC (Client Quit) |
08:41
π
|
|
Kitaru_ has quit IRC (Quit: This computer has gone to sleep) |
08:51
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
08:55
π
|
Nemo_bis |
Flashfire: just /JOIN mistyped :) |
08:56
π
|
|
Sk1d has joined #archiveteam |
09:12
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
09:15
π
|
|
Sk1d has joined #archiveteam |
09:18
π
|
|
BlueMax has quit IRC (Quit: Leaving) |
09:28
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
09:32
π
|
|
Sk1d has joined #archiveteam |
09:39
π
|
|
threeTBHe has joined #archiveteam |
09:40
π
|
|
ats has joined #archiveteam |
09:45
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
09:45
π
|
|
threeTBHe has quit IRC (Ping timeout: 260 seconds) |
09:48
π
|
|
Sk1d has joined #archiveteam |
10:02
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
10:07
π
|
|
Sk1d has joined #archiveteam |
10:08
π
|
|
godane has quit IRC (Ping timeout: 265 seconds) |
10:18
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
10:24
π
|
|
Sk1d has joined #archiveteam |
10:54
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
10:59
π
|
|
Sk1d has joined #archiveteam |
11:11
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
11:14
π
|
|
Sk1d has joined #archiveteam |
11:27
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
11:31
π
|
|
Sk1d has joined #archiveteam |
11:44
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
11:49
π
|
|
Sk1d has joined #archiveteam |
12:07
π
|
|
nertzy has joined #archiveteam |
12:08
π
|
|
Ryz has quit IRC (Quit: ChatZilla 0.9.92-rdmsoft [XULRunner 35.0.1/20150122214805]) |
12:22
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
12:59
π
|
|
godane has joined #archiveteam |
13:08
π
|
|
adinbied_ is now known as adinbied |
13:26
π
|
|
Ctrl has quit IRC (Ping timeout: 268 seconds) |
13:38
π
|
|
LFlare has joined #archiveteam |
13:44
π
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
13:44
π
|
|
Mateon1 has joined #archiveteam |
13:46
π
|
|
LFlare has left The Lounge - https://thelounge.chat |
13:46
π
|
|
LFlare has joined #archiveteam |
13:48
π
|
JAA |
So which channel is it for FMA? #fmaction or #musicateam? There are more people in the former, and the latter is mentioned on the wiki. |
13:50
π
|
|
matthusby has quit IRC (Remote host closed the connection) |
13:51
π
|
|
matthusby has joined #archiveteam |
14:31
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
14:34
π
|
|
Sk1d has joined #archiveteam |
14:48
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
14:49
π
|
|
VerifiedJ has joined #archiveteam |
14:50
π
|
|
Sk1d has joined #archiveteam |
15:06
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
15:11
π
|
|
Sk1d has joined #archiveteam |
15:16
π
|
|
SketchCo1 is now known as SketchCow |
15:16
π
|
SketchCow |
Free Music Archive all set. |
15:16
π
|
|
LFlare has quit IRC (Read error: Operation timed out) |
15:17
π
|
SketchCow |
- We are doing an Archivebot grab |
15:17
π
|
SketchCow |
- Free Music Archive mailed us a hard drive |
15:17
π
|
SketchCow |
- Archive-It grabbed a copy |
15:17
π
|
anarcat |
nice |
15:17
π
|
JAA |
Sweet |
15:21
π
|
SketchCow |
YEah, so direct people away from that one, it's handled. |
15:22
π
|
SketchCow |
Not surprisingly, my run through all our projects show some getting good action and others getting none. |
15:23
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
15:26
π
|
|
LFlare has joined #archiveteam |
15:28
π
|
|
Sk1d has joined #archiveteam |
15:42
π
|
|
thesame has joined #archiveteam |
15:43
π
|
|
Martle has joined #archiveteam |
15:43
π
|
thesame |
Hello archiveteam. Can anyone help me with rescuing a project from gitorious? It's giving 404 |
16:15
π
|
|
pizzaiolo has quit IRC (Quit: pizzaiolo) |
16:19
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
16:23
π
|
|
Sk1d has joined #archiveteam |
16:24
π
|
|
Somebody2 has quit IRC (Read error: Operation timed out) |
16:25
π
|
|
DFJustin has quit IRC (Read error: Connection reset by peer) |
16:26
π
|
|
DFJustin has joined #archiveteam |
16:26
π
|
|
swebb sets mode: +o DFJustin |
16:27
π
|
|
_Verified has joined #archiveteam |
16:29
π
|
|
_Verified has quit IRC (Client Quit) |
16:30
π
|
|
VerifiedJ has quit IRC (Quit: Leaving) |
16:30
π
|
astrid |
hi |
16:30
π
|
astrid |
gitorious admin here |
16:31
π
|
astrid |
there was a hardware failure over the weekend. it's held together with shoestring and scotch tape, i suspect some daemon failed to start up. i'll get on it in a few days. sorry! |
16:34
π
|
|
VerifiedJ has joined #archiveteam |
16:36
π
|
astrid |
data's safe though: nothing was lost from that system, and i have already sent a copy to another organization. |
16:36
π
|
astrid |
someday i will get my shit together enough to put it on IA |
16:38
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
16:41
π
|
|
Sk1d has joined #archiveteam |
16:57
π
|
|
schbirid has joined #archiveteam |
16:59
π
|
thesame |
thank you astrid, hope it happens sooner than later |
16:59
π
|
thesame |
i'm afraid gitorious is the only place where my very old project remained for now |
17:07
π
|
|
LFlare has quit IRC (Ping timeout: 252 seconds) |
17:12
π
|
|
wp494 has quit IRC (Ping timeout: 260 seconds) |
17:12
π
|
|
wp494 has joined #archiveteam |
17:14
π
|
|
Somebody2 has joined #archiveteam |
17:16
π
|
* |
Kaz wonders if we have my infra that isn't 'held together with shoestring and scotch tape' |
17:16
π
|
Kaz |
archivebot seems to work fine I guess |
17:16
π
|
Kaz |
s/my/any |
17:29
π
|
|
LFlare has joined #archiveteam |
17:57
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
18:02
π
|
|
Sk1d has joined #archiveteam |
18:06
π
|
|
Ryz has joined #archiveteam |
18:07
π
|
|
Martle_ has joined #archiveteam |
18:08
π
|
|
Martle_ has quit IRC (Client Quit) |
18:09
π
|
|
Martle has quit IRC (Ping timeout: 252 seconds) |
18:53
π
|
|
thesame has quit IRC (Remote host closed the connection) |
19:00
π
|
Igloo |
Kaz: the warrior does for the time being |
19:00
π
|
Igloo |
What do you need? |
19:03
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
19:06
π
|
|
Sk1d has joined #archiveteam |
19:22
π
|
|
SimpBrain has quit IRC (Read error: Operation timed out) |
19:38
π
|
|
m007a83_ has joined #archiveteam |
19:41
π
|
|
stratum has joined #archiveteam |
19:41
π
|
|
m007a83 has quit IRC (Read error: Operation timed out) |
19:43
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
19:46
π
|
|
Sk1d has joined #archiveteam |
19:48
π
|
|
m007a83_ is now known as m007a83 |
19:59
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
20:03
π
|
|
Sk1d has joined #archiveteam |
20:16
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
20:20
π
|
|
Sk1d has joined #archiveteam |
20:26
π
|
|
icedice has joined #archiveteam |
20:32
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
20:34
π
|
|
Sk1d has joined #archiveteam |
20:42
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
20:48
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
20:49
π
|
|
twoTBHetz has joined #archiveteam |
20:52
π
|
twoTBHetz |
Is there any plan to backup transfer.sh ? |
20:52
π
|
|
Sk1d has joined #archiveteam |
20:53
π
|
schbirid |
i would consider that pointless and not in the spirit of the site, it is designed with "Files stored for 14 days" |
20:53
π
|
twoTBHetz |
I see. |
20:53
π
|
|
BlueMax has joined #archiveteam |
20:54
π
|
twoTBHetz |
I am currently looking into hostinger's free hosting which will be closed in <two weeks. I am currently collecting all subdomains i can find. |
20:56
π
|
twoTBHetz |
But since i do not have any archving stuff (warc or whatever) setup i can only give you a list of entrypooints |
20:57
π
|
Igloo |
Put them into the archivebot |
20:57
π
|
Igloo |
Into blocks of subdomains |
21:00
π
|
twoTBHetz |
Igloo how do i do that? |
21:03
π
|
twoTBHetz |
why grouped by subdomains? |
21:09
π
|
twoTBHetz |
I only know about https://web.archive.org/save/$individual_link |
21:11
π
|
|
Kitaru has joined #archiveteam |
21:13
π
|
betamax |
twoTBHetz: wait, is hostinger shutting down their free hosting |
21:14
π
|
betamax |
I can't find any info on this |
21:14
π
|
twoTBHetz |
just visit any site: www.cinemahd.esy.es |
21:15
π
|
twoTBHetz |
that link did not work because its a redirection but see thos one http://aea.zz.vc/ |
21:17
π
|
twoTBHetz |
I am currently running Sublist3r but it always takes for ever and i only got so many IPs |
21:18
π
|
twoTBHetz |
betamax, do you see? |
21:19
π
|
twoTBHetz |
so far i got roughly 7705 subdomain names from 16mb.com ahol.es azz.vc esy.es hol.es zz.vc but is still need to check whether they are in use |
21:21
π
|
betamax |
ah, yes: |
21:21
π
|
betamax |
This website is hosted on a free hosting platform provided by Hostinger. The platform is deprecated, and it will be turned off in two weeks. If you are a website owner, please log into your control panel here. |
21:22
π
|
twoTBHetz |
I am currently collectiong subdomains from 96.lt . I have not scanned pe.hu and .890m.com and do a little more digging for more top-levels |
21:22
π
|
|
BlueMaxim has joined #archiveteam |
21:24
π
|
betamax |
great work. Any idea when that notice first appeared? It's a pain not knowing the actuall shutdown date |
21:25
π
|
|
BlueMax has quit IRC (Ping timeout: 260 seconds) |
21:25
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
21:26
π
|
twoTBHetz |
no somebody pasted it yesterday in here |
21:26
π
|
|
Kitaru has quit IRC (Quit: This computer has gone to sleep) |
21:27
π
|
twoTBHetz |
his name was something along the lines of Arctic |
21:27
π
|
twoTBHetz |
if i recall correctly |
21:27
π
|
betamax |
Ah, missed that |
21:27
π
|
betamax |
fyi: hostinger is *very closely* tied to 000webhost |
21:28
π
|
betamax |
now, 000webhost are unlikely to stop being free - it's in their name |
21:28
π
|
betamax |
but I'm suspicous |
21:28
π
|
twoTBHetz |
yeah hostingers free hosting side does not advertise itself but 000webhost instead. i do not think that one will go down |
21:29
π
|
twoTBHetz |
are there faster/better tools than sublistr. The online services it uses are starting to rate limit me hard an i already have burned trhough two IPs |
21:30
π
|
|
Sk1d has joined #archiveteam |
21:31
π
|
SketchCow |
--------------------------- |
21:31
π
|
SketchCow |
IMPORTANT NOTE |
21:31
π
|
SketchCow |
If you upload WARC files into the general open collections on archive.org |
21:31
π
|
SketchCow |
...they're gonna end up in the WARC Zone (https://archive.org/details/warczone) |
21:31
π
|
SketchCow |
Unless you've arranged them to go to an archive team collection |
21:32
π
|
SketchCow |
If you're smashing endless WARCs into the open collection, then somewhere you |
21:32
π
|
SketchCow |
didn't do a thing you probably should have done. Contact me or others. |
21:32
π
|
SketchCow |
--------------------------- |
21:33
π
|
twoTBHetz |
but there are also sites like http://profin.by/blog/ which feature the banner but are not easy to discover i think (but i know little) |
21:33
π
|
betamax |
SketchCow: but I'm guessing any non-ArchiveTeam grabs in WARC format won't be added to Wayback? |
21:33
π
|
SketchCow |
They will not |
21:40
π
|
anarcat |
SketchCow: i'd love to do the right thing |
21:40
π
|
anarcat |
SketchCow: i'm not sure what i'm missing |
21:40
π
|
anarcat |
SketchCow: i uploaded those so far https://archive.org/details/@anarcat |
21:47
π
|
|
Kitaru has joined #archiveteam |
21:57
π
|
|
w0rmybak has quit IRC (Quit: Ping timeout (120 seconds)) |
21:57
π
|
|
kiskabak has quit IRC (Quit: Ping timeout (120 seconds)) |
21:57
π
|
|
Flashback has quit IRC (Quit: Ping timeout (120 seconds)) |
21:58
π
|
|
w0rmybak has joined #archiveteam |
22:00
π
|
|
kiskabak has joined #archiveteam |
22:00
π
|
|
w0rmybak has quit IRC (Client Quit) |
22:00
π
|
|
w0rmybak has joined #archiveteam |
22:03
π
|
|
Flashback has joined #archiveteam |
22:10
π
|
|
schbirid has quit IRC (Remote host closed the connection) |
22:11
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
22:11
π
|
|
bakJAA_ is now known as bakJAA |
22:13
π
|
twoTBHetz |
betamax how would i best submit my to be archuved URLs to the wayback machine? |
22:15
π
|
|
Sk1d has joined #archiveteam |
22:22
π
|
betamax |
its a bit tricky |
22:22
π
|
betamax |
what I would do is a two-step process |
22:23
π
|
betamax |
1.) get a list of all the sites you discover, and archive their home pages only using archivebot '!ao' |
22:24
π
|
betamax |
2.) download all the sites yourself, ideally using something that can generate WARCs linke wget, then extract all the urls from that |
22:24
π
|
twoTBHetz |
so only one page per domain? |
22:24
π
|
betamax |
all those urls can then by put into archivebot |
22:25
π
|
twoTBHetz |
My question is: only one entrypoint per domain or spider the complete thing |
22:25
π
|
twoTBHetz |
how would i go about doing warc files in wget |
22:28
π
|
betamax |
well, first just get the home pages into archivebot, so something is saved |
22:28
π
|
betamax |
then spider / download the complete thing yourself, to there is a copy, even if not in wayback |
22:29
π
|
betamax |
then try and get it into wayback by putting a list of all the urls found into archivebot |
22:29
π
|
betamax |
see https://www.archiveteam.org/index.php/Wget_with_WARC_output for warc with wget |
22:30
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
22:33
π
|
|
Sk1d has joined #archiveteam |
22:35
π
|
twoTBHetz |
mhh how does warc work with the spider option? |
22:39
π
|
twoTBHetz |
I am currently doing something like that wget "-r" "-l4" "--spider" "--tries=0" "-o" file "--no-verbose" "-D" "startreknews.esy.es" "startreknews.esy.es" to attempt to get all the link on the url in question |
22:43
π
|
betamax |
ah, bad choice of wording from me above: I'm not sure "spider" is the argument you want |
22:43
π
|
betamax |
according to the manual it means pages won't be downloaded |
22:43
π
|
betamax |
I don't know if that will affect the warc |
22:44
π
|
betamax |
and warc is just: --warc-file=fileName |
22:44
π
|
betamax |
which creates fileName.warc.gz |
22:45
π
|
twoTBHetz |
if somebody has a tool chain that works for him i would like to give him my domains |
22:45
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
22:46
π
|
twoTBHetz |
is the warc spider like too (in the sense that it fetches a complete website (and the sites on that domain it links to) and not just the link i point it to. |
22:48
π
|
betamax |
twoTBHetz: afraid I have to go (to bed) now |
22:48
π
|
twoTBHetz |
i see |
22:48
π
|
betamax |
buy fyi, apparently hostinger emailed people on 25th september saying they had 2 months |
22:50
π
|
twoTBHetz |
so two weeks is roughly right |
22:51
π
|
|
Sk1d has joined #archiveteam |
22:55
π
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
22:56
π
|
|
dashcloud has joined #archiveteam |
22:56
π
|
|
Mateon1 has joined #archiveteam |
23:12
π
|
|
matthusb_ has joined #archiveteam |
23:14
π
|
|
matthusby has quit IRC (Read error: Operation timed out) |
23:18
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
23:19
π
|
|
adinbied has quit IRC (Left Channel.) |
23:20
π
|
|
matthusb_ has quit IRC (Read error: Operation timed out) |
23:39
π
|
|
adinbied has joined #archiveteam |