| Time |
Nickname |
Message |
|
00:01
π
|
JAA |
If anyone has a bit of time, I'd appreciate help with verifying that my archives of AMO are complete. Come to #outofammo if interested. |
|
00:04
π
|
|
nertzy has joined #archiveteam |
|
00:05
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
00:08
π
|
|
Sk1d has joined #archiveteam |
|
00:13
π
|
|
m007a83 has joined #archiveteam |
|
00:21
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
00:23
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
|
00:23
π
|
|
Sk1d has joined #archiveteam |
|
00:45
π
|
|
VerifiedJ has quit IRC (Quit: Leaving) |
|
00:51
π
|
|
Mateon1 has quit IRC (Ping timeout: 265 seconds) |
|
00:52
π
|
|
Mateon1 has joined #archiveteam |
|
00:58
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
01:01
π
|
|
Sk1d has joined #archiveteam |
|
01:05
π
|
|
ats has quit IRC (Ping timeout: 252 seconds) |
|
01:15
π
|
Flashfire |
Arkiver FTP needs serious work |
|
01:15
π
|
arkiver |
yes |
|
01:15
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
01:15
π
|
Flashfire |
YES |
|
01:16
π
|
Flashfire |
letβs move this to #effteepee |
|
01:19
π
|
|
Sk1d has joined #archiveteam |
|
01:31
π
|
|
twoTBHetz has quit IRC (Ping timeout: 260 seconds) |
|
01:32
π
|
|
Kitaru_ has quit IRC (Quit: This computer has gone to sleep) |
|
01:35
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
01:39
π
|
|
Sk1d has joined #archiveteam |
|
01:56
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
01:59
π
|
|
Sk1d has joined #archiveteam |
|
02:05
π
|
|
Stilett0 has joined #archiveteam |
|
02:10
π
|
|
Stiletto has quit IRC (Ping timeout: 633 seconds) |
|
02:15
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
02:20
π
|
|
Sk1d has joined #archiveteam |
|
02:30
π
|
|
dtm has quit IRC (Read error: Operation timed out) |
|
02:32
π
|
|
pizzaiolo has quit IRC (west.us.hub irc.Prison.NET) |
|
02:32
π
|
|
Ryz has quit IRC (west.us.hub irc.Prison.NET) |
|
02:32
π
|
|
achip has quit IRC (west.us.hub irc.Prison.NET) |
|
02:33
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
02:37
π
|
|
Sk1d has joined #archiveteam |
|
02:48
π
|
|
pizzaiolo has joined #archiveteam |
|
02:50
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
02:54
π
|
|
Sk1d has joined #archiveteam |
|
02:59
π
|
|
Kitaru_ has joined #archiveteam |
|
03:05
π
|
|
dtm has joined #archiveteam |
|
03:05
π
|
|
achip has joined #archiveteam |
|
03:07
π
|
|
Ryz has joined #archiveteam |
|
03:32
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
03:36
π
|
|
Sk1d has joined #archiveteam |
|
03:40
π
|
|
bakJAA_ has joined #archiveteam |
|
03:40
π
|
|
swebb sets mode: +o bakJAA_ |
|
03:40
π
|
|
JAA sets mode: +o bakJAA_ |
|
03:40
π
|
|
bakJAA has quit IRC (Read error: Connection reset by peer) |
|
03:41
π
|
|
kyounko has quit IRC (Ping timeout: 492 seconds) |
|
03:41
π
|
|
Mikal_i2p has quit IRC (Ping timeout: 492 seconds) |
|
03:43
π
|
|
mgrytbak_ has quit IRC (Ping timeout: 492 seconds) |
|
03:44
π
|
|
Mikal_i2p has joined #archiveteam |
|
03:47
π
|
|
mgrytbak_ has joined #archiveteam |
|
03:50
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
03:54
π
|
|
Sk1d has joined #archiveteam |
|
04:07
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
04:12
π
|
|
Sk1d has joined #archiveteam |
|
04:26
π
|
|
matthusb_ has joined #archiveteam |
|
04:28
π
|
|
Martle__ has quit IRC (Ping timeout: 252 seconds) |
|
04:28
π
|
|
matthusb_ has quit IRC (Remote host closed the connection) |
|
04:28
π
|
|
matthusby has quit IRC (Read error: Operation timed out) |
|
04:28
π
|
|
matthusby has joined #archiveteam |
|
04:43
π
|
|
qw3rty114 has joined #archiveteam |
|
04:50
π
|
|
qw3rty113 has quit IRC (Read error: Operation timed out) |
|
04:50
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
04:53
π
|
|
Kitaru_ has quit IRC (Quit: This computer has gone to sleep) |
|
04:54
π
|
|
Sk1d has joined #archiveteam |
|
04:57
π
|
|
odemg has quit IRC (Read error: Operation timed out) |
|
05:06
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
05:10
π
|
|
Sk1d has joined #archiveteam |
|
05:11
π
|
|
odemg has joined #archiveteam |
|
05:23
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
05:27
π
|
|
Sk1d has joined #archiveteam |
|
05:39
π
|
|
Kitaru_ has joined #archiveteam |
|
05:46
π
|
Lord_Nigh |
is there a project channel for the free music archive thing? |
|
05:47
π
|
Flashfire |
Dunno |
|
05:49
π
|
|
pino_p has joined #archiveteam |
|
05:51
π
|
pino_p |
How many of us have already heard about Free Music Archive going dark? https://www.theverge.com/2018/11/7/18073346/free-music-archive-closing-wfmu-creative-commons-cheyenne-hohman |
|
05:54
π
|
pino_p |
(checks log) Lord_Nigh, see #musicateam |
|
05:55
π
|
pino_p |
and https://www.archiveteam.org/index.php?title=Free_Music_Archive |
|
06:01
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
06:04
π
|
|
nertzy has joined #archiveteam |
|
06:05
π
|
|
Sk1d has joined #archiveteam |
|
06:16
π
|
|
pino_p has quit IRC (Quit: Leaving) |
|
06:17
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
06:22
π
|
|
Sk1d has joined #archiveteam |
|
06:24
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
|
06:36
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
06:41
π
|
|
Sk1d has joined #archiveteam |
|
06:53
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
06:58
π
|
|
Sk1d has joined #archiveteam |
|
07:06
π
|
hiroi |
Is there a way I can download the Geocities (US) data from a few years ago? The torrent is having no speed, not too surprisingly |
|
07:09
π
|
Nemo_bis |
hiroi: there were seeders as recently as few months ago |
|
07:09
π
|
Nemo_bis |
Asking here may be a good way for someone to reseed. |
|
07:10
π
|
Nemo_bis |
Meanwhile, I remembered to upload the first of the two Twitter "integrity" datasets yesterday https://archive.org/details/twitter-integrity-ira |
|
07:13
π
|
hiroi |
Okay I have a torrenting client sitting there waiting for someone to feed her. If anyone would be so kind... |
|
07:43
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
07:48
π
|
|
Sk1d has joined #archiveteam |
|
08:00
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
08:03
π
|
hiroi |
I decided to grab from IAβs Geocities Valhalla instead (this is the same dataset right?); no need for seeder now thanks to all though |
|
08:04
π
|
|
Sk1d has joined #archiveteam |
|
08:12
π
|
|
adinbied has quit IRC (Remote host closed the connection) |
|
08:12
π
|
|
adinbied has joined #archiveteam |
|
08:15
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
08:18
π
|
|
adinbied_ has joined #archiveteam |
|
08:19
π
|
|
adinbied has quit IRC (Ping timeout: 252 seconds) |
|
08:20
π
|
|
Sk1d has joined #archiveteam |
|
08:32
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
08:36
π
|
|
Sk1d has joined #archiveteam |
|
08:37
π
|
|
twiggy36 has joined #archiveteam |
|
08:37
π
|
twiggy36 |
JOIN |
|
08:38
π
|
Flashfire |
join what? |
|
08:39
π
|
|
twiggy36 has quit IRC (Client Quit) |
|
08:41
π
|
|
Kitaru_ has quit IRC (Quit: This computer has gone to sleep) |
|
08:51
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
08:55
π
|
Nemo_bis |
Flashfire: just /JOIN mistyped :) |
|
08:56
π
|
|
Sk1d has joined #archiveteam |
|
09:12
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
09:15
π
|
|
Sk1d has joined #archiveteam |
|
09:18
π
|
|
BlueMax has quit IRC (Quit: Leaving) |
|
09:28
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
09:32
π
|
|
Sk1d has joined #archiveteam |
|
09:39
π
|
|
threeTBHe has joined #archiveteam |
|
09:40
π
|
|
ats has joined #archiveteam |
|
09:45
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
09:45
π
|
|
threeTBHe has quit IRC (Ping timeout: 260 seconds) |
|
09:48
π
|
|
Sk1d has joined #archiveteam |
|
10:02
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
10:07
π
|
|
Sk1d has joined #archiveteam |
|
10:08
π
|
|
godane has quit IRC (Ping timeout: 265 seconds) |
|
10:18
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
10:24
π
|
|
Sk1d has joined #archiveteam |
|
10:54
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
10:59
π
|
|
Sk1d has joined #archiveteam |
|
11:11
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
11:14
π
|
|
Sk1d has joined #archiveteam |
|
11:27
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
11:31
π
|
|
Sk1d has joined #archiveteam |
|
11:44
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
11:49
π
|
|
Sk1d has joined #archiveteam |
|
12:07
π
|
|
nertzy has joined #archiveteam |
|
12:08
π
|
|
Ryz has quit IRC (Quit: ChatZilla 0.9.92-rdmsoft [XULRunner 35.0.1/20150122214805]) |
|
12:22
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
|
12:59
π
|
|
godane has joined #archiveteam |
|
13:08
π
|
|
adinbied_ is now known as adinbied |
|
13:26
π
|
|
Ctrl has quit IRC (Ping timeout: 268 seconds) |
|
13:38
π
|
|
LFlare has joined #archiveteam |
|
13:44
π
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
|
13:44
π
|
|
Mateon1 has joined #archiveteam |
|
13:46
π
|
|
LFlare has left The Lounge - https://thelounge.chat |
|
13:46
π
|
|
LFlare has joined #archiveteam |
|
13:48
π
|
JAA |
So which channel is it for FMA? #fmaction or #musicateam? There are more people in the former, and the latter is mentioned on the wiki. |
|
13:50
π
|
|
matthusby has quit IRC (Remote host closed the connection) |
|
13:51
π
|
|
matthusby has joined #archiveteam |
|
14:31
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
14:34
π
|
|
Sk1d has joined #archiveteam |
|
14:48
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
14:49
π
|
|
VerifiedJ has joined #archiveteam |
|
14:50
π
|
|
Sk1d has joined #archiveteam |
|
15:06
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
15:11
π
|
|
Sk1d has joined #archiveteam |
|
15:16
π
|
|
SketchCo1 is now known as SketchCow |
|
15:16
π
|
SketchCow |
Free Music Archive all set. |
|
15:16
π
|
|
LFlare has quit IRC (Read error: Operation timed out) |
|
15:17
π
|
SketchCow |
- We are doing an Archivebot grab |
|
15:17
π
|
SketchCow |
- Free Music Archive mailed us a hard drive |
|
15:17
π
|
SketchCow |
- Archive-It grabbed a copy |
|
15:17
π
|
anarcat |
nice |
|
15:17
π
|
JAA |
Sweet |
|
15:21
π
|
SketchCow |
YEah, so direct people away from that one, it's handled. |
|
15:22
π
|
SketchCow |
Not surprisingly, my run through all our projects show some getting good action and others getting none. |
|
15:23
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
15:26
π
|
|
LFlare has joined #archiveteam |
|
15:28
π
|
|
Sk1d has joined #archiveteam |
|
15:42
π
|
|
thesame has joined #archiveteam |
|
15:43
π
|
|
Martle has joined #archiveteam |
|
15:43
π
|
thesame |
Hello archiveteam. Can anyone help me with rescuing a project from gitorious? It's giving 404 |
|
16:15
π
|
|
pizzaiolo has quit IRC (Quit: pizzaiolo) |
|
16:19
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
16:23
π
|
|
Sk1d has joined #archiveteam |
|
16:24
π
|
|
Somebody2 has quit IRC (Read error: Operation timed out) |
|
16:25
π
|
|
DFJustin has quit IRC (Read error: Connection reset by peer) |
|
16:26
π
|
|
DFJustin has joined #archiveteam |
|
16:26
π
|
|
swebb sets mode: +o DFJustin |
|
16:27
π
|
|
_Verified has joined #archiveteam |
|
16:29
π
|
|
_Verified has quit IRC (Client Quit) |
|
16:30
π
|
|
VerifiedJ has quit IRC (Quit: Leaving) |
|
16:30
π
|
astrid |
hi |
|
16:30
π
|
astrid |
gitorious admin here |
|
16:31
π
|
astrid |
there was a hardware failure over the weekend. it's held together with shoestring and scotch tape, i suspect some daemon failed to start up. i'll get on it in a few days. sorry! |
|
16:34
π
|
|
VerifiedJ has joined #archiveteam |
|
16:36
π
|
astrid |
data's safe though: nothing was lost from that system, and i have already sent a copy to another organization. |
|
16:36
π
|
astrid |
someday i will get my shit together enough to put it on IA |
|
16:38
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
16:41
π
|
|
Sk1d has joined #archiveteam |
|
16:57
π
|
|
schbirid has joined #archiveteam |
|
16:59
π
|
thesame |
thank you astrid, hope it happens sooner than later |
|
16:59
π
|
thesame |
i'm afraid gitorious is the only place where my very old project remained for now |
|
17:07
π
|
|
LFlare has quit IRC (Ping timeout: 252 seconds) |
|
17:12
π
|
|
wp494 has quit IRC (Ping timeout: 260 seconds) |
|
17:12
π
|
|
wp494 has joined #archiveteam |
|
17:14
π
|
|
Somebody2 has joined #archiveteam |
|
17:16
π
|
* |
Kaz wonders if we have my infra that isn't 'held together with shoestring and scotch tape' |
|
17:16
π
|
Kaz |
archivebot seems to work fine I guess |
|
17:16
π
|
Kaz |
s/my/any |
|
17:29
π
|
|
LFlare has joined #archiveteam |
|
17:57
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
18:02
π
|
|
Sk1d has joined #archiveteam |
|
18:06
π
|
|
Ryz has joined #archiveteam |
|
18:07
π
|
|
Martle_ has joined #archiveteam |
|
18:08
π
|
|
Martle_ has quit IRC (Client Quit) |
|
18:09
π
|
|
Martle has quit IRC (Ping timeout: 252 seconds) |
|
18:53
π
|
|
thesame has quit IRC (Remote host closed the connection) |
|
19:00
π
|
Igloo |
Kaz: the warrior does for the time being |
|
19:00
π
|
Igloo |
What do you need? |
|
19:03
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
19:06
π
|
|
Sk1d has joined #archiveteam |
|
19:22
π
|
|
SimpBrain has quit IRC (Read error: Operation timed out) |
|
19:38
π
|
|
m007a83_ has joined #archiveteam |
|
19:41
π
|
|
stratum has joined #archiveteam |
|
19:41
π
|
|
m007a83 has quit IRC (Read error: Operation timed out) |
|
19:43
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
19:46
π
|
|
Sk1d has joined #archiveteam |
|
19:48
π
|
|
m007a83_ is now known as m007a83 |
|
19:59
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
20:03
π
|
|
Sk1d has joined #archiveteam |
|
20:16
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
20:20
π
|
|
Sk1d has joined #archiveteam |
|
20:26
π
|
|
icedice has joined #archiveteam |
|
20:32
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
20:34
π
|
|
Sk1d has joined #archiveteam |
|
20:42
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
20:48
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
20:49
π
|
|
twoTBHetz has joined #archiveteam |
|
20:52
π
|
twoTBHetz |
Is there any plan to backup transfer.sh ? |
|
20:52
π
|
|
Sk1d has joined #archiveteam |
|
20:53
π
|
schbirid |
i would consider that pointless and not in the spirit of the site, it is designed with "Files stored for 14 days" |
|
20:53
π
|
twoTBHetz |
I see. |
|
20:53
π
|
|
BlueMax has joined #archiveteam |
|
20:54
π
|
twoTBHetz |
I am currently looking into hostinger's free hosting which will be closed in <two weeks. I am currently collecting all subdomains i can find. |
|
20:56
π
|
twoTBHetz |
But since i do not have any archving stuff (warc or whatever) setup i can only give you a list of entrypooints |
|
20:57
π
|
Igloo |
Put them into the archivebot |
|
20:57
π
|
Igloo |
Into blocks of subdomains |
|
21:00
π
|
twoTBHetz |
Igloo how do i do that? |
|
21:03
π
|
twoTBHetz |
why grouped by subdomains? |
|
21:09
π
|
twoTBHetz |
I only know about https://web.archive.org/save/$individual_link |
|
21:11
π
|
|
Kitaru has joined #archiveteam |
|
21:13
π
|
betamax |
twoTBHetz: wait, is hostinger shutting down their free hosting |
|
21:14
π
|
betamax |
I can't find any info on this |
|
21:14
π
|
twoTBHetz |
just visit any site: www.cinemahd.esy.es |
|
21:15
π
|
twoTBHetz |
that link did not work because its a redirection but see thos one http://aea.zz.vc/ |
|
21:17
π
|
twoTBHetz |
I am currently running Sublist3r but it always takes for ever and i only got so many IPs |
|
21:18
π
|
twoTBHetz |
betamax, do you see? |
|
21:19
π
|
twoTBHetz |
so far i got roughly 7705 subdomain names from 16mb.com ahol.es azz.vc esy.es hol.es zz.vc but is still need to check whether they are in use |
|
21:21
π
|
betamax |
ah, yes: |
|
21:21
π
|
betamax |
This website is hosted on a free hosting platform provided by Hostinger. The platform is deprecated, and it will be turned off in two weeks. If you are a website owner, please log into your control panel here. |
|
21:22
π
|
twoTBHetz |
I am currently collectiong subdomains from 96.lt . I have not scanned pe.hu and .890m.com and do a little more digging for more top-levels |
|
21:22
π
|
|
BlueMaxim has joined #archiveteam |
|
21:24
π
|
betamax |
great work. Any idea when that notice first appeared? It's a pain not knowing the actuall shutdown date |
|
21:25
π
|
|
BlueMax has quit IRC (Ping timeout: 260 seconds) |
|
21:25
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
21:26
π
|
twoTBHetz |
no somebody pasted it yesterday in here |
|
21:26
π
|
|
Kitaru has quit IRC (Quit: This computer has gone to sleep) |
|
21:27
π
|
twoTBHetz |
his name was something along the lines of Arctic |
|
21:27
π
|
twoTBHetz |
if i recall correctly |
|
21:27
π
|
betamax |
Ah, missed that |
|
21:27
π
|
betamax |
fyi: hostinger is *very closely* tied to 000webhost |
|
21:28
π
|
betamax |
now, 000webhost are unlikely to stop being free - it's in their name |
|
21:28
π
|
betamax |
but I'm suspicous |
|
21:28
π
|
twoTBHetz |
yeah hostingers free hosting side does not advertise itself but 000webhost instead. i do not think that one will go down |
|
21:29
π
|
twoTBHetz |
are there faster/better tools than sublistr. The online services it uses are starting to rate limit me hard an i already have burned trhough two IPs |
|
21:30
π
|
|
Sk1d has joined #archiveteam |
|
21:31
π
|
SketchCow |
--------------------------- |
|
21:31
π
|
SketchCow |
IMPORTANT NOTE |
|
21:31
π
|
SketchCow |
If you upload WARC files into the general open collections on archive.org |
|
21:31
π
|
SketchCow |
...they're gonna end up in the WARC Zone (https://archive.org/details/warczone) |
|
21:31
π
|
SketchCow |
Unless you've arranged them to go to an archive team collection |
|
21:32
π
|
SketchCow |
If you're smashing endless WARCs into the open collection, then somewhere you |
|
21:32
π
|
SketchCow |
didn't do a thing you probably should have done. Contact me or others. |
|
21:32
π
|
SketchCow |
--------------------------- |
|
21:33
π
|
twoTBHetz |
but there are also sites like http://profin.by/blog/ which feature the banner but are not easy to discover i think (but i know little) |
|
21:33
π
|
betamax |
SketchCow: but I'm guessing any non-ArchiveTeam grabs in WARC format won't be added to Wayback? |
|
21:33
π
|
SketchCow |
They will not |
|
21:40
π
|
anarcat |
SketchCow: i'd love to do the right thing |
|
21:40
π
|
anarcat |
SketchCow: i'm not sure what i'm missing |
|
21:40
π
|
anarcat |
SketchCow: i uploaded those so far https://archive.org/details/@anarcat |
|
21:47
π
|
|
Kitaru has joined #archiveteam |
|
21:57
π
|
|
w0rmybak has quit IRC (Quit: Ping timeout (120 seconds)) |
|
21:57
π
|
|
kiskabak has quit IRC (Quit: Ping timeout (120 seconds)) |
|
21:57
π
|
|
Flashback has quit IRC (Quit: Ping timeout (120 seconds)) |
|
21:58
π
|
|
w0rmybak has joined #archiveteam |
|
22:00
π
|
|
kiskabak has joined #archiveteam |
|
22:00
π
|
|
w0rmybak has quit IRC (Client Quit) |
|
22:00
π
|
|
w0rmybak has joined #archiveteam |
|
22:03
π
|
|
Flashback has joined #archiveteam |
|
22:10
π
|
|
schbirid has quit IRC (Remote host closed the connection) |
|
22:11
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
22:11
π
|
|
bakJAA_ is now known as bakJAA |
|
22:13
π
|
twoTBHetz |
betamax how would i best submit my to be archuved URLs to the wayback machine? |
|
22:15
π
|
|
Sk1d has joined #archiveteam |
|
22:22
π
|
betamax |
its a bit tricky |
|
22:22
π
|
betamax |
what I would do is a two-step process |
|
22:23
π
|
betamax |
1.) get a list of all the sites you discover, and archive their home pages only using archivebot '!ao' |
|
22:24
π
|
betamax |
2.) download all the sites yourself, ideally using something that can generate WARCs linke wget, then extract all the urls from that |
|
22:24
π
|
twoTBHetz |
so only one page per domain? |
|
22:24
π
|
betamax |
all those urls can then by put into archivebot |
|
22:25
π
|
twoTBHetz |
My question is: only one entrypoint per domain or spider the complete thing |
|
22:25
π
|
twoTBHetz |
how would i go about doing warc files in wget |
|
22:28
π
|
betamax |
well, first just get the home pages into archivebot, so something is saved |
|
22:28
π
|
betamax |
then spider / download the complete thing yourself, to there is a copy, even if not in wayback |
|
22:29
π
|
betamax |
then try and get it into wayback by putting a list of all the urls found into archivebot |
|
22:29
π
|
betamax |
see https://www.archiveteam.org/index.php/Wget_with_WARC_output for warc with wget |
|
22:30
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
22:33
π
|
|
Sk1d has joined #archiveteam |
|
22:35
π
|
twoTBHetz |
mhh how does warc work with the spider option? |
|
22:39
π
|
twoTBHetz |
I am currently doing something like that wget "-r" "-l4" "--spider" "--tries=0" "-o" file "--no-verbose" "-D" "startreknews.esy.es" "startreknews.esy.es" to attempt to get all the link on the url in question |
|
22:43
π
|
betamax |
ah, bad choice of wording from me above: I'm not sure "spider" is the argument you want |
|
22:43
π
|
betamax |
according to the manual it means pages won't be downloaded |
|
22:43
π
|
betamax |
I don't know if that will affect the warc |
|
22:44
π
|
betamax |
and warc is just: --warc-file=fileName |
|
22:44
π
|
betamax |
which creates fileName.warc.gz |
|
22:45
π
|
twoTBHetz |
if somebody has a tool chain that works for him i would like to give him my domains |
|
22:45
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
|
22:46
π
|
twoTBHetz |
is the warc spider like too (in the sense that it fetches a complete website (and the sites on that domain it links to) and not just the link i point it to. |
|
22:48
π
|
betamax |
twoTBHetz: afraid I have to go (to bed) now |
|
22:48
π
|
twoTBHetz |
i see |
|
22:48
π
|
betamax |
buy fyi, apparently hostinger emailed people on 25th september saying they had 2 months |
|
22:50
π
|
twoTBHetz |
so two weeks is roughly right |
|
22:51
π
|
|
Sk1d has joined #archiveteam |
|
22:55
π
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
|
22:56
π
|
|
dashcloud has joined #archiveteam |
|
22:56
π
|
|
Mateon1 has joined #archiveteam |
|
23:12
π
|
|
matthusb_ has joined #archiveteam |
|
23:14
π
|
|
matthusby has quit IRC (Read error: Operation timed out) |
|
23:18
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
23:19
π
|
|
adinbied has quit IRC (Left Channel.) |
|
23:20
π
|
|
matthusb_ has quit IRC (Read error: Operation timed out) |
|
23:39
π
|
|
adinbied has joined #archiveteam |