| Time |
Nickname |
Message |
|
00:16
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
|
00:18
π
|
|
signius has quit IRC (Ping timeout: 265 seconds) |
|
00:30
π
|
|
signius has joined #archiveteam |
|
00:32
π
|
|
primus104 has quit IRC (Leaving.) |
|
00:40
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
00:54
π
|
|
mistym has joined #archiveteam |
|
01:08
π
|
|
schbirid2 has joined #archiveteam |
|
01:10
π
|
|
schbirid has quit IRC (Read error: Operation timed out) |
|
01:11
π
|
|
boozehoun has quit IRC (Ping timeout: 258 seconds) |
|
01:19
π
|
|
boozehoun has joined #archiveteam |
|
01:42
π
|
|
Ymgve has quit IRC () |
|
01:54
π
|
|
cadbury_ has joined #archiveteam |
|
01:58
π
|
|
brayden has joined #archiveteam |
|
02:03
π
|
|
aNthraXx has joined #archiveteam |
|
02:47
π
|
|
rejon has joined #archiveteam |
|
02:48
π
|
|
lytv has quit IRC (Ping timeout: 252 seconds) |
|
02:52
π
|
|
lytv has joined #archiveteam |
|
03:58
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
04:29
π
|
|
mistym has joined #archiveteam |
|
05:03
π
|
|
Ctrl-S has quit IRC ( HydraIRC -> http://www.hydrairc.com <- In tests, 0x09 out of 0x0A l33t h4x0rz prefer it :)) |
|
05:10
π
|
|
Ctrl-S has joined #archiveteam |
|
05:48
π
|
|
marvinw has quit IRC (Read error: Operation timed out) |
|
06:13
π
|
|
marvinw has joined #archiveteam |
|
06:15
π
|
|
Control-S has joined #archiveteam |
|
06:19
π
|
|
Ctrl-S has quit IRC (Read error: Operation timed out) |
|
06:19
π
|
|
Control-S is now known as Ctrl-S |
|
06:20
π
|
|
primus104 has joined #archiveteam |
|
06:51
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
07:11
π
|
|
dinomite has quit IRC (Remote host closed the connection) |
|
07:16
π
|
|
dinomite has joined #archiveteam |
|
07:19
π
|
|
atomotic has joined #archiveteam |
|
07:51
π
|
|
mistym has joined #archiveteam |
|
08:02
π
|
schbirid2 |
wp494: kniffy: that grooveshark.io thing is a scam... jesus, have some sense... |
|
08:02
π
|
schbirid2 |
it's disgusting how many "reputable" websites jump on it |
|
08:05
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
|
08:07
π
|
|
dinomite has quit IRC (Read error: Operation timed out) |
|
08:16
π
|
|
dinomite has joined #archiveteam |
|
08:22
π
|
|
MMovie has joined #archiveteam |
|
08:24
π
|
|
MMovie1 has quit IRC (Ping timeout: 306 seconds) |
|
08:29
π
|
|
dinomite has quit IRC (Remote host closed the connection) |
|
08:29
π
|
|
dinomite has joined #archiveteam |
|
08:36
π
|
BlueMaxim |
clever idea though. jump on a dead site's name and use it for advertising |
|
09:06
π
|
Ctrl-S |
or gathering user info |
|
09:06
π
|
Ctrl-S |
how many would use the same PW? |
|
09:07
π
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
|
09:39
π
|
Nemo_bis |
For archivebot: http://datasets.wikimedia.org/ (only root currently available in wayback machine) |
|
09:41
π
|
|
mistym has joined #archiveteam |
|
09:55
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
|
10:42
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
10:46
π
|
|
[Beta] has joined #archiveteam |
|
10:48
π
|
|
john1 has quit IRC (Ping timeout: 252 seconds) |
|
10:49
π
|
[Beta] |
was anyone able to grab P.T. before it vanished off playstation store? saw the bot grabbed the konami page for it⦠|
|
10:50
π
|
|
primus104 has quit IRC (Leaving.) |
|
11:03
π
|
|
john1 has joined #archiveteam |
|
11:28
π
|
midas |
pt? |
|
11:29
π
|
|
mistym has joined #archiveteam |
|
11:32
π
|
|
Ymgve has joined #archiveteam |
|
11:39
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
|
12:08
π
|
Rotab |
that silent hill demo |
|
12:08
π
|
Rotab |
Playable Teaser |
|
12:32
π
|
|
atomotic has joined #archiveteam |
|
12:34
π
|
|
quEt has joined #archiveteam |
|
12:34
π
|
|
quEt has quit IRC (Client Quit) |
|
12:47
π
|
|
sankin has joined #archiveteam |
|
13:16
π
|
|
primus104 has joined #archiveteam |
|
13:51
π
|
|
garyrh has quit IRC (http://bnc4free.com/) |
|
13:58
π
|
|
Start has quit IRC (Disconnected.) |
|
14:19
π
|
|
mistym has joined #archiveteam |
|
14:33
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
|
14:35
π
|
|
Start has joined #archiveteam |
|
14:38
π
|
|
mistym has joined #archiveteam |
|
14:38
π
|
|
caber has quit IRC (Read error: Operation timed out) |
|
14:41
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
14:41
π
|
|
caber has joined #archiveteam |
|
14:57
π
|
|
cirdan_ has joined #archiveteam |
|
14:59
π
|
cirdan_ |
hey all. have a question about trying to archive a drupal site. I'm using httrack and it goes ok, but by the end i have thousands of files like index.html index398.html games894-html. there should only be a few of them, I'm using a rewrite because it uses page= for page numbers |
|
14:59
π
|
|
Start has quit IRC (Disconnected.) |
|
15:00
π
|
|
goekesmi has quit IRC (Remote host closed the connection) |
|
15:00
π
|
cirdan_ |
any ideas to stop this? It seems to happen at the end of the scrape |
|
15:01
π
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
|
15:02
π
|
|
goekesmi has joined #archiveteam |
|
15:03
π
|
|
Start has joined #archiveteam |
|
15:06
π
|
|
mistym has joined #archiveteam |
|
15:13
π
|
achip |
cirdan_: to me it sounds like a pagination that continually has a "next" link that is <this page number>+<page offset> even if there isn't more content, do you have an example? |
|
15:16
π
|
|
Start has quit IRC (Disconnected.) |
|
15:20
π
|
cirdan_ |
I'm trying to snapshot http://macintoshgarden.org |
|
15:21
π
|
cirdan_ |
I'm trying something different now so I don't have anything downloaded atm |
|
15:21
π
|
cirdan_ |
it's a drupal 6 site |
|
15:25
π
|
cirdan_ |
I was thinking maybe it takes so long and the site is set to not cache, that it was re-getting all the indices at the end again |
|
15:26
π
|
cirdan_ |
the odd thing also if you give it an invalid page link it'll take you to page 1 |
|
15:36
π
|
DFJustin |
drupal has a lot of dumb things that mess up crawling, you'd probably be better running it through archivebot which has some anti-drupal measures |
|
15:40
π
|
|
nertzy has joined #archiveteam |
|
15:42
π
|
balrog |
or using wpull directly |
|
15:44
π
|
|
Start has joined #archiveteam |
|
15:46
π
|
cirdan_ |
:p |
|
15:49
π
|
|
Start has quit IRC (Read error: Connection reset by peer) |
|
15:49
π
|
|
nertzy has quit IRC (This computer has gone to sleep) |
|
15:50
π
|
cirdan_ |
yeah i'll try wpull |
|
15:50
π
|
cirdan_ |
sny special settings needed? |
|
15:51
π
|
DFJustin |
well I don't know how much of the smarts are in wpull as opposed to higher layers |
|
15:52
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
16:08
π
|
|
primus104 has quit IRC (Leaving.) |
|
16:11
π
|
|
mistym has joined #archiveteam |
|
16:13
π
|
|
c_b has joined #archiveteam |
|
16:21
π
|
|
Start has joined #archiveteam |
|
16:24
π
|
|
garyrh has joined #archiveteam |
|
16:39
π
|
cirdan_ |
hmm enabling compression seemed to not work. was telling me server misbehaved |
|
16:39
π
|
cirdan_ |
removing it worked |
|
16:39
π
|
cirdan_ |
odd because the server has compression on |
|
16:43
π
|
|
Start has quit IRC (Disconnected.) |
|
16:47
π
|
|
signius has quit IRC (Ping timeout: 240 seconds) |
|
16:48
π
|
cirdan_ |
why does --exclude-directories not work? i have --exclude-directories "/sites/macintoshgarden.org/files/games/" and it wants to download from it |
|
16:49
π
|
cirdan_ |
i have 5 I don't want, and 5 --exclude-directories |
|
16:50
π
|
balrog |
you mean --reject-regex ? |
|
16:51
π
|
cirdan_ |
no |
|
16:52
π
|
cirdan_ |
i mean --exclude-domains: donβt download paths in LIST |
|
16:53
π
|
cirdan_ |
err --exclude-directories |
|
16:58
π
|
cirdan_ |
if you can only have one, the command should fail with multiple on the command line |
|
16:59
π
|
cirdan_ |
it also doesn't say how multiple entries should be delimited⦠space, comma, colon? |
|
17:00
π
|
|
signius has joined #archiveteam |
|
17:06
π
|
|
aaaaaaaaa has joined #archiveteam |
|
17:07
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
17:12
π
|
|
mistym has joined #archiveteam |
|
17:15
π
|
|
SimpBrain has joined #archiveteam |
|
17:23
π
|
|
primus104 has joined #archiveteam |
|
17:25
π
|
|
c_b has quit IRC (Quit: c_b) |
|
17:44
π
|
|
primus104 has quit IRC (Leaving.) |
|
17:44
π
|
|
xmc has quit IRC (Remote host closed the connection) |
|
17:45
π
|
|
xmc has joined #archiveteam |
|
17:48
π
|
|
nertzy has joined #archiveteam |
|
18:16
π
|
|
nertzy has quit IRC (This computer has gone to sleep) |
|
18:32
π
|
|
primus104 has joined #archiveteam |
|
18:37
π
|
|
RichardG_ is now known as RichardG |
|
18:47
π
|
|
habi has joined #archiveteam |
|
18:49
π
|
|
habi has left |
|
18:58
π
|
|
godane has quit IRC (Ping timeout: 265 seconds) |
|
19:18
π
|
|
godane has joined #archiveteam |
|
19:22
π
|
|
cirdan_ has quit IRC (Ping timeout: 240 seconds) |
|
19:26
π
|
|
Ymgve has quit IRC (Ping timeout: 506 seconds) |
|
19:30
π
|
|
Ymgve has joined #archiveteam |
|
19:51
π
|
|
habi has joined #archiveteam |
|
19:54
π
|
|
SN4T14_ has joined #archiveteam |
|
19:56
π
|
|
habi has left |
|
20:02
π
|
|
SN4T14 has quit IRC (Ping timeout: 512 seconds) |
|
20:02
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
20:02
π
|
|
mistym has joined #archiveteam |
|
20:08
π
|
Deewiant |
https://www.reddit.com/r/DataHoarder/comments/3532q9/longterm_retention/ if anybody knows (somebody who knows) how to get data off of those, consider contacting the guy before they're lost |
|
20:12
π
|
aaaaaaaaa |
seems somewhat funny to have someone on datahoarder with 72TB talking about destroying something like that. |
|
20:13
π
|
Sanqui |
somebody should post IABAK on that subreddit |
|
20:17
π
|
balrog |
those look like 9 track tapes |
|
20:17
π
|
balrog |
there are people who have equipment |
|
20:17
π
|
balrog |
(cctech mailing list, etc) |
|
20:27
π
|
pikhq |
Shit, that quantity of tapes you could probably at least find someone willing to take 'em and do the searching on their own. |
|
20:31
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
20:44
π
|
|
mistym has joined #archiveteam |
|
20:45
π
|
|
BlueMaxim has joined #archiveteam |
|
20:51
π
|
|
sankin has quit IRC (Leaving.) |
|
21:08
π
|
goekesmi |
bexitexit |
|
21:24
π
|
xmc |
balrog DFJustin ersi Lord_Nigh underscor yipdw: spread the @s |
|
21:24
π
|
ersi |
No! |
|
21:24
π
|
xmc |
D: |
|
21:24
π
|
|
underscor sets mode: +o xmc |
|
21:27
π
|
|
xmc sets mode: +oooo chfoo SketchCow joepie91 closure |
|
21:28
π
|
|
SimpBrain has quit IRC (Quit: Leaving) |
|
21:33
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
21:36
π
|
SketchCow |
FOS is sort of healed from Halo |
|
21:36
π
|
SketchCow |
I'd like it to be 100% free of Halo before we ramp it up again |
|
21:36
π
|
SketchCow |
But it's going well in that direction |
|
21:38
π
|
SketchCow |
It was previously at, like, 85 Halo 40gb units |
|
21:38
π
|
SketchCow |
Now at 3 |
|
21:38
π
|
SketchCow |
5tb free on that partition |
|
21:38
π
|
SketchCow |
So that bodes well |
|
21:47
π
|
|
mistym has joined #archiveteam |
|
21:54
π
|
|
Ymgve has quit IRC () |
|
22:06
π
|
arkiver |
SketchCow: Kenshin is holding around 1T of google baraza items |
|
22:06
π
|
arkiver |
when the project is fully done, do you have room on FOS for them? |
|
22:13
π
|
SketchCow |
I do |
|
22:16
π
|
|
DFJustin has quit IRC (IMHOSTFU) |
|
22:16
π
|
|
rolf has joined #archiveteam |
|
22:17
π
|
|
DFJustin has joined #archiveteam |
|
22:17
π
|
|
Start has joined #archiveteam |
|
22:18
π
|
arkiver |
ok, they'll be synced over when the project is done |
|
22:21
π
|
|
toad2 has joined #archiveteam |
|
22:27
π
|
|
toad1 has quit IRC (Read error: Operation timed out) |
|
22:45
π
|
|
rolf has quit IRC (Linkinus - http://linkinus.com) |
|
22:58
π
|
Lord_Nigh |
SketchCow: halo? as in what exactly? the game? |
|
23:08
π
|
SketchCow |
Shhhh |
|
23:08
π
|
SketchCow |
It's handled. |
|
23:08
π
|
SketchCow |
It's a project that's been going on. It'll come back. I had it going and it flooded our buffer. |