Time |
Nickname |
Message |
00:16
π
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
00:18
π
|
|
signius has quit IRC (Ping timeout: 265 seconds) |
00:30
π
|
|
signius has joined #archiveteam |
00:32
π
|
|
primus104 has quit IRC (Leaving.) |
00:40
π
|
|
mistym has quit IRC (Remote host closed the connection) |
00:54
π
|
|
mistym has joined #archiveteam |
01:08
π
|
|
schbirid2 has joined #archiveteam |
01:10
π
|
|
schbirid has quit IRC (Read error: Operation timed out) |
01:11
π
|
|
boozehoun has quit IRC (Ping timeout: 258 seconds) |
01:19
π
|
|
boozehoun has joined #archiveteam |
01:42
π
|
|
Ymgve has quit IRC () |
01:54
π
|
|
cadbury_ has joined #archiveteam |
01:58
π
|
|
brayden has joined #archiveteam |
02:03
π
|
|
aNthraXx has joined #archiveteam |
02:47
π
|
|
rejon has joined #archiveteam |
02:48
π
|
|
lytv has quit IRC (Ping timeout: 252 seconds) |
02:52
π
|
|
lytv has joined #archiveteam |
03:58
π
|
|
mistym has quit IRC (Remote host closed the connection) |
04:29
π
|
|
mistym has joined #archiveteam |
05:03
π
|
|
Ctrl-S has quit IRC ( HydraIRC -> http://www.hydrairc.com <- In tests, 0x09 out of 0x0A l33t h4x0rz prefer it :)) |
05:10
π
|
|
Ctrl-S has joined #archiveteam |
05:48
π
|
|
marvinw has quit IRC (Read error: Operation timed out) |
06:13
π
|
|
marvinw has joined #archiveteam |
06:15
π
|
|
Control-S has joined #archiveteam |
06:19
π
|
|
Ctrl-S has quit IRC (Read error: Operation timed out) |
06:19
π
|
|
Control-S is now known as Ctrl-S |
06:20
π
|
|
primus104 has joined #archiveteam |
06:51
π
|
|
mistym has quit IRC (Remote host closed the connection) |
07:11
π
|
|
dinomite has quit IRC (Remote host closed the connection) |
07:16
π
|
|
dinomite has joined #archiveteam |
07:19
π
|
|
atomotic has joined #archiveteam |
07:51
π
|
|
mistym has joined #archiveteam |
08:02
π
|
schbirid2 |
wp494: kniffy: that grooveshark.io thing is a scam... jesus, have some sense... |
08:02
π
|
schbirid2 |
it's disgusting how many "reputable" websites jump on it |
08:05
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
08:07
π
|
|
dinomite has quit IRC (Read error: Operation timed out) |
08:16
π
|
|
dinomite has joined #archiveteam |
08:22
π
|
|
MMovie has joined #archiveteam |
08:24
π
|
|
MMovie1 has quit IRC (Ping timeout: 306 seconds) |
08:29
π
|
|
dinomite has quit IRC (Remote host closed the connection) |
08:29
π
|
|
dinomite has joined #archiveteam |
08:36
π
|
BlueMaxim |
clever idea though. jump on a dead site's name and use it for advertising |
09:06
π
|
Ctrl-S |
or gathering user info |
09:06
π
|
Ctrl-S |
how many would use the same PW? |
09:07
π
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
09:39
π
|
Nemo_bis |
For archivebot: http://datasets.wikimedia.org/ (only root currently available in wayback machine) |
09:41
π
|
|
mistym has joined #archiveteam |
09:55
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
10:42
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
10:46
π
|
|
[Beta] has joined #archiveteam |
10:48
π
|
|
john1 has quit IRC (Ping timeout: 252 seconds) |
10:49
π
|
[Beta] |
was anyone able to grab P.T. before it vanished off playstation store? saw the bot grabbed the konami page for it⦠|
10:50
π
|
|
primus104 has quit IRC (Leaving.) |
11:03
π
|
|
john1 has joined #archiveteam |
11:28
π
|
midas |
pt? |
11:29
π
|
|
mistym has joined #archiveteam |
11:32
π
|
|
Ymgve has joined #archiveteam |
11:39
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
12:08
π
|
Rotab |
that silent hill demo |
12:08
π
|
Rotab |
Playable Teaser |
12:32
π
|
|
atomotic has joined #archiveteam |
12:34
π
|
|
quEt has joined #archiveteam |
12:34
π
|
|
quEt has quit IRC (Client Quit) |
12:47
π
|
|
sankin has joined #archiveteam |
13:16
π
|
|
primus104 has joined #archiveteam |
13:51
π
|
|
garyrh has quit IRC (http://bnc4free.com/) |
13:58
π
|
|
Start has quit IRC (Disconnected.) |
14:19
π
|
|
mistym has joined #archiveteam |
14:33
π
|
|
mistym has quit IRC (Read error: Operation timed out) |
14:35
π
|
|
Start has joined #archiveteam |
14:38
π
|
|
mistym has joined #archiveteam |
14:38
π
|
|
caber has quit IRC (Read error: Operation timed out) |
14:41
π
|
|
mistym has quit IRC (Remote host closed the connection) |
14:41
π
|
|
caber has joined #archiveteam |
14:57
π
|
|
cirdan_ has joined #archiveteam |
14:59
π
|
cirdan_ |
hey all. have a question about trying to archive a drupal site. I'm using httrack and it goes ok, but by the end i have thousands of files like index.html index398.html games894-html. there should only be a few of them, I'm using a rewrite because it uses page= for page numbers |
14:59
π
|
|
Start has quit IRC (Disconnected.) |
15:00
π
|
|
goekesmi has quit IRC (Remote host closed the connection) |
15:00
π
|
cirdan_ |
any ideas to stop this? It seems to happen at the end of the scrape |
15:01
π
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
15:02
π
|
|
goekesmi has joined #archiveteam |
15:03
π
|
|
Start has joined #archiveteam |
15:06
π
|
|
mistym has joined #archiveteam |
15:13
π
|
achip |
cirdan_: to me it sounds like a pagination that continually has a "next" link that is <this page number>+<page offset> even if there isn't more content, do you have an example? |
15:16
π
|
|
Start has quit IRC (Disconnected.) |
15:20
π
|
cirdan_ |
I'm trying to snapshot http://macintoshgarden.org |
15:21
π
|
cirdan_ |
I'm trying something different now so I don't have anything downloaded atm |
15:21
π
|
cirdan_ |
it's a drupal 6 site |
15:25
π
|
cirdan_ |
I was thinking maybe it takes so long and the site is set to not cache, that it was re-getting all the indices at the end again |
15:26
π
|
cirdan_ |
the odd thing also if you give it an invalid page link it'll take you to page 1 |
15:36
π
|
DFJustin |
drupal has a lot of dumb things that mess up crawling, you'd probably be better running it through archivebot which has some anti-drupal measures |
15:40
π
|
|
nertzy has joined #archiveteam |
15:42
π
|
balrog |
or using wpull directly |
15:44
π
|
|
Start has joined #archiveteam |
15:46
π
|
cirdan_ |
:p |
15:49
π
|
|
Start has quit IRC (Read error: Connection reset by peer) |
15:49
π
|
|
nertzy has quit IRC (This computer has gone to sleep) |
15:50
π
|
cirdan_ |
yeah i'll try wpull |
15:50
π
|
cirdan_ |
sny special settings needed? |
15:51
π
|
DFJustin |
well I don't know how much of the smarts are in wpull as opposed to higher layers |
15:52
π
|
|
mistym has quit IRC (Remote host closed the connection) |
16:08
π
|
|
primus104 has quit IRC (Leaving.) |
16:11
π
|
|
mistym has joined #archiveteam |
16:13
π
|
|
c_b has joined #archiveteam |
16:21
π
|
|
Start has joined #archiveteam |
16:24
π
|
|
garyrh has joined #archiveteam |
16:39
π
|
cirdan_ |
hmm enabling compression seemed to not work. was telling me server misbehaved |
16:39
π
|
cirdan_ |
removing it worked |
16:39
π
|
cirdan_ |
odd because the server has compression on |
16:43
π
|
|
Start has quit IRC (Disconnected.) |
16:47
π
|
|
signius has quit IRC (Ping timeout: 240 seconds) |
16:48
π
|
cirdan_ |
why does --exclude-directories not work? i have --exclude-directories "/sites/macintoshgarden.org/files/games/" and it wants to download from it |
16:49
π
|
cirdan_ |
i have 5 I don't want, and 5 --exclude-directories |
16:50
π
|
balrog |
you mean --reject-regex ? |
16:51
π
|
cirdan_ |
no |
16:52
π
|
cirdan_ |
i mean --exclude-domains: donβt download paths in LIST |
16:53
π
|
cirdan_ |
err --exclude-directories |
16:58
π
|
cirdan_ |
if you can only have one, the command should fail with multiple on the command line |
16:59
π
|
cirdan_ |
it also doesn't say how multiple entries should be delimited⦠space, comma, colon? |
17:00
π
|
|
signius has joined #archiveteam |
17:06
π
|
|
aaaaaaaaa has joined #archiveteam |
17:07
π
|
|
mistym has quit IRC (Remote host closed the connection) |
17:12
π
|
|
mistym has joined #archiveteam |
17:15
π
|
|
SimpBrain has joined #archiveteam |
17:23
π
|
|
primus104 has joined #archiveteam |
17:25
π
|
|
c_b has quit IRC (Quit: c_b) |
17:44
π
|
|
primus104 has quit IRC (Leaving.) |
17:44
π
|
|
xmc has quit IRC (Remote host closed the connection) |
17:45
π
|
|
xmc has joined #archiveteam |
17:48
π
|
|
nertzy has joined #archiveteam |
18:16
π
|
|
nertzy has quit IRC (This computer has gone to sleep) |
18:32
π
|
|
primus104 has joined #archiveteam |
18:37
π
|
|
RichardG_ is now known as RichardG |
18:47
π
|
|
habi has joined #archiveteam |
18:49
π
|
|
habi has left |
18:58
π
|
|
godane has quit IRC (Ping timeout: 265 seconds) |
19:18
π
|
|
godane has joined #archiveteam |
19:22
π
|
|
cirdan_ has quit IRC (Ping timeout: 240 seconds) |
19:26
π
|
|
Ymgve has quit IRC (Ping timeout: 506 seconds) |
19:30
π
|
|
Ymgve has joined #archiveteam |
19:51
π
|
|
habi has joined #archiveteam |
19:54
π
|
|
SN4T14_ has joined #archiveteam |
19:56
π
|
|
habi has left |
20:02
π
|
|
SN4T14 has quit IRC (Ping timeout: 512 seconds) |
20:02
π
|
|
mistym has quit IRC (Remote host closed the connection) |
20:02
π
|
|
mistym has joined #archiveteam |
20:08
π
|
Deewiant |
https://www.reddit.com/r/DataHoarder/comments/3532q9/longterm_retention/ if anybody knows (somebody who knows) how to get data off of those, consider contacting the guy before they're lost |
20:12
π
|
aaaaaaaaa |
seems somewhat funny to have someone on datahoarder with 72TB talking about destroying something like that. |
20:13
π
|
Sanqui |
somebody should post IABAK on that subreddit |
20:17
π
|
balrog |
those look like 9 track tapes |
20:17
π
|
balrog |
there are people who have equipment |
20:17
π
|
balrog |
(cctech mailing list, etc) |
20:27
π
|
pikhq |
Shit, that quantity of tapes you could probably at least find someone willing to take 'em and do the searching on their own. |
20:31
π
|
|
mistym has quit IRC (Remote host closed the connection) |
20:44
π
|
|
mistym has joined #archiveteam |
20:45
π
|
|
BlueMaxim has joined #archiveteam |
20:51
π
|
|
sankin has quit IRC (Leaving.) |
21:08
π
|
goekesmi |
bexitexit |
21:24
π
|
xmc |
balrog DFJustin ersi Lord_Nigh underscor yipdw: spread the @s |
21:24
π
|
ersi |
No! |
21:24
π
|
xmc |
D: |
21:24
π
|
|
underscor sets mode: +o xmc |
21:27
π
|
|
xmc sets mode: +oooo chfoo SketchCow joepie91 closure |
21:28
π
|
|
SimpBrain has quit IRC (Quit: Leaving) |
21:33
π
|
|
mistym has quit IRC (Remote host closed the connection) |
21:36
π
|
SketchCow |
FOS is sort of healed from Halo |
21:36
π
|
SketchCow |
I'd like it to be 100% free of Halo before we ramp it up again |
21:36
π
|
SketchCow |
But it's going well in that direction |
21:38
π
|
SketchCow |
It was previously at, like, 85 Halo 40gb units |
21:38
π
|
SketchCow |
Now at 3 |
21:38
π
|
SketchCow |
5tb free on that partition |
21:38
π
|
SketchCow |
So that bodes well |
21:47
π
|
|
mistym has joined #archiveteam |
21:54
π
|
|
Ymgve has quit IRC () |
22:06
π
|
arkiver |
SketchCow: Kenshin is holding around 1T of google baraza items |
22:06
π
|
arkiver |
when the project is fully done, do you have room on FOS for them? |
22:13
π
|
SketchCow |
I do |
22:16
π
|
|
DFJustin has quit IRC (IMHOSTFU) |
22:16
π
|
|
rolf has joined #archiveteam |
22:17
π
|
|
DFJustin has joined #archiveteam |
22:17
π
|
|
Start has joined #archiveteam |
22:18
π
|
arkiver |
ok, they'll be synced over when the project is done |
22:21
π
|
|
toad2 has joined #archiveteam |
22:27
π
|
|
toad1 has quit IRC (Read error: Operation timed out) |
22:45
π
|
|
rolf has quit IRC (Linkinus - http://linkinus.com) |
22:58
π
|
Lord_Nigh |
SketchCow: halo? as in what exactly? the game? |
23:08
π
|
SketchCow |
Shhhh |
23:08
π
|
SketchCow |
It's handled. |
23:08
π
|
SketchCow |
It's a project that's been going on. It'll come back. I had it going and it flooded our buffer. |