Time |
Nickname |
Message |
00:09
🔗
|
chfoo |
arkiver: ok, added. |
01:00
🔗
|
kyan |
DFJustin, also there are groups, with group forums and group comments |
01:03
🔗
|
|
Ravenloft has joined #archiveteam |
01:15
🔗
|
|
xtr-201 has joined #archiveteam |
01:36
🔗
|
|
GLaDOS has quit IRC (Ping timeout: 512 seconds) |
01:56
🔗
|
|
Kazzy has quit IRC (Read error: Operation timed out) |
01:56
🔗
|
|
Kazzy has joined #archiveteam |
02:07
🔗
|
|
signius has quit IRC (Read error: Operation timed out) |
02:16
🔗
|
|
Ymgve has quit IRC () |
02:18
🔗
|
|
SimpBrain has quit IRC (Read error: Connection reset by peer) |
02:21
🔗
|
|
signius has joined #archiveteam |
02:22
🔗
|
bmcginty |
Question. Doe sArchiveTeam ever use backend apis to grab content from failing sites, or is it restricted to typical webpages? |
02:27
🔗
|
chfoo |
for hyves, all the ajax calls were manually made to save all the photos which is why every archived page has the word "deadbeef" in it |
02:28
🔗
|
chfoo |
for viddler, reverse engineering was done and they got angry. |
02:30
🔗
|
bmcginty |
chfoo: ah. thanks. |
02:42
🔗
|
SketchCow |
http://teamarchive1.fnf.archive.org/DELETE-SCREENBIN/alanwood.net-inf-20140114-090506-5jxkt.warc.gz.png |
02:42
🔗
|
SketchCow |
Delicious |
02:43
🔗
|
SketchCow |
It's nice when it just works, although it does not often just work. |
02:52
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
02:53
🔗
|
|
mistym has joined #archiveteam |
03:15
🔗
|
|
khaoohs has joined #archiveteam |
03:30
🔗
|
|
GLaDOS has joined #archiveteam |
03:35
🔗
|
|
JMC has joined #archiveteam |
03:36
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
03:40
🔗
|
|
dashcloud has joined #archiveteam |
03:42
🔗
|
|
GLaDOS has quit IRC (Ping timeout: 260 seconds) |
03:45
🔗
|
|
GLaDOS has joined #archiveteam |
04:08
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
04:26
🔗
|
|
wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) |
04:48
🔗
|
|
Mayonaise has quit IRC (Ping timeout: 512 seconds) |
05:13
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
05:14
🔗
|
|
mistym has joined #archiveteam |
05:27
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
05:28
🔗
|
|
wp494 has joined #archiveteam |
05:54
🔗
|
|
Mayonaise has joined #archiveteam |
05:55
🔗
|
|
primus104 has joined #archiveteam |
05:56
🔗
|
|
mistym has joined #archiveteam |
06:08
🔗
|
|
Mayonaise has quit IRC (Ping timeout: 512 seconds) |
06:17
🔗
|
|
Mayonaise has joined #archiveteam |
07:06
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
07:17
🔗
|
|
[Beta]_ has joined #archiveteam |
07:18
🔗
|
|
[Beta] has quit IRC (Ping timeout: 240 seconds) |
07:19
🔗
|
|
robink has quit IRC (Quit: No Ping reply in 180 seconds.) |
07:20
🔗
|
|
robink has joined #archiveteam |
07:39
🔗
|
|
[Beta]_ is now known as [Beta] |
07:51
🔗
|
|
schbirid has joined #archiveteam |
07:53
🔗
|
|
SimpBrain has joined #archiveteam |
08:03
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
08:06
🔗
|
|
dashcloud has joined #archiveteam |
09:10
🔗
|
|
primus104 has quit IRC (Leaving.) |
09:23
🔗
|
BlueMaxim |
hey, I just learnt that the Neverwinter Vault (an IGN hosted site that hosted game addons for Neverwinter Nights 1 & 2) went down a few months ago and never came back, but some people not only got a full copy of the website but actually put up a mirror of it at ftp://neverwintervault.org/rolovault/ - it might be worth feeding it into the Archive somehow. don't have the hardware to do it myself so I'd just put it here to |
09:23
🔗
|
BlueMaxim |
see if someone would like to do it |
09:47
🔗
|
|
Start has quit IRC (Ping timeout: 740 seconds) |
10:06
🔗
|
|
Start has joined #archiveteam |
10:11
🔗
|
|
sky__ has joined #archiveteam |
10:11
🔗
|
|
sky__ has left |
10:44
🔗
|
|
habi has joined #archiveteam |
10:49
🔗
|
|
svchfoo2 has quit IRC (Remote host closed the connection) |
10:52
🔗
|
|
svchfoo2 has joined #archiveteam |
10:53
🔗
|
|
svchfoo1 sets mode: +o svchfoo2 |
10:56
🔗
|
|
habi has left |
11:04
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
11:42
🔗
|
|
Ymgve has joined #archiveteam |
11:58
🔗
|
|
Wolfie has joined #archiveteam |
12:42
🔗
|
|
primus104 has joined #archiveteam |
12:43
🔗
|
|
sankin has joined #archiveteam |
12:54
🔗
|
johtso |
is anyone able to upload to IA with the s3-like api at the moment? |
12:56
🔗
|
johtso |
I worried it's my api keys that have been blocked, unless everyone is getting 503 |
13:03
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
13:11
🔗
|
johtso |
Don't seem to be able to upload through the web interface either. |
13:30
🔗
|
Nemo_bis |
johtso: what sort of requests are you making? |
13:31
🔗
|
johtso |
Nemo_bis, just normal upload requests, using the ia tool, and using the upload tool on the website |
13:31
🔗
|
johtso |
unless there's some kind of site wide outage I assume my account got blocked when a was making a bunch of parallel uploads.. |
13:31
🔗
|
|
brayden has joined #archiveteam |
13:31
🔗
|
Nemo_bis |
...which is not what I call normal upload requests |
13:32
🔗
|
Nemo_bis |
Yes, it's possible to get uploads refused if you do many at once. Even with a single upload thread doing many small uploads. |
13:32
🔗
|
johtso |
right, I'm but now I'm just making single requests |
13:33
🔗
|
Nemo_bis |
In my experience it happens when I have hundreds of green rows in https://archive.org/catalog.php?history=1&justme=1 |
13:33
🔗
|
johtso |
Nemo_bis, that's exactly my situation! |
13:34
🔗
|
johtso |
that's a hard disk issue |
13:34
🔗
|
johtso |
unfortunate that it's stops you from doing any more uploads until it's sorted.. |
13:35
🔗
|
|
primus104 has quit IRC (Leaving.) |
13:35
🔗
|
johtso |
wow, what's happened to my typing today.. |
14:14
🔗
|
|
Melissa_ has joined #archiveteam |
14:14
🔗
|
Melissa_ |
Hi guys |
14:14
🔗
|
Melissa_ |
Can I request a website for archiving? |
14:17
🔗
|
johtso |
Melissa_, sure! what's the site? |
14:17
🔗
|
Melissa_ |
dbtropes.org |
14:17
🔗
|
Melissa_ |
It's a Linked Data wrapper for TV Tropes |
14:18
🔗
|
Melissa_ |
Which allows one to perform way better searches |
14:18
🔗
|
Melissa_ |
I contacted the creator of it, asking a question, but he has not responded and ever since access to the website has been revoked. Just try accessing it now |
14:19
🔗
|
Melissa_ |
I'm starting to think he didn't want it to be public |
14:19
🔗
|
Melissa_ |
I do have another URL which can still be accessed, and maybe it's just a matter of time before he revokes access to that URL as well |
14:19
🔗
|
Melissa_ |
I'm not sure of this, though. Maybe this issue is just temporary |
14:20
🔗
|
Melissa_ |
You can read more about dbtopres here: https://web.archive.org/web/20140722081335/http://skipforward.opendfki.de/wiki/DBTropes |
14:21
🔗
|
johtso |
Melissa_, if you give me the URL I can try pointing archivebot at it. |
14:21
🔗
|
Melissa_ |
Yeah. There's just one problem: dbtropes isn't just the pages, but also a search engine, which is a back-end thing |
14:21
🔗
|
Melissa_ |
Anything's better than nothing, though |
14:23
🔗
|
|
toad1 has quit IRC (Read error: Operation timed out) |
14:24
🔗
|
Melissa_ |
Actually, nevermind |
14:24
🔗
|
Melissa_ |
Just found a zip of the whole database |
14:30
🔗
|
johtso |
nice :) |
14:30
🔗
|
johtso |
Melissa_, want me to archivebot the database dump? |
14:39
🔗
|
|
mistym has joined #archiveteam |
14:40
🔗
|
godane |
SketchCow: i'm uploading CD3WD dvds to your ftp |
14:41
🔗
|
godane |
i figure it would be easier to upload to a ftp since its very big and i can resume if my wifi acts up |
14:41
🔗
|
SketchCow |
Great |
14:47
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
14:53
🔗
|
|
wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) |
14:54
🔗
|
balrog |
Melissa_: it works for me |
14:54
🔗
|
balrog |
and they provide a database snapshot |
14:54
🔗
|
balrog |
it's weird that you can't access it... |
14:55
🔗
|
Melissa_ |
balrog yeah, looks like I've been worrying about nothing. The issue was client-sided |
14:57
🔗
|
Melissa_ |
Very strange but the issue is gone now that I've removed cookies, site prefs, etc. |
14:57
🔗
|
|
Wolfie has quit IRC (Quit: Leaving.) |
14:58
🔗
|
johtso |
how can you tell if archivebot has properly scraped something? where can you see the actual total data downloaded? |
14:58
🔗
|
johtso |
would it be this bit at the end of the job? "sent 325 bytes received 30 bytes 236.67 bytes/sec" |
14:58
🔗
|
johtso |
and if so, how would you "try again"? |
15:12
🔗
|
|
Wolfie has joined #archiveteam |
15:28
🔗
|
|
Melissa_ has quit IRC (Quit: Leaving) |
15:36
🔗
|
Start |
rapidshare dies tomorrow, we need more people doing rapidshare discovery |
15:36
🔗
|
Start |
#rapidscare |
16:08
🔗
|
|
mistym has joined #archiveteam |
16:08
🔗
|
|
mistym_ has joined #archiveteam |
16:12
🔗
|
|
mistym has quit IRC (Client Quit) |
16:13
🔗
|
|
Emcy has quit IRC (Read error: Connection reset by peer) |
16:24
🔗
|
|
mistym has joined #archiveteam |
16:26
🔗
|
|
mistym has quit IRC (Client Quit) |
16:30
🔗
|
|
mistym has joined #archiveteam |
16:32
🔗
|
|
mistym_ has quit IRC (Leaving...) |
16:35
🔗
|
schbirid |
Start: ask again when the tracker is actually servign jobs :P |
16:35
🔗
|
Start |
it was when i asked |
16:36
🔗
|
schbirid |
:) |
16:40
🔗
|
|
patricko- is now known as patrickod |
16:43
🔗
|
|
john has quit IRC (Remote host closed the connection) |
16:44
🔗
|
|
philpem has joined #archiveteam |
16:48
🔗
|
|
philpem_ has joined #archiveteam |
16:48
🔗
|
|
philpem_ has quit IRC (Remote host closed the connection) |
16:49
🔗
|
yipdw |
johtso: search here -> http://archive.fart.website/archivebot/viewer/ |
16:49
🔗
|
yipdw |
job results are typically uploaded in 5 GB chunks; that index updates periodically |
16:53
🔗
|
|
aaaaaaaaa has joined #archiveteam |
17:07
🔗
|
|
patrickod is now known as patricko- |
17:09
🔗
|
|
primus104 has joined #archiveteam |
17:14
🔗
|
Wolfie |
Is there a good way to pull sites out of the wayback machine? |
17:15
🔗
|
yipdw |
Wolfie: if the site was injected via ArchiveBot then the WARCs can be located via http://archive.fart.website/archivebot/viewer/ |
17:15
🔗
|
yipdw |
if it came from the Archive-It group then you may be able to ask them |
17:15
🔗
|
yipdw |
for other cases no public access presently exists |
17:17
🔗
|
Wolfie |
Well goddamnit. |
17:22
🔗
|
johtso |
yipdw, do you know what that output I pasted from the logs refers to? |
17:23
🔗
|
kyan |
I always assumed that was the json file |
17:23
🔗
|
kyan |
I think those are around that big |
17:24
🔗
|
johtso |
ah, that's not so useful :) |
17:25
🔗
|
johtso |
is there any way to bypass the delay before being able to re-archive? |
17:25
🔗
|
kyan |
!expire ident |
17:25
🔗
|
johtso |
ah, nice |
17:25
🔗
|
johtso |
does !abort do anything if the job is already done? |
17:25
🔗
|
kyan |
not as far as I know |
17:26
🔗
|
kyan |
but yeah I think the json file is the only thing that gets rsynced at the end of a job now, all the warcs I think get sent in the background while the job is running (or something like that?) |
17:26
🔗
|
kyan |
I'm no expert |
17:27
🔗
|
johtso |
any way to nuke a completed archive? |
17:27
🔗
|
SketchCow |
https://archive.org/details/archiveteam_madden/v2 |
17:28
🔗
|
kyan |
maybe by nuking the IA data centers. Please don't do that... |
17:31
🔗
|
yipdw |
johtso: it's the JSON file |
17:31
🔗
|
johtso |
makes sense |
17:31
🔗
|
yipdw |
RsyncUpload exists as a catch-all for materials not delegated to the uploader |
17:32
🔗
|
yipdw |
further discussion of archivebot behavior in #archivebot |
17:32
🔗
|
* |
johtso enters missile launch codes |
17:32
🔗
|
johtso |
I okay, i'll keep the discussion to #archivebot .. just that it gets kind of swamped by the bot output |
17:46
🔗
|
SketchCow |
johtso, you talk wayyyy too much |
17:46
🔗
|
SketchCow |
But I think you struck something there. |
17:46
🔗
|
SketchCow |
Hence: #archivebot-bs |
17:46
🔗
|
johtso |
SketchCow, hahaha |
17:51
🔗
|
|
caber has quit IRC (Quit: Doei Doei!!!) |
17:57
🔗
|
|
caber has joined #archiveteam |
18:08
🔗
|
Stiletto |
this was posted on one of my forums, not sure if it's a repack of something already on the Archive or instead something that should be preserved in the Magazines collection: http://www.pixsoriginadventures.co.uk/PCZone/ |
18:08
🔗
|
Stiletto |
there's some overlap with https://archive.org/details/pczonemagazine?sort=-date but it has some issues that aren't on the archive... |
18:16
🔗
|
SketchCow |
Does someone want to run comparisons or should I do it. |
18:16
🔗
|
SketchCow |
I suppose I could do it quick (download all, compare) |
19:14
🔗
|
schbirid |
gwern subpoenad http://www.reddit.com/r/DarkNetMarkets/comments/30tudk/psa_5_reddit_accounts_subpoenaed_by_ice/ |
19:14
🔗
|
schbirid |
is http://www.gwern.net/ well archived? |
19:21
🔗
|
|
Emcy has joined #archiveteam |
19:29
🔗
|
|
mistym has quit IRC (Leaving) |
19:43
🔗
|
|
mistym has joined #archiveteam |
19:45
🔗
|
midas |
not sure, just to be sure archivebotted |
19:48
🔗
|
|
SN4T14_ has joined #archiveteam |
19:49
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
19:53
🔗
|
|
mistym has joined #archiveteam |
19:56
🔗
|
|
SN4T14__ has quit IRC (Ping timeout: 512 seconds) |
20:03
🔗
|
|
Emcy_ has joined #archiveteam |
20:07
🔗
|
|
Emcy has quit IRC (Ping timeout: 512 seconds) |
20:26
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
20:31
🔗
|
* |
schbirid archivebots midas' certain harddisk |
20:31
🔗
|
midas |
LALALALA |
20:39
🔗
|
|
mistym has joined #archiveteam |
20:50
🔗
|
|
Emcy_ has quit IRC (Read error: Connection reset by peer) |
20:52
🔗
|
|
signius has quit IRC (Read error: Operation timed out) |
20:58
🔗
|
|
sankin has quit IRC (Leaving.) |
21:01
🔗
|
|
mistym has quit IRC (Quit: Leaving) |
21:04
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 240 seconds) |
21:06
🔗
|
|
signius has joined #archiveteam |
21:18
🔗
|
|
SimpBrain has quit IRC (Read error: Connection reset by peer) |
21:19
🔗
|
|
Emcy has joined #archiveteam |
21:39
🔗
|
|
wp494 has joined #archiveteam |
21:51
🔗
|
|
Stiletto has quit IRC (Ping timeout: 186 seconds) |
21:51
🔗
|
|
BlueMaxim has joined #archiveteam |
21:51
🔗
|
|
appledash has joined #archiveteam |
21:52
🔗
|
appledash |
Hey, is there an easy way / a guide for getting the warrior image running under ESXi? |
21:54
🔗
|
|
Emcy_ has joined #archiveteam |
21:55
🔗
|
|
Emcy has quit IRC (Ping timeout: 512 seconds) |
21:58
🔗
|
appledash |
The issue I mainly forsee is the fact that I need to assign a static IP |
21:59
🔗
|
|
khaoohs_ has joined #archiveteam |
21:59
🔗
|
|
khaoohs has quit IRC (Read error: Connection reset by peer) |
22:11
🔗
|
|
mistym has joined #archiveteam |
22:15
🔗
|
|
khaoohs_ has quit IRC (Read error: Connection reset by peer) |
22:15
🔗
|
|
khaoohs has joined #archiveteam |
22:16
🔗
|
dashcloud |
here you go: http://edvoncken.net/2014/08/archiveteam-warrior-on-esxi/ |
22:16
🔗
|
dashcloud |
I haven't tried it, but if it works, please add it to the wiki |
22:29
🔗
|
|
philpem has quit IRC (Ping timeout: 260 seconds) |
22:46
🔗
|
|
wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) |
22:50
🔗
|
|
NovaKing_ has quit IRC (Read error: Operation timed out) |
22:51
🔗
|
|
NovaKing_ has joined #archiveteam |
22:53
🔗
|
|
wp494 has joined #archiveteam |
23:01
🔗
|
|
NovaKing_ has quit IRC (Read error: Operation timed out) |
23:02
🔗
|
|
NovaKing_ has joined #archiveteam |
23:11
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
23:22
🔗
|
|
aschmitz has quit IRC (Read error: Connection reset by peer) |
23:25
🔗
|
|
dashcloud has joined #archiveteam |
23:27
🔗
|
|
Wolfie has left |
23:44
🔗
|
|
Stiletto has joined #archiveteam |
23:52
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
23:52
🔗
|
|
toad has joined #archiveteam |