| Time |
Nickname |
Message |
|
00:09
🔗
|
chfoo |
arkiver: ok, added. |
|
01:00
🔗
|
kyan |
DFJustin, also there are groups, with group forums and group comments |
|
01:03
🔗
|
|
Ravenloft has joined #archiveteam |
|
01:15
🔗
|
|
xtr-201 has joined #archiveteam |
|
01:36
🔗
|
|
GLaDOS has quit IRC (Ping timeout: 512 seconds) |
|
01:56
🔗
|
|
Kazzy has quit IRC (Read error: Operation timed out) |
|
01:56
🔗
|
|
Kazzy has joined #archiveteam |
|
02:07
🔗
|
|
signius has quit IRC (Read error: Operation timed out) |
|
02:16
🔗
|
|
Ymgve has quit IRC () |
|
02:18
🔗
|
|
SimpBrain has quit IRC (Read error: Connection reset by peer) |
|
02:21
🔗
|
|
signius has joined #archiveteam |
|
02:22
🔗
|
bmcginty |
Question. Doe sArchiveTeam ever use backend apis to grab content from failing sites, or is it restricted to typical webpages? |
|
02:27
🔗
|
chfoo |
for hyves, all the ajax calls were manually made to save all the photos which is why every archived page has the word "deadbeef" in it |
|
02:28
🔗
|
chfoo |
for viddler, reverse engineering was done and they got angry. |
|
02:30
🔗
|
bmcginty |
chfoo: ah. thanks. |
|
02:42
🔗
|
SketchCow |
http://teamarchive1.fnf.archive.org/DELETE-SCREENBIN/alanwood.net-inf-20140114-090506-5jxkt.warc.gz.png |
|
02:42
🔗
|
SketchCow |
Delicious |
|
02:43
🔗
|
SketchCow |
It's nice when it just works, although it does not often just work. |
|
02:52
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
02:53
🔗
|
|
mistym has joined #archiveteam |
|
03:15
🔗
|
|
khaoohs has joined #archiveteam |
|
03:30
🔗
|
|
GLaDOS has joined #archiveteam |
|
03:35
🔗
|
|
JMC has joined #archiveteam |
|
03:36
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
03:40
🔗
|
|
dashcloud has joined #archiveteam |
|
03:42
🔗
|
|
GLaDOS has quit IRC (Ping timeout: 260 seconds) |
|
03:45
🔗
|
|
GLaDOS has joined #archiveteam |
|
04:08
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
|
04:26
🔗
|
|
wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) |
|
04:48
🔗
|
|
Mayonaise has quit IRC (Ping timeout: 512 seconds) |
|
05:13
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
05:14
🔗
|
|
mistym has joined #archiveteam |
|
05:27
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
05:28
🔗
|
|
wp494 has joined #archiveteam |
|
05:54
🔗
|
|
Mayonaise has joined #archiveteam |
|
05:55
🔗
|
|
primus104 has joined #archiveteam |
|
05:56
🔗
|
|
mistym has joined #archiveteam |
|
06:08
🔗
|
|
Mayonaise has quit IRC (Ping timeout: 512 seconds) |
|
06:17
🔗
|
|
Mayonaise has joined #archiveteam |
|
07:06
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
|
07:17
🔗
|
|
[Beta]_ has joined #archiveteam |
|
07:18
🔗
|
|
[Beta] has quit IRC (Ping timeout: 240 seconds) |
|
07:19
🔗
|
|
robink has quit IRC (Quit: No Ping reply in 180 seconds.) |
|
07:20
🔗
|
|
robink has joined #archiveteam |
|
07:39
🔗
|
|
[Beta]_ is now known as [Beta] |
|
07:51
🔗
|
|
schbirid has joined #archiveteam |
|
07:53
🔗
|
|
SimpBrain has joined #archiveteam |
|
08:03
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
08:06
🔗
|
|
dashcloud has joined #archiveteam |
|
09:10
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
09:23
🔗
|
BlueMaxim |
hey, I just learnt that the Neverwinter Vault (an IGN hosted site that hosted game addons for Neverwinter Nights 1 & 2) went down a few months ago and never came back, but some people not only got a full copy of the website but actually put up a mirror of it at ftp://neverwintervault.org/rolovault/ - it might be worth feeding it into the Archive somehow. don't have the hardware to do it myself so I'd just put it here to |
|
09:23
🔗
|
BlueMaxim |
see if someone would like to do it |
|
09:47
🔗
|
|
Start has quit IRC (Ping timeout: 740 seconds) |
|
10:06
🔗
|
|
Start has joined #archiveteam |
|
10:11
🔗
|
|
sky__ has joined #archiveteam |
|
10:11
🔗
|
|
sky__ has left |
|
10:44
🔗
|
|
habi has joined #archiveteam |
|
10:49
🔗
|
|
svchfoo2 has quit IRC (Remote host closed the connection) |
|
10:52
🔗
|
|
svchfoo2 has joined #archiveteam |
|
10:53
🔗
|
|
svchfoo1 sets mode: +o svchfoo2 |
|
10:56
🔗
|
|
habi has left |
|
11:04
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
11:42
🔗
|
|
Ymgve has joined #archiveteam |
|
11:58
🔗
|
|
Wolfie has joined #archiveteam |
|
12:42
🔗
|
|
primus104 has joined #archiveteam |
|
12:43
🔗
|
|
sankin has joined #archiveteam |
|
12:54
🔗
|
johtso |
is anyone able to upload to IA with the s3-like api at the moment? |
|
12:56
🔗
|
johtso |
I worried it's my api keys that have been blocked, unless everyone is getting 503 |
|
13:03
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
|
13:11
🔗
|
johtso |
Don't seem to be able to upload through the web interface either. |
|
13:30
🔗
|
Nemo_bis |
johtso: what sort of requests are you making? |
|
13:31
🔗
|
johtso |
Nemo_bis, just normal upload requests, using the ia tool, and using the upload tool on the website |
|
13:31
🔗
|
johtso |
unless there's some kind of site wide outage I assume my account got blocked when a was making a bunch of parallel uploads.. |
|
13:31
🔗
|
|
brayden has joined #archiveteam |
|
13:31
🔗
|
Nemo_bis |
...which is not what I call normal upload requests |
|
13:32
🔗
|
Nemo_bis |
Yes, it's possible to get uploads refused if you do many at once. Even with a single upload thread doing many small uploads. |
|
13:32
🔗
|
johtso |
right, I'm but now I'm just making single requests |
|
13:33
🔗
|
Nemo_bis |
In my experience it happens when I have hundreds of green rows in https://archive.org/catalog.php?history=1&justme=1 |
|
13:33
🔗
|
johtso |
Nemo_bis, that's exactly my situation! |
|
13:34
🔗
|
johtso |
that's a hard disk issue |
|
13:34
🔗
|
johtso |
unfortunate that it's stops you from doing any more uploads until it's sorted.. |
|
13:35
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
13:35
🔗
|
johtso |
wow, what's happened to my typing today.. |
|
14:14
🔗
|
|
Melissa_ has joined #archiveteam |
|
14:14
🔗
|
Melissa_ |
Hi guys |
|
14:14
🔗
|
Melissa_ |
Can I request a website for archiving? |
|
14:17
🔗
|
johtso |
Melissa_, sure! what's the site? |
|
14:17
🔗
|
Melissa_ |
dbtropes.org |
|
14:17
🔗
|
Melissa_ |
It's a Linked Data wrapper for TV Tropes |
|
14:18
🔗
|
Melissa_ |
Which allows one to perform way better searches |
|
14:18
🔗
|
Melissa_ |
I contacted the creator of it, asking a question, but he has not responded and ever since access to the website has been revoked. Just try accessing it now |
|
14:19
🔗
|
Melissa_ |
I'm starting to think he didn't want it to be public |
|
14:19
🔗
|
Melissa_ |
I do have another URL which can still be accessed, and maybe it's just a matter of time before he revokes access to that URL as well |
|
14:19
🔗
|
Melissa_ |
I'm not sure of this, though. Maybe this issue is just temporary |
|
14:20
🔗
|
Melissa_ |
You can read more about dbtopres here: https://web.archive.org/web/20140722081335/http://skipforward.opendfki.de/wiki/DBTropes |
|
14:21
🔗
|
johtso |
Melissa_, if you give me the URL I can try pointing archivebot at it. |
|
14:21
🔗
|
Melissa_ |
Yeah. There's just one problem: dbtropes isn't just the pages, but also a search engine, which is a back-end thing |
|
14:21
🔗
|
Melissa_ |
Anything's better than nothing, though |
|
14:23
🔗
|
|
toad1 has quit IRC (Read error: Operation timed out) |
|
14:24
🔗
|
Melissa_ |
Actually, nevermind |
|
14:24
🔗
|
Melissa_ |
Just found a zip of the whole database |
|
14:30
🔗
|
johtso |
nice :) |
|
14:30
🔗
|
johtso |
Melissa_, want me to archivebot the database dump? |
|
14:39
🔗
|
|
mistym has joined #archiveteam |
|
14:40
🔗
|
godane |
SketchCow: i'm uploading CD3WD dvds to your ftp |
|
14:41
🔗
|
godane |
i figure it would be easier to upload to a ftp since its very big and i can resume if my wifi acts up |
|
14:41
🔗
|
SketchCow |
Great |
|
14:47
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
14:53
🔗
|
|
wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) |
|
14:54
🔗
|
balrog |
Melissa_: it works for me |
|
14:54
🔗
|
balrog |
and they provide a database snapshot |
|
14:54
🔗
|
balrog |
it's weird that you can't access it... |
|
14:55
🔗
|
Melissa_ |
balrog yeah, looks like I've been worrying about nothing. The issue was client-sided |
|
14:57
🔗
|
Melissa_ |
Very strange but the issue is gone now that I've removed cookies, site prefs, etc. |
|
14:57
🔗
|
|
Wolfie has quit IRC (Quit: Leaving.) |
|
14:58
🔗
|
johtso |
how can you tell if archivebot has properly scraped something? where can you see the actual total data downloaded? |
|
14:58
🔗
|
johtso |
would it be this bit at the end of the job? "sent 325 bytes received 30 bytes 236.67 bytes/sec" |
|
14:58
🔗
|
johtso |
and if so, how would you "try again"? |
|
15:12
🔗
|
|
Wolfie has joined #archiveteam |
|
15:28
🔗
|
|
Melissa_ has quit IRC (Quit: Leaving) |
|
15:36
🔗
|
Start |
rapidshare dies tomorrow, we need more people doing rapidshare discovery |
|
15:36
🔗
|
Start |
#rapidscare |
|
16:08
🔗
|
|
mistym has joined #archiveteam |
|
16:08
🔗
|
|
mistym_ has joined #archiveteam |
|
16:12
🔗
|
|
mistym has quit IRC (Client Quit) |
|
16:13
🔗
|
|
Emcy has quit IRC (Read error: Connection reset by peer) |
|
16:24
🔗
|
|
mistym has joined #archiveteam |
|
16:26
🔗
|
|
mistym has quit IRC (Client Quit) |
|
16:30
🔗
|
|
mistym has joined #archiveteam |
|
16:32
🔗
|
|
mistym_ has quit IRC (Leaving...) |
|
16:35
🔗
|
schbirid |
Start: ask again when the tracker is actually servign jobs :P |
|
16:35
🔗
|
Start |
it was when i asked |
|
16:36
🔗
|
schbirid |
:) |
|
16:40
🔗
|
|
patricko- is now known as patrickod |
|
16:43
🔗
|
|
john has quit IRC (Remote host closed the connection) |
|
16:44
🔗
|
|
philpem has joined #archiveteam |
|
16:48
🔗
|
|
philpem_ has joined #archiveteam |
|
16:48
🔗
|
|
philpem_ has quit IRC (Remote host closed the connection) |
|
16:49
🔗
|
yipdw |
johtso: search here -> http://archive.fart.website/archivebot/viewer/ |
|
16:49
🔗
|
yipdw |
job results are typically uploaded in 5 GB chunks; that index updates periodically |
|
16:53
🔗
|
|
aaaaaaaaa has joined #archiveteam |
|
17:07
🔗
|
|
patrickod is now known as patricko- |
|
17:09
🔗
|
|
primus104 has joined #archiveteam |
|
17:14
🔗
|
Wolfie |
Is there a good way to pull sites out of the wayback machine? |
|
17:15
🔗
|
yipdw |
Wolfie: if the site was injected via ArchiveBot then the WARCs can be located via http://archive.fart.website/archivebot/viewer/ |
|
17:15
🔗
|
yipdw |
if it came from the Archive-It group then you may be able to ask them |
|
17:15
🔗
|
yipdw |
for other cases no public access presently exists |
|
17:17
🔗
|
Wolfie |
Well goddamnit. |
|
17:22
🔗
|
johtso |
yipdw, do you know what that output I pasted from the logs refers to? |
|
17:23
🔗
|
kyan |
I always assumed that was the json file |
|
17:23
🔗
|
kyan |
I think those are around that big |
|
17:24
🔗
|
johtso |
ah, that's not so useful :) |
|
17:25
🔗
|
johtso |
is there any way to bypass the delay before being able to re-archive? |
|
17:25
🔗
|
kyan |
!expire ident |
|
17:25
🔗
|
johtso |
ah, nice |
|
17:25
🔗
|
johtso |
does !abort do anything if the job is already done? |
|
17:25
🔗
|
kyan |
not as far as I know |
|
17:26
🔗
|
kyan |
but yeah I think the json file is the only thing that gets rsynced at the end of a job now, all the warcs I think get sent in the background while the job is running (or something like that?) |
|
17:26
🔗
|
kyan |
I'm no expert |
|
17:27
🔗
|
johtso |
any way to nuke a completed archive? |
|
17:27
🔗
|
SketchCow |
https://archive.org/details/archiveteam_madden/v2 |
|
17:28
🔗
|
kyan |
maybe by nuking the IA data centers. Please don't do that... |
|
17:31
🔗
|
yipdw |
johtso: it's the JSON file |
|
17:31
🔗
|
johtso |
makes sense |
|
17:31
🔗
|
yipdw |
RsyncUpload exists as a catch-all for materials not delegated to the uploader |
|
17:32
🔗
|
yipdw |
further discussion of archivebot behavior in #archivebot |
|
17:32
🔗
|
* |
johtso enters missile launch codes |
|
17:32
🔗
|
johtso |
I okay, i'll keep the discussion to #archivebot .. just that it gets kind of swamped by the bot output |
|
17:46
🔗
|
SketchCow |
johtso, you talk wayyyy too much |
|
17:46
🔗
|
SketchCow |
But I think you struck something there. |
|
17:46
🔗
|
SketchCow |
Hence: #archivebot-bs |
|
17:46
🔗
|
johtso |
SketchCow, hahaha |
|
17:51
🔗
|
|
caber has quit IRC (Quit: Doei Doei!!!) |
|
17:57
🔗
|
|
caber has joined #archiveteam |
|
18:08
🔗
|
Stiletto |
this was posted on one of my forums, not sure if it's a repack of something already on the Archive or instead something that should be preserved in the Magazines collection: http://www.pixsoriginadventures.co.uk/PCZone/ |
|
18:08
🔗
|
Stiletto |
there's some overlap with https://archive.org/details/pczonemagazine?sort=-date but it has some issues that aren't on the archive... |
|
18:16
🔗
|
SketchCow |
Does someone want to run comparisons or should I do it. |
|
18:16
🔗
|
SketchCow |
I suppose I could do it quick (download all, compare) |
|
19:14
🔗
|
schbirid |
gwern subpoenad http://www.reddit.com/r/DarkNetMarkets/comments/30tudk/psa_5_reddit_accounts_subpoenaed_by_ice/ |
|
19:14
🔗
|
schbirid |
is http://www.gwern.net/ well archived? |
|
19:21
🔗
|
|
Emcy has joined #archiveteam |
|
19:29
🔗
|
|
mistym has quit IRC (Leaving) |
|
19:43
🔗
|
|
mistym has joined #archiveteam |
|
19:45
🔗
|
midas |
not sure, just to be sure archivebotted |
|
19:48
🔗
|
|
SN4T14_ has joined #archiveteam |
|
19:49
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
19:53
🔗
|
|
mistym has joined #archiveteam |
|
19:56
🔗
|
|
SN4T14__ has quit IRC (Ping timeout: 512 seconds) |
|
20:03
🔗
|
|
Emcy_ has joined #archiveteam |
|
20:07
🔗
|
|
Emcy has quit IRC (Ping timeout: 512 seconds) |
|
20:26
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
20:31
🔗
|
* |
schbirid archivebots midas' certain harddisk |
|
20:31
🔗
|
midas |
LALALALA |
|
20:39
🔗
|
|
mistym has joined #archiveteam |
|
20:50
🔗
|
|
Emcy_ has quit IRC (Read error: Connection reset by peer) |
|
20:52
🔗
|
|
signius has quit IRC (Read error: Operation timed out) |
|
20:58
🔗
|
|
sankin has quit IRC (Leaving.) |
|
21:01
🔗
|
|
mistym has quit IRC (Quit: Leaving) |
|
21:04
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 240 seconds) |
|
21:06
🔗
|
|
signius has joined #archiveteam |
|
21:18
🔗
|
|
SimpBrain has quit IRC (Read error: Connection reset by peer) |
|
21:19
🔗
|
|
Emcy has joined #archiveteam |
|
21:39
🔗
|
|
wp494 has joined #archiveteam |
|
21:51
🔗
|
|
Stiletto has quit IRC (Ping timeout: 186 seconds) |
|
21:51
🔗
|
|
BlueMaxim has joined #archiveteam |
|
21:51
🔗
|
|
appledash has joined #archiveteam |
|
21:52
🔗
|
appledash |
Hey, is there an easy way / a guide for getting the warrior image running under ESXi? |
|
21:54
🔗
|
|
Emcy_ has joined #archiveteam |
|
21:55
🔗
|
|
Emcy has quit IRC (Ping timeout: 512 seconds) |
|
21:58
🔗
|
appledash |
The issue I mainly forsee is the fact that I need to assign a static IP |
|
21:59
🔗
|
|
khaoohs_ has joined #archiveteam |
|
21:59
🔗
|
|
khaoohs has quit IRC (Read error: Connection reset by peer) |
|
22:11
🔗
|
|
mistym has joined #archiveteam |
|
22:15
🔗
|
|
khaoohs_ has quit IRC (Read error: Connection reset by peer) |
|
22:15
🔗
|
|
khaoohs has joined #archiveteam |
|
22:16
🔗
|
dashcloud |
here you go: http://edvoncken.net/2014/08/archiveteam-warrior-on-esxi/ |
|
22:16
🔗
|
dashcloud |
I haven't tried it, but if it works, please add it to the wiki |
|
22:29
🔗
|
|
philpem has quit IRC (Ping timeout: 260 seconds) |
|
22:46
🔗
|
|
wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) |
|
22:50
🔗
|
|
NovaKing_ has quit IRC (Read error: Operation timed out) |
|
22:51
🔗
|
|
NovaKing_ has joined #archiveteam |
|
22:53
🔗
|
|
wp494 has joined #archiveteam |
|
23:01
🔗
|
|
NovaKing_ has quit IRC (Read error: Operation timed out) |
|
23:02
🔗
|
|
NovaKing_ has joined #archiveteam |
|
23:11
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
23:22
🔗
|
|
aschmitz has quit IRC (Read error: Connection reset by peer) |
|
23:25
🔗
|
|
dashcloud has joined #archiveteam |
|
23:27
🔗
|
|
Wolfie has left |
|
23:44
🔗
|
|
Stiletto has joined #archiveteam |
|
23:52
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
23:52
🔗
|
|
toad has joined #archiveteam |