Time |
Nickname |
Message |
03:10
🔗
|
chfoo |
the tracker disk usage is at 94% |
06:05
🔗
|
yipdw |
#justouttv, the justin.tv "grab the videos that have some views" project, is now online and driving up bandwidth bills |
06:10
🔗
|
exmic |
yay |
06:20
🔗
|
yipdw |
aaaand |
06:20
🔗
|
yipdw |
500 GB in 4 hours |
06:20
🔗
|
yipdw |
goddamn |
06:22
🔗
|
yipdw |
actually, that's around 34.7 MB/s |
06:22
🔗
|
aggrosk |
Yay for the cloud. 10 bucks for 2 TB of transfer over at DO. |
06:23
🔗
|
aggrosk |
Really only less than half of that (given the whole push pull thing that seesaw does) but still :D |
06:26
🔗
|
Nemo_bis |
Folks here say they're crawling 300 .fi URLs per second http://helsinginyliopisto.etapahtuma.fi/Default.aspx?tabid=304&id=9538 |
06:27
🔗
|
Nemo_bis |
(with heritrix) |
07:22
🔗
|
yipdw |
whoever has access to @at_warrior, you're wanted in #justouttv |
07:23
🔗
|
trs80 |
mostly to tweet about justouttv |
08:16
🔗
|
exmic |
holy shitfuck, this will distort the graph scale for the next year http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/project_bytes.html |
08:16
🔗
|
exmic |
perfect |
08:20
🔗
|
voltagex |
holy crap, is that going to bankrupt you? |
08:20
🔗
|
exmic |
one does not simply bankrupt archiveteam |
08:24
🔗
|
voltagex |
I'm finally running warrior now |
09:45
🔗
|
midas |
wow wow wow! |
09:45
🔗
|
midas |
Dear customers, |
09:45
🔗
|
midas |
After receiving many complaints regarding the stability of our OneCloud infrastructure, and exploring many options for a better, stronger system, we have finally come to the decision to start off fresh. |
09:45
🔗
|
midas |
This unfortunately means that we are terminating the OneCloud system in its entirety, and that all VPS plans will be terminated on June 16th 2014. |
09:45
🔗
|
midas |
There will be no further warnings after this message, so please make sure to complete any migration or backup process to preserve your data and services. |
09:48
🔗
|
Kenshin |
which provider is that |
09:49
🔗
|
midas |
Oneprovider |
12:09
🔗
|
DBArchive |
I have a question regarding this search: https://ia801604.us.archive.org/11/items/dailybooth-freeze-frame-index/ |
12:10
🔗
|
DBArchive |
each time i search i get a 'Sorry, your browser is not smart enough. (It does not support HTTP Range requests.)' message i've tried this with mozilla chrome and ie, on a few computers, and on a computer with dmz, all did the same error.. |
12:21
🔗
|
DBArch |
?? |
12:23
🔗
|
DBArch |
Sorry, your browser is not smart enough. (It does not support HTTP Range requests.)... |
12:23
🔗
|
DBArch |
whhat do? |
12:24
🔗
|
* |
DBArch :S |
12:42
🔗
|
Nemo_bis |
Change browser? Seems pretty obvious. |
14:18
🔗
|
SadDM |
Slamming in 1300+ more Diplomacy zines. Stuff from the 60s to the mid-2000s. (https://archive.org/details/tetracuspid_27-1978-04-03) |
14:18
🔗
|
SadDM |
It boggles my mind that people used to play board games by mail. |
14:19
🔗
|
SadDM |
SketchCow: If you want to make a collection for them and make me an admin, I can start filing them. |
14:20
🔗
|
SadDM |
Also, for my next batch, I can upload them to there directly. |
15:13
🔗
|
SketchCow |
So.... many zines |
15:13
🔗
|
SketchCow |
Where are you getting these from? |
15:13
🔗
|
SketchCow |
what should the name be, by the way. |
15:21
🔗
|
SadDM |
The last few rounds (totalling 5000 or so) have so far come from a single site (http://www.whiningkentpigs.com/DW/zines.htm), but there are a couple of other sites with the same type of zine that will be in my crosshairs shortly. |
15:36
🔗
|
SadDM |
As for a name, I'd go with "Diplomacy Zines" |
15:39
🔗
|
SketchCow |
diplomacyzines created |
15:46
🔗
|
SketchCow |
SadDM, I'll swap a bnch over to you. |
15:50
🔗
|
SadDM |
OK cool... thanks. |
15:52
🔗
|
SketchCow |
Are all 4000 items in opensource (except that mailing list) good to put into this collection? |
15:54
🔗
|
SadDM |
oh... no. Just the stuff in https://archive.org/search.php?query=uploader%3A%22aeakett%40gmail.com%22%20AND%20subject%3A%22dipzine%22 |
15:57
🔗
|
SketchCow |
3,680 of them it is. |
15:57
🔗
|
SadDM |
sounds about right |
16:01
🔗
|
SketchCow |
https://archive.org/details/diplomacyzines is coming along and will populate. |
16:13
🔗
|
balrog |
SketchCow: sounds like we might be having a space issue with justin... |
16:32
🔗
|
SketchCow |
What wwherehflksdfjldfjsdf |
16:33
🔗
|
balrog |
SketchCow: seems you dropped out of -bs... |
17:34
🔗
|
SadDM |
SketchCow: I tried to make an edit to the new collection's description and got the following message: "You are not allowed to submit items into collection(s): magazine_rack" |
17:34
🔗
|
SadDM |
I found that a little odd since I was able to add an image to that collection item. |
17:34
🔗
|
SadDM |
If it's a "no go", it's not a big deal... just thought I'd add a picture to the page. |
17:40
🔗
|
SketchCow |
hmmm |
18:21
🔗
|
Asparagir |
Welp, I just discovered my job for the next X weeks/months: "Ancestry.com Announces Retirement of Several Websites" |
18:21
🔗
|
Asparagir |
"Ancestry.com announced this morning at 10:00 MT that it is retiring several of its websites. The websites are" |
18:21
🔗
|
Asparagir |
MyFamily.com MyCanvas.com Genealogy.com Mundia.com |
18:22
🔗
|
Asparagir |
This is LOTS of data from small companies they've acquired over the years. Mostly it's the message board data that needs saving; the underlying databases are already posted on other sites. |
18:22
🔗
|
joepie91 |
wait what |
18:22
🔗
|
Asparagir |
"Users will be told the retirement timeline and how to export their data." |
18:22
🔗
|
joepie91 |
ancestry.com is going down? |
18:22
🔗
|
Asparagir |
So, no timeline yet, and yes they have an export function. |
18:23
🔗
|
joepie91 |
:| |
18:23
🔗
|
Asparagir |
No, not ancestry -- sites they've acquired over the years. |
18:23
🔗
|
antomatic |
A great archive |
18:23
🔗
|
antomatic |
Unlimited storage space and SiteSafeSM technology keep all of your family memories safe and secure. No matter what. |
18:23
🔗
|
joepie91 |
right |
18:23
🔗
|
antomatic |
Ha. From MyFamily.com |
18:23
🔗
|
joepie91 |
lol |
18:23
🔗
|
joepie91 |
reminds me of... *searches* |
18:24
🔗
|
joepie91 |
(had to dig through my jason scott stalkings) |
18:24
🔗
|
joepie91 |
http://bit-chest.com/ |
18:24
🔗
|
joepie91 |
lots of handwavium |
18:24
🔗
|
SketchCow |
THE GREATEST ELEMENT OF ALL |
18:24
🔗
|
SketchCow |
Stealing handwavium for speech |
18:25
🔗
|
joepie91 |
:D |
18:25
🔗
|
joepie91 |
found it on tvtropes some time ago |
18:25
🔗
|
joepie91 |
the magic ingredient that makes everything right without explanation |
18:25
🔗
|
joepie91 |
seems to fit perfectly for these "safe permanent storage" clowns |
18:25
🔗
|
balrog |
wow, a family history site destroying data? |
18:25
🔗
|
balrog |
really? |
18:25
🔗
|
balrog |
talk about irony |
18:26
🔗
|
Asparagir |
It's really the message boards that need the ArchiveTeam love, so I'll start there. They go back at least 15 years in some cases. For example, 67,000+ posts just for the SMITH family: http://genforum.genealogy.com/smith/ |
18:26
🔗
|
exmic |
these are my people |
18:26
🔗
|
exmic |
SMITHs |
18:27
🔗
|
joepie91 |
Asparagir: very 1995 messageboards, that should be easy to archive |
18:27
🔗
|
Asparagir |
Yeah, I just hope the shutdown timeline isn't too compressed. |
18:28
🔗
|
Asparagir |
Otherwise, I will have to finally teach myself how to write seesaw scripts and build my own Warrior project. :-) |
18:28
🔗
|
exmic |
looks like something we could hit with archivebot |
18:29
🔗
|
exmic |
easy peasy |
18:29
🔗
|
joepie91 |
exmic: yup, it's big, but it's simply structured |
18:29
🔗
|
joepie91 |
archivebot should do fine on this one |
18:30
🔗
|
Asparagir |
Does ArchiveBot have enough space at the moment to take on a project like that? And remember, these are four separate sites, some with separate message board or forum sub-domains. |
18:31
🔗
|
exmic |
well, genforum at least should be fine |
18:31
🔗
|
exmic |
I'd guess ten, fifteen gigs before compression |
18:41
🔗
|
SketchCow |
This is a misuse of archivebot |
18:42
🔗
|
SketchCow |
It does strike me that archivebot is very, very successful. Maybe we need to make it so its work can be over to a larger pool of volunteer machines? |
18:43
🔗
|
SketchCow |
Machines much less likely to flit in and out like warriors. |
18:43
🔗
|
SketchCow |
Ultrawarriors, if you will. |
18:43
🔗
|
Asparagir |
So, a warrior project, then? Or should I grab it myself and do a standard upload to IA to the archiveteam_antecedents collection? |
18:43
🔗
|
Asparagir |
SPARTANS |
18:44
🔗
|
SketchCow |
I'm going to say "It's a misuse of archivebot but should be used as a sign we should upgrade archivebot's abilities and flexibility" |
18:44
🔗
|
SketchCow |
So go ahead |
18:44
🔗
|
SadDM |
*Ultimatewarriors |
18:44
🔗
|
SketchCow |
We've done a few other whoppers before. |
18:44
🔗
|
SadDM |
but those whoppers can really choke things up |
18:44
🔗
|
SketchCow |
http://www.kayfabenews.com/wp-content/uploads/2014/04/warrior.jpg |
18:45
🔗
|
SketchCow |
And as discussed, we should use our awareness of this to rethink some of the bot's abilities. |
18:45
🔗
|
SketchCow |
For example, being able to say "and this is a big on" so it goes into a different torpedo tube |
18:45
🔗
|
joepie91 |
SketchCow: once RAM requirements are solved (that is, can run on <512MB boxes without swap), I can plug a few boxes into the archivebot architecture |
18:45
🔗
|
joepie91 |
insofar that helps |
18:46
🔗
|
joepie91 |
and yes, that would be a useful distinction |
18:46
🔗
|
joepie91 |
though I'd opt for saying "this is a small one" rather than saying "this is a big one", so that if somebody forgets to specify it won't accidentally block everything |
18:47
🔗
|
SadDM |
or maybe I mis-judge the size of a site and send a huge job to the smart-car lane |
18:47
🔗
|
SadDM |
either way, there will be issues |
18:48
🔗
|
SketchCow |
Well, the whole point of archivebot upon inception was for small sites. |
18:49
🔗
|
SadDM |
oh yeah, but the definition of "small" has been sliding quite a bit lately |
18:49
🔗
|
SketchCow |
So maybe having it notice we've gone past a certain limit, be it 1gb of material, or x amount of URLs, and go "uh, this needs to go to the ultimatewarriors" |
18:49
🔗
|
SketchCow |
My concern is mostly we lose timeliness |
18:50
🔗
|
SketchCow |
If it takes hours to get to a classic fuckup or craigslist ad, it'll be gone |
18:55
🔗
|
yipdw |
if I had written archivebot using mongodb it'd clearly be webscale |
18:55
🔗
|
joepie91 |
SketchCow: this is moving into -bs territory, but... I'm not sure of the current archivebot architecture, but is it capable yet of freezing/pausing jobs, sending them over to another box wholesale, and resuming them there? |
18:55
🔗
|
yipdw |
no |
18:55
🔗
|
SketchCow |
This isn't -bs territory |
18:55
🔗
|
SketchCow |
The channel is the bot working and talking about the bot |
18:55
🔗
|
joepie91 |
lengthy discussion :) |
18:56
🔗
|
joepie91 |
anyway |
18:56
🔗
|
SketchCow |
and giving yipdw kudos for the fucking thing |
18:56
🔗
|
yipdw |
it will at some point, but there are bigger issues to deal with first, like the reporting process fucking itself up periodically |
18:56
🔗
|
SketchCow |
I asked for a swiss army knife and he made the iron giant |
18:56
🔗
|
joepie91 |
yipdw: is archivebot still using wget, or does it use wpull now? |
18:56
🔗
|
yipdw |
wpull |
18:56
🔗
|
joepie91 |
hmmm |
18:56
🔗
|
joepie91 |
on all nodes? |
18:56
🔗
|
yipdw |
yes |
18:56
🔗
|
yipdw |
there are however memory issues remaining, and I think those have to do with the reporter threads |
18:57
🔗
|
joepie91 |
I might be able to dick around with it in the near future then, and see if I can duct-tape together a resume function |
18:57
🔗
|
yipdw |
in MOST cases you will not see a memory blowup |
18:57
🔗
|
joepie91 |
mm |
18:57
🔗
|
SketchCow |
Another line of thought for Asparagir's question |
18:57
🔗
|
SketchCow |
The point of the bot is to make basic things easy and not constantly have to ramp people up on the "right" ways and missing mistakes. |
18:57
🔗
|
joepie91 |
yipdw: if you had to describe the current architecture of archivebot in a single line, including technologies/architectures used, what would it be? (so I can get a vague idea of what to expect) |
18:57
🔗
|
SketchCow |
But Asparagir has been in this place forever, she gets what it needs. |
18:57
🔗
|
SketchCow |
So her doing it the old-fashioned way seems quite legit |
18:58
🔗
|
SketchCow |
joepie91: 45 cats, a blender and underscor's mom |
18:58
🔗
|
yipdw |
joepie91: Python, Ruby on the backend, CoffeeScript/Ember.js on the frontend, CouchDB and Redis as datastores |
18:58
🔗
|
SketchCow |
Oh sure, use the layman's terms |
18:59
🔗
|
yipdw |
underscor's mom will be present in release 6 |
18:59
🔗
|
yipdw |
joepie91: also the fetch pipeline is seesaw, though a much more complicated seesaw pipeline than any other I am aware of |
19:00
🔗
|
Kenshin |
what about a beefy box |
19:00
🔗
|
yipdw |
ivan` runs one |
19:00
🔗
|
Kenshin |
for archivebot |
19:00
🔗
|
yipdw |
it's the reason why we're doing 30 concurrent jobs vs. 5 |
19:00
🔗
|
yipdw |
:P |
19:00
🔗
|
joepie91 |
yipdw: main communication protocol(s) between components? |
19:00
🔗
|
yipdw |
redis pubsub |
19:00
🔗
|
yipdw |
:P |
19:00
🔗
|
joepie91 |
oh dear |
19:00
🔗
|
joepie91 |
right |
19:00
🔗
|
yipdw |
it works |
19:01
🔗
|
joepie91 |
I know Python, I can learn Ruby, I know CoffeeScript, I can learn Ember, I know CouchDB a little bit, I know Redis a little bit, I have nfi how its pubsub works |
19:01
🔗
|
joepie91 |
not too bad a score |
19:01
🔗
|
yipdw |
it's pretty easy |
19:01
🔗
|
yipdw |
I was going to use e.g. ZeroMQ and then went "I do not need that" |
19:01
🔗
|
joepie91 |
Kenshin: beefy boxes are always useful :P |
19:01
🔗
|
joepie91 |
yipdw: see, if it were zeromq, I would've known how it worked :D |
19:01
🔗
|
yipdw |
also, ArchiveBot is very much a product of "get shit online" |
19:02
🔗
|
yipdw |
the fact that it has done what it is doing surprises the hell out of me |
19:02
🔗
|
Kenshin |
it is the fastest way to get a small site archived though |
19:02
🔗
|
yipdw |
I mean, its processes are running in a tmux |
19:02
🔗
|
* |
nico hide his screen process |
19:03
🔗
|
yipdw |
in any case, there is plenty to do |
19:03
🔗
|
nico |
archivebot is also running with a lot of different version of the pipeline/wpull |
19:03
🔗
|
SketchCow |
http://i.imgur.com/SfNlIEA.png |
19:03
🔗
|
Kenshin |
but are we even maxing out archivebot's resources |
19:03
🔗
|
nico |
Kenshin: no |
19:03
🔗
|
joepie91 |
yipdw: hehe |
19:04
🔗
|
nico |
my drone is sleeping |
19:04
🔗
|
SketchCow |
I think yipdw knows the current maxing or not maxing. |
19:04
🔗
|
Kenshin |
but if we threw the sites mentioned on it |
19:04
🔗
|
yipdw |
it's doing 32 jobs right now |
19:04
🔗
|
yipdw |
I think we can do 35 |
19:04
🔗
|
Kenshin |
it'll probably flip? |
19:04
🔗
|
SketchCow |
yipdw: Maybe work with exmic to add graphs? |
19:04
🔗
|
yipdw |
they're already being done |
19:04
🔗
|
yipdw |
SketchCow: yeah |
19:04
🔗
|
joepie91 |
yipdw: how receptive are you to (well-tested) architecture changes for archivebot? in case I have a bored weekend |
19:04
🔗
|
nico |
(doing a grab of http://tcrf.net/ and http://geekbeat.tv/) |
19:04
🔗
|
yipdw |
joepie91: I'm fine with them if they fix things |
19:04
🔗
|
joepie91 |
such as not needing tmux? :P |
19:04
🔗
|
yipdw |
I don't need tmux |
19:05
🔗
|
nico |
joepie91: put it under supervisord :) |
19:05
🔗
|
yipdw |
I just haven't written e.g. start scripts or gotten it in daemontools |
19:05
🔗
|
yipdw |
I can point you to current deficiencies in ArchiveBot |
19:06
🔗
|
yipdw |
for example, there is currently no way to know what the max load is |
19:06
🔗
|
yipdw |
there is also no way right now to signal when a job starts |
19:06
🔗
|
yipdw |
(but we do know when it finishes) |
19:06
🔗
|
yipdw |
stuff like that |
19:06
🔗
|
joepie91 |
yipdw: are there issue tickets for this? |
19:06
🔗
|
yipdw |
I do not think that fixing those requires significant architecture changes |
19:06
🔗
|
yipdw |
joepie91: yeah |
19:06
🔗
|
SketchCow |
I can say that archivebot is pulling roughly 200gb a day. |
19:06
🔗
|
joepie91 |
because then I can just look at those |
19:08
🔗
|
joepie91 |
yipdw: correct repo link? |
19:08
🔗
|
yipdw |
https://github.com/ArchiveTeam/ArchiveBot |
19:08
🔗
|
joepie91 |
thanks, bookmarked |
19:08
🔗
|
joepie91 |
now brb, just got a bugticket for pythonwhois, work to do :D |
19:09
🔗
|
yipdw |
the most confusing part of it is probably the cogs program |
19:09
🔗
|
Nemo_bis |
do we have a page for the generalogy site(s)? need to link |
19:09
🔗
|
Kenshin |
yipdw: anyway if you reach a point you need resources for archivebot, just ping me |
19:09
🔗
|
yipdw |
Kenshin: np |
19:09
🔗
|
yipdw |
thanks |
19:17
🔗
|
Asparagir |
Nemo_bis: No, not yet -- I got the news through one of my e-mail mailing lists. But i'm sure something will be up soon. |
19:18
🔗
|
Asparagir |
Nemo_bis: Wait, it looks like one of the four sites, genealogy.com, now has a page with info about the shutdown: http://www.ancestry.com/cs/faq/genealogy-faq |
19:18
🔗
|
SketchCow |
How did today become Verify All Archiveteam Architecture Day |
19:18
🔗
|
SketchCow |
Probably the fact that justin tv is a fucking nightmare planet of websuck crashing into the IA building |
19:18
🔗
|
Asparagir |
Props to Ancestry.com for not deleting the message boards, just putting them into read-only mode. |
19:19
🔗
|
SketchCow |
Kaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaan |
19:19
🔗
|
Kenshin |
SketchCow: i guess it's been quiet for a while? |
19:19
🔗
|
Kenshin |
and suddenly we have justin.tv |
19:19
🔗
|
Kenshin |
so suddenly everyone is interested in how we're able to cope with projects? |
19:19
🔗
|
SketchCow |
Yeah |
19:19
🔗
|
SketchCow |
We'll likely be overengineered as fuck after this. |
19:19
🔗
|
SketchCow |
Which is good |
19:20
🔗
|
SketchCow |
For when facebook goes down |
19:20
🔗
|
Asparagir |
It never rains, but it pours. |
19:20
🔗
|
nico |
21:20 @SketchCow> For when facebook goes down |
19:21
🔗
|
nico |
a nightmare |
19:23
🔗
|
balrog |
[14:31:37] <@jrra> rip crazynation.org |
19:23
🔗
|
midas |
strange, we did some small projects not that long ago |
19:24
🔗
|
Asparagir |
So, on another subject...random thing that I discovered the other day: since when does the Washington Post block the IA bot in their robots.txt? And therefore is not available AT ALL in the Wayback Machine? |
19:24
🔗
|
midas |
ah fuck, SketchCow, which project was the one we had months to do? i lost the channel |
19:24
🔗
|
Asparagir |
What assholes. |
19:25
🔗
|
midas |
or someone who remembers, i think we had to september or something |
19:25
🔗
|
nico |
Asparagir: it still exist in IA |
19:25
🔗
|
Asparagir |
Yeah, but not visible to the public. |
19:26
🔗
|
* |
joepie91 back |
19:26
🔗
|
joepie91 |
[21:18] <@SketchCow> How did today become Verify All Archiveteam Architecture Day |
19:26
🔗
|
joepie91 |
has to happen every once in a while |
19:27
🔗
|
Nemo_bis |
Asparagir: stub at http://archiveteam.org/index.php?title=Ancestry.com |
19:27
🔗
|
joepie91 |
balrog: crazynation is suspended; might just be a host issue |
19:27
🔗
|
midas |
ah, #totheyard |
19:27
🔗
|
balrog |
seems it's been down for some time |
19:28
🔗
|
BiggieJon |
midas: helium ? mlkshk ? verizon customer pages ? |
19:28
🔗
|
Nemo_bis |
http://archiveteam.org/index.php?title=Current_Projects#Upcoming_projects |
19:28
🔗
|
Nemo_bis |
wikis, how wonderful ;) |
19:29
🔗
|
joepie91 |
"We're pleased to announce that GenForum message boards, Family Tree Maker homepages, and the most popular articles will continue to be available in a read-only format on the Genealogy.com site." |
19:29
🔗
|
joepie91 |
well that's something, I suppose |
19:30
🔗
|
joepie91 |
minimum bar of shutdowning almost reached |
19:38
🔗
|
Asparagir |
Nemo_bis: Thanks. Minor edits made; will update as needed. |
19:39
🔗
|
schbirid |
https://github.com/FlatRockSoft/Hovertank3D |
19:42
🔗
|
balrog |
Asparagir: http://familytreemaker.genealogy.com/users/ |
19:42
🔗
|
balrog |
wow, those pages look straight out of 1995 |
19:42
🔗
|
Asparagir |
They are! |
19:43
🔗
|
Asparagir |
joepie91: True. But am I going to trust 39,000,000+ web pages of family history to those guys' whims? https://www.google.com/search?client=safari&rls=en&q=site:familytreemaker.genealogy.com&ie=UTF-8&oe=UTF-8 Hahahahaha, no. |
19:43
🔗
|
Asparagir |
Okay, 38.2 million. But still! |
19:43
🔗
|
Asparagir |
(And SketchCow chastises me in another channel that these numbers are very very very sketchy.) |
19:55
🔗
|
Nemo_bis |
Put everything on wikidata.org! |
19:58
🔗
|
Nemo_bis |
then you can make https://toolserver.org/~magnus/ts2/geneawiki/?q=Q508848 (http://ultimategerardm.blogspot.it/2013/08/what-to-do-when-wikipedia-does-not_16.html ) |
20:01
🔗
|
exmic |
re: archivebot discussion, how about a mechanism to cut jobs at 1000 fetches and send the queue back to the tracker/hub for re-issuance |
20:06
🔗
|
Famicoman |
Any plans to put blip.tv back in the warrior? |
20:18
🔗
|
SketchCow |
------------------------------------- |
20:18
🔗
|
SketchCow |
IMPORTANT NEWS |
20:18
🔗
|
SketchCow |
archive.org now shows .iso files in file listing links in items |
20:18
🔗
|
SketchCow |
okay not so important |
20:18
🔗
|
SketchCow |
------------------------------------- |
20:18
🔗
|
SketchCow |
But Brewster was cranky it was the old ways. |
20:19
🔗
|
exmic |
I don't see a change, can you explain in more detail? |
20:27
🔗
|
joepie91 |
SketchCow: does it now also link to the ISO browsing feature? |
20:27
🔗
|
* |
joepie91 has been waiting for that to be a UI thing for a whil |
20:27
🔗
|
joepie91 |
while * |
20:30
🔗
|
SketchCow |
No! |
20:31
🔗
|
SketchCow |
Good point! |
20:34
🔗
|
Nemo_bis |
SketchCow: I like it :) it's like a Kahle stamp of approval on all the ripping |
20:38
🔗
|
SketchCow |
Kahle's happy we've brought so much to bear |
20:38
🔗
|
SketchCow |
We've caught up for a decade of neglect |
20:38
🔗
|
SketchCow |
handily |
20:44
🔗
|
underscor |
RELEASE 6 |
20:45
🔗
|
yipdw |
I'll add a changelog to archivebot and just arbitrarily start at RELEASE 6 |
20:46
🔗
|
exmic |
sounds good, ship it |
20:46
🔗
|
underscor |
:D |
20:57
🔗
|
joepie91 |
SketchCow: I stalked you in PM, btw |
20:58
🔗
|
joepie91 |
oh, hai underscor, it's been a while |
23:50
🔗
|
Asparagir |
Does anyone remember who did the ValleyWag crawl in ArchiveBot a few months ago? And why? Just curious... |
23:54
🔗
|
Asparagir |
Nevermind, got it. |
23:54
🔗
|
Asparagir |
No one did a full crawl; it was just a few key articles grabbed. |
23:55
🔗
|
DFJustin |
looks like ivan` did |
23:55
🔗
|
DFJustin |
#archivebot.EFnet.20131207.log:[19:44:10] <ivan`> !a http://valleywag.gawker.com/ |
23:55
🔗
|
Asparagir |
No, the ones I saw were articles about Brendan Eich and related news. No full crawls yet. |
23:56
🔗
|
Asparagir |
And were initiated by yipdw . |
23:56
🔗
|
Asparagir |
Point being, I am living in San Francisco (well, almost) during a latter-day Gilded Age and I want the stories about this place preserved, robots.txt or no. |
23:57
🔗
|
Asparagir |
I think this might be a good test project for me to break out the wpull + phantomjs, instead of wget. |
23:58
🔗
|
DFJustin |
http://archivebot.at.ninjawedding.org:4567/#/histories/http://valleywag.gawker.com/ |
23:59
🔗
|
Asparagir |
Six months ago, half these crazy startups didn't even exist yet. :-P |