#archiveteam 2016-02-26,Fri

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
dashcloudstorage primarily- flickr is probably the largest photo site in the world (facebook is probably bigger, but it's not a photo site first) [00:02]
SketchCow------------------------- INTERNET ARCHIVE READ-ONLY FOR A WHILE [00:06]
***MMovie has quit IRC (Read error: Operation timed out) [00:07]
dashcloudhope it's going to be okay! [00:07]
***MMovie has joined #archiveteam [00:08]
xmcit's planned maintenance [00:10]
***qwebirc89 has joined #archiveteam [00:20]
qwebirc89Hi [00:21]
***bsmith095 has joined #archiveteam [00:21]
snapeSalutations. [00:22]
Fletcherg'day qwebirc89 [00:26]
HCross2Hello! [00:27]
qwebirc89So I am new here. My real name is "Mark Graham"
Hi Harry
[00:27]
HCross2Hello [00:27]
xmcyou should pick a different nickname than qwebirc89 though
we get a lot of people with default names in here
[00:27]
arkiverHi Mark! [00:28]
qwebirc89yes... that will be my 2nd task (first task was getting connected) [00:28]
xmctype /nick newname
anyway
[00:28]
***qwebirc89 is now known as MarkGraha [00:28]
HCross2EFnet has a nick length limit which you just hit [00:28]
xmcoh you only get 9 letters [00:28]
***MarkGraha is now known as MGraham [00:29]
arkiverSketchCow announced you would come here [00:29]
HCross2yes [00:29]
xmcwho is MGraham? some kind of celebrity? [00:29]
MGrahamDoes anyone use a Mac client (e.g. Textual?) [00:29]
arkiverNettalk here [00:29]
MGrahamThere is a banjo player named Mark Graham but that is not me [00:30]
arkiverYou're the director of the Wayback Machine if I understood that correctly [00:31]
MGrahamI work at the Internet Archive in SF [00:31]
xmcah, spiffy
hello, friend
[00:31]
HCross2and the target of many terabytes of news [00:31]
xmcwe give you piles and piles of crap from the net [00:31]
HCross2amongst other things [00:31]
MGrahamYes... I am working with a fantastic team on all things Wayback
I LOVE all the crap you give us!
[00:31]
xmc:D [00:32]
arkiver:D [00:32]
DFJustinALL of it? ;P [00:32]
MGrahamAnd we are work to keep it safe and make it easier for people to access it [00:32]
arkiverSo we have the newgrabber project, FTP project, warrior projects and upcoming videobot
Yeah
[00:32]
HCross2we have a lot [00:32]
MGrahamI am here (in this channel) to learn from you [00:32]
DFJustinand of course archivebot [00:32]
HCross2a lot more coming your way
arkiver: flickr too
[00:33]
MGrahamAnd, to offer any help I may be able to offer and/or that may be desired [00:33]
arkiverYes, SketchCow gave us the green light to get all free pictures from flickr
Nice!
[00:33]
MGrahamKeep it all coming! [00:34]
arkiverSo we've had some project which currently can't run in the wayback machine
one of the largest in size if Blip
is*
[00:34]
MGrahamOne of the areas I am most interested in is "news". [00:34]
***tomwsmf-a has quit IRC (Read error: Operation timed out) [00:34]
arkiverOk [00:34]
***Fletcher sets mode: +o MGraham [00:34]
HCross2yea, the source of many hours of code/server fire [00:35]
arkiverCurrently newsgrabber is down due to a small rewrite to use multiple servers [00:35]
MGrahamThere are 3 "news" projects related to the IA. GDELT, "top_news" and your (Harry) newsgrabber [00:35]
HCross2well, its both mine and arkiver's - he writes the code, I run the grabber [00:35]
arkiverMGraham: would you like a little intro on how newsgrabber does it's grabs? [00:36]
MGrahamYes! [00:36]
arkiverSo 'newsbuddy' can be found here https://github.com/ArchiveTeam/NewsGrabber [00:36]
MGrahamOf course I have read your wiki page
and that page as well
[00:36]
arkiverPage with supported services as well? https://github.com/ArchiveTeam/NewsGrabber/tree/master/services [00:37]
HCross2and more are being added all the tinme
time
quite often, a load of sites from one country are tipped in due to things going on in that country at the time (like the Taiwan earthquake)
[00:37]
***Boppen has quit IRC (Ping timeout: 200 seconds) [00:39]
MGrahamOne question I have is does it make sense to have 3 news crawling projects that don't "talk" to each other... or might it be better to pool our efforts/resources/time into 1 [00:41]
HCrossI think that sounds like a good idea, ensuring we dont have any overlap
Do the crawls that the IA do have such fine control over how often they crawl, like we do?
[00:42]
arkiveryeah, problem is these 3 crawls operate very differently [00:42]
HCrosswe aim to get in, as soon as the article hits the site
to track if there are any changes
to the story (so say the newsagency makes a mistake)
[00:43]
arkiverGDELT uses lists, top_news does crawls up to 5 (?) links deep and newsbuddy does scrapes for new articles on webpages
We can provide lists of URLs crawled by newsbuddy
Before starting a grab newsbuddies lists can be deduplicated from GDELT lists
I'm not sure though how we can get top_news into that picture
[00:43]
DFJustinwho should I talk to about bugs in wayback's handling of robots.txt files [00:45]
SketchCowI had a brilliant idea about that [00:48]
HCrossGo on [00:48]
SketchCowOn a case by case basis, an optional clickthrough screen is added.
Explains this is a historical snapshot and unrelated to the site.
You have to agree to this, then you can see the old crap
Make it opt-in
Why am I not running everything
[00:49]
DFJustinI'm not even talking about change of ownership stuff just straight up it should be allowed according to the robots.txt but it locks you out anyway [00:50]
arkiverMGraham: We did a partial grab of Google Code. I'm in contact with the people behind Google Code. Our user agent will be whitelisted, so we can continue the grab of Google Code
Wayback Machine will have a full copy of Google Code
(except the source code pages, as those can be found in the repo that's up for download)
[00:52]
xmc:D [00:53]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
Boppen has joined #archiveteam
[01:01]
DFJustinI'm heading out so here are some examples
should be allowed by the Allow: directive https://web.archive.org/web/*/https://bugzilla.mozilla.org/show_bug.cgi?id=920433
the robots.txt is just an internal server error page http://web.archive.org/web/*/http://www.nintendometal.com/*
[01:06]
***DexRemoun has joined #archiveteam [01:14]
DexRemounhallo
DexRemoun slaps Cameron_D around a bit with a large fishbot
[01:16]
***zerkalo has joined #archiveteam [01:17]
DexRemounwas get [01:18]
Fletcherhello DexRemoun [01:18]
DexRemounwas geht [01:18]
***MMovie has quit IRC (Read error: Operation timed out) [01:18]
DexRemounwas das für ner chat..?
was ist das für ner chat
[01:19]
***MMovie has joined #archiveteam [01:19]
arkiverPlease talk in English [01:19]
***zerkalo has quit IRC (Remote host closed the connection)
JetBalsa has joined #archiveteam
zerkalo has joined #archiveteam
[01:20]
DexRemounstart [01:22]
***DexRemoun has left
zerkalo has quit IRC (Client Quit)
zerkalo has joined #archiveteam
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
JesseW has joined #archiveteam
[01:25]
tomwsmf-a has joined #archiveteam
lytv has quit IRC (Max SendQ exceeded)
Atom-- has joined #archiveteam
Atom__ has quit IRC (Ping timeout: 252 seconds)
lytv has joined #archiveteam
[01:40]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[01:56]
dashcloud has quit IRC (Read error: Operation timed out) [02:09]
MGrahamSorry... I had to step away from my keyboard but am back... but only long enough to say I am heading home (Half Moon Bay). I will follow-up with you via email Harry. Thanks! [02:11]
***MMovie has quit IRC (Read error: Operation timed out)
dashcloud has joined #archiveteam
MMovie has joined #archiveteam
_vOYtEC has joined #archiveteam
vOYtEC_ has quit IRC (Read error: Connection reset by peer)
MGraham has quit IRC (Ping timeout: 258 seconds)
[02:12]
wp494 has quit IRC (Read error: Connection reset by peer) [02:25]
bsmith093 has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [02:35]
Boppen has quit IRC (hub.se irc.du.se)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
megaminxw has joined #archiveteam
philpem has quit IRC (Ping timeout: 260 seconds)
[02:40]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[02:57]
dashcloudSketchCow: any thoughts on how to handle multi-episode DOS shareware games? There's some that had the registered version released as freeware, so I'd like to get them up on IA if they aren't already. I'm thinking one item per episode, with links in the description pointing to the other episodes. [03:11]
..... (idle for 20mn)
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
dashcloud has quit IRC (Read error: Operation timed out)
dashcloud has joined #archiveteam
[03:31]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[03:48]
SketchCowyes.
exactly.
[03:53]
dashcloudokay- thanks. [04:00]
***mismatch_ has quit IRC (Ping timeout: 633 seconds)
Boppen has joined #archiveteam
[04:12]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
Boppen has quit IRC (hub.se irc.du.se)
[04:18]
tomwsmf-a has quit IRC (Ping timeout: 258 seconds)
megaminxw has quit IRC (Quit: Leaving.)
[04:32]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[04:50]
JetBalsa has quit IRC (Read error: Connection reset by peer) [05:02]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[05:07]
.... (idle for 16mn)
Sk1d has quit IRC (Ping timeout: 250 seconds)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
ndizzle has joined #archiveteam
Sk1d has joined #archiveteam
[05:24]
xXx_ndidd has quit IRC (Read error: Operation timed out) [05:41]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
ndizzle has quit IRC (Read error: Operation timed out)
[05:50]
WinterFox has joined #archiveteam [06:01]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[06:06]
wp494 has joined #archiveteam [06:17]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[06:24]
metalcamp has joined #archiveteam [06:35]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[06:42]
.... (idle for 17mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[07:01]
SketchCowSmall note: Apparently I didn't re-run the "Send Archivebot output" watchman script when the machine last rebooted.
So Archivebot's being a little..... cranky
I'm now finally running it. The backlog is 1.8tb
[07:15]
JesseWwhat's a terabyte or 0.8 between friends, really? [07:18]
***JesseW has quit IRC (Quit: Leaving.)
JesseW has joined #archiveteam
JesseW has quit IRC (Client Quit)
[07:22]
xmc has quit IRC (Ping timeout: 260 seconds)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[07:31]
.... (idle for 18mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[07:51]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
Chorca has quit IRC (Read error: Operation timed out)
Chorca has joined #archiveteam
[08:06]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[08:24]
schbirid has joined #archiveteam [08:37]
..... (idle for 20mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[08:57]
.... (idle for 18mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[09:17]
Mayonaise has quit IRC (Read error: Operation timed out) [09:28]
arkiverWe haven't talked with Mark Graham about other projects
Some that need some special javascript playback from the wayback machine
or the FTP project
[09:41]
***atomotic has joined #archiveteam [09:53]
bwn has quit IRC (Read error: Operation timed out) [09:59]
.... (idle for 19mn)
MMovie has quit IRC (Read error: Operation timed out)
Protab is now known as Rotab
MMovie has joined #archiveteam
[10:18]
Mayonaise has joined #archiveteam [10:33]
megaminxw has joined #archiveteam [10:39]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[10:52]
bwn has joined #archiveteam [11:05]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[11:10]
.... (idle for 18mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
zerkalo has quit IRC (Quit: Lost terminal)
[11:30]
Sk1dtracker seems to be down [11:37]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[11:42]
_vOYtEC has quit IRC (Read error: Connection reset by peer)
vOYtEC has joined #archiveteam
achip has quit IRC (Ping timeout: 258 seconds)
zerkalo has joined #archiveteam
Rotab has quit IRC (Ping timeout: 260 seconds)
db48x has quit IRC (Ping timeout: 258 seconds)
achip has joined #archiveteam
vOYtEC has quit IRC (Read error: Connection reset by peer)
zerkalo_ has joined #archiveteam
lbft_ has joined #archiveteam
zerkalo has quit IRC (Read error: Operation timed out)
lbft has quit IRC (Read error: Operation timed out)
achip has quit IRC (Ping timeout: 258 seconds)
achip has joined #archiveteam
vOYtEC has joined #archiveteam
atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…)
mr-b has quit IRC (Read error: Operation timed out)
Jonimus has quit IRC (Read error: Operation timed out)
morbus_ has joined #archiveteam
mr-b has joined #archiveteam
aMunster has quit IRC (Read error: Operation timed out)
rossdylan has quit IRC (Write error: Broken pipe)
beardicus has quit IRC (Read error: Operation timed out)
vegbrasil has quit IRC (Read error: Operation timed out)
closure has quit IRC (Read error: Operation timed out)
atomotic has joined #archiveteam
megaminxw has quit IRC (Read error: Operation timed out)
Morbus has quit IRC (Read error: Operation timed out)
zerkalo_ has quit IRC (Remote host closed the connection)
megaminxw has joined #archiveteam
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[11:54]
jspiros has quit IRC (Read error: Operation timed out) [12:28]
jspiros has joined #archiveteam [12:39]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
[12:46]
toad2 has joined #archiveteam
toad1 has quit IRC (Read error: Operation timed out)
[12:58]
vegbrasil has joined #archiveteam
beardicus has joined #archiveteam
vtyl has joined #archiveteam
Laverne has quit IRC (Read error: Operation timed out)
closure has joined #archiveteam
aMunster has joined #archiveteam
dserodio has quit IRC (Quit: ZNC - http://znc.in)
Laverne has joined #archiveteam
[13:06]
Jonimus has joined #archiveteam
lytv has quit IRC (Read error: Operation timed out)
mhazinsk has joined #archiveteam
dserodio has joined #archiveteam
[13:15]
nwf has joined #archiveteam
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[13:32]
Zei-Pii has joined #archiveteam [13:43]
MMovie has quit IRC (Read error: Operation timed out)
jut has joined #archiveteam
MMovie has joined #archiveteam
[13:54]
bzc6p has joined #archiveteam [14:00]
bzc6pyipdw chfoo: The tracker has been down since today 05:53 UTC [14:03]
jutAny idea when it will be back? [14:03]
bzc6pWhen one of the aforementioned tracker operators have a chance to look at it. [14:04]
***WinterFox has quit IRC (Remote host closed the connection)
db48x` has joined #archiveteam
atomotic has joined #archiveteam
[14:06]
bzc6p has left
philpem has joined #archiveteam
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
Malik has joined #archiveteam
[14:21]
Malikhello
What about the site http://modsonline.com
[14:28]
***Sk2d has joined #archiveteam [14:30]
MalikMalik slaps arkiver around a bit with a large fishbot [14:31]
***Malik has quit IRC (Client Quit) [14:31]
Sk1d has quit IRC (hub.se irc.du.se)
tomwsmf-a has joined #archiveteam
[14:36]
xmc has joined #archiveteam [14:42]
xmcsorry for the outage, tracker should be up [14:45]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
Sk2d is now known as Sk1d
[14:47]
plk has joined #archiveteam
Boppen has joined #archiveteam
plk has quit IRC (Client Quit)
LastNinja has joined #archiveteam
[14:58]
xmcactually, doubling the ram on it, brb [15:04]
***philpem has quit IRC (Ping timeout: 260 seconds)
zerkalo has joined #archiveteam
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[15:10]
bsmith095anyone want a stupidly rare old nick tape, that i cvant seem to rip for the life of me
its a cassette, so i feel extra stupid
i'm getting feedback, huge bursts of static one the line, inconsistent volume, basically everything that can screw up a recording, all in this one stupid tape!
[15:26]
***megaminxw has quit IRC (Quit: Leaving.) [15:31]
snapedecaying remains of an alleged former tape, it sounds like. [15:31]
***jut has quit IRC (Read error: Operation timed out) [15:34]
bsmith095snape: seriously i will mail this thing to you
i've been googling, it doesnt seem to exist anywhere, i just want it saved.
[15:34]
snapeUnfortunately, I know next to nothing about video transfer. :/ [15:37]
bsmith095snape: audio cassette [15:37]
***MMovie has quit IRC (Read error: Operation timed out)
jut has joined #archiveteam
MMovie has joined #archiveteam
VADemon has joined #archiveteam
[15:39]
bsmith095SketchCow: do you still accept donations by mail [15:44]
***arkiver2 has joined #archiveteam [15:51]
.... (idle for 18mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
arkiver2 has quit IRC (Ping timeout: 252 seconds)
[16:09]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[16:22]
HCrosshttp://www.friendsreunited.co.uk finally gone :( [16:23]
LastNinja:( I thought they would have at least pulled the plug at midnight
HCross - did you manage to grab any of the site?
[16:27]
HCrossLastNinja, most of .co.uk is down
18<arkiver18> We actually got a lot more then expected
18<arkiver18> almsot all .co.uk groups
18<arkiver18> and some discussions
18<arkiver18> the first page of every discussion thread is also already saved in the group grabs
[16:28]
LastNinjawow, that is great :) [16:29]
HCross"We are opening a new free service called Liife.com" - estimated lifetime, 15 years [16:29]
LastNinjaI have been working on it too, being quite selfish and getting the groups that were relevant to myself and family [16:29]
arkiverIf their site was a bit faster we would have saved everything [16:29]
xmcHCross: you mean liifetime [16:29]
HCrossye [16:30]
LastNinjayeah, I found it to be really slow too [16:30]
HCrossLastNinja, we were hitting them hard
by "them" I mean their single windows server
[16:31]
LastNinjahow long had you guys been at it for? I mentioned it to SketchCow when i saw the announcement, but have been travelliing around so didn't get the chance to come in here and exchange notes =) [16:32]
HCrosspretty much since the annoucementr
announcement
[16:33]
jutWas it realy a single Windows server? [16:34]
LastNinjagreat :) I went manually first to get full size images from assetstorage.co.uk for the groups i was interested in, which didnt take long. I had problems getting the fullsize images though [16:34]
HCrossjut, it was, and it was in the top of scotland and slow [16:34]
LastNinjaproblems with ripping the size and getting the fullsize
cant type today! problems ripping full size images i should say
[16:34]
MrRadarjut: I think the mentioned at one point the site was running on Classic ASP code nobody really understood anymore which is why they were shutting the site down [16:35]
HCrossLastNinja, you should have come in and told us, then we could have added them [16:35]
LastNinjai have got the ones from my groups - it was that bloody 'fullscreen' feature they had - not a direct link so my automation missed it, hence the manual approach
well, now I'm here, I shall hang around and see what use I can be in the future =)
[16:37]
MrRadarYou can alwasy run a Warrior VM instance
http://archiveteam.org/index.php?title=Warrior
[16:39]
***JesseW has joined #archiveteam [16:41]
.... (idle for 16mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
bsmith093 has joined #archiveteam
[16:57]
bsmith093SketchCow:is there any way to change the identifier of an item? I accidentally mixed up id and title https://archive.org/details/Side1_20160226 [17:05]
snapeYou can do that by editing the metadata via the website, I believe. [17:06]
arkiverNo
Not the identifier as far as I know
[17:06]
SketchCowTell me what it was and what it should be and I can do it.
snape is wrong but oh so cute.
[17:09]
bsmith093https://archive.org/details/Side1_20160226 that item, switch the id to the title. thanks
SketchCow: https://archive.org/details/Side1_20160226 that item, switch the id to the title. thanks
[17:09]
snapesnape is usually wrong before noon, alas; not a morning person [17:10]
LastNinjaLastNinja slides snape a bucket of coffee [17:11]
bsmith093bsmith093 slpas snape repeatedly across the face "wake up already" [17:12]
***JesseW has quit IRC (Ping timeout: 252 seconds) [17:13]
SketchCowCCP, the makers of the game EVE Online, announced yesterday afternoon that
they were taking their game wiki offline as of Monday, February 29th.
Announcement URL:
http://community.eveonline.com/news/news-channels/eve-online-news/evelopedia
-shutdown-2016-02-29-09-00/
Wiki Location: https://wiki.eveonline.com/en/wiki/Main_Page
This is annoying for a number of reasons, not least of which is the very
short notice being given. There are a lot of articles that contain lore and
[17:13]
bsmith093bsmith093 gives SketchCow an internet high five [17:13]
SketchCowhistorical information about the game on the wiki that are not written up
any place else. They have given players an option to download a .sql file
that has some of the data, but give its small size, it doesn't seem like
there is much data there. Also, not sure what most people are going to be
able to do with a .sql file.
[17:13]
MrRadarThe Eve wiki is already in ArchiveBot [17:13]
SketchCowI went to the Internet Archive and checked the Wayback Machine to see if it
was archiving the Wiki. It has been... sort of. It has saved the
organizational hierarchy, but when you drill down to actual articles, those
Emergency Wiki grab. Who can do it.
bsmith093: So you want the ID to be NickSongsFromthe90s?
[17:13]
bsmith093SketchCow:yes, please [17:14]
arkiverwiki is loading very slow for me [17:14]
HCrossisnt it in ArchiveBot [17:14]
arkiveryes it is [17:14]
SketchCowbsmith093: https://archive.org/details/NickSongsFromthe90s [17:15]
arkiverwith external links though [17:15]
bsmith093SketchCow:thanks [17:15]
phuzionIs anyone grabbing the eve wiki yet? If not, I'll get a grab started [17:15]
MrRadarIt's in archivebot, but a standalone grab would probably be a good idea too [17:16]
SketchCowPush it to the front. Full priority. [17:16]
LastNinjai'm having a go at grabbing the eve wiki right now [17:16]
phuzionSketchCow: worth it to hit it with wikiteam tools too? [17:16]
LastNinjabut I dont trust myself to make a good job of it :) [17:17]
SketchCowxmc: The Software Heritage Institute of France would like a copy of Gitorious. I would suggest you give them one.
Yes, I want a Wikiteam grab here.
We want nothing left to chance.
[17:17]
xmcalready in my email :) [17:17]
MrRadarLastNinja: you should probably use the mediawiki ignoreset if you're using grab-site, or otherwise use the same regexes it uses to reject useless URLs: https://github.com/ArchiveTeam/ArchiveBot/blob/master/db/ignore_patterns/mediawiki.json
Mediawikis generate an exponentially huge number of largely useless derived pages without aggressive trimming
[17:18]
phuzionLastNinja: What URL are you using for the API? [17:19]
SketchCowxmc: Great [17:19]
LastNinjaMrRadar thanks :-) I shall check that once i get back from dinner. Thanks for the pointer [17:24]
phuzionSketchCow: Bad news, looks like they hide their API from the world, which is stupid. [17:25]
arkiveryeah, we can't do a grab with the wiki external URLs grab
phuzion: so wikiteam also can't do a grab right?
[17:25]
phuzionarkiver: Unless I can figure out a way to hit the API
Hitting the API gets a 500 Internal Server Error
Soooooooooo, I dunno what to do short of archivebotting the thing, since wikiteam tools require API access.
[17:26]
SketchCowDo what it takes however it takes. [17:30]
***bauruine has quit IRC (Ping timeout: 260 seconds)
noahc_ has joined #archiveteam
[17:36]
arkiverhttp://warctozip.archive.org/ is down [17:37]
SketchCowAnd has been for over a year.
I'm going to get it back up
[17:38]
arkiverThanks! [17:38]
noahc_Thanks so much @SketchCow [17:38]
@SketchCow, Do you have a rough ETA? If not, that's cool too. [17:50]
SketchCowIt's sitting in my inbox and I'm dumb as shit
So no idea.
[17:50]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[17:50]
noahc_Oh, okay. Thanks. [17:52]
***zerkalo has quit IRC (Quit: leaving)
zerkalo has joined #archiveteam
[17:56]
HCrossSketchCow, copy of https://www.youtube.com/user/AlJazeeraAmerica/videos coming your way soon [18:00]
SketchCowThanks [18:02]
How much of Friends Reunited did we get? [18:08]
arkiverWe got most of the groups of the .co.uk domain
we did not get single users
[18:09]
***noahc_ has quit IRC (Ping timeout: 255 seconds) [18:10]
arkiverthough most (all maybe) of the pictures and comments are made in groups by users and we did get all of that [18:11]
***bauruine has joined #archiveteam [18:19]
SketchCowI won't make hay of it [18:22]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[18:22]
arkiverIf the website didn't run on an old slow server we could have saved more [18:25]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[18:35]
StartIt would be nice if we could get an archive of BetaArchive
They have 24.32 TB of software
Unfortunately, mirroring them is a pain in the ass
You have to be registered and have enough forum activity to gain FTP access
And once you do, there's a 50GB/day limit on how much you can download
Not to mention BetaArchive accounts are IP locked
[18:38]
SketchCowReach out to them about mirroring at Internet Archive
Front door
[18:42]
Also, a lot of their "abandonware" is still we're hosting at the Archive [18:50]
***MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[18:53]
.... (idle for 16mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[19:10]
.... (idle for 18mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[19:29]
.... (idle for 17mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[19:47]
bwn has quit IRC (Ping timeout: 246 seconds) [20:01]
..... (idle for 20mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[20:21]
bwn has joined #archiveteam [20:32]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
philpem has joined #archiveteam
jut has quit IRC (jut)
[20:40]
MMovie has quit IRC (Read error: Operation timed out)
WinterFox has joined #archiveteam
MMovie has joined #archiveteam
[20:58]
.... (idle for 17mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
schbirid has quit IRC (Quit: Leaving)
[21:17]
Riviera has joined #archiveteam [21:28]
godane has quit IRC (Quit: Leaving.)
godane has joined #archiveteam
[21:35]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
icedice has joined #archiveteam
[21:46]
icedicehttp://blog.8tracks.com/2016/02/12/a-change-in-our-international-streaming/ [21:49]
***megaminxw has joined #archiveteam [21:49]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
LastNinja has quit IRC (Quit: byeeee)
[22:00]
JetBalsa has joined #archiveteam [22:11]
ndiddy has joined #archiveteam [22:16]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
LastNinja has joined #archiveteam
Boppen has quit IRC (hub.se irc.du.se)
[22:29]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[22:45]
Elegance has joined #archiveteam
Elegance has quit IRC (Client Quit)
[22:59]
Elegance has joined #archiveteam
Elegance has quit IRC (Client Quit)
[23:06]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[23:16]
.... (idle for 18mn)
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
metalcamp has quit IRC (Ping timeout: 252 seconds)
[23:35]
MMovie has quit IRC (Read error: Operation timed out)
MMovie has joined #archiveteam
[23:53]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)