[00:02] storage primarily- flickr is probably the largest photo site in the world (facebook is probably bigger, but it's not a photo site first) [00:06] ------------------------- INTERNET ARCHIVE READ-ONLY FOR A WHILE [00:07] *** MMovie has quit IRC (Read error: Operation timed out) [00:07] hope it's going to be okay! [00:08] *** MMovie has joined #archiveteam [00:10] it's planned maintenance [00:20] *** qwebirc89 has joined #archiveteam [00:21] Hi [00:21] *** bsmith095 has joined #archiveteam [00:22] Salutations. [00:26] g'day qwebirc89 [00:27] Hello! [00:27] So I am new here. My real name is "Mark Graham" [00:27] Hi Harry [00:27] Hello [00:27] you should pick a different nickname than qwebirc89 though [00:28] we get a lot of people with default names in here [00:28] Hi Mark! [00:28] yes... that will be my 2nd task (first task was getting connected) [00:28] type /nick newname [00:28] anyway [00:28] *** qwebirc89 is now known as MarkGraha [00:28] EFnet has a nick length limit which you just hit [00:28] oh you only get 9 letters [00:29] *** MarkGraha is now known as MGraham [00:29] SketchCow announced you would come here [00:29] yes [00:29] who is MGraham? some kind of celebrity? [00:29] Does anyone use a Mac client (e.g. Textual?) [00:29] Nettalk here [00:30] There is a banjo player named Mark Graham but that is not me [00:31] You're the director of the Wayback Machine if I understood that correctly [00:31] I work at the Internet Archive in SF [00:31] ah, spiffy [00:31] hello, friend [00:31] and the target of many terabytes of news [00:31] we give you piles and piles of crap from the net [00:31] amongst other things [00:31] Yes... I am working with a fantastic team on all things Wayback [00:32] I LOVE all the crap you give us! [00:32] :D [00:32] :D [00:32] ALL of it? ;P [00:32] And we are work to keep it safe and make it easier for people to access it [00:32] So we have the newgrabber project, FTP project, warrior projects and upcoming videobot [00:32] Yeah [00:32] we have a lot [00:32] I am here (in this channel) to learn from you [00:32] and of course archivebot [00:33] a lot more coming your way [00:33] arkiver: flickr too [00:33] And, to offer any help I may be able to offer and/or that may be desired [00:33] Yes, SketchCow gave us the green light to get all free pictures from flickr [00:33] Nice! [00:34] Keep it all coming! [00:34] So we've had some project which currently can't run in the wayback machine [00:34] one of the largest in size if Blip [00:34] is* [00:34] One of the areas I am most interested in is "news". [00:34] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [00:34] Ok [00:34] *** Fletcher sets mode: +o MGraham [00:35] yea, the source of many hours of code/server fire [00:35] Currently newsgrabber is down due to a small rewrite to use multiple servers [00:35] There are 3 "news" projects related to the IA. GDELT, "top_news" and your (Harry) newsgrabber [00:35] well, its both mine and arkiver's - he writes the code, I run the grabber [00:36] MGraham: would you like a little intro on how newsgrabber does it's grabs? [00:36] Yes! [00:36] So 'newsbuddy' can be found here https://github.com/ArchiveTeam/NewsGrabber [00:36] Of course I have read your wiki page [00:36] and that page as well [00:37] Page with supported services as well? https://github.com/ArchiveTeam/NewsGrabber/tree/master/services [00:37] and more are being added all the tinme [00:37] time [00:38] quite often, a load of sites from one country are tipped in due to things going on in that country at the time (like the Taiwan earthquake) [00:39] *** Boppen has quit IRC (Ping timeout: 200 seconds) [00:41] One question I have is does it make sense to have 3 news crawling projects that don't "talk" to each other... or might it be better to pool our efforts/resources/time into 1 [00:42] I think that sounds like a good idea, ensuring we dont have any overlap [00:42] Do the crawls that the IA do have such fine control over how often they crawl, like we do? [00:42] yeah, problem is these 3 crawls operate very differently [00:43] we aim to get in, as soon as the article hits the site [00:43] to track if there are any changes [00:43] to the story (so say the newsagency makes a mistake) [00:43] GDELT uses lists, top_news does crawls up to 5 (?) links deep and newsbuddy does scrapes for new articles on webpages [00:44] We can provide lists of URLs crawled by newsbuddy [00:45] Before starting a grab newsbuddies lists can be deduplicated from GDELT lists [00:45] I'm not sure though how we can get top_news into that picture [00:45] who should I talk to about bugs in wayback's handling of robots.txt files [00:48] I had a brilliant idea about that [00:48] Go on [00:49] On a case by case basis, an optional clickthrough screen is added. [00:50] Explains this is a historical snapshot and unrelated to the site. [00:50] You have to agree to this, then you can see the old crap [00:50] Make it opt-in [00:50] Why am I not running everything [00:50] I'm not even talking about change of ownership stuff just straight up it should be allowed according to the robots.txt but it locks you out anyway [00:52] MGraham: We did a partial grab of Google Code. I'm in contact with the people behind Google Code. Our user agent will be whitelisted, so we can continue the grab of Google Code [00:52] Wayback Machine will have a full copy of Google Code [00:53] (except the source code pages, as those can be found in the repo that's up for download) [00:53] :D [01:01] *** MMovie has quit IRC (Read error: Operation timed out) [01:02] *** MMovie has joined #archiveteam [01:03] *** Boppen has joined #archiveteam [01:06] I'm heading out so here are some examples [01:06] should be allowed by the Allow: directive https://web.archive.org/web/*/https://bugzilla.mozilla.org/show_bug.cgi?id=920433 [01:06] the robots.txt is just an internal server error page http://web.archive.org/web/*/http://www.nintendometal.com/* [01:14] *** DexRemoun has joined #archiveteam [01:16] hallo [01:17] * DexRemoun slaps Cameron_D around a bit with a large fishbot [01:17] *** zerkalo has joined #archiveteam [01:18] was get [01:18] hello DexRemoun [01:18] was geht [01:18] *** MMovie has quit IRC (Read error: Operation timed out) [01:19] was das für ner chat..? [01:19] was ist das für ner chat [01:19] *** MMovie has joined #archiveteam [01:19] Please talk in English [01:20] *** zerkalo has quit IRC (Remote host closed the connection) [01:21] *** JetBalsa has joined #archiveteam [01:21] *** zerkalo has joined #archiveteam [01:22] start [01:25] *** DexRemoun has left [01:26] *** zerkalo has quit IRC (Client Quit) [01:26] *** zerkalo has joined #archiveteam [01:29] *** MMovie has quit IRC (Read error: Operation timed out) [01:30] *** MMovie has joined #archiveteam [01:34] *** JesseW has joined #archiveteam [01:40] *** tomwsmf-a has joined #archiveteam [01:44] *** lytv has quit IRC (Max SendQ exceeded) [01:44] *** Atom-- has joined #archiveteam [01:47] *** Atom__ has quit IRC (Ping timeout: 252 seconds) [01:49] *** lytv has joined #archiveteam [01:56] *** MMovie has quit IRC (Read error: Operation timed out) [01:57] *** MMovie has joined #archiveteam [02:09] *** dashcloud has quit IRC (Read error: Operation timed out) [02:11] Sorry... I had to step away from my keyboard but am back... but only long enough to say I am heading home (Half Moon Bay). I will follow-up with you via email Harry. Thanks! [02:12] *** MMovie has quit IRC (Read error: Operation timed out) [02:12] *** dashcloud has joined #archiveteam [02:13] *** MMovie has joined #archiveteam [02:14] *** _vOYtEC has joined #archiveteam [02:14] *** vOYtEC_ has quit IRC (Read error: Connection reset by peer) [02:16] *** MGraham has quit IRC (Ping timeout: 258 seconds) [02:25] *** wp494 has quit IRC (Read error: Connection reset by peer) [02:35] *** bsmith093 has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [02:40] *** Boppen has quit IRC (hub.se irc.du.se) [02:42] *** MMovie has quit IRC (Read error: Operation timed out) [02:44] *** MMovie has joined #archiveteam [02:46] *** megaminxw has joined #archiveteam [02:50] *** philpem has quit IRC (Ping timeout: 260 seconds) [02:57] *** MMovie has quit IRC (Read error: Operation timed out) [02:58] *** MMovie has joined #archiveteam [03:11] SketchCow: any thoughts on how to handle multi-episode DOS shareware games? There's some that had the registered version released as freeware, so I'd like to get them up on IA if they aren't already. I'm thinking one item per episode, with links in the description pointing to the other episodes. [03:31] *** MMovie has quit IRC (Read error: Operation timed out) [03:33] *** MMovie has joined #archiveteam [03:35] *** dashcloud has quit IRC (Read error: Operation timed out) [03:39] *** dashcloud has joined #archiveteam [03:48] *** MMovie has quit IRC (Read error: Operation timed out) [03:50] *** MMovie has joined #archiveteam [03:53] yes. [03:53] exactly. [04:00] okay- thanks. [04:12] *** mismatch_ has quit IRC (Ping timeout: 633 seconds) [04:13] *** Boppen has joined #archiveteam [04:18] *** MMovie has quit IRC (Read error: Operation timed out) [04:20] *** MMovie has joined #archiveteam [04:22] *** Boppen has quit IRC (hub.se irc.du.se) [04:32] *** tomwsmf-a has quit IRC (Ping timeout: 258 seconds) [04:36] *** megaminxw has quit IRC (Quit: Leaving.) [04:50] *** MMovie has quit IRC (Read error: Operation timed out) [04:50] *** MMovie has joined #archiveteam [05:02] *** JetBalsa has quit IRC (Read error: Connection reset by peer) [05:07] *** MMovie has quit IRC (Read error: Operation timed out) [05:08] *** MMovie has joined #archiveteam [05:24] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [05:25] *** MMovie has quit IRC (Read error: Operation timed out) [05:26] *** MMovie has joined #archiveteam [05:29] *** ndizzle has joined #archiveteam [05:31] *** Sk1d has joined #archiveteam [05:41] *** xXx_ndidd has quit IRC (Read error: Operation timed out) [05:50] *** MMovie has quit IRC (Read error: Operation timed out) [05:51] *** MMovie has joined #archiveteam [05:53] *** ndizzle has quit IRC (Read error: Operation timed out) [06:01] *** WinterFox has joined #archiveteam [06:06] *** MMovie has quit IRC (Read error: Operation timed out) [06:07] *** MMovie has joined #archiveteam [06:17] *** wp494 has joined #archiveteam [06:24] *** MMovie has quit IRC (Read error: Operation timed out) [06:26] *** MMovie has joined #archiveteam [06:35] *** metalcamp has joined #archiveteam [06:42] *** MMovie has quit IRC (Read error: Operation timed out) [06:44] *** MMovie has joined #archiveteam [07:01] *** MMovie has quit IRC (Read error: Operation timed out) [07:03] *** MMovie has joined #archiveteam [07:15] Small note: Apparently I didn't re-run the "Send Archivebot output" watchman script when the machine last rebooted. [07:15] So Archivebot's being a little..... cranky [07:17] I'm now finally running it. The backlog is 1.8tb [07:18] what's a terabyte or 0.8 between friends, really? [07:22] *** JesseW has quit IRC (Quit: Leaving.) [07:23] *** JesseW has joined #archiveteam [07:23] *** JesseW has quit IRC (Client Quit) [07:31] *** xmc has quit IRC (Ping timeout: 260 seconds) [07:31] *** MMovie has quit IRC (Read error: Operation timed out) [07:33] *** MMovie has joined #archiveteam [07:51] *** MMovie has quit IRC (Read error: Operation timed out) [07:52] *** MMovie has joined #archiveteam [08:06] *** MMovie has quit IRC (Read error: Operation timed out) [08:08] *** MMovie has joined #archiveteam [08:09] *** Chorca has quit IRC (Read error: Operation timed out) [08:12] *** Chorca has joined #archiveteam [08:24] *** MMovie has quit IRC (Read error: Operation timed out) [08:25] *** MMovie has joined #archiveteam [08:37] *** schbirid has joined #archiveteam [08:57] *** MMovie has quit IRC (Read error: Operation timed out) [08:59] *** MMovie has joined #archiveteam [09:17] *** MMovie has quit IRC (Read error: Operation timed out) [09:19] *** MMovie has joined #archiveteam [09:28] *** Mayonaise has quit IRC (Read error: Operation timed out) [09:41] We haven't talked with Mark Graham about other projects [09:41] Some that need some special javascript playback from the wayback machine [09:41] or the FTP project [09:53] *** atomotic has joined #archiveteam [09:59] *** bwn has quit IRC (Read error: Operation timed out) [10:18] *** MMovie has quit IRC (Read error: Operation timed out) [10:20] *** Protab is now known as Rotab [10:20] *** MMovie has joined #archiveteam [10:33] *** Mayonaise has joined #archiveteam [10:39] *** megaminxw has joined #archiveteam [10:52] *** MMovie has quit IRC (Read error: Operation timed out) [10:53] *** MMovie has joined #archiveteam [11:05] *** bwn has joined #archiveteam [11:10] *** MMovie has quit IRC (Read error: Operation timed out) [11:12] *** MMovie has joined #archiveteam [11:30] *** MMovie has quit IRC (Read error: Operation timed out) [11:31] *** MMovie has joined #archiveteam [11:34] *** zerkalo has quit IRC (Quit: Lost terminal) [11:37] tracker seems to be down [11:42] *** MMovie has quit IRC (Read error: Operation timed out) [11:43] *** MMovie has joined #archiveteam [11:54] *** _vOYtEC has quit IRC (Read error: Connection reset by peer) [11:54] *** vOYtEC has joined #archiveteam [11:54] *** achip has quit IRC (Ping timeout: 258 seconds) [11:54] *** zerkalo has joined #archiveteam [11:55] *** Rotab has quit IRC (Ping timeout: 260 seconds) [11:56] *** db48x has quit IRC (Ping timeout: 258 seconds) [11:56] *** achip has joined #archiveteam [11:59] *** vOYtEC has quit IRC (Read error: Connection reset by peer) [11:59] *** zerkalo_ has joined #archiveteam [12:00] *** lbft_ has joined #archiveteam [12:00] *** zerkalo has quit IRC (Read error: Operation timed out) [12:00] *** lbft has quit IRC (Read error: Operation timed out) [12:01] *** achip has quit IRC (Ping timeout: 258 seconds) [12:04] *** achip has joined #archiveteam [12:04] *** vOYtEC has joined #archiveteam [12:06] *** atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) [12:06] *** mr-b has quit IRC (Read error: Operation timed out) [12:06] *** Jonimus has quit IRC (Read error: Operation timed out) [12:07] *** morbus_ has joined #archiveteam [12:07] *** mr-b has joined #archiveteam [12:07] *** aMunster has quit IRC (Read error: Operation timed out) [12:07] *** rossdylan has quit IRC (Write error: Broken pipe) [12:07] *** beardicus has quit IRC (Read error: Operation timed out) [12:08] *** vegbrasil has quit IRC (Read error: Operation timed out) [12:09] *** closure has quit IRC (Read error: Operation timed out) [12:10] *** atomotic has joined #archiveteam [12:10] *** megaminxw has quit IRC (Read error: Operation timed out) [12:13] *** Morbus has quit IRC (Read error: Operation timed out) [12:13] *** zerkalo_ has quit IRC (Remote host closed the connection) [12:16] *** megaminxw has joined #archiveteam [12:19] *** MMovie has quit IRC (Read error: Operation timed out) [12:20] *** MMovie has joined #archiveteam [12:28] *** jspiros has quit IRC (Read error: Operation timed out) [12:39] *** jspiros has joined #archiveteam [12:46] *** MMovie has quit IRC (Read error: Operation timed out) [12:48] *** MMovie has joined #archiveteam [12:50] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [12:58] *** toad2 has joined #archiveteam [12:59] *** toad1 has quit IRC (Read error: Operation timed out) [13:06] *** vegbrasil has joined #archiveteam [13:06] *** beardicus has joined #archiveteam [13:07] *** vtyl has joined #archiveteam [13:08] *** Laverne has quit IRC (Read error: Operation timed out) [13:08] *** closure has joined #archiveteam [13:08] *** aMunster has joined #archiveteam [13:09] *** dserodio has quit IRC (Quit: ZNC - http://znc.in) [13:09] *** Laverne has joined #archiveteam [13:15] *** Jonimus has joined #archiveteam [13:17] *** lytv has quit IRC (Read error: Operation timed out) [13:18] *** mhazinsk has joined #archiveteam [13:22] *** dserodio has joined #archiveteam [13:32] *** nwf has joined #archiveteam [13:36] *** MMovie has quit IRC (Read error: Operation timed out) [13:37] *** MMovie has joined #archiveteam [13:43] *** Zei-Pii has joined #archiveteam [13:54] *** MMovie has quit IRC (Read error: Operation timed out) [13:55] *** jut has joined #archiveteam [13:55] *** MMovie has joined #archiveteam [14:00] *** bzc6p has joined #archiveteam [14:03] yipdw chfoo: The tracker has been down since today 05:53 UTC [14:03] Any idea when it will be back? [14:04] When one of the aforementioned tracker operators have a chance to look at it. [14:06] *** WinterFox has quit IRC (Remote host closed the connection) [14:09] *** db48x` has joined #archiveteam [14:10] *** atomotic has joined #archiveteam [14:21] *** bzc6p has left [14:24] *** philpem has joined #archiveteam [14:25] *** MMovie has quit IRC (Read error: Operation timed out) [14:27] *** MMovie has joined #archiveteam [14:27] *** Malik has joined #archiveteam [14:28] hello [14:29] What about the site http://modsonline.com [14:30] *** Sk2d has joined #archiveteam [14:31] * Malik slaps arkiver around a bit with a large fishbot [14:31] *** Malik has quit IRC (Client Quit) [14:36] *** Sk1d has quit IRC (hub.se irc.du.se) [14:37] *** tomwsmf-a has joined #archiveteam [14:42] *** xmc has joined #archiveteam [14:45] sorry for the outage, tracker should be up [14:47] *** MMovie has quit IRC (Read error: Operation timed out) [14:49] *** MMovie has joined #archiveteam [14:52] *** Sk2d is now known as Sk1d [14:58] *** plk has joined #archiveteam [14:59] *** Boppen has joined #archiveteam [14:59] *** plk has quit IRC (Client Quit) [15:01] *** LastNinja has joined #archiveteam [15:04] actually, doubling the ram on it, brb [15:10] *** philpem has quit IRC (Ping timeout: 260 seconds) [15:10] *** zerkalo has joined #archiveteam [15:13] *** MMovie has quit IRC (Read error: Operation timed out) [15:14] *** MMovie has joined #archiveteam [15:26] anyone want a stupidly rare old nick tape, that i cvant seem to rip for the life of me [15:27] its a cassette, so i feel extra stupid [15:29] i'm getting feedback, huge bursts of static one the line, inconsistent volume, basically everything that can screw up a recording, all in this one stupid tape! [15:31] *** megaminxw has quit IRC (Quit: Leaving.) [15:31] decaying remains of an alleged former tape, it sounds like. [15:34] *** jut has quit IRC (Read error: Operation timed out) [15:34] snape: seriously i will mail this thing to you [15:35] i've been googling, it doesnt seem to exist anywhere, i just want it saved. [15:37] Unfortunately, I know next to nothing about video transfer. :/ [15:37] snape: audio cassette [15:39] *** MMovie has quit IRC (Read error: Operation timed out) [15:39] *** jut has joined #archiveteam [15:40] *** MMovie has joined #archiveteam [15:40] *** VADemon has joined #archiveteam [15:44] SketchCow: do you still accept donations by mail [15:51] *** arkiver2 has joined #archiveteam [16:09] *** MMovie has quit IRC (Read error: Operation timed out) [16:10] *** MMovie has joined #archiveteam [16:13] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [16:22] *** MMovie has quit IRC (Read error: Operation timed out) [16:23] *** MMovie has joined #archiveteam [16:23] http://www.friendsreunited.co.uk finally gone :( [16:27] :( I thought they would have at least pulled the plug at midnight [16:28] HCross - did you manage to grab any of the site? [16:28] LastNinja, most of .co.uk is down [16:28] 18<arkiver18> We actually got a lot more then expected [16:28] 18<arkiver18> almsot all .co.uk groups [16:28] 18<arkiver18> and some discussions [16:28] 18<arkiver18> the first page of every discussion thread is also already saved in the group grabs [16:29] wow, that is great :) [16:29] "We are opening a new free service called Liife.com" - estimated lifetime, 15 years [16:29] I have been working on it too, being quite selfish and getting the groups that were relevant to myself and family [16:29] If their site was a bit faster we would have saved everything [16:29] HCross: you mean liifetime [16:30] ye [16:30] yeah, I found it to be really slow too [16:31] LastNinja, we were hitting them hard [16:31] by "them" I mean their single windows server [16:32] how long had you guys been at it for? I mentioned it to SketchCow when i saw the announcement, but have been travelliing around so didn't get the chance to come in here and exchange notes =) [16:33] pretty much since the annoucementr [16:33] announcement [16:34] Was it realy a single Windows server? [16:34] great :) I went manually first to get full size images from assetstorage.co.uk for the groups i was interested in, which didnt take long. I had problems getting the fullsize images though [16:34] jut, it was, and it was in the top of scotland and slow [16:34] problems with ripping the size and getting the fullsize [16:35] cant type today! problems ripping full size images i should say [16:35] jut: I think the mentioned at one point the site was running on Classic ASP code nobody really understood anymore which is why they were shutting the site down [16:35] LastNinja, you should have come in and told us, then we could have added them [16:37] i have got the ones from my groups - it was that bloody 'fullscreen' feature they had - not a direct link so my automation missed it, hence the manual approach [16:39] well, now I'm here, I shall hang around and see what use I can be in the future =) [16:39] You can alwasy run a Warrior VM instance [16:40] http://archiveteam.org/index.php?title=Warrior [16:41] *** JesseW has joined #archiveteam [16:57] *** MMovie has quit IRC (Read error: Operation timed out) [16:59] *** MMovie has joined #archiveteam [17:01] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [17:03] *** bsmith093 has joined #archiveteam [17:05] SketchCow:is there any way to change the identifier of an item? I accidentally mixed up id and title https://archive.org/details/Side1_20160226 [17:06] You can do that by editing the metadata via the website, I believe. [17:06] No [17:07] Not the identifier as far as I know [17:09] Tell me what it was and what it should be and I can do it. [17:09] snape is wrong but oh so cute. [17:09] https://archive.org/details/Side1_20160226 that item, switch the id to the title. thanks [17:10] SketchCow: https://archive.org/details/Side1_20160226 that item, switch the id to the title. thanks [17:10] * snape is usually wrong before noon, alas; not a morning person [17:11] * LastNinja slides snape a bucket of coffee [17:12] * bsmith093 slpas snape repeatedly across the face "wake up already" [17:13] *** JesseW has quit IRC (Ping timeout: 252 seconds) [17:13] CCP, the makers of the game EVE Online, announced yesterday afternoon that [17:13] they were taking their game wiki offline as of Monday, February 29th. [17:13] Announcement URL: [17:13] http://community.eveonline.com/news/news-channels/eve-online-news/evelopedia [17:13] -shutdown-2016-02-29-09-00/ [17:13] Wiki Location: https://wiki.eveonline.com/en/wiki/Main_Page [17:13] This is annoying for a number of reasons, not least of which is the very [17:13] short notice being given. There are a lot of articles that contain lore and [17:13] * bsmith093 gives SketchCow an internet high five [17:13] historical information about the game on the wiki that are not written up [17:13] any place else. They have given players an option to download a .sql file [17:13] that has some of the data, but give its small size, it doesn't seem like [17:13] there is much data there. Also, not sure what most people are going to be [17:13] able to do with a .sql file. [17:13] The Eve wiki is already in ArchiveBot [17:13] I went to the Internet Archive and checked the Wayback Machine to see if it [17:13] was archiving the Wiki. It has been... sort of. It has saved the [17:13] organizational hierarchy, but when you drill down to actual articles, those [17:13] Emergency Wiki grab. Who can do it. [17:14] bsmith093: So you want the ID to be NickSongsFromthe90s? [17:14] SketchCow:yes, please [17:14] wiki is loading very slow for me [17:14] isnt it in ArchiveBot [17:14] yes it is [17:15] bsmith093: https://archive.org/details/NickSongsFromthe90s [17:15] with external links though [17:15] SketchCow:thanks [17:15] Is anyone grabbing the eve wiki yet? If not, I'll get a grab started [17:16] It's in archivebot, but a standalone grab would probably be a good idea too [17:16] Push it to the front. Full priority. [17:16] i'm having a go at grabbing the eve wiki right now [17:16] SketchCow: worth it to hit it with wikiteam tools too? [17:17] but I dont trust myself to make a good job of it :) [17:17] xmc: The Software Heritage Institute of France would like a copy of Gitorious. I would suggest you give them one. [17:17] Yes, I want a Wikiteam grab here. [17:17] We want nothing left to chance. [17:17] already in my email :) [17:18] LastNinja: you should probably use the mediawiki ignoreset if you're using grab-site, or otherwise use the same regexes it uses to reject useless URLs: https://github.com/ArchiveTeam/ArchiveBot/blob/master/db/ignore_patterns/mediawiki.json [17:19] Mediawikis generate an exponentially huge number of largely useless derived pages without aggressive trimming [17:19] LastNinja: What URL are you using for the API? [17:19] xmc: Great [17:24] MrRadar thanks :-) I shall check that once i get back from dinner. Thanks for the pointer [17:25] SketchCow: Bad news, looks like they hide their API from the world, which is stupid. [17:25] yeah, we can't do a grab with the wiki external URLs grab [17:26] phuzion: so wikiteam also can't do a grab right? [17:26] arkiver: Unless I can figure out a way to hit the API [17:27] Hitting the API gets a 500 Internal Server Error [17:28] Soooooooooo, I dunno what to do short of archivebotting the thing, since wikiteam tools require API access. [17:30] Do what it takes however it takes. [17:36] *** bauruine has quit IRC (Ping timeout: 260 seconds) [17:37] *** noahc_ has joined #archiveteam [17:37] http://warctozip.archive.org/ is down [17:38] And has been for over a year. [17:38] I'm going to get it back up [17:38] Thanks! [17:38] Thanks so much @SketchCow [17:50] @SketchCow, Do you have a rough ETA? If not, that's cool too. [17:50] It's sitting in my inbox and I'm dumb as shit [17:50] So no idea. [17:50] *** MMovie has quit IRC (Read error: Operation timed out) [17:51] *** MMovie has joined #archiveteam [17:52] Oh, okay. Thanks. [17:56] *** zerkalo has quit IRC (Quit: leaving) [17:57] *** zerkalo has joined #archiveteam [18:00] SketchCow, copy of https://www.youtube.com/user/AlJazeeraAmerica/videos coming your way soon [18:02] Thanks [18:08] How much of Friends Reunited did we get? [18:09] We got most of the groups of the .co.uk domain [18:09] we did not get single users [18:10] *** noahc_ has quit IRC (Ping timeout: 255 seconds) [18:11] though most (all maybe) of the pictures and comments are made in groups by users and we did get all of that [18:19] *** bauruine has joined #archiveteam [18:22] I won't make hay of it [18:22] *** MMovie has quit IRC (Read error: Operation timed out) [18:23] *** MMovie has joined #archiveteam [18:25] If the website didn't run on an old slow server we could have saved more [18:35] *** MMovie has quit IRC (Read error: Operation timed out) [18:37] *** MMovie has joined #archiveteam [18:38] It would be nice if we could get an archive of BetaArchive [18:38] They have 24.32 TB of software [18:39] Unfortunately, mirroring them is a pain in the ass [18:40] You have to be registered and have enough forum activity to gain FTP access [18:40] And once you do, there's a 50GB/day limit on how much you can download [18:41] Not to mention BetaArchive accounts are IP locked [18:42] Reach out to them about mirroring at Internet Archive [18:42] Front door [18:50] Also, a lot of their "abandonware" is still we're hosting at the Archive [18:53] *** MMovie has quit IRC (Read error: Operation timed out) [18:54] *** MMovie has joined #archiveteam [19:10] *** MMovie has quit IRC (Read error: Operation timed out) [19:11] *** MMovie has joined #archiveteam [19:29] *** MMovie has quit IRC (Read error: Operation timed out) [19:30] *** MMovie has joined #archiveteam [19:47] *** MMovie has quit IRC (Read error: Operation timed out) [19:48] *** MMovie has joined #archiveteam [20:01] *** bwn has quit IRC (Ping timeout: 246 seconds) [20:21] *** MMovie has quit IRC (Read error: Operation timed out) [20:23] *** MMovie has joined #archiveteam [20:32] *** bwn has joined #archiveteam [20:40] *** MMovie has quit IRC (Read error: Operation timed out) [20:42] *** MMovie has joined #archiveteam [20:45] *** philpem has joined #archiveteam [20:47] *** jut has quit IRC (jut) [20:58] *** MMovie has quit IRC (Read error: Operation timed out) [21:00] *** WinterFox has joined #archiveteam [21:00] *** MMovie has joined #archiveteam [21:17] *** MMovie has quit IRC (Read error: Operation timed out) [21:18] *** MMovie has joined #archiveteam [21:20] *** schbirid has quit IRC (Quit: Leaving) [21:28] *** Riviera has joined #archiveteam [21:35] *** godane has quit IRC (Quit: Leaving.) [21:35] *** godane has joined #archiveteam [21:46] *** MMovie has quit IRC (Read error: Operation timed out) [21:46] *** MMovie has joined #archiveteam [21:49] *** icedice has joined #archiveteam [21:49] http://blog.8tracks.com/2016/02/12/a-change-in-our-international-streaming/ [21:49] *** megaminxw has joined #archiveteam [22:00] *** MMovie has quit IRC (Read error: Operation timed out) [22:02] *** MMovie has joined #archiveteam [22:02] *** LastNinja has quit IRC (Quit: byeeee) [22:11] *** JetBalsa has joined #archiveteam [22:16] *** ndiddy has joined #archiveteam [22:29] *** MMovie has quit IRC (Read error: Operation timed out) [22:30] *** MMovie has joined #archiveteam [22:34] *** LastNinja has joined #archiveteam [22:35] *** Boppen has quit IRC (hub.se irc.du.se) [22:45] *** MMovie has quit IRC (Read error: Operation timed out) [22:46] *** MMovie has joined #archiveteam [22:59] *** Elegance has joined #archiveteam [23:01] *** Elegance has quit IRC (Client Quit) [23:06] *** Elegance has joined #archiveteam [23:09] *** Elegance has quit IRC (Client Quit) [23:16] *** MMovie has quit IRC (Read error: Operation timed out) [23:17] *** MMovie has joined #archiveteam [23:35] *** MMovie has quit IRC (Read error: Operation timed out) [23:36] *** MMovie has joined #archiveteam [23:39] *** metalcamp has quit IRC (Ping timeout: 252 seconds) [23:53] *** MMovie has quit IRC (Read error: Operation timed out) [23:55] *** MMovie has joined #archiveteam