[00:48] What. [03:35] heya SketchCow i wanna chat about habitat when you get time [03:35] we're probably gonna want to bring you in on it [05:58] VonGuard: You have the most adorable view of how IRC works [06:10] SketchCow: when you have time, I'd love to hear your SAA14 talk [06:10] the twitters have made me interested [06:13] lol [06:13] SketchCow, i just figure irc is low priority [06:13] which is where i wanna be with this request [06:14] so, do you know about habitat and our work on it? ] [07:04] No. [07:43] 43 minutes later, still no idea what habitat is or the work that has been done on it tho :) [07:46] more like 4 hours [07:54] I once had a habitat [07:54] but then the girbils died [11:08] Is someone here able to scrape google, yahoo and/or bing results? [11:08] I updates these wiki pages: http://archiveteam.org/index.php?title=Swipnet [11:09] http://archiveteam.org/index.php?title=Verizon_Personal_Web_Space [11:09] Created scripts and got list of sites from wayback machine, scripts will be checked by chfoo when there is more time [11:09] if someone is able to scrape results from google, yahoo and/or bing, please let me know [11:15] Arkiver2: the only experience i had is with https://gist.github.com/nemobis/7718061 [11:17] Nemo_bis: would you be able to do that same thing on google with the keywords with the keyword: site:home.swipnet.se ? [11:18] is it possible then to get all the 100.000's of results? [11:21] no [11:21] dunno [11:36] Arkiver2: https://archive.org/details/home.swipnet.se-w-20001-to-26000-20140726, https://archive.org/editxml/home.swipnet.se-w-26000-to-30000-20140726-00000.warc, https://archive.org/details/home.swipnet.se-w-30001-to-39999-20140726 [11:36] tephra: thank you! [11:37] I'll exclude those items from the list later [11:37] but [11:37] Arkiver2: first uploads of the ~w-[0-9] users, more to come soon. Will upload more when i get back later and start with google also [11:37] turns out it is not 5 digits [11:37] there are 6 digits [11:37] so it's likely going to be a warrior project [11:37] since there are 6 digits [11:38] there is ~w- as well as -w- [11:40] hmm crap! :P Well I have 26000-99999 [11:40] you got all 26000-99999 already? awesome! [11:40] please keep me informed and when they are uploaded I'll delete them from the list of items [11:42] yes [11:44] good day:) silly questions maybe but does archive.org actually save any POST answers / does any CDN cache any? [11:46] tephira: the name should have been https://archive.org/details/home.swipnet.se-w-26000-to-30000-20140726 [11:46] not [11:46] https://archive.org/details/home.swipnet.se-w-26000-to-30000-20140726-00000.warc [11:47] *tephra [11:47] ^ [12:01] http://hulkfile.eu/ [12:01] http://torrentfreak.com/hulkfile-shuts-down-following-expendables-3-lawsuit-140813/ [12:01] "The company informs TorrentFreak that it has disabled access to all visitors from the United States and that it intends to shut down globally during the coming days. " [12:03] https://www.google.nl/search?q=site%3Ahulkfile.eu&oq=site%3Ahulkfile.eu&aqs=chrome..69i57j69i58.4535j0j7&sourceid=chrome&es_sm=93&ie=UTF-8 [12:03] well, that sucked [12:03] I can still access it here, how do I make a warc? [12:04] you mean the site itself? [12:04] actually, half the assets don't load [12:04] and there isn't any kind of message [12:04] so there is probably already an old version of the frontpage [12:04] archivebot can do it, just need a europe pipeline [12:10] midas: http://www.chillingeffects.org/notice.cgi?sID=1301208 is still available [12:10] contains a hulkfile link [12:10] what about trying to discover as any hulkfile liks as possible and warc those? [12:14] tried some filesearchers, none search hulkfile also [12:17] anything listed in chillingeffects documents will already by gone [13:09] https://i.imgur.com/TuKkqnV.jpg [13:15] +1 factual [13:28] how do I get voice on #archivebot ? [13:35] I want to help with the archiving of Ferguson news [13:58] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [13:58] ... I feel like a bot should have responded by now. Or I missed the joke in that test. <_< [13:58] yahoosucks [13:59] thanks. [13:59] am bot do not understand wordy words [15:52] godane: yes i know but could not figure out how to change after, sorry [15:52] ok [17:26] ok sketchcow, well, we should chat on the phone or in some chat window somewhere about Habitat [17:26] We're bringing it back [17:26] and it's a freaking nutty big project [17:26] that requires us to also bring back Qlink, so any help finding Qlink people would be very much appreciated [17:37] habitat the video game? [17:37] neat [18:14] yeah the c64 mmo [18:28] what is verizon taking down? personal web space, what kinda files? [19:49] ohhdemgir: personal web space [19:50] members.bellatlantic.net [19:51] aka mysite.verizon.net [19:51] they're both the same [19:52] but the name "bell atlantic" should tell you just how old it is ;) [19:52] small sites of users [19:52] I'm working on the making it a warrior [19:52] nice [19:52] 10mb max per site [19:52] so not big at all [19:52] should be a blip compared to twitch [19:52] if you would like to help [19:52] http://archiveteam.org/index.php?title=Swipnet [19:52] take a look at progress [19:53] we still need lists from yahoo [19:53] oh? [19:53] for the keyword site:home.swipnet.se [19:53] The fifth list will be taken from Bing Search results with the keyword: site:home.swipnet.se. (User:Mithrandir is downloading a list of links retrieved from Bing.) [19:53] The fourth list will be taken from Yahoo Search results with the keyword: site:home.swipnet.se. [19:53] The third list will be taken from Google Search results with the keyword: site:home.swipnet.se. (Someone was planning this already?) [19:54] google is being worked on as far as I know, bing is being taken care of [19:54] just yahoo [19:54] I thought it was good to make a progress list so people could actually see what they can do for it [19:55] other then running a warrior and see where the project stands right now [21:57] google script is running, had to resort to using selinium 'cause google kept blocking me. Hopefully this will work better [22:20] tephra: thank you! I'm going now, please keep me informed