[00:38] Anyone home? [00:38] Hello [00:38] hey [00:38] I have a weird question [00:38] how hard would it be to backup a private forum? [00:39] I'm not the one to ask, but I would assume pretty easy. [00:39] I'm talking about What.cd for the record [00:39] Ton of great discussion on there [00:39] and it would be pretty great to have that backed up [00:40] Do you have any invites left for What.CD? [00:40] I have a few [00:40] but I'm not really comfortable giving them out [00:41] But the staff have mentioned how the funds to run the site are getting thinner everyday [00:41] No worries [00:41] So it might be wise to put it on deathwatch [00:41] Though it seems to be getting the necessary funds to keep running [00:42] but who knows when the site will go belly up [00:42] What is probably my favorite site on the net and I'll probably be depressed for a few days when it does die [00:42] Recently, so many sites have been taken down [00:42] I was devastated when underground gamer got taken down [00:42] same [00:42] I was depressed [00:43] Are there an ]y [00:43] any archives of ug? [00:43] idk [00:43] the staff mentioned Scortching the servers [00:43] not sure if they have personal archives [00:43] wow [00:44] Not literally [00:44] I know [00:44] but just think about all the time and effort [00:44] and energy [00:44] lol [00:45] yeah [00:45] I think archiving the What.cd forums should be a big undertaking [00:45] I'm not sure what the forums run on [00:46] gazelle like the rest [00:46] but the source for gazelle is out there, so it shouldn't be too hard to figure out how to back engineer it [00:47] what.cd it's an awesome tracker indeed [00:48] how do you get invites [00:48] either know someone who has them [00:48] or take the interview [00:48] I did the latter [00:48] ahh okay [00:49] on a unrelated note [00:49] I'm running Warrior and it keeps giving me "Tracker rate limiting is active." "errors" on the Blip project [00:49] Me too [00:53] Blip tracking is down (I don't why or how to fix it) [00:53] ok [00:54] What.cd is gazelle, the source is neutral leach on the site [00:54] right [00:54] poke around the source, if nothing else ask the DT (Development Team) for help [01:02] source is on github [01:02] dev team may not help because of privacy concerns [01:02] yeah [01:02] source: https://github.com/whatcd/gazelle [01:04] I have no idea what I'm looking at [01:05] I really want to help, but I have no idea what to do [01:20] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [01:24] Stormaged: "yahoosucks" [01:29] Thank you ivan [01:49] ivan`, are you awake? [02:02] joepie91_: barely [02:03] ivan`: right.. seems like claims page is still broken due to unicode bug [02:03] so I can't clear claims [02:03] I don't have shell or global admin on the tracker [02:03] ah [02:03] right, then I'll ask someone else when I wake up [02:03] thanks anyway :) [02:13] can someone do a proper capture of https://www.facebook.com/isoHunt ? [02:13] iramari's WarcProxy doesn't support https [02:13] wow people still use my WarcProxy? [02:15] heh, i've never had to use it before, but its the only thing that comes close [02:16] sadly HTTPS is its one weakness [02:17] chfoo: would you use it from Firefox or chrome? [02:17] id just manually click on show all posts [02:17] in Firefox? [02:17] yeah [02:18] you can try WarcMiddleware, or wget [02:18] the page is basically just javascript [02:19] flashfreeze I think uses a headless js client [02:19] not sure if it would work though I'm afraid [02:23] i do have the "save page as" version though.. [02:23] does the warc-proxy viewer display https properly? [02:25] i'm not sure. [02:25] I would like to thank you all for your efforts, and Im here to make a proactive offer of a archive project, I run the webservers over at PCGamingWiki.com, And I would like to see it archived before anything where to ever happen to it [02:26] and yes, I just spun up a Warrior on our project box [02:30] JRWR: wikiteam would graciously accept your offer. can you stick around until someone from wikiteam answers? [02:31] Yes [02:31] https://scontent-b-lga.xx.fbcdn.net/hphotos-ash4/p480x480/408010_10151222815119812_736256458_n.jpg [02:32] JRWR: there's also the #wikiteam channel if you haven't joined it yet. [02:34] awesome [03:12] JRWR: weren't you involved with the Valve ARGs? [03:13] Yes I was! [03:13] Wow thats some old history for me [03:13] me too, but [03:13] * tmesis went by CheeseGamer then [03:13] I kinda remember you :) [03:13] I run PCGamingWiki.com now [03:14] oh, that's awesome. [03:14] hello. [03:14] nice nice, long time no see. i'm actually still opped in the #valvearg channel for some reason haha [03:14] lol [03:14] that was a long long time ago, and it was hella fun [03:14] I wish there where more games that did this [03:14] 0x10c arg was fun [03:15] yeah, I assume valve'll have another one for their next game [03:15] at least if it's a puzzle game [03:15] I will be right on top of that and PCGamingWiki will be happy to sponsor the wikis related to it [03:16] and me and my low-life +o crew can't wait to deal with the chaos that ensues after several thousand people join a channel and want to be heard [03:16] oh and if guys ever need help with the backend work of the Archive Team's Wiki, just let me know, I know all the tricks to have them survive under high loads [03:16] oh god [03:17] HL3 ARG will be the server crasher of EFNET [03:17] (FYI freenode next time) [03:17] yeah, we should shift to freenode [03:17] gamesurge flat out broke during the e3 announcement of P2 [03:17] lol [03:18] freenode is able to handle the bigstuff [03:23] I think I shall invest some time into the AT Project, its a great little project [06:21] working on IH archive? [07:52] aww "scheduled maintenance" [08:04] FYI [08:04] other tracker admins are probably needed in #isoprey [08:04] that project is one of the highest loads that software has seen in a while [08:30] yes [08:36] jynx: yipdw: Isoprey just shut down their fucking servers... [08:41] https://i.imgur.com/LqxUuVg.png [08:45] argh, fucktards [08:45] Everything is worthy of archivation >:( [08:45] A german community called qype is also being shut down... how much work is needed to back it up? [08:46] Seems gone already? It redirects to yelp.co.uk. [08:46] we can always find their torrents in DMCA notices :P https://encrypted.google.com/search?q=isohunt+site%3Achillingeffects.org [08:47] Yeah, but the point is partly to get it from the "source". [08:49] in germany its still alive, there is a little notice that it will close on 30. [08:53] ersi: did you try qype.de? [08:55] Yes. [08:58] for me its up and running... [09:01] http://www.dopplr.com/ [09:02] Cameron_D: Is there public stuff available? Remember it as a travel planning system [09:02] There seems to be a little bit of stuff, but not much [09:07] okay... looks like there is not dat much interest in qype ;-) [09:08] Is there anything left of qype to grab, SilSte? From what I can see it has already been absorbed into yelp [09:09] (reads back) oh, I see [09:10] ne brr [09:10] 13Bwaah [09:10] :3 [09:10] :3 [09:15] Please stop that. Do that in #archiveteam-bs if you still want to do that [09:17] 13Bwaah [09:21] antomatic: I can stillf ind a lot [09:22] antomatic: And there is only the option to migrate data... doesn't look like all will be taken [09:22] antomatic: and however the comments etc. are not transfered... [09:25] antomatic: what do you see? [09:26] Ah, I was looking at qype.co.uk - now redirects to yelp.co.uk [09:27] fore me it looks like http://s14.directupload.net/images/131021/bentv744.png [09:27] it tells "Achtung Qyper! Hast du Bewertungen geschrieben oder Fotos hochgeladen? Klick hier um sicherzustellen, dass deine Inhalte in dein neues Profil bei Yelp übertragen werden" [09:27] brr ;3 [09:28] "Attention Qyper! Did you write reviews or upload photos? then click here to make sure that your profile is transferred to Yelp" [09:28] For me this sounds like a lot of stuff will be deleted [09:33] I'm really unfamiliar to set up a tracker / stuff like that... if someone could make a quick review if its makeable or not that would be nice ;-) [09:33] tanking a shower now [09:37] brry, jynx: Stop that in here. [09:54] 13Bwaah [10:05] hi there, the warrior appliance isnt available for ESX is it? [10:06] midas__: we don't have an appliance for directly loading, however you maybe able to get it to load in ESXi. [10:06] Smiley: yeah figured that out, just was checking that i wasnt inventing the wheel again [10:06] :p [10:07] ill rebuild it for ESX(I) and if you want i can put a download link up [10:07] you can do [10:07] however if your running esx then you miught as well just run the scripts... [10:08] scripts? :p you are telling me you can install the warrior from source? [10:08] :) [10:09] http://archiveteam.org/index.php?title=User:Djsmiley2k#Build_your_own_EC2_ami.2Finstance [10:09] and back to building it from source :-p [10:29] what's that free (lowest tier) instance type mentioned in the page? [10:31] just the "free" one on EC2 [10:31] got a star by it [10:31] https://aws.amazon.com/free/ [10:32] micro instances. [10:33] 750*12 hours [10:33] Nemo_bis: you still pay for the data. [10:38] Smiley: you mean beyond this? "The AWS free tier includes 5 GB of Amazon S3 standard storage, which offers the highest Amazon S3 durability." https://aws.amazon.com/free/faqs/ [10:38] no no [10:38] thje data moving in/out [10:38] that costs you [10:40] ah, over the "15 GB of bandwidth out aggregated across all AWS services" ? [10:50] hi. is it possible to run a warrior without a vm? [10:54] You can probably convert the VM to a standard install [10:54] hiker2, preferable without needing to extract everything from the vm. [10:55] How would you run the warrior then? [10:56] hiker2, i would like to just run the "app" on my linux servers. [10:56] oh. I don't know how. I've never actually used the warrior. Perhaps someone else could answer [10:58] midas__: vmware convertor can import the VM into ESXi [10:59] midas__: that's how I did it. although you need to remove something from the config file first because it causes an error [10:59] trs80: oh ill check that out [11:04] midas: you'll also want to realign the disks if you're running on 4kb blocksize [11:15] undersco2: ping mw when you are alive [11:40] trs80: made it myself a bit more practical [11:41] uploaded to san, cloned the vmdk's to the right version and added them to new vm [11:41] saves time and no worries about misalignment [11:42] i tried to convert the vmdk to qcow2 to use it with kvm / libvirt. but i get the error "qemu-img: 'image' uses a vmdk feature which is not supported by this qemu version: VMDK version 3" [11:45] midas: yeah, once I had one I just cloned from it [11:45] I think the other changes I made to the VM were more RAM and disk [11:45] the vm should be aligned actually, it's debian 6 [11:49] it is a tad slow on the downloads [11:56] trs80: is there a way to speed it up a bit? :P have 1Gbit and need to burn off some data [11:57] bauruine: yes, see http://www.archiveteam.org/index.php?title=User:Djsmiley2k#Build_your_own_EC2_ami.2Finstance [11:57] midas__: what project you running? [11:57] it depends on the site your pulling, but none will max 1Gbit [11:57] Smiley: blip right now [11:58] it's just saying tracker rate limiting is active :p [11:58] Smiley, thanks! [11:59] midas__: then wqe are at the point of breaking the site entirely. [11:59] got it [11:59] the tracker monitors the site as much as possible [11:59] as we've taken sites out before. [12:45] Smiley, is there any project that needs some resources? [12:45] midas__, you can always run a tor relay to burn some bandwidth :-) [12:46] Feel free to use the #warrior channel for discussions specific to the ArchiveTeam Warrior. (Not bashing or shaming anyone. Just a friendly informatic announcement) [12:47] ersi: thnx :p [12:47] bauruine: already running ;) [13:16] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [13:17] Pichu0202: "yahoosucks" [13:17] Thanks [14:00] :( Also got a couple of TB to spare for the upcomming months :) [14:08] We do got a blip.tv project sort of running, it's currently on-hold/paused - but that can use some bandwidth/storage. Project channel is #blooper.tv [14:15] aight [14:17] Feel free to hang around though, things happen here and there at varying speed :) [14:23] Initiating Self Destruct [14:23] This is it. We are shutting down isoHunt services a little early. I'm told there was this Internet archival team that wants to make historical copy of our .torrent files, I'm honoured that people thinks our site is worthy of historical preservation, but the truth is about 95% of those .torrent files can be found off Google regardless and mostly have been indexed from other BitTorrent sites in the first place. So I might as [14:23] well do a proper send-off to you dear isoHunt users, before final shutdown sequence on Tuesday. It's been an adventure in the last 10.5 years working on isoHunt, a privilege working with some of the smartest guys I've worked with, and my life won't be the same without it. For what I'm working on next, please look up my blog on Google and follow me there. Because as the Terminator would say with a German accent, [14:23] -Gary Fung [14:23] I'll be backkk. [14:23] joepie91_: ^ [14:23] Yes, it's been said. Thanks for posting the whole message. :-P [14:24] balrog: yup :( [14:24] posting the whole message means it gets logged here :P [14:50] any way to tell the chap that it wasn't really the .torrents we were most interested in, but the metadata around them? probably too late now... [14:53] TorrentFreak may come back for a second story, so there could be a chance. [18:20] M1das: :D [18:20] how about backing up www.blackberry.com and related sites? [18:21] as they might go bankrupt in the next months [18:21] M1das: i need to get an android phone soonish then :P [18:21] * M1das waves at Stary2001 :p [18:21] most likely, yes :p [18:22] Blackberry has one phone i really would like to own [18:22] M1das: which is? i have a Curve :/ [18:22] http://us.blackberry.com/smartphones/blackberry-p9981.html [18:23] it is a 2000 dollar phone tho :p [18:23] hahahaha. M1das [18:57] My ISP i silly and has bandwidth caps. would love to get as close to 300GB/mo as possible each month :) [19:00] * phillipsj does not know how much of that he is going to be using for Bitcoin related stuff yet. [19:03] hehe [19:19] phillipsj: be adventurous run a Tor exit node :) [19:22] what kind of 3th world country has bw caps? [19:30] Please take "off-topic" (ie not related to archiving) talk to #archiveteam-bs [19:30] Just a friendly reminder. [21:00] Where does the Warrior upload its results? [21:00] *its results to? [21:02] There is a tracker server [21:03] e.g. http://tracker.archiveteam.org/isoprey/ [21:03] Is the data uploaded to the tracker server? [21:03] Each project usually has one or more rsync targets where finished items are uploaded to [21:03] The tracker informs the clients of which one, or ones, to use for each item [21:04] Who hosts the rsync targets? [21:04] the target servers crunch and collate the uploads, then upload in beefy chunks to archive.org, etc. [21:04] hiker1: anyone and everyone, although there are usually some used more than others [21:05] in isohunt's case I think joe used his own servers for speed [21:07] FOS is a common rsync target, maintained by the mighty Jason 'textfiles.com' Scott. [21:07] okay, so everyone uploads their results to joe's server using rsync, then his server merges data, chunks it, then uploads to archive.org? [21:07] depends on the individual project [21:07] I think that's the plan, hiker1. [21:07] Using rsync, are the files uploaded in the warc.gz format? Or some other format? [21:08] .warc.gz is popular, especially for big web scrapes. In other cases (e.g. blip.tv) sometimes just the raw files are uploaded as-is. (e.g. mp4, mov, ogg, etc) [21:09] How does the rsync target merge and chunk the warc.gz files? [21:10] That part is outside my knowledge - but in general they get collated into 'megawarcs' of about 50gb in size [21:10] think "cat" [21:11] Look at this for example: http://archive.org/details/archiveteam_posterous [21:11] plenty of 50gb .tar files to download there. :) [21:11] * antomatic thinks of cats. [21:11] Smiley: Can warc.gz be merged with cat? [21:11] hiker1: no [21:11] .gz [21:12] Can .warc be merged with cat? [21:14] hiker1: we have scripts for all this on the archiveteam github [21:14] https://github.com/ArchiveTeam/megawarc [21:15] https://github.com/chfoo/warcat [21:15] http://www.archiveteam.org/index.php?title=The_WARC_Ecosystem [21:18] DFJustin: thanks [21:18] .warc can be merged with cat I believe but if you naively gzip the result you lose the special effort .warc.gz files have to match the gzip chunk boundaries with the warc data inside to make it nice and easy to index [21:19] the wayback machine serves content straight out of .warc.gz files in the archive.org storage system [21:19] how quickly do uploaded .warc.gz files make it into the wayback machine? [21:20] I don't know the exact schedule but it seems to be within a few weeks [21:20] question: does archiveteam have anything to do with archive.org? [21:20] archive.org generously gives us hosting and our mascot jason scott is an archive.org employee but otherwise we do whatever we want [21:21] ah [21:21] figured i'd lurk here for a while, maybe help if something comes up [21:21] they generously give everyone hosting though [21:21] got my box set up for isohunt last night [21:22] zifnab: too late now [21:22] appreciated, zifnab. hopefully there'll be another project along shortly. :) [21:22] why is it too late now? [21:22] oh i know :) [21:22] hiker1: site went down [21:22] :O [21:23] he's dead jim [21:23] [We really should try to ALWAYS have at least one project running at any given time, IMHO] [21:23] i got a whole 136 items in last night :( [21:24] I said before, but Warhammer Online is shutting down. It might be worth grabbing a copy of the website + gigantic forum, and maybe even a patched copy of the client [21:24] my guess is they'll gut the forum when it goes [21:24] we did the city of heroes forums before so it's up our alley [21:25] if you have the client you can throw it into an archive.org item yourself [21:25] lol we've done all of gamespy and ign. [21:25] 600,000+ posts on the WAR forum [21:25] However we need someone to take lead and figure it out [21:26] hiker1: not an issue [21:26] Smiley: Is it the kind of project that would be done distributed, or just by one person? [21:26] Warharmmer closes on December 18th apparently [21:26] shit, missed a month already [21:26] antomatic: yeah, it was previously announced [21:26] hiker1: distrobuted [21:27] here have a forum https://archive.org/details/archiveteam-city-of-heroes-main [21:27] someone needs to take the warrior code, and figure out how to split up the jobs [21:27] then load them itno a tracker, and blam. [21:27] Smiley: using a warrior, or just a "you do these I'll do those" approach? [21:27] hiker1: it's an ideal warrior grab for the forums. [21:27] I think a patched up to date copy of the client would also be useful for future private server development. I'd back it up myself, but I have very limited bandwidth [21:28] For the main site, one person can prob easily do it. [21:28] * Smiley points at github [21:28] one for archivebot, maybe? [21:28] What about github? [21:28] hiker1: all our stuff should be there. [21:28] ah, right [21:28] antomatic: need to be able to tell it to ignore the forums, kind of difficult. [21:28] aah. [21:29] Smiley: forums are on a different domain, so shouldn't be difficult I think [21:30] #warslammered then [21:30] SketchCow: investigation into warhammer online shutdown started - #warslammered [21:31] Anyone interested, come to channel. Keep this for general ontopic discussion, and off topic to #archiveteam-bs [21:34] ooh, can you create funny channel names for bebo and wretch as well [21:36] #beboob [21:36] #wretchwench [21:37] lol [21:37] yeah they'll do [21:37] xD [21:40] there's also #yahooblah for yahoo blogs [21:41] I see we made Torrentfreak [21:42] Hi, [21:42] * Smiley needs some more coders. Anyone awake please visit by #warslammered [21:43] Not sure how to index huge chunks of forums [21:44] what forum software is it [21:44] ea... :/ [21:45] nothing on their legal page, so I presume in house. [21:46] Hi SketchCow, could you create torrents also for these items? https://archive.org/metamgr.php?&w_identifier=Wikimediacommons*&mode=more (notice uppercase W; you "only" made lowercase, these 3 are a mistake I made) [21:59] SketchCow: once your done, make sure to catch up with joepie91_; he's done some awesome work; unsure if you've seen ( I'm reading stuff now https://github.com/joepie91/isohunt-grab ) [22:01] ok, now trying lrztar -lD -w 10 [22:11] Nemo_bis: #warslammered [22:11] go make beautiful [22:11] * Smiley goes to bed [22:37] ars technica wrote an article about isohunt's preemptive shutdown due to us: [22:37] http://arstechnica.com/tech-policy/2013/10/isohunt-shuts-down-a-day-early-to-avoid-becoming-part-of-online-archive/ [22:42] has anyone archived the shutdown message? [22:49] hahaha [22:56] http://archive.org/details/TheAdventuresOfSupermanTv1 [22:56] i keep getting emailed by IA saying it been reviewed and i would like to read it, but its not there on the page [22:58] probably spam that gets deleted before you get there [22:59] oh I see there are reviews but the derive process is hung and that needs to finish before they can go live https://catalogd.archive.org/history/TheAdventuresOfSupermanTv1