[01:06] Ctrl-S: vpsdime is using OpenVZ [01:06] OpenVZ does not have working CPU limits [01:06] only on paper [01:06] I'd cc schbirid but he's gone [01:06] :p [01:27] *** mistym has joined #archiveteam-bs [01:47] *** primus104 has quit IRC (Leaving.) [02:07] *** mst_ has joined #archiveteam-bs [03:16] *** Jonimus has joined #archiveteam-bs [03:44] *** xmc has quit IRC (Ping timeout: 512 seconds) [03:51] *** xmc has joined #archiveteam-bs [03:51] *** swebb sets mode: +o xmc [03:57] *** mistym has quit IRC (Remote host closed the connection) [04:02] *** mst_ has quit IRC (Quit: bye) [04:23] *** mistym has joined #archiveteam-bs [05:16] *** mistym has quit IRC (Remote host closed the connection) [05:50] *** mistym has joined #archiveteam-bs [06:14] is there a place where people are discussing solutions to the metadata problem? [06:16] i'm working with a project that is gathering lots of data about video games, and I'm trying to standardize the way tools and frontends get that metadata [06:16] currently its a mix of "shit hardcoded into a program", "shit scraped from websites of wildly varying quality" and "shit the user told us about this set of data" [06:18] i'm wondering if there are other projects that are working in this space that I should talk to/work with, or if i'm venturing into the unknown [06:35] *** mistym has quit IRC (Remote host closed the connection) [07:23] joepie91_: Do you WANT to breakdown of how that interaction of yours could have not been a clusterfuck? [07:24] acridAxid: Mobygames [07:59] *** primus104 has joined #archiveteam-bs [08:26] *** underscor has quit IRC (Ping timeout: 370 seconds) [08:26] *** underscor has joined #archiveteam-bs [08:26] *** swebb sets mode: +o underscor [10:46] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [11:32] *** dashcloud has quit IRC (Read error: Connection reset by peer) [11:35] *** dashcloud has joined #archiveteam-bs [12:08] I have a newbie query, when the arhive completes for a php forum (vbulletin), it has php files (forumdisplay.php?f=129, showthread.php?t=405494, etc) [12:08] Now I want to host these files as an static HTML archive on an Nginx server, but I think the server interprsets these files as PHP and instead of showing html content it download the file on client browser (http://gearbox.techapj.com/oldforums.gearboxsoftware.com/forumdisplay.php?f=128) [12:08] can anyone please tell me how can i host these files as static html server via nginx? [12:17] *** primus104 has quit IRC (Leaving.) [12:43] *** primus104 has joined #archiveteam-bs [13:18] *** sankin has joined #archiveteam-bs [13:50] *** primus104 has quit IRC (Leaving.) [13:54] *** dashcloud has quit IRC (Read error: Connection reset by peer) [14:02] *** dashcloud has joined #archiveteam-bs [14:27] *** Start has quit IRC (Disconnected.) [14:49] *** mistym has joined #archiveteam-bs [14:56] *** mistym has quit IRC (Remote host closed the connection) [14:59] so i got some good news [15:00] i will be grabbing the PRI's The World: Global Hits podcast [15:00] looks like i can grab alot of the older dates from that podcast [15:00] i also found a podcast called Walt Disney World Today [15:01] 1450 podcasts of that so far so i'm grabbing it [15:03] *** Start has joined #archiveteam-bs [15:08] http://zennistrad.tumblr.com/post/113151511988/gamergate-is-killing-video-games -- having effects on funding of archival too :/ [15:08] #blablagate [15:09] https://storify.com/8BitBecca/video-game-archiving-post-gamergate [15:12] *** mistym has joined #archiveteam-bs [15:18] So, walking this one carefully [15:18] Games archiving is doing juuuuuuust fine [15:19] But I will agree that places that are seeking any excuse, any at all, to not deal with game shit, have now found a legion of cheap excuses. [15:20] the point about archiving games culture stuff is a good one, though [15:21] who really makes noise when recorded streams get deleted? I mean, AT is holding on to TBs of twitch, right? [15:21] So, interesting situation. [15:21] I'm well on my way to becoming a big name in the game archiving field. [15:22] Archive Team is well on its way to being the name, outside of archive.org itself, in web archiving. [15:22] And since I, and Archive Team, are both insane completionists with "fuck you, it's going into the vault" [15:22] Radical inclusionism [15:23] So one way to battle forces of despair is to just keep rocking [15:23] IMHO, gamergate happenings should be preserved so we can look back and see how things went wrong and what sort of idiots are prevalent among internet culture. This sort of reaction from academia does indeed feel like excuses [15:25] Rock rock rock, around the clock [15:27] SketchCow: you're completely right, though, all we can do is keep pushing forward [15:27] oh no, no more academic game archiving, what will we do without that juggernaut of success [15:28] * ersi chuckles [15:41] DFJustin: now, when academics are finally considering it, is not the time for them to be like "well people suck, this isn't worth it" [15:56] I have no hate for the Academic archivists. It's just they're going to end up depending on us. [15:57] *** Start_ has joined #archiveteam-bs [15:57] *** Start has quit IRC (Read error: Connection reset by peer) [16:00] well, we can't ignore the fact that *someone* has to pay for long term storage maintenance. [16:00] it's not like you can stash hard drives in a cold, dry basement and expect even half of them to work 20 years later. [16:02] *** robink has quit IRC (Ping timeout: 492 seconds) [16:07] That funding isn't coming from academics, sir [16:07] oh no, it's not [16:07] that's not the point [16:09] *** Start_ has quit IRC (Ping timeout: 370 seconds) [16:10] *** robink has joined #archiveteam-bs [16:14] *** Start has joined #archiveteam-bs [16:16] *** robink has quit IRC (Remote host closed the connection) [16:16] *** Start has quit IRC (Read error: Connection reset by peer) [16:17] *** Start has joined #archiveteam-bs [16:22] *** mistym has quit IRC (Remote host closed the connection) [16:31] *** primus104 has joined #archiveteam-bs [16:38] balrog: it's being preserved [16:38] admittedly everything I've thrown in has been with the goal of embarassing assholes but hey [16:39] yipdw: I mentioned in #archivebot -- I really wish we had a way of scraping something from archive.today into an archive.org compliant warc [16:39] we do; scrape archive.today [16:39] is that automatic or is there a special way to do that? [16:39] !ao http://archive.today/... [16:40] but will that archive it under archive.today or the original URL? [16:40] the former, which is the correct action [16:40] to do otherwise is to misrepresent the source [16:40] *** abartov has joined #archiveteam-bs [16:41] if a more direct map is desired, an archive.today => URL index can be generated from those WARCs [16:41] each archive.today capture records the source URL [16:41] however I do not think it is a good idea to do that step directly; it is too complex [16:42] often you have the original URL and not the archive.today URL [16:42] yes [16:42] if you need that direct mapping then build a URL -> archive.today WARC index [16:43] maybe that's what you had in mind anyway; I just wanted to state that I think doing that transformation directly in the WARC is too hard to be worth it [16:46] *** Start has quit IRC (Disconnected.) [16:57] *** Start has joined #archiveteam-bs [17:00] *** Start has quit IRC (Read error: Connection reset by peer) [17:04] *** Start has joined #archiveteam-bs [17:08] *** dashcloud has quit IRC (Read error: Operation timed out) [17:11] *** dashcloud has joined #archiveteam-bs [17:17] *** chfoo has quit IRC (Ping timeout: 512 seconds) [17:20] *** chfoo has joined #archiveteam-bs [17:30] *** sankin1 has joined #archiveteam-bs [17:31] *** sankin has quit IRC (Read error: Operation timed out) [17:35] there should be a meta-archive [17:35] all i can see in that is an archive of METAR readings [17:35] which checks the wayback machine, archive.today, archive data, etc. [17:35] :P [17:37] *** sankin has joined #archiveteam-bs [17:39] *** sankin1 has quit IRC (Read error: Operation timed out) [17:45] *** Start_ has joined #archiveteam-bs [17:47] *** Start has quit IRC (Read error: Connection reset by peer) [17:47] *** Start_ is now known as Start [17:49] *** sankin has quit IRC (Ping timeout: 600 seconds) [17:49] *** dashcloud has quit IRC (Read error: Operation timed out) [17:49] *** sankin has joined #archiveteam-bs [17:50] *** dashcloud has joined #archiveteam-bs [18:03] *** sankin has quit IRC (Read error: Operation timed out) [18:13] techapj: how have you archived the site in question? [18:15] joepie91_: using wpull [18:16] techapj: is it saved as a WARC? [18:16] no, there are a bunch of php files from vb forum [18:16] :( [18:16] techapj: please save it as a WARC next time [18:17] it retains important metadata [18:17] techapj: anyway, you can /somehow/ configure nginx to treat .php files as text/html [18:17] ie. configuring mimetype [18:17] I'm not sure of the specifics because I don't use nginx, but that's what you're looking for [18:17] *** Jonimus has quit IRC (Ping timeout: 370 seconds) [18:17] may need to use some kind of wildcard setup where anything *containing* .php is treated as text/html, if the ?query=string is part of your filename [18:18] because it will treat .php plus the entire query string as the "extension" and not match on .php alone [18:18] joepie91_: i configured the mime type and i can see the page in browser now [18:19] but when i try to view `forumdisplay.php?f=128` it shows content for `forumdisplay.php` instead [18:19] ignoring the ?f=128 querystring [18:19] right, didn't think about that [18:19] makes sense [18:20] ?f=128 isn't part of the file path to nginx [18:20] but the query string [18:20] so it loads forumdisplay.php with f=128 as query string but obv that doesn't work [18:21] yup :( [18:22] any idea how to handle this? i am sruck here.. [18:23] techapj: I have no idea tbh, if you have a WARC it;s a lot easier [18:24] there's daemons for serving content directly from WARCs [18:25] joepie91_: just curious, will WARC preserve links like http://oldforums.gearboxsoftware.com/showthread.php?t=405494 [18:26] that is cruciaa to preserve google crawls [18:26] techapj: warc records the http requests/responses, including headers [18:27] *** sankin has joined #archiveteam-bs [18:37] *** dashcloud has quit IRC (Read error: Operation timed out) [18:37] *** Start has quit IRC (Disconnected.) [18:41] *** dashcloud has joined #archiveteam-bs [18:42] *** Start has joined #archiveteam-bs [18:43] *** Rotab has quit IRC (Read error: Connection reset by peer) [18:43] *** Start has quit IRC (Read error: Connection reset by peer) [18:43] *** Start has joined #archiveteam-bs [18:50] *** Rotab has joined #archiveteam-bs [18:56] *** atlogbot has joined #archiveteam-bs [19:20] *** Start has quit IRC (Disconnected.) [19:29] *** lag2 has joined #archiveteam-bs [19:36] *** Start has joined #archiveteam-bs [19:37] *** dashcloud has quit IRC (Read error: Operation timed out) [19:40] *** dashcloud has joined #archiveteam-bs [19:49] https://i.imgur.com/yBO49Wo.gif [19:49] *** techapj has quit IRC (Quit: Page closed) [20:03] *** Start has quit IRC (Disconnected.) [20:05] *** dashcloud has quit IRC (Read error: Operation timed out) [20:09] *** mistym has joined #archiveteam-bs [20:10] *** dashcloud has joined #archiveteam-bs [20:12] *** Start has joined #archiveteam-bs [20:14] *** Start has quit IRC (Client Quit) [20:19] <@SketchCow> joepie91_: Do you WANT to breakdown of how that interaction of yours could have not been a clusterfuck? [20:21] SketchCow, where in my logs should i look for context [20:21] It was twitter [20:21] oh [20:21] I put Communications Museum in my GDC talk by the way [20:22] cool [20:22] Just a couple photos [20:23] :) [20:24] oh, you must be referring to joepie91_ mentioning that someone called him a sealion [20:26] oh boy [20:28] which, btw, i can totally understand happening [20:35] *** cbb has joined #archiveteam-bs [20:44] *** mistym has quit IRC (Remote host closed the connection) [20:58] *** sankin has quit IRC (Leaving.) [20:59] *** mistym has joined #archiveteam-bs [21:05] *** mistym has quit IRC (Remote host closed the connection) [21:06] *** mistym has joined #archiveteam-bs [21:16] *** dashcloud has quit IRC (Read error: Operation timed out) [21:20] *** dashcloud has joined #archiveteam-bs [21:41] *** cbb has quit IRC (Quit: cbb) [21:44] *** Jonimus has joined #archiveteam-bs [21:47] *** Nertsy has joined #archiveteam-bs [21:55] *** schbirid has joined #archiveteam-bs [22:00] dammit wget, learn to mirror with --header="Accept-Encoding: gzip" already [22:00] good night >:( [22:00] *** schbirid has quit IRC (Client Quit) [22:12] *** mistym has quit IRC (Remote host closed the connection) [22:12] *** cbb has joined #archiveteam-bs [22:17] *** Start has joined #archiveteam-bs [22:21] *** mistym has joined #archiveteam-bs [22:21] *** Sk1d has joined #archiveteam-bs [22:51] *** lag2 has quit IRC (Ping timeout: 512 seconds) [23:02] https://twitter.com/joepie91/status/575068905510715392 [23:02] we are so fucked [23:03] SketchCow: let's just call that discussion an "interesting" one and forget about it :) [23:05] I can't blame them too much for it- Apple's press conferences are a big deal, and if you pretend to cover tech in any way, you really do need to cover them [23:16] I won't forget about it if you do it again. [23:16] IN OTHER NEWS: http://news.stanford.edu/news/2015/march/thedemo-engelbart-live-030915.html is amazing and I will miss it [23:17] http://mikelrouse.com/new/works/the-demo/ [23:19] oh shit, that looks awesome [23:19] I hope it tours. [23:39] joepie91_: Nice screenshot :D [23:42] https://github.com/blog/1964-open-source-license-usage-on-github-com [23:45] *** dashcloud has quit IRC (Read error: Operation timed out) [23:46] http://www.reddit.com/r/KotakuInAction/comments/2y30o2/gdcrant_this_years_gdc_wasdifferent/ really is the greatest thing ever [23:52] *** dashcloud has joined #archiveteam-bs