#archiveteam-bs 2015-03-09,Mon

↑back Search

Time Nickname Message
01:06 🔗 joepie91_ Ctrl-S: vpsdime is using OpenVZ
01:06 🔗 joepie91_ OpenVZ does not have working CPU limits
01:06 🔗 joepie91_ only on paper
01:06 🔗 joepie91_ I'd cc schbirid but he's gone
01:06 🔗 joepie91_ :p
01:27 🔗 mistym has joined #archiveteam-bs
01:47 🔗 primus104 has quit IRC (Leaving.)
02:07 🔗 mst_ has joined #archiveteam-bs
03:16 🔗 Jonimus has joined #archiveteam-bs
03:44 🔗 xmc has quit IRC (Ping timeout: 512 seconds)
03:51 🔗 xmc has joined #archiveteam-bs
03:51 🔗 swebb sets mode: +o xmc
03:57 🔗 mistym has quit IRC (Remote host closed the connection)
04:02 🔗 mst_ has quit IRC (Quit: bye)
04:23 🔗 mistym has joined #archiveteam-bs
05:16 🔗 mistym has quit IRC (Remote host closed the connection)
05:50 🔗 mistym has joined #archiveteam-bs
06:14 🔗 acridAxid is there a place where people are discussing solutions to the metadata problem?
06:16 🔗 acridAxid i'm working with a project that is gathering lots of data about video games, and I'm trying to standardize the way tools and frontends get that metadata
06:16 🔗 acridAxid currently its a mix of "shit hardcoded into a program", "shit scraped from websites of wildly varying quality" and "shit the user told us about this set of data"
06:18 🔗 acridAxid i'm wondering if there are other projects that are working in this space that I should talk to/work with, or if i'm venturing into the unknown
06:35 🔗 mistym has quit IRC (Remote host closed the connection)
07:23 🔗 SketchCow joepie91_: Do you WANT to breakdown of how that interaction of yours could have not been a clusterfuck?
07:24 🔗 SketchCow acridAxid: Mobygames
07:59 🔗 primus104 has joined #archiveteam-bs
08:26 🔗 underscor has quit IRC (Ping timeout: 370 seconds)
08:26 🔗 underscor has joined #archiveteam-bs
08:26 🔗 swebb sets mode: +o underscor
10:46 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
11:32 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
11:35 🔗 dashcloud has joined #archiveteam-bs
12:08 🔗 techapj I have a newbie query, when the arhive completes for a php forum (vbulletin), it has php files (forumdisplay.php?f=129, showthread.php?t=405494, etc)
12:08 🔗 techapj Now I want to host these files as an static HTML archive on an Nginx server, but I think the server interprsets these files as PHP and instead of showing html content it download the file on client browser (http://gearbox.techapj.com/oldforums.gearboxsoftware.com/forumdisplay.php?f=128)
12:08 🔗 techapj can anyone please tell me how can i host these files as static html server via nginx?
12:17 🔗 primus104 has quit IRC (Leaving.)
12:43 🔗 primus104 has joined #archiveteam-bs
13:18 🔗 sankin has joined #archiveteam-bs
13:50 🔗 primus104 has quit IRC (Leaving.)
13:54 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
14:02 🔗 dashcloud has joined #archiveteam-bs
14:27 🔗 Start has quit IRC (Disconnected.)
14:49 🔗 mistym has joined #archiveteam-bs
14:56 🔗 mistym has quit IRC (Remote host closed the connection)
14:59 🔗 godane so i got some good news
15:00 🔗 godane i will be grabbing the PRI's The World: Global Hits podcast
15:00 🔗 godane looks like i can grab alot of the older dates from that podcast
15:00 🔗 godane i also found a podcast called Walt Disney World Today
15:01 🔗 godane 1450 podcasts of that so far so i'm grabbing it
15:03 🔗 Start has joined #archiveteam-bs
15:08 🔗 balrog http://zennistrad.tumblr.com/post/113151511988/gamergate-is-killing-video-games -- having effects on funding of archival too :/
15:08 🔗 ersi #blablagate
15:09 🔗 balrog https://storify.com/8BitBecca/video-game-archiving-post-gamergate
15:12 🔗 mistym has joined #archiveteam-bs
15:18 🔗 SketchCow So, walking this one carefully
15:18 🔗 SketchCow Games archiving is doing juuuuuuust fine
15:19 🔗 SketchCow But I will agree that places that are seeking any excuse, any at all, to not deal with game shit, have now found a legion of cheap excuses.
15:20 🔗 balrog the point about archiving games culture stuff is a good one, though
15:21 🔗 balrog who really makes noise when recorded streams get deleted? I mean, AT is holding on to TBs of twitch, right?
15:21 🔗 SketchCow So, interesting situation.
15:21 🔗 SketchCow I'm well on my way to becoming a big name in the game archiving field.
15:22 🔗 SketchCow Archive Team is well on its way to being the name, outside of archive.org itself, in web archiving.
15:22 🔗 SketchCow And since I, and Archive Team, are both insane completionists with "fuck you, it's going into the vault"
15:22 🔗 SketchCow Radical inclusionism
15:23 🔗 SketchCow So one way to battle forces of despair is to just keep rocking
15:23 🔗 balrog IMHO, gamergate happenings should be preserved so we can look back and see how things went wrong and what sort of idiots are prevalent among internet culture. This sort of reaction from academia does indeed feel like excuses
15:25 🔗 ersi Rock rock rock, around the clock
15:27 🔗 balrog SketchCow: you're completely right, though, all we can do is keep pushing forward
15:27 🔗 DFJustin oh no, no more academic game archiving, what will we do without that juggernaut of success
15:28 🔗 * ersi chuckles
15:41 🔗 balrog DFJustin: now, when academics are finally considering it, is not the time for them to be like "well people suck, this isn't worth it"
15:56 🔗 SketchCow I have no hate for the Academic archivists. It's just they're going to end up depending on us.
15:57 🔗 Start_ has joined #archiveteam-bs
15:57 🔗 Start has quit IRC (Read error: Connection reset by peer)
16:00 🔗 balrog well, we can't ignore the fact that *someone* has to pay for long term storage maintenance.
16:00 🔗 balrog it's not like you can stash hard drives in a cold, dry basement and expect even half of them to work 20 years later.
16:02 🔗 robink has quit IRC (Ping timeout: 492 seconds)
16:07 🔗 SketchCow That funding isn't coming from academics, sir
16:07 🔗 balrog oh no, it's not
16:07 🔗 balrog that's not the point
16:09 🔗 Start_ has quit IRC (Ping timeout: 370 seconds)
16:10 🔗 robink has joined #archiveteam-bs
16:14 🔗 Start has joined #archiveteam-bs
16:16 🔗 robink has quit IRC (Remote host closed the connection)
16:16 🔗 Start has quit IRC (Read error: Connection reset by peer)
16:17 🔗 Start has joined #archiveteam-bs
16:22 🔗 mistym has quit IRC (Remote host closed the connection)
16:31 🔗 primus104 has joined #archiveteam-bs
16:38 🔗 yipdw balrog: it's being preserved
16:38 🔗 yipdw admittedly everything I've thrown in has been with the goal of embarassing assholes but hey
16:39 🔗 balrog yipdw: I mentioned in #archivebot -- I really wish we had a way of scraping something from archive.today into an archive.org compliant warc
16:39 🔗 yipdw we do; scrape archive.today
16:39 🔗 balrog is that automatic or is there a special way to do that?
16:39 🔗 yipdw !ao http://archive.today/...
16:40 🔗 balrog but will that archive it under archive.today or the original URL?
16:40 🔗 yipdw the former, which is the correct action
16:40 🔗 yipdw to do otherwise is to misrepresent the source
16:40 🔗 abartov has joined #archiveteam-bs
16:41 🔗 yipdw if a more direct map is desired, an archive.today => URL index can be generated from those WARCs
16:41 🔗 yipdw each archive.today capture records the source URL
16:41 🔗 yipdw however I do not think it is a good idea to do that step directly; it is too complex
16:42 🔗 balrog often you have the original URL and not the archive.today URL
16:42 🔗 yipdw yes
16:42 🔗 yipdw if you need that direct mapping then build a URL -> archive.today WARC index
16:43 🔗 yipdw maybe that's what you had in mind anyway; I just wanted to state that I think doing that transformation directly in the WARC is too hard to be worth it
16:46 🔗 Start has quit IRC (Disconnected.)
16:57 🔗 Start has joined #archiveteam-bs
17:00 🔗 Start has quit IRC (Read error: Connection reset by peer)
17:04 🔗 Start has joined #archiveteam-bs
17:08 🔗 dashcloud has quit IRC (Read error: Operation timed out)
17:11 🔗 dashcloud has joined #archiveteam-bs
17:17 🔗 chfoo has quit IRC (Ping timeout: 512 seconds)
17:20 🔗 chfoo has joined #archiveteam-bs
17:30 🔗 sankin1 has joined #archiveteam-bs
17:31 🔗 sankin has quit IRC (Read error: Operation timed out)
17:35 🔗 Sanqui there should be a meta-archive
17:35 🔗 xmc all i can see in that is an archive of METAR readings
17:35 🔗 Sanqui which checks the wayback machine, archive.today, archive data, etc.
17:35 🔗 xmc :P
17:37 🔗 sankin has joined #archiveteam-bs
17:39 🔗 sankin1 has quit IRC (Read error: Operation timed out)
17:45 🔗 Start_ has joined #archiveteam-bs
17:47 🔗 Start has quit IRC (Read error: Connection reset by peer)
17:47 🔗 Start_ is now known as Start
17:49 🔗 sankin has quit IRC (Ping timeout: 600 seconds)
17:49 🔗 dashcloud has quit IRC (Read error: Operation timed out)
17:49 🔗 sankin has joined #archiveteam-bs
17:50 🔗 dashcloud has joined #archiveteam-bs
18:03 🔗 sankin has quit IRC (Read error: Operation timed out)
18:13 🔗 joepie91_ techapj: how have you archived the site in question?
18:15 🔗 techapj joepie91_: using wpull
18:16 🔗 joepie91_ techapj: is it saved as a WARC?
18:16 🔗 techapj no, there are a bunch of php files from vb forum
18:16 🔗 joepie91_ :(
18:16 🔗 joepie91_ techapj: please save it as a WARC next time
18:17 🔗 joepie91_ it retains important metadata
18:17 🔗 joepie91_ techapj: anyway, you can /somehow/ configure nginx to treat .php files as text/html
18:17 🔗 joepie91_ ie. configuring mimetype
18:17 🔗 joepie91_ I'm not sure of the specifics because I don't use nginx, but that's what you're looking for
18:17 🔗 Jonimus has quit IRC (Ping timeout: 370 seconds)
18:17 🔗 joepie91_ may need to use some kind of wildcard setup where anything *containing* .php is treated as text/html, if the ?query=string is part of your filename
18:18 🔗 joepie91_ because it will treat .php plus the entire query string as the "extension" and not match on .php alone
18:18 🔗 techapj joepie91_: i configured the mime type and i can see the page in browser now
18:19 🔗 techapj but when i try to view `forumdisplay.php?f=128` it shows content for `forumdisplay.php` instead
18:19 🔗 techapj ignoring the ?f=128 querystring
18:19 🔗 joepie91_ right, didn't think about that
18:19 🔗 joepie91_ makes sense
18:20 🔗 joepie91_ ?f=128 isn't part of the file path to nginx
18:20 🔗 joepie91_ but the query string
18:20 🔗 joepie91_ so it loads forumdisplay.php with f=128 as query string but obv that doesn't work
18:21 🔗 techapj yup :(
18:22 🔗 techapj any idea how to handle this? i am sruck here..
18:23 🔗 joepie91_ techapj: I have no idea tbh, if you have a WARC it;s a lot easier
18:24 🔗 joepie91_ there's daemons for serving content directly from WARCs
18:25 🔗 techapj joepie91_: just curious, will WARC preserve links like http://oldforums.gearboxsoftware.com/showthread.php?t=405494
18:26 🔗 techapj that is cruciaa to preserve google crawls
18:26 🔗 balrog techapj: warc records the http requests/responses, including headers
18:27 🔗 sankin has joined #archiveteam-bs
18:37 🔗 dashcloud has quit IRC (Read error: Operation timed out)
18:37 🔗 Start has quit IRC (Disconnected.)
18:41 🔗 dashcloud has joined #archiveteam-bs
18:42 🔗 Start has joined #archiveteam-bs
18:43 🔗 Rotab has quit IRC (Read error: Connection reset by peer)
18:43 🔗 Start has quit IRC (Read error: Connection reset by peer)
18:43 🔗 Start has joined #archiveteam-bs
18:50 🔗 Rotab has joined #archiveteam-bs
18:56 🔗 atlogbot has joined #archiveteam-bs
19:20 🔗 Start has quit IRC (Disconnected.)
19:29 🔗 lag2 has joined #archiveteam-bs
19:36 🔗 Start has joined #archiveteam-bs
19:37 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:40 🔗 dashcloud has joined #archiveteam-bs
19:49 🔗 joepie91_ https://i.imgur.com/yBO49Wo.gif
19:49 🔗 techapj has quit IRC (Quit: Page closed)
20:03 🔗 Start has quit IRC (Disconnected.)
20:05 🔗 dashcloud has quit IRC (Read error: Operation timed out)
20:09 🔗 mistym has joined #archiveteam-bs
20:10 🔗 dashcloud has joined #archiveteam-bs
20:12 🔗 Start has joined #archiveteam-bs
20:14 🔗 Start has quit IRC (Client Quit)
20:19 🔗 SketchCow <@SketchCow> joepie91_: Do you WANT to breakdown of how that interaction of yours could have not been a clusterfuck?
20:21 🔗 xmc SketchCow, where in my logs should i look for context
20:21 🔗 SketchCow It was twitter
20:21 🔗 xmc oh
20:21 🔗 SketchCow I put Communications Museum in my GDC talk by the way
20:22 🔗 xmc cool
20:22 🔗 SketchCow Just a couple photos
20:23 🔗 xmc :)
20:24 🔗 xmc oh, you must be referring to joepie91_ mentioning that someone called him a sealion
20:26 🔗 Sanqui oh boy
20:28 🔗 xmc which, btw, i can totally understand happening
20:35 🔗 cbb has joined #archiveteam-bs
20:44 🔗 mistym has quit IRC (Remote host closed the connection)
20:58 🔗 sankin has quit IRC (Leaving.)
20:59 🔗 mistym has joined #archiveteam-bs
21:05 🔗 mistym has quit IRC (Remote host closed the connection)
21:06 🔗 mistym has joined #archiveteam-bs
21:16 🔗 dashcloud has quit IRC (Read error: Operation timed out)
21:20 🔗 dashcloud has joined #archiveteam-bs
21:41 🔗 cbb has quit IRC (Quit: cbb)
21:44 🔗 Jonimus has joined #archiveteam-bs
21:47 🔗 Nertsy has joined #archiveteam-bs
21:55 🔗 schbirid has joined #archiveteam-bs
22:00 🔗 schbirid dammit wget, learn to mirror with --header="Accept-Encoding: gzip" already
22:00 🔗 schbirid good night >:(
22:00 🔗 schbirid has quit IRC (Client Quit)
22:12 🔗 mistym has quit IRC (Remote host closed the connection)
22:12 🔗 cbb has joined #archiveteam-bs
22:17 🔗 Start has joined #archiveteam-bs
22:21 🔗 mistym has joined #archiveteam-bs
22:21 🔗 Sk1d has joined #archiveteam-bs
22:51 🔗 lag2 has quit IRC (Ping timeout: 512 seconds)
23:02 🔗 joepie91_ https://twitter.com/joepie91/status/575068905510715392
23:02 🔗 joepie91_ we are so fucked
23:03 🔗 joepie91_ SketchCow: let's just call that discussion an "interesting" one and forget about it :)
23:05 🔗 dashcloud I can't blame them too much for it- Apple's press conferences are a big deal, and if you pretend to cover tech in any way, you really do need to cover them
23:16 🔗 SketchCow I won't forget about it if you do it again.
23:16 🔗 SketchCow IN OTHER NEWS: http://news.stanford.edu/news/2015/march/thedemo-engelbart-live-030915.html is amazing and I will miss it
23:17 🔗 SketchCow http://mikelrouse.com/new/works/the-demo/
23:19 🔗 garyrh oh shit, that looks awesome
23:19 🔗 SketchCow I hope it tours.
23:39 🔗 ersi joepie91_: Nice screenshot :D
23:42 🔗 SketchCow https://github.com/blog/1964-open-source-license-usage-on-github-com
23:45 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:46 🔗 SketchCow http://www.reddit.com/r/KotakuInAction/comments/2y30o2/gdcrant_this_years_gdc_wasdifferent/ really is the greatest thing ever
23:52 🔗 dashcloud has joined #archiveteam-bs

irclogger-viewer