[00:04] i'm looking at valleywag right now [00:10] *** Stiletto has quit IRC (Read error: Operation timed out) [00:10] *** Stiletto has joined #archiveteam-bs [00:10] JesseW: i think valleywag is just a portal to stories on other gawker's websites [00:11] and looks like they didn't put link up that often on it too [00:14] i started to upload my collection of The Brooklyn Paper : https://archive.org/details/The_Brooklyn_Paper-volume-25-issue-51 [00:17] *** Eloquence has joined #archiveteam-bs [00:20] Hm, interesting. [00:20] Hi Erik. [00:21] valleywag was dead well before this suit started (it was a seperate Gawker site, but was given the axe some time ago) [00:28] JesseW, yeah, mw edits are a bit tricky since they're often dependent on other people's work, and as you said on slack wikis tend to be freely licensed already [00:29] I want to see if I can get some of the lesser known sites that nonetheless have huge amounts of interesting reviews and similar data, like zazzle or opentable [00:29] I'm also working on https://github.com/eloquence/lib.reviews so my focus has been mostly on reviews for that reason [00:30] That makes sense. [00:37] Eloquence: with regards to that, you should connect up with the https://critiquebrainz.org/ people (related to MusicBrainz) [00:38] and the moderation issues (with regards to hosting reviews of arbitrary things), sound deeply deeply terrifying (at least to me) [00:38] the amount of *legal threats* alone will be significant [00:39] let alone extra-legal threats and gamesmanship [00:43] oh, critiquebrainz definitely looks very interesting - thanks for the tip. [00:44] yeah, it's a hard set of problems. my thinking is that we need some notion of teams/groups that can have their own rules of engagement, certificates, etc. [00:44] certainly. [00:44] er, "certainly" in regards to you're welcome for the tip about criquebrainz [00:44] yeah :) [00:46] The Austin Chronicle pdfs for 2006 are uploaded [00:46] the problem is that people angry about the reviews aren't going to bother going through your process -- they're just going to try to threaten/attack whatever weak point they can find to get rid of the reviews and punish those they feel harmed by [00:47] and if you are providing hosting for the reviews -- you are one of those weak points [00:47] JesseW, *nod* what worked for Wikipedia is to have volunteer first-responders who help differentiate between "cranky person who can be ignored" vs. "person who needs help" vs. "actual credible legal threat" [00:48] another site you might be interested in is 800notes.com -- they review phone numbers, and are one of the best known/oldest sites out there [00:48] iiinteresting, I probably came across that one before when dealing with spammers [00:49] Eloquence: nods; but I strongly expect a site focused on reviews is going to generate *even more challenging* support needs than Wikipedia does (which is saying a LOT) [00:49] both of the legal threat variety and the "fraudulent paid review" variety, yeah .. [00:50] but hey, if it wasn't a hard problem someone would have done it already :) [00:51] also, there's a lot less of a pool of well-intentioned volunteers to burn through with the review site, I think. Wikipedia has a lot of goodwill and other jobs for interested volunteers to work on aside from the grind of OTRS -- it seems like lib.reviews would have quite a bit less [00:52] mhh, not sure. there's the reviews, then the data about things -- product metadata and the like --, various sync/maintenance/import jobs that could be rewarding, policy wonkery, coming up with cool new badges/certificates, scavenger hunt type activities ... [00:52] well, the point is lots of people *have* done "make a site hosting reviews" already -- they just generally (although not always) don't license them under FOSS licenses, and don't try to review everything (although Amazon comes close), and don't rely on donations for funding. [00:53] Epinions did try for everything, I think, and eventually went under. [00:53] Yelp and Amazon are definitely closest when it comes to "everything" sites [00:54] well, the data about things will hopefully be leveraged from other sources (e.g. wikipdata, musicbrainz, etc), and maintainance jobs are difficult to use for onboarding (as are policy wonkery). [00:54] agreed [00:54] +osm has lots of point of interest data [00:54] for restaurants and the like [00:55] I hope that teams/groups can come up with interesting onboarding ideas I can't - kind of like subreddits. [00:56] also, with regards to licensing the reviews under FOSS licenses (and collecting existing reviews into one-big-site) -- one thing that is often quite important to the meaning/value of a review is precisely control over it's distribution (i.e. who gets to know it and who doesn't). I suspect that may be something of an issue as well. [00:56] what do you mean by "who gets to know it"? [00:57] Take stock tips as a architypal example -- a tip is valuable precisely depending on how many other people know it. [00:59] that's interesting. again I think that groups could be a good way to keep a certain amount of "in the know-ness" -- if you follow a certain group, you're in the know. on reddit being part of this or that subreddit is as important as being part of reddit as a whole. [00:59] and even with relatively widely distributed reviews, on say, Amazon -- part of the reason Amazon is willing to pay the (non-negligible) direct and indirect costs of hosting the reviews is because in order to see them, you have to go to Amazon's page, where they can track you, and sell you on other things. [01:00] If the reviews were licensed under FOSS licenses, that would make them less able to pay back the cost of hosting them [01:01] with regards to groups (hosted on a central server) -- that adds a whole new set of support tasks, and decisions: do you allow the GNAAA (to use an old example) to make a group (or hundreds)? Either yes or no makes trouble for you. [01:01] you in the sense of the site host. [01:02] in my experience the donation model scales pretty well proportional to the real world usefulness of a thing -- unless we immediately import huge datasets like OSM or whatever, which could balloon costs. [01:02] re: GNAA and the like, yeah, I've thought a bit about that one -- will probably work with the code of conduct community to come up with some meta-guidelines. [01:02] racism is right out, but pro-GMO vs. anti-GMO would probably be okay [01:02] I think your expierence is a bit skewed by having been involved with Wikimedia. :-) [01:04] it is! but there are quite a few other donation driven success stories now, too, especially in the content-creation community (podcasts, video channels, etc.) [01:04] to me, being unable to pay for the hosting because traffic is growing too fast is the kind of problem I want to have :) [01:05] oh, I wasn't worried about *hosting* costs -- I agree with you that those will likely stay in parallel with donations pretty well [01:05] I was worried about *support* costs [01:05] Which I think scale much worse [01:06] You go from negligible to weekly death threats *way* too abruptly (as I understand it). [01:06] I'm pretty optimistic that we can build a meta-community if we can build a community at all. it's building a community at all that (I find) is the really hard part. getting enough people to care that you get momentum [01:07] I agree that's a hard part, certainly [01:07] another community I want to study a bit more is BeWelcome.org, which to me seems to have a lot of parallels [01:08] hm, hadn't heard of that before [01:08] nonprofit alternative to a commercial service (Couchsurfing), entirely volunteer-driven, etc. [01:08] hm, neat [01:08] yet when I used it, it became clear pretty quickly that folks really care about stuff like safety issues, which of course are one of the number #1 concerns for a community like that [01:09] oh, another source of reviews to think about -- ebay [01:09] *** SN4T14 has quit IRC (Quit: Leaving) [01:10] * JesseW is reading https://www.bewelcome.org/safety now [01:10] I always thought about them as transactional first and foremost -- but you're right, browsing around, I find quite a few book reviews & such [01:11] ebay certainly seems a good freeyourstuff.cc candidate [01:11] * Eloquence starts a spreadsheet .. [01:11] either that or github issues [01:12] yeah, I want to sample some data from a few .. and then I'll log issues for the top candidate sites [01:12] especially interested in ones that value quality of reviews > quantity [01:16] *** SN4T14 has joined #archiveteam-bs [01:16] *** Honno has quit IRC (Read error: Operation timed out) [01:17] nods [01:18] While it would be very difficult to automatically identify them, there are a lot of reviews distributed merely as arbitrary web pages or blogs [01:18] particularly book reviews [01:18] another suggestion: ripoffreport.com [01:21] *** SN4T14 has quit IRC (Quit: Leaving) [01:23] huh, never heard of that one - iinteresting ... [01:24] happy to point you at more :-) [01:24] as I think of them [01:25] oh, here's another one: https://pubpeer.com -- reviewing scientific papers [01:29] *** SN4T14 has joined #archiveteam-bs [01:29] oh man [01:29] ripoffreport is *full* of poorly written nonsense [01:29] heh [01:29] yes, yes it is. [01:30] surprised they (pubpeer) don't use any CC-* license already since they appear to be 501(c)(3) .. hmmm [01:32] anyhow, what are we discussing again? :) [01:34] JesseW and I were kicking around the difficulties (and opportunities) in building an open review site / candidates for importing data for freeyourstuff.cc [01:35] joepie91: i think you meant the internet is *full* of poorly written nonsense.. [01:35] nods, yep [01:35] bwn: ripoffreport is worse than average [01:35] I would expect it would be, given it's intended scope [01:36] point :) [01:42] *** VADemon has quit IRC (Quit: left4dead) [01:55] Eloquence: if you haven't read http://www.bevolunteer.org/wp-content/uploads/2015/11/BV-Annual-Report-2014-15.pdf -- it's quite detailed and informative [01:55] as are, I presume, the previous ones [01:57] that's pretty impressive reporting for such a small community [01:58] yeah, it is :-) [01:59] AO3 (Archive Of Our Own)'s parent organization is also worth looking at as an organizational model, I think. [02:01] that's a good tip - I've heard good things about them before, definitely taking a closer look [02:03] they are also a good study in organizational conflict/drama and its resolution, too :-) [02:03] although you may have plenty of data on that from wikimedia [02:05] yeah -- the international dimension at wikimedia (chapters etc.) has made that stuff very interesting [02:05] nods [02:23] *** Stiletto has quit IRC (Read error: Operation timed out) [02:23] *** Stiletto has joined #archiveteam-bs [02:30] reading about otw/ao3 [02:32] fanlore is pretty awesome [02:34] it certainly is [02:35] * JesseW is reminded to check the Fanlore entry on WikiApiary [02:43] *** Stiletto has quit IRC (Read error: Operation timed out) [02:43] *** xXx_ndidd has joined #archiveteam-bs [02:43] *** Stiletto has joined #archiveteam-bs [02:47] *** ndizzle has quit IRC (Ping timeout: 244 seconds) [02:48] if you're looking for other possible sources of content to pull into freeyourstuff.cc, have you considered the user reviews from goodreads, themoviedb.org, and the tvdb.com? [02:55] dashcloud: goodreads is already supported, afaik [02:55] Eloquence: the other two seem worth considering, though [02:55] if it matters, it's not listed on the wiki though [03:01] dashcloud: what wiki? [03:28] *** mutoso has joined #archiveteam-bs [03:29] speaking of good reads [03:29] nice @ this being on IA https://www.goodreads.com/review/show/340987215 [03:40] dashcloud, I didn't know about themoviedb and thetvdb - these definitely seem like great candidates to support, thanks! [03:40] and of course if any of y'all want to take a crack at a plugin, happy to assist [03:55] *** Eloquence has quit IRC (Read error: Operation timed out) [04:11] these fanlore/otw/ao3 projects are great, thank you jessew [04:12] reminds me of old friends [04:12] *** Sk1d has joined #archiveteam-bs [04:13] bwn: delighted to have pointed you in their direction! [04:34] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [04:34] *** tomwsmf-a has joined #archiveteam-bs [04:54] *** jut has joined #archiveteam-bs [05:15] *** dan- has quit IRC (Ping timeout: 260 seconds) [05:57] *** ralphdnak has joined #archiveteam-bs [05:57] *** Honno has joined #archiveteam-bs [06:48] *** dan- has joined #archiveteam-bs [06:57] *** schbirid has joined #archiveteam-bs [07:08] *** JesseW has quit IRC (Ping timeout: 370 seconds) [07:10] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [07:22] *** Stiletto has quit IRC (Read error: Operation timed out) [07:22] *** Stiletto has joined #archiveteam-bs [08:04] *** xXx_ndidd has quit IRC (Ping timeout: 244 seconds) [08:09] *** Stiletto has quit IRC (Read error: Operation timed out) [08:10] *** Stiletto has joined #archiveteam-bs [08:12] *** RichardG has quit IRC (Read error: Operation timed out) [08:21] *** dashcloud has quit IRC (Ping timeout: 260 seconds) [08:25] *** dashcloud has joined #archiveteam-bs [08:47] *** Stiletto has quit IRC (Read error: Operation timed out) [08:47] *** Stiletto has joined #archiveteam-bs [09:07] *** Stiletto has quit IRC (Read error: Operation timed out) [09:07] *** Stiletto has joined #archiveteam-bs [09:45] *** kristian_ has joined #archiveteam-bs [10:10] *** Stiletto has quit IRC (Read error: Operation timed out) [10:10] *** Stiletto has joined #archiveteam-bs [10:13] *** dashcloud has quit IRC (Read error: Operation timed out) [10:15] *** dashcloud has joined #archiveteam-bs [10:41] guys i have a couple of warcs to send to archive.org [10:41] what's the best way to do so? [10:41] somebody here was saying that simply uploading them is not enough [10:42] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [10:43] *** Lord_Nigh has joined #archiveteam-bs [10:43] if you want them imported into the wayback machine the media type on the item needs to be changed iirc [10:43] iirc? [10:43] in any case the first step is uploading the warcs, then I think probably just requesting someone to change them over [10:43] check this https://gist.github.com/Asparagirl/6206247 - mediatype web, collection opensource does the trick for me. [10:43] if I remember correctly [10:45] so warc.gz warc.cdx warc.meta right? [10:45] the files i mean [10:46] I think you just need to upload the WARC file itself, everything else will be created through IA derive. [10:47] k thanks [10:58] so i'm getting this error: InternalErrorWe encountered an internal error. Please try again. [10:58] but only when trying to make g4tv.com-video-1166 item [11:02] i even blocked all metadata [11:05] now even the uploader page doesn't want g4tv.com-vdeo1166 as a item [11:06] whats funny is the item is not used or even dark [11:06] https://archive.org/details/g4tv.com-video1166 [11:06] there is not history at all for that item so whats the probelm [11:06] *problem [11:11] i'm adding a -1 to it so it will upload [11:30] *** Stiletto has quit IRC (Read error: Operation timed out) [11:30] *** Stiletto has joined #archiveteam-bs [11:59] *** kristian_ has quit IRC (Leaving) [12:22] *** Stiletto has quit IRC (Read error: Operation timed out) [12:22] *** Stiletto has joined #archiveteam-bs [12:45] *** Stiletto has quit IRC (Read error: Operation timed out) [12:45] *** Stiletto has joined #archiveteam-bs [13:02] *** RichardG has joined #archiveteam-bs [13:22] Microsoft to aquire LinkedIn [13:32] *** atrocity has joined #archiveteam-bs [13:36] *** RichardG has quit IRC (Read error: Operation timed out) [13:53] *** RichardG has joined #archiveteam-bs [14:14] *** Stiletto has quit IRC (Remote host closed the connection) [14:14] *** Stiletto has joined #archiveteam-bs [14:17] *** RichardG_ has joined #archiveteam-bs [14:17] *** RichardG has quit IRC (Read error: Connection reset by peer) [14:32] *** dashcloud has quit IRC (Read error: Operation timed out) [14:37] *** dashcloud has joined #archiveteam-bs [14:43] *** BlueMaxim has quit IRC (Ping timeout: 250 seconds) [14:44] *** BlueMaxim has joined #archiveteam-bs [14:51] *** Stiletto has quit IRC (Read error: Operation timed out) [15:08] *** Stiletto has joined #archiveteam-bs [15:10] *** signius has joined #archiveteam-bs [15:10] *** BlueMaxim has quit IRC (Read error: Operation timed out) [15:11] *** BlueMaxim has joined #archiveteam-bs [15:14] *** signius_ has joined #archiveteam-bs [15:17] *** Stiletto has quit IRC (Read error: Connection reset by peer) [15:18] *** signius has quit IRC (Read error: Operation timed out) [15:27] *** RichardG_ is now known as RichardG [15:30] HCross2: well, their marketing tactics seem to match up anyway [15:30] so seems like a good fit [15:30] ("pls upgrade to Windows 10" vs "pls join LinkedIn") [15:38] does LinkedIn do much for you other than clutter your inbox? [15:52] so all flvhd g4tv.com files are uploaded official now [15:52] also 1xxx to 3xxx are checked to have been uploaded [15:59] *** JesseW has joined #archiveteam-bs [16:08] *** Stiletto has joined #archiveteam-bs [16:14] *** jut has quit IRC (Leaving) [16:25] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:54] *** whydomain has joined #archiveteam-bs [16:56] *** marvinw has quit IRC (Ping timeout: 246 seconds) [17:05] *** fie_ has joined #archiveteam-bs [17:06] *** fie has quit IRC (Read error: Operation timed out) [17:08] Are there any arguments that should NOT be used with wget when creating WARCs (e.g: will -k alter/corrupt the WARC?) [17:10] finally wrote my thing about using JWT for sessions: https://news.ycombinator.com/item?id=11895440 [17:17] *** dashcloud has quit IRC (Ping timeout: 244 seconds) [17:21] *** dashcloud has joined #archiveteam-bs [17:27] *** RichardG has quit IRC (Ping timeout: 258 seconds) [17:36] *** whydomain has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [17:51] *** antomati_ has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** luckcolor has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** Lord_Nigh has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** xioustic has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** GLaDOS has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** bwn has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** godane has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** Coderjoe has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** Mayonaise has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** tfgbd_znc has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** phuzion has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** beardicus has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** botpie91 has quit IRC (ny.us.hub ircd.choopa.net) [17:51] *** Lord_Nigh has joined #archiveteam-bs [17:51] *** xioustic has joined #archiveteam-bs [17:51] *** GLaDOS has joined #archiveteam-bs [17:51] *** bwn has joined #archiveteam-bs [17:51] *** godane has joined #archiveteam-bs [17:51] *** Coderjoe has joined #archiveteam-bs [17:51] *** antomati_ has joined #archiveteam-bs [17:51] *** Mayonaise has joined #archiveteam-bs [17:51] *** tfgbd_znc has joined #archiveteam-bs [17:51] *** phuzion has joined #archiveteam-bs [17:51] *** beardicus has joined #archiveteam-bs [17:51] *** botpie91 has joined #archiveteam-bs [17:51] *** luckcolor has joined #archiveteam-bs [17:51] *** ircd.choopa.net sets mode: +o antomati_ [17:51] *** swebb sets mode: +o antomati_ [17:52] *** Mayonaise has quit IRC (Read error: Operation timed out) [17:53] *** Mayonaise has joined #archiveteam-bs [17:55] *** marvinw has joined #archiveteam-bs [17:56] *** Aranje has quit IRC (Quit: Three sheets to the wind) [18:00] *** Aranje has joined #archiveteam-bs [18:02] *** RichardG has joined #archiveteam-bs [18:05] *** schbirid has quit IRC (Ping timeout: 258 seconds) [18:07] *** fie_ has quit IRC (Read error: Connection reset by peer) [18:07] *** ralphdnak has quit IRC (Read error: Connection reset by peer) [18:07] *** fie_ has joined #archiveteam-bs [18:07] *** ralphdnak has joined #archiveteam-bs [18:42] *** tomwsmf-a has joined #archiveteam-bs [18:51] *** ndiddy has joined #archiveteam-bs [18:56] *** DFJustin has quit IRC (Ping timeout: 260 seconds) [18:57] *** DFJustin has joined #archiveteam-bs [18:57] *** swebb sets mode: +o DFJustin [19:25] *** Eloquence has joined #archiveteam-bs [19:41] *** schbirid has joined #archiveteam-bs [20:10] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [20:27] *** dashcloud has quit IRC (Read error: Operation timed out) [20:30] *** dashcloud has joined #archiveteam-bs [21:02] *** Honno has quit IRC (Read error: Operation timed out) [21:02] *** Stilett0 has joined #archiveteam-bs [21:04] *** Stiletto has quit IRC (Ping timeout: 244 seconds) [21:08] *** tomwsmf-a has joined #archiveteam-bs [21:10] *** xXx_ndidd has joined #archiveteam-bs [21:10] *** ndizzle has joined #archiveteam-bs [21:14] *** ndiddy has quit IRC (Ping timeout: 244 seconds) [21:14] *** xXx_ndidd has quit IRC (Ping timeout: 244 seconds) [21:39] JesseW: this wiki article: http://archiveteam.org/index.php?title=Software (it's where I saw the mention for freeyourstuff first) [22:06] *** Eloquence has quit IRC (Ping timeout: 250 seconds) [22:09] dashcloud: that pages mentions freeyourstuff.cc supports goodreads (although it spells it GoodReads, which may have misled you) [22:10] i'm starting to upload more kpfa mp3s [22:30] *** Eloquence has joined #archiveteam-bs [22:40] *** mr-b has quit IRC (Read error: Operation timed out) [22:50] *** mr-b has joined #archiveteam-bs [23:22] ah