#archiveteam 2013-01-12,Sat

↑back Search

Time Nickname Message
04:57 🔗 omf_ I just tried the tracker site and got a blank page. Is it down?
05:27 🔗 S[h]O[r]T godane if you give me the g4tv url list ill download it all
08:17 🔗 godane1 S[h]O[r]T: https://archive.org/details/g4tv.com-video-url-list-1
08:17 🔗 godane1 i uploaded the list
10:55 🔗 IR5611 12www.jizzday.com
11:04 🔗 GLaDOS I wouldn't set jizzday up on an autoban list..
12:39 🔗 SketchCow hi.
12:39 🔗 SketchCow we are bring back the JSTOR downloader
12:39 🔗 SketchCow for aaron.
12:40 🔗 SketchCow I believe alard and underscor have the code?
13:09 🔗 alard SketchCow: I must have it somewhere, yes.
13:19 🔗 SketchCow let's do it.
13:23 🔗 Cameron_D yes, when I first heard about it I wondered if we'd ever completed the JSTOR stuff
13:23 🔗 Cameron_D so lets do that
13:44 🔗 SketchCow get the code, prep the bookmarklet. please, someone check if jstor changed tos to address what we are doing.
13:45 🔗 SketchCow ill provide a non archive.org box access for this.
13:45 🔗 SketchCow And after I nap a little, I will write verbiage for the page.
13:47 🔗 GLaDOS (d) undertake any activity such as computer programs that automatically download or export Content, commonly known as web robots, spiders, crawlers, wanderers or accelerators that may interfere with, disrupt or otherwise burden the JSTOR server(s) or any third-party server(s) being used or accessed in connection with JSTOR
13:48 🔗 GLaDOS The only part in the prohibited activities clause which could conflict.
13:48 🔗 kennethre odd timing http://www.webpronews.com/jstor-opens-up-its-archive-kinda-sorta-2013-01
13:48 🔗 Cameron_D Won't be automated, IIRC it was something that had to be manually triggered for each fiile
13:49 🔗 GLaDOS AFAIK, we're ok
13:49 🔗 Cameron_D At least, bookmarklet implies that
13:53 🔗 SketchCow ok.
13:53 🔗 SketchCow did it change? is that old or new info?
13:53 🔗 GLaDOS Just fetched it
13:54 🔗 GLaDOS Wait
13:54 🔗 GLaDOS "he Content can be read online (but not printed or downloaded) as further described in Section 2.1 below."
13:55 🔗 godane i'm capturing thefeed from g4tv.com and its taking a very long time
13:56 🔗 godane this capture is without the images in it
13:56 🔗 godane its readly 733mb
13:56 🔗 godane *alreadly
13:57 🔗 GLaDOS (f) download or print, or attempt to download or print an entire issue of a journal (unless such entire issue has been purchased through the Publisher Sales Service) or substantial portions of the entire run of a journal, except for the specific case in which the complete contents of a journal issue or a substantial portion of Textual Content (e.g. a series of scholarly essays) is relevant to the particular research
13:57 🔗 GLaDOS (c) incorporate Content into an unrestricted database or website, except that authors or other Content creators may incorporate their Content into such sites with prior permission from the publisher and other applicable rights holders
13:57 🔗 GLaDOS Any of these new?
14:00 🔗 SketchCow check wayback
14:02 🔗 alard https://twitter.com/JSTOR/status/174155323668574208
14:03 🔗 GLaDOS Newest version in wayback is may 31
14:03 🔗 GLaDOS http://web.archive.org/web/20120531065004/http://about.jstor.org/participate-jstor/individuals/early-journal-content
14:03 🔗 GLaDOS Wait, mind mixed order of messages up
14:04 🔗 GLaDOS Blocked by robots.txt
14:04 🔗 Cameron_D http://www.jstor.org/robots.txt it won't be in wayback?
14:10 🔗 SketchCow I don't to move too rashly on this. I've done that in the past, not always forgood.
14:11 🔗 SketchCow a part ofmewants to make it so it violates the agreement, so thousands of people commit the felony.
14:11 🔗 SketchCow ok, rest
14:11 🔗 Cameron_D Yeah, and looknig at point (c) we may not be able to, although there are no past versions of the ToC to compare to
14:59 🔗 balrog_ I'm wondering if something exists that just stores any PDFs you're viewing in browser together with a little bit of metadata
15:40 🔗 riordan Is this where OpAaronSW is going down?
15:47 🔗 balrog_ to some extent
17:27 🔗 SketchCow I've put a slight waiting period on it to understand the best thing to do.
17:27 🔗 SketchCow But I want his stuff in away from keyboard on archive.org, so we are definitely doing that.
17:45 🔗 riordan SketchCow: totally - thank you man
18:18 🔗 godane uploaded: http://archive.org/details/www.aaronsw.com-20130112-mirror
21:46 🔗 SketchCow Hi.
21:46 🔗 SketchCow OK, so.
21:49 🔗 SketchCow #1. He deleted some sites, before hanging himself.
21:49 🔗 SketchCow #2. Making a collection now.
21:50 🔗 SketchCow #3. Soooooo angry still, but running out of people to blame
21:52 🔗 SketchCow I've cooked up a plan, working it out with alard.
21:53 🔗 SketchCow Here's the plan.
21:53 🔗 SketchCow Bookmarket, like the JSTOR downloader. You run it, and it downloads one document.
21:53 🔗 SketchCow You write something about aaron when you do it.
21:53 🔗 SketchCow And so it gets uploaded, with your memorial.
21:53 🔗 SketchCow Then everyone commits a felony
21:53 🔗 SketchCow And says their peace.
21:58 🔗 chronomex nice
22:02 🔗 dashcloud SketchCow: if the goal is to download everything, can't we just have something that would take a group of people months to complete (i.e, low profile enough to avoid detection until the end?)
22:06 🔗 alard One document seems like a nice idea. So people can also leave their name and a message?
22:06 🔗 alard (They'll still have to install the bookmarklet, even if there's only one document.)
22:28 🔗 balrog_ alard: I'd like to see something I mentioned above to be done
22:28 🔗 balrog_ basically like RECAP but for more than just PACER
22:41 🔗 SketchCow yes
22:42 🔗 balrog_ it bothers me greatly when PDFs (and content in general) that I browsed when doing research even recently goes dark
22:42 🔗 SketchCow dashcloud: goal is not torape jstor todeath
22:42 🔗 SketchCow sorry, ipad
22:43 🔗 balrog_ often a lot of the older stuff is on very sketchy sites to begin with :/
22:43 🔗 balrog_ look at chip datasheets for example...
22:48 🔗 philpem yeah, the EAB archive is one such site
22:48 🔗 philpem bloody huge collection of databooks and so on, sitting behind someone's cable modem.
22:48 🔗 philpem if I had the details of the guy who ran it, I'd offer to send hima a
22:49 🔗 philpem *him a Peli hardcase and a bunch of hard drives in exchange for a copy.
22:49 🔗 SketchCow n 20
22:51 🔗 SketchCow OK, so, this is what I would like.
22:51 🔗 SketchCow 1. JSTOR bookmarklet. You add it, click it, and it downloads the article, asking you for a message about aaron.
22:52 🔗 SketchCow 2. If someone has a virtual instance alard can use, I'd like you to coordinate with him. He has a lot done.
22:52 🔗 SketchCow 3. When the bookmarklet is used again, banner thanking people, and then a link to the Wikipedia article on Aaron.
22:52 🔗 SketchCow Make sense?
22:58 🔗 fault I've got some server capacity that can be used
23:01 🔗 fault Send me a message if you need somewhere to dump it, I can set up nginx/cgi, whatever stack you need
23:05 🔗 alard Actually, it's almost bedtime for me. I have little time tomorrow. So if there's anyone who wants to take over, please do.
23:05 🔗 alard I've done the following so far:
23:06 🔗 alard There's a bookmarklet that does a form POST with the PDF and the message to a script somewhere. What's needed is a server-side thing that receives the POST data, stores it and adds it to the memorial page.
23:10 🔗 chronomex this seems like a fit for tracker.archiveteam.org
23:13 🔗 SketchCow Who can take over?
23:53 🔗 Nemo_bis Some warrior instances getting killed for not enough memory.
23:54 🔗 Nemo_bis Ah, looks like I lost a user of which I downloaded some 10-15 GiB.

irclogger-viewer