[01:08] uh-oh [01:08] yahoo takes another victim: http://ptch.com/ [01:11] it is shutting down already [01:11] wtf yahoo [01:12] bought 03-dec, closing 02-jan [01:13] classic move [02:20] damn, ptch does that infinite scroll shit [02:21] guess it's time to break out the phantomjs [02:39] Erm... I'm very new here -- how would I check if this is the JSON for all of ptches or just some of them? [02:39] http://ptch.com/v1/user/2/ptchs [02:54] Mooring: only way I know of is to compare what's in the JSON vs. what shows up if you fully scroll a ptch [03:09] 'k. Thanks. (Also, arg. Only the first bit.) [03:10] uploaded: https://archive.org/details/cdrom-twilight-027 [03:11] internet archive has a 3-to-1 match on donations right now [03:13] "your life is a story. share it." ... this sounds familiar. like another closed site had this. [03:13] so, yahoo bought it with the intention of killing it? [03:13] yay. [03:20] !ao http://modernwomandigest.com/five-reasons-glad-paul-walker-dead/ [03:21] !a http://modernwomandigest.com/ [03:22] opps [03:22] i put into the right channel [03:40] Well, maybe we want to archive that. [03:42] its was put into the archivebot channel [03:42] so its being archived [03:47] lol @ the giant editorial additions at the head and tail [03:47] hahah [03:47] "Your STUPID as fuck!" [03:49] ew. godaddy [05:12] http://www.fuzzymemories.tv/ [05:46] suggestions for ptch channel: #ptchin #ptchitout #ptchhistory [06:32] http://screenshotsdatabase.com/ download will be done today! [08:13] ptch ..highest API id I found seems to be http://ptch.com/v1/user/10257/ptchs [08:13] (if they're sequential... user ids... ) [08:58] deathy: there's more than that [08:58] deathy: see e.g. http://ptch.com/profile/34176 [09:02] right, so there might be some possibly big gaps (got quite a few empties when randomly trying) [09:04] also https://github.com/damianw/pyptch ..maybe useful [10:47] Do we have a ptch channel yet? [10:47] Or is it going to be #btch like destiny requires [11:25] warhammeronline is downloading loops over here... :P [11:25] after a while they also stop agan I've noticed [11:25] so I'll just let it running without excluding anything, since I have till the 18th of december. [13:53] http://www.newyorker.com/online/blogs/books/2013/12/the-art-of-google-book-scan.html [14:20] Paul Souellis is my girlfriend's chorus [14:20] Cousin [14:20] We spoke on the same stage in Belfast [14:54] Chorus sounds much more entertaining [15:04] so [15:05] to download Ptch we only have to do [15:06] download this: [15:06] http://ptch.com/profile/1 [15:06] http://ptch.com/profile/2 [15:06] etc. [15:06] that's it? [15:08] On initial view [15:08] We'll spend a few more days ensuring that's right [15:08] then go for it [15:09] there are also /group [15:10] http://ptch.com/group/louisville/ [15:12] i'm uploading 2011 issues of edn magazines [15:13] bad news is i don't have march issues of that year [15:13] very weird when i can find everything else but not those [15:13] ok, we'll see [15:13] :) [15:13] looking forward to the ptch download [16:04] SketchCow: I'm uploading more wilkow epsiodes from theblaze network [16:04] Brilliant [16:05] i figure wilkow i can start doing since glennbecktv collection is more up to date [16:06] the fun part is wilkow is going to be in the glennbecktv collection cause videos are 3 hours long as of june 2013 [16:08] also know i'm grabbing liberty treehouse now [16:08] i got october before it was removed [16:09] i'm starting to grab that now cause its been promoted to the weekends [16:12] the 'promoted to weekends' line came from this: https://www.youtube.com/watch?v=tiY228BmsCE [16:47] FYI: I just limited the maxPathDept of the www.warhammeronline.com crawl to 10 segments, since it was making an infinite loop [16:47] current status of webmonkey: [16:47] -- [16:47] 22.344 downloaded + 8.093 queued = 30.437 total [16:47] -- [16:47] 4.3 GiB crawled (4.3 GiB novel, 0 B dupByHash, 0 B notModified) [16:48] and screenshotsdatabase.com: [16:48] -- [16:48] 20.019 downloaded + 16.335 queued = 36.354 total [16:48] 480 MiB crawled (480 MiB novel, 0 B dupByHash, 0 B notModified) [16:48] -- [16:50] and as last www.warhammeronline.com: [16:50] -- [16:50] 6.0 GiB crawled (6.0 GiB novel, 0 B dupByHash, 0 B notModified) [16:50] 8.131 downloaded + 50.273 queued = 58.404 total [16:50] -- [18:37] anyone care about SAS? they have a big FTP site http://ftp.sas.com/techsup/download/ [18:43] i'm mirroring it [18:48] ivan`: how big is this ftp site? [18:48] just want to know cause if its too big for me then i will send it to archivebot [18:48] I was worried it was too big for archivebot [18:50] let me ls -lR it [18:51] yeah I think that's more of a sketchcow job [18:51] i sent in to archivebot [18:51] i have mirror websites with archivebot that are 30gb [18:51] I have a few big projects running now, but when I'm finished with some of those, I can maybe take that website? [18:56] SketchCow: ftp://ftp.sas.com/ 250GB [18:57] ivan`: how do you know that? [18:59] lftp ftp.sas.com [18:59] ls -lR > sas-lslR [18:59] cat sas-lslR | awk '{ print $5 }' | awk '{total = total + $1}END{print total}' [19:01] wget I guess? [20:10] Arkiver2: that's not right [20:10] re: ptch [20:10] you don't actually get anything by downloading just the profile; there's additional things that are loaded in on scroll [20:12] it's also worth it to click each ptch [20:12] since there's likes and comments on them that aren't displayed in the profile [20:12] * yipdw is working on a way to do this with phantomjs, it's kind of a mess though [20:12] yipdw: ptch related...wrote some things I found in the API in the #btch channel .. [20:12] oh, that's the channel?