[00:58] *** kitties has joined #archiveteam [01:28] *** Freddo has joined #archiveteam [01:28] *** RichardG has quit IRC (Ping timeout: 245 seconds) [01:28] *** Freddo has quit IRC (Client Quit) [01:52] If I have UC Berkley Lectures sitting around. [01:52] Where do I put them? [01:53] I downloaded some and then never uploaded them. [02:01] *** Burak has joined #archiveteam [02:01] *** Svekla has quit IRC (Read error: Connection reset by peer) [02:34] *** ItsYoda has quit IRC (Ping timeout: 260 seconds) [03:38] *** ItsYoda has joined #archiveteam [04:34] *** conradev has quit IRC (Quit: ...) [04:37] *** conradev has joined #archiveteam [04:43] *** qw3rty115 has joined #archiveteam [04:46] *** ItsYoda has quit IRC (Ping timeout: 260 seconds) [04:46] *** qw3rty114 has quit IRC (Read error: Operation timed out) [05:13] *** ranma has quit IRC (Ping timeout: 260 seconds) [05:15] *** ItsYoda has joined #archiveteam [05:33] *** vitzli has joined #archiveteam [06:02] *** kitties has quit IRC (Quit: Connection closed for inactivity) [06:07] *** indrora has joined #archiveteam [06:08] Wikispaces declared Jul 31 as the day all non-private wikis are going down forever. Turns out their API allows for a sitemap.xml which is as complete as I can surmise, which makes it good for scraping. [06:11] There is one problem: For huge sites, it returns not one single sitemap, but a "mutlipart" sitemap [06:16] (why not use --mirror? Because AJAX) [06:40] *** vitzli has quit IRC (Quit: Leaving) [08:51] *** bRick5772 has joined #archiveteam [09:03] *** schbirid has joined #archiveteam [10:31] *** paparus has joined #archiveteam [10:32] *** paparus has quit IRC (Client Quit) [10:32] *** Stiletto has joined #archiveteam [10:32] *** muramasa has quit IRC (Read error: Operation timed out) [10:33] *** Stilett0 has quit IRC (Read error: Operation timed out) [10:34] *** muramasa has joined #archiveteam [10:34] *** BlueMax has quit IRC (Leaving) [14:05] *** RichardG has joined #archiveteam [14:59] *** ranavalon has joined #archiveteam [15:01] *** ranavalon has quit IRC (Client Quit) [15:08] *** Pixi has quit IRC (Quit: Pixi) [15:08] *** Pixi has joined #archiveteam [16:07] Wikispaces should be the thing we go after [16:07] It's awful [16:28] *** Burak has quit IRC (Read error: Connection reset by peer) [16:28] *** Burak has joined #archiveteam [16:29] *** atrocity has quit IRC (Read error: Connection reset by peer) [16:59] *** RichardG has quit IRC (Read error: Operation timed out) [17:03] *** RichardG has joined #archiveteam [17:06] *** djbeadle has joined #archiveteam [17:10] *** RichardG has quit IRC (Read error: Connection reset by peer) [17:11] *** RichardG has joined #archiveteam [18:30] *** Mateon1 has quit IRC (Read error: Operation timed out) [18:31] *** Mateon1 has joined #archiveteam [18:48] *** RichardG has quit IRC (Read error: Connection reset by peer) [18:49] *** RichardG has joined #archiveteam [18:51] *** ats has quit IRC (Read error: Operation timed out) [18:51] *** ats has joined #archiveteam [19:17] *** obelisk has joined #archiveteam [19:29] *** schbirid has quit IRC (Leaving) [19:29] *** schbirid has joined #archiveteam [20:06] *** BlueMax has joined #archiveteam [20:29] *** ___ has joined #archiveteam [20:30] *** ___ has quit IRC (Client Quit) [20:32] *** octothorp has quit IRC (Read error: Connection reset by peer) [20:33] *** octothorp has joined #archiveteam [20:39] *** sekolyn has joined #archiveteam [20:39] *** octothorp has quit IRC (Read error: Connection reset by peer) [20:44] *** |Ripley| has quit IRC (Quit: ZNC 1.6.3 - http://znc.in) [20:55] *** lexiconda has joined #archiveteam [20:55] *** lexiconda is now known as lexicon [21:01] *** K4k has quit IRC (Read error: Connection reset by peer) [21:11] *** obelisk has quit IRC (Remote host closed the connection) [21:20] okay, so I have some terrible bash logic that checks if the sitemap is multipart and does some terrible egrep/sed pipelining to get the "complete" sitemap [21:21] ` grep -q 'sitemap.complete ` [21:32] *** |Ripley| has joined #archiveteam [21:57] *** bRick5772 has quit IRC (Quit: Leaving.) [22:19] *** BlueMax has quit IRC (Read error: Connection reset by peer) [22:26] *** djbeadle has quit IRC (djbeadle) [22:48] *** dogsrcool has quit IRC (Quit: Ping timeout (120 seconds)) [22:49] *** dogsrcool has joined #archiveteam [22:59] *** robink has quit IRC (Read error: Connection reset by peer) [23:01] *** robink has joined #archiveteam [23:08] Okay, now to make wget ignore the terrible in-browser JS hackery [23:31] *** jschwart has quit IRC (Konversation terminated!) [23:40] *** godane has quit IRC (Read error: Operation timed out) [23:40] Fantastic, figured that one out. [23:43] *** godane has joined #archiveteam [23:45] indrora: Let's move this to #archiveteam-bs please. This channel is mostly intended for announcements. [23:46] (As in, "ohshitohshitohshit this site is going down!!"-type messages.) [23:47] *** godane has quit IRC (Client Quit) [23:48] *** godane has joined #archiveteam [23:49] *** robink has quit IRC (Read error: Connection reset by peer) [23:51] *** robink has joined #archiveteam