[00:01] *** Ravenloft has quit IRC (Ping timeout: 378 seconds) [00:15] so i setup a old computer for my brother for his business/office room [00:15] fun fact: i can only get internet from wireless [00:16] the wired line to the wireless router was not working [00:16] *** cbb2 has joined #archiveteam [00:19] *** cbb has quit IRC (Ping timeout: 633 seconds) [00:33] *** cbb2 has quit IRC (Quit: Nettalk6 - www.ntalk.de) [00:36] *** wp494 has quit IRC (Read error: Operation timed out) [00:40] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [00:40] *** ruukasu has joined #archiveteam [00:49] *** wp494 has joined #archiveteam [00:49] *** signius has quit IRC (Ping timeout: 480 seconds) [00:59] *** signius has joined #archiveteam [01:04] the archivebot run of my site oasisjournals.com seems to be pulling in a lot of things not from my domain [01:04] is that expected? [01:04] job 9tchq93k6q83xjzok0rul5a8a [01:32] Yes, that is the expected behavior. [01:44] *** vertice32 has quit IRC (Remote host closed the connection) [01:44] *** primus104 has quit IRC (Leaving.) [02:01] *** philpem has quit IRC (Ping timeout: 272 seconds) [02:05] *** ATZ0 has joined #archiveteam [02:27] *** schbirid2 has joined #archiveteam [02:30] *** schbirid has quit IRC (Read error: Operation timed out) [02:32] *** JonimusP is now known as Jonimus [03:01] *** mistym has quit IRC (Remote host closed the connection) [03:19] *** Jonimus is now known as JonimusP [03:19] *** JonimusP is now known as Jonimus [03:21] *** Jonimus is now known as JonimusP [03:23] *** JonimusP is now known as Jonimus [03:31] *** Aranje has quit IRC (Read error: Connection reset by peer) [03:31] *** Aranje has joined #archiveteam [03:48] *** kris33 has quit IRC (Textual IRC Client: www.textualapp.com) [03:48] *** Ymgve has quit IRC () [04:48] *** aaaaaaaaa has quit IRC (Leaving) [05:11] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [05:13] *** Lord_Nigh has joined #archiveteam [05:14] *** balrog sets mode: +o Lord_Nigh [05:58] *** mistym has joined #archiveteam [07:58] *** mistym has quit IRC (Remote host closed the connection) [09:21] *** primus104 has joined #archiveteam [09:59] *** kris33 has joined #archiveteam [10:04] *** primus104 has quit IRC (Ping timeout: 1757 seconds) [11:21] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [11:46] http://rt.com/usa/207443-kim-dotcom-megaupload-fugitives/ mega.co.nz is probably going to explode before long. [11:54] *** kris33 has quit IRC (Read error: Connection reset by peer) [11:56] *** kris33 has joined #archiveteam [12:13] *** kris33 has quit IRC (Ping timeout: 512 seconds) [12:13] *** kris33 has joined #archiveteam [12:22] *** kris33 has quit IRC (Textual IRC Client: www.textualapp.com) [13:04] *** Ymgve has joined #archiveteam [13:19] *** GLaDOS has quit IRC (Ping timeout: 272 seconds) [13:19] *** GLaDOS has joined #archiveteam [13:24] *** ruukasu has joined #archiveteam [13:31] *** w0rp has quit IRC (Remote host closed the connection) [13:33] *** w0rp has joined #archiveteam [14:05] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [14:05] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [14:06] *** ruukasu has joined #archiveteam [14:07] *** dashcloud has joined #archiveteam [14:10] *** sankin has joined #archiveteam [14:14] *** philpem has joined #archiveteam [14:35] *** K4k has joined #archiveteam [14:39] *** Boppen has quit IRC (Read error: Connection reset by peer) [14:40] *** Boppen has joined #archiveteam [14:52] *** Aranje has quit IRC (Quit: Three sheets to the wind) [15:33] *** xk_id has quit IRC (Ping timeout: 852 seconds) [15:42] *** mistym has joined #archiveteam [15:48] *** mistym has quit IRC (Remote host closed the connection) [16:05] *** mistym has joined #archiveteam [16:26] sounds like it's just about whether the US can keep assets they already seized? not sure how that would kill mega [16:40] *** K4k has quit IRC (Read error: Operation timed out) [16:41] *** MMovie has joined #archiveteam [16:43] *** MMovie1 has quit IRC (Read error: Operation timed out) [16:48] howdy everyone -- I mentioned this a little while back, but we're sunsetting meetup.com/everywhere on Dec. 1st. Anything I can do to assist in archiving? [16:49] http://www.forbes.com/sites/parmyolson/2014/11/20/the-largest-cyber-attack-in-history-has-been-hitting-hong-kong-sites/ [16:53] *** parsons_ is now known as parsons [16:57] w0rp: if dotcom is smart the mega stuff is not under his name or purvey, he's just an employee and it should be unsiezable [16:57] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [16:58] also i see http://dndtools.eu/ got cease-and-desisted to death [17:03] https://dl.dropboxusercontent.com/s/8uwuvhbg8cc7y5m/dnd.zip?dl=1&token_hash=AAGEpJ6AE0ROuCSuoWThKPfpCHQ_Wuvfg_t8cNCtfKAOdg was (and is still up) a copy of their backend sqlite db [17:03] *** bebzol has quit IRC (Ping timeout: 480 seconds) [17:07] so i guess between that and the django application source (which is on github) a good portion of the site could be rebuilt? [17:10] *** dashcloud has quit IRC (Read error: Connection reset by peer) [17:11] *** mistym has quit IRC (Remote host closed the connection) [17:12] *** dashcloud has joined #archiveteam [17:20] *** signius has quit IRC (Ping timeout: 480 seconds) [17:23] *** GLaDOS has quit IRC (Ping timeout: 272 seconds) [17:23] *** GLaDOS has joined #archiveteam [17:26] *** ruukasu has joined #archiveteam [17:28] *** godane has quit IRC (Read error: Operation timed out) [17:29] *** signius has joined #archiveteam [17:37] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [17:52] *** ruukasu has joined #archiveteam [17:53] *** philpem has quit IRC (Ping timeout: 272 seconds) [18:12] *** dashcloud has quit IRC (Read error: Connection reset by peer) [18:13] *** dashcloud has joined #archiveteam [18:15] Lord_Nigh: Boo on WotC! [18:16] *** mistym has joined #archiveteam [18:25] *** dashcloud has quit IRC (Read error: Connection reset by peer) [18:25] *** dashcloud has joined #archiveteam [18:34] *** dashcloud has quit IRC (Read error: Connection reset by peer) [18:40] *** philpem has joined #archiveteam [18:43] huh [18:43] [17:48] howdy everyone -- I mentioned this a little while back, but we're sunsetting meetup.com/everywhere on Dec. 1st. Anything I can do to assist in archiving? [18:43] did anybody respond to that? [18:43] no [18:46] oh, you're still here without underscore :P [18:46] haven't been paying attention - what exactly is going on with meetup? [18:46] oh, there's a footer, sec [18:48] yipdw: ivan`: archivebot'able? [18:48] chfoo: perhaps [18:48] meetup.com [18:48] uh [18:48] chfoo perhaps * [18:48] I don't think so [18:48] yipdw: specifically /everywhere [18:48] oh [18:48] see above [18:48] yeah, not the whole site [18:49] /everywhere is relatively small [18:49] oh that's much more doable [18:49] yeah, you can grab /everywhere [18:49] provided that the rest of the site is going to stay around [18:49] *** dashcloud has joined #archiveteam [18:49] yeah, rest of the site will stay around [18:50] I don't even think we store any photos for everywhere groups/events [18:50] one thing to be aware of is that archivebot won't grab the individual meetups because they're not suffixed with /everywhere/, but so long as those URLs stay stable it's ok [18:50] also meetup.com/Evernote/ is consistently crashing chromium on my system, lol [18:51] oh dear [18:51] well, meetup.com/Evernote/ or meetup.com/Coursera will cease to exist [18:52] parsons: am I correct in that you can't tell from the URL alone whether something is Meetup Everywhere or regular Meetup? [18:52] actually they're all crashing chromium for me, so this is probably more of a "my system is fucked" problem [18:52] that is correct [18:52] hrm, that's tricky [18:52] i wouldn't rule out some problem on our end [18:52] parsons: BTW, any plans to add export functionality for non-everywhere groups? I have one that's been idle for a while, but I've kept it for now because I haven't gotten around to scraping everything [18:52] they are all linked to from /everywhere [18:52] *** Zebranky_ is now known as Zebranky [18:53] so, here's one thing we can do [18:53] parsons: the /everywhere groups - are they just the ones listed on /everywhere or is there a "view more" link somewhere? [18:53] 1: !a http://www.meetup.com/everywhere/ --no-offsite-links [18:53] 2: parse out references to meetups [18:53] 3: fetch (2) [18:54] if /everywhere/ is really that small (1) should not take much time [18:54] wouldnt a quick curl | grep > urls.txt and then !a < do the job? [18:54] yes provided that meetups are a single page [18:54] I don't know if that's true [18:54] oh right, you don't have !a it could be added but I really didn't want to because it'd end up making a single pipeline worker do them all [18:54] because I suspect subpages will live under the main URL of a group [18:54] a smarter solution is needed for !a [18:55] heh [18:55] fair enough [18:55] yeah, that's a bit tricky then [18:55] I'll start with grabbing /everywhere/ [18:55] yeah, local groups are like http://www.meetup.com/Coursera/Toronto-CA/ [18:55] we'll know when it's there via chfoo's archivebot viewer thing and then we can proceed from there [18:56] ok, /everywhere/ grab is underway [18:56] awesome [18:56] also there is no way it actually did what I wanted it to do [18:57] lol [18:57] "software that actually /works/? impossible!" [18:57] oh wait [18:57] I see [18:57] this is not so bad [18:57] /everywhere/ links to e.g. http://www.meetup.com/occupytogether/ [18:57] but under that there exists a good hierarchy [18:58] let's try one of the smaller ones, /ChessCom. [18:58] / [18:59] Zebranky_: no plans for export functionality that I know of [19:00] Oh well, I'll figure something out. Thanks anyway! [19:01] ooh, we do have an api [19:01] so, it may not be that hard [19:01] http://www.meetup.com/meetup_api/ [19:03] I think I got this meetup everywhere thing [19:03] one moment [19:03] That probably covers most things, true. I'll note that it doesn't cover the organizer "money" view, which I did write something to scrape... 2.5 years ago, when this was a thing (wow, I should *really* close it) [19:03] *** cbb has joined #archiveteam [19:04] hehe [19:05] *** ruukasu has quit IRC (Ping timeout: 265 seconds) [19:07] wtf, is there some code on meetup everywhere that disables web inspectors [19:07] oh no it's just my system being fucky agaibn [19:07] ah ok, was worried [19:08] we're putting more and more methods into the api to support our native apps, so I'm sure it's just a matter of time [19:09] ok, so [19:09] I can just !a all of these [19:09] https://gist.github.com/anonymous/a73b819fb9a3593c52b6 [19:09] does this look complete? [19:09] I will double check [19:09] nblr has generously donated an archivebot node so we'll have capacity for that too [19:10] I think I'll run these without offsite link fetch; this will still fetch page requisites (e.g. meetupstatic images) [19:12] hmm, here's what I dug up about a month ago: "There are 6,623 communities, 102,538 local groups, and 217,979 events" [19:13] so, I will find a more complete list [19:13] *** dashcloud has quit IRC (Read error: Operation timed out) [19:13] *** cbb has quit IRC (Read error: Operation timed out) [19:14] *** cbb has joined #archiveteam [19:15] parsons: for meetup everywhere? [19:15] oh [19:15] I just pulled what I could get off the front page [19:17] yeah, I thought we'd have a complete index, but it doesn't look like it [19:17] I can get a master list though [19:18] ok [19:19] if it's in the 10^3 range or higher it might get trickier [19:19] unless most of these are super-small [19:19] (maybe the index only shows the big ones?) [19:19] *** dashcloud has joined #archiveteam [19:21] yeah, it probably shows the biggest ones [19:21] *** godane has joined #archiveteam [19:22] here's the full list: https://gist.github.com/adrianparsons/74318cae806fc3ada182 [19:22] most are probably really small [19:22] yeah [19:22] this is probably one of those weird cases where this is actually archivebottable and big [19:22] ok [19:23] *** Kenshin has quit IRC (Read error: Operation timed out) [19:23] I can get on that in a bit, or feel free to join #archivebot and add the !a lines yourself [19:23] *** Kenshin has joined #archiveteam [19:24] cool! I'll see what I can do (not familiar with the syntax) [19:24] also, heading out for a bit. I'll be back in this room, but if anyone needs to get in touch I'm adrian@adrianparsons.com [19:24] *** cbb has quit IRC (Read error: Operation timed out) [19:24] parsons: we have some docs at http://archivebot.readthedocs.org/en/latest/ [19:25] excellent, thanks [19:25] *** dashcloud has quit IRC (Read error: Operation timed out) [19:25] *** cbb has joined #archiveteam [19:27] *** dashcloud has joined #archiveteam [19:36] *** cbb has quit IRC (Read error: Operation timed out) [19:37] *** cbb has joined #archiveteam [19:44] *** ATZ0 has quit IRC () [19:47] *** cbb has quit IRC (Read error: Operation timed out) [19:48] *** cbb has joined #archiveteam [19:48] *** xk_id has joined #archiveteam [19:58] *** cbb has quit IRC (Read error: Operation timed out) [19:59] *** cbb has joined #archiveteam [20:07] *** brayden has quit IRC (Read error: Connection reset by peer) [20:09] *** cbb has quit IRC (Quit: Nettalk6 - www.ntalk.de) [20:12] *** spara0 has joined #archiveteam [20:13] *** brayden has joined #archiveteam [20:26] *** mistym has quit IRC (Remote host closed the connection) [20:36] is bitcasa forums/blog/site/etc. taken care of yet? [20:56] the dndtools sqlite thing came from the top post at http://web.archive.org/web/20140226202746/http://dndtools.eu/ [21:16] *** dashcloud has quit IRC (Read error: Operation timed out) [21:22] *** dashcloud has joined #archiveteam [21:30] chfoo: yipdw: ivan`: xmc: is bitcasa forums/blog/site/etc. taken care of yet [21:30] * joepie91 is just going to add it to archivebot again if no response... [21:30] no idea re: bitcasa [21:31] i guess it has? http://archive.fart.website/archivebot/viewer/?q=bitcasa [21:38] *** mistym has joined #archiveteam [21:43] *** mistym_ has joined #archiveteam [21:44] *** mistym has quit IRC (Read error: Operation timed out) [21:48] ... viewer? wut [21:48] anyway [21:48] needs a more recent crawl of the forums [21:52] *** [2]the_fo has joined #archiveteam [21:53] *** sankin has quit IRC (Leaving.) [21:57] *** the_fox_ has joined #archiveteam [22:03] *** [2]the_fo has quit IRC (Read error: Operation timed out) [22:03] *** ruukasu has joined #archiveteam [22:04] *** [1]the_fo has joined #archiveteam [22:06] *** [2]the_fo has joined #archiveteam [22:11] *** the_fox_ has quit IRC (Read error: Connection reset by peer) [22:11] *** [1]the_fo has quit IRC (Read error: Connection reset by peer) [22:11] *** [2]the_fo has quit IRC (Read error: Connection reset by peer) [22:26] *** the_fox_ has joined #archiveteam [22:53] *** mistym__ has joined #archiveteam [22:54] *** mistym_ has quit IRC (Read error: Operation timed out) [23:14] *** dashcloud has quit IRC (Read error: Operation timed out) [23:26] *** dashcloud has joined #archiveteam [23:36] *** the_fox_ has quit IRC (Ping timeout: 492 seconds) [23:38] *** sivoais_ has quit IRC (Ping timeout: 252 seconds) [23:39] *** aaaaaaaaa has joined #archiveteam [23:41] *** the_fox_ has joined #archiveteam [23:44] *** nertzy has quit IRC (Read error: Operation timed out) [23:47] *** dashcloud has quit IRC (Read error: Operation timed out) [23:53] *** dashcloud has joined #archiveteam [23:54] *** the_fox_ has quit IRC (Read error: Operation timed out) [23:58] *** the_fox_ has joined #archiveteam