#archiveteam 2015-02-27,Fri

↑back Search

Time Nickname Message
00:27 🔗 SmileyG has quit IRC (Remote host closed the connection)
00:28 🔗 Marc has quit IRC (Ping timeout: 240 seconds)
00:30 🔗 BlueMaxim has quit IRC (Ping timeout: 265 seconds)
00:31 🔗 Start has quit IRC (Read error: Connection reset by peer)
00:31 🔗 BlueMaxim has joined #archiveteam
00:31 🔗 Marc has joined #archiveteam
00:33 🔗 Start has joined #archiveteam
00:34 🔗 wp494 has quit IRC (Read error: Operation timed out)
00:34 🔗 arkiver Discovery of blogger is going to start tomorrow
00:35 🔗 arkiver because of the captchas popping up if we go too fast, we'll need a lot of ips
00:35 🔗 arkiver if you know people who can help us out, please let them know
00:37 🔗 Smiley has joined #archiveteam
00:39 🔗 SketchCow Ask Kenshin
00:41 🔗 arkiver SketchCow: just for confirmation, are we really going to download the full blogger?
00:41 🔗 arkiver that will be very very big
00:49 🔗 wp494 has joined #archiveteam
00:51 🔗 ohhdemgir do it
00:51 🔗 ohhdemgir arkiver, how big guesstimate
00:52 🔗 GLaDOS has joined #archiveteam
00:52 🔗 swebb sets mode: +o GLaDOS
01:11 🔗 aaaaaaaaa has quit IRC (Read error: Operation timed out)
01:13 🔗 Specular has joined #archiveteam
01:14 🔗 aaaaaaaaa has joined #archiveteam
01:15 🔗 dashcloud has quit IRC (Read error: Operation timed out)
01:21 🔗 dashcloud has joined #archiveteam
01:21 🔗 Spring has quit IRC (Read error: Operation timed out)
01:31 🔗 signius has quit IRC (Read error: Operation timed out)
01:32 🔗 primus104 has quit IRC (Leaving.)
01:35 🔗 Spring has joined #archiveteam
01:40 🔗 Specular has quit IRC (Ping timeout: 370 seconds)
01:46 🔗 signius has joined #archiveteam
02:06 🔗 Specular has joined #archiveteam
02:16 🔗 Spring has quit IRC (Read error: Operation timed out)
02:19 🔗 SketchCow We're going to try and download a lot of it
02:19 🔗 SketchCow With an eye towards blogs that match "sex", "eros", "nude"
02:26 🔗 Ymgve has quit IRC ()
02:39 🔗 RedType_ has quit IRC (Remote host closed the connection)
02:58 🔗 mistym has quit IRC (Remote host closed the connection)
03:29 🔗 mistym has joined #archiveteam
03:38 🔗 dashcloud has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 NovaKing has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 maltris has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 ionpulse has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 pikhq has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 altlabel has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 Jogie has quit IRC (hub.dk irc.homelien.no)
03:42 🔗 maltris_ has joined #archiveteam
03:45 🔗 xmc paging ohhdemgir
03:45 🔗 xmc ^
03:45 🔗 xmc it's what he was born for!
03:45 🔗 NovaKing_ has joined #archiveteam
03:47 🔗 S[h]O[r]T im ready to gear up a shit ton of machines to help :)
03:54 🔗 dashcloud has joined #archiveteam
04:03 🔗 Spring has joined #archiveteam
04:04 🔗 SN4T14 has joined #archiveteam
04:11 🔗 Specular has quit IRC (Read error: Operation timed out)
04:51 🔗 Spring has quit IRC (Ping timeout: 362 seconds)
04:58 🔗 Kenshin arkiver: i'll step in once you stablize the code :)
05:00 🔗 mistym has quit IRC (Remote host closed the connection)
05:23 🔗 aaaaaaaaa has quit IRC (Leaving)
05:38 🔗 nwf has quit IRC (WeeChat 1.0.1)
05:39 🔗 nwf has joined #archiveteam
05:45 🔗 mistym has joined #archiveteam
05:46 🔗 Start i'm wondering when we should start projects for angelfire and tripod
05:47 🔗 Start that way we can have all three major 90s web hosts backed up
05:48 🔗 Start also, did the geocities archive include sites from geocities japan?
05:49 🔗 xmc Start: lycos, but that's on tripod now
05:54 🔗 nwf has quit IRC (Read error: Operation timed out)
05:54 🔗 nwf has joined #archiveteam
05:57 🔗 mistym has quit IRC (Remote host closed the connection)
06:03 🔗 RedType has joined #archiveteam
06:12 🔗 RedType has quit IRC (Quit: Lost terminal)
06:19 🔗 RedType has joined #archiveteam
06:23 🔗 mistym has joined #archiveteam
06:33 🔗 RedType has quit IRC (Client Quit)
07:04 🔗 Muad-Dib has quit IRC (Ping timeout: 260 seconds)
07:08 🔗 Muad-Dib has joined #archiveteam
07:13 🔗 pikhq has joined #archiveteam
07:13 🔗 altlabel has joined #archiveteam
07:13 🔗 ionpulse has joined #archiveteam
07:21 🔗 Jogie has joined #archiveteam
07:37 🔗 Emcy_ has quit IRC (Read error: Connection reset by peer)
08:12 🔗 mistym has quit IRC (Remote host closed the connection)
08:18 🔗 primus104 has joined #archiveteam
08:25 🔗 acridAxid has quit IRC (Quit: Quitting)
08:28 🔗 khaoohs_ has joined #archiveteam
08:29 🔗 dashcloud has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 wp494 has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 rejon has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Famicoma1 has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 balrog has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 lrkj has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 slash` has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Baljem has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 ats has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 espes__ has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Mayonaise has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 marvinw has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 khaoohs has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Froggypwn has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 oli has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Cameron_D has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 gibigiana has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 okeuday has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 ohhdemgir has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 ryan___ has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 thefinn93 has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 eprillios has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Rickster has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 kanzure has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 fenn has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 xmc has quit IRC (west.us.hub irc.eversible.com)
08:30 🔗 acridAxid has joined #archiveteam
08:30 🔗 wp494_ has joined #archiveteam
08:30 🔗 wp494_ has quit IRC (Excess Flood)
08:30 🔗 wp494_ has joined #archiveteam
08:31 🔗 Rickster` has joined #archiveteam
08:32 🔗 kanzure_ has joined #archiveteam
08:32 🔗 gibigian1 has joined #archiveteam
08:32 🔗 lrkj_ has joined #archiveteam
08:36 🔗 acridAxid has quit IRC (Read error: Operation timed out)
08:39 🔗 ryan__ has joined #archiveteam
08:41 🔗 acridAxid has joined #archiveteam
08:44 🔗 Rickster` is now known as Rickster
08:44 🔗 rejon has joined #archiveteam
08:44 🔗 balrog has joined #archiveteam
08:44 🔗 slash` has joined #archiveteam
08:44 🔗 thefinn93 has joined #archiveteam
08:44 🔗 Baljem has joined #archiveteam
08:44 🔗 ats has joined #archiveteam
08:44 🔗 espes__ has joined #archiveteam
08:44 🔗 Mayonaise has joined #archiveteam
08:44 🔗 marvinw has joined #archiveteam
08:44 🔗 oli has joined #archiveteam
08:44 🔗 Cameron_D has joined #archiveteam
08:44 🔗 okeuday has joined #archiveteam
08:44 🔗 xmc has joined #archiveteam
08:44 🔗 irc.eversible.com sets mode: +oo balrog xmc
08:44 🔗 swebb sets mode: +o balrog
08:44 🔗 swebb sets mode: +o xmc
08:44 🔗 balrog sets mode: +o Lord_Nigh
08:48 🔗 fenn has joined #archiveteam
08:51 🔗 Emcy_ has joined #archiveteam
08:51 🔗 fenn has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 rejon has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 balrog has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 slash` has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 Baljem has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 ats has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 espes__ has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 Mayonaise has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 marvinw has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 oli has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 Cameron_D has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 okeuday has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 thefinn93 has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 xmc has quit IRC (west.us.hub irc.eversible.com)
08:53 🔗 oli_ has joined #archiveteam
08:54 🔗 espes___ has joined #archiveteam
08:55 🔗 schbirid has joined #archiveteam
08:55 🔗 marvinw_ has joined #archiveteam
08:55 🔗 Baljem_ has joined #archiveteam
09:06 🔗 oli_ is now known as oli
09:06 🔗 eprillios has joined #archiveteam
09:10 🔗 dashcloud has joined #archiveteam
09:10 🔗 rejon has joined #archiveteam
09:10 🔗 balrog has joined #archiveteam
09:10 🔗 thefinn93 has joined #archiveteam
09:10 🔗 Mayonaise has joined #archiveteam
09:10 🔗 Cameron_D has joined #archiveteam
09:10 🔗 xmc has joined #archiveteam
09:10 🔗 irc.eversible.com sets mode: +oo balrog xmc
09:10 🔗 swebb sets mode: +o balrog
09:10 🔗 swebb sets mode: +o xmc
09:18 🔗 ats has joined #archiveteam
09:19 🔗 fenn has joined #archiveteam
09:21 🔗 primus104 has quit IRC (Leaving.)
09:38 🔗 antomatic That was quick:
09:38 🔗 antomatic http://www.engadget.com/2015/02/27/google-reverses-blogger-porn-ban/
09:46 🔗 eprillios has quit IRC (Ping timeout: 506 seconds)
09:52 🔗 eprillios has joined #archiveteam
09:59 🔗 Famicoman has joined #archiveteam
10:09 🔗 espes___ I like the imagine it's because of SketchCow shouting at them yesterday
10:12 🔗 arkiver ohhdemgir: full blogger would be many 10's of TB's I think
10:13 🔗 arkiver Kenshin S[h]O[r]T: thanks! I'll keep you informed
10:13 🔗 arkiver so do we still want blogger or do we close the project now? http://www.engadget.com/2015/02/27/google-reverses-blogger-porn-ban/
10:14 🔗 Kenshin if they're going to keep things and only target commerical porn, i feel there's no rush to archive it
10:16 🔗 espes___ full blogger
10:16 🔗 espes___ >300TB
10:18 🔗 Atluxity well... why not have blogger on our project-list anyway? If we get to it then why not have it as a large project warriors can work with when there is nothing else?
10:18 🔗 swebb has quit IRC (Read error: Operation timed out)
10:19 🔗 Atluxity would it be hard to de-duplicate if they announce a shutdown later?
10:19 🔗 Atluxity just do an incremental grab then?
10:19 🔗 espes___ 'cause storage is expensive
10:22 🔗 swebb has joined #archiveteam
10:23 🔗 Ymgve has joined #archiveteam
10:25 🔗 arkiver deduplicating is hard with 300 TB
10:26 🔗 arkiver and the storage is a problem, but if they announce to go away a week before shutdown we'll not have enough time to save everything
10:26 🔗 arkiver so a slow constant proect
10:26 🔗 arkiver project* would be good
10:34 🔗 fenn why is deduplicating hard?
10:35 🔗 ersi You need to crunch a lot and keep a lot of data in memory
10:35 🔗 ersi tldr "resource intensive and complex task"
10:35 🔗 fenn is it not just a matter of comparing hashes?
10:35 🔗 fenn either perceptual hash or md5
10:37 🔗 fenn file size is a decent first pass too
11:32 🔗 Atluxity depending on storage solution, some have it buildt in
11:33 🔗 Atluxity but without knowing the storage in detail it is hard to plan for
11:34 🔗 Atluxity implementing in pipeline software would certainly be a challenge
11:34 🔗 Atluxity but maybe a constant big project could be better than nothing at all?
12:00 🔗 slash` has joined #archiveteam
12:25 🔗 primus104 has joined #archiveteam
12:32 🔗 antomatic I agree with Atluxity, it'd be really good to have a big, ongoing, unhurried project that can serve as a backstop for hungry warriors with nothing else to do
12:35 🔗 antomatic Google/Blogger have demonstrated that their whims are arbitrary and changeable
12:35 🔗 antomatic so a pre-emptive grab, over time, seems like a worthwhile investmenet.
12:35 🔗 antomatic *investment
12:38 🔗 antomatic As regards deduplicating, perhaps the project could keep track of the time/date of each blog's grab, so that future grabs (if done) can use the blogger features to grab 'everything since' that date, etc
12:40 🔗 antomatic it should be a solvable problem
12:48 🔗 Atluxity ah, I did not know of such a feature
12:50 🔗 antomatic something like /search?updated-min=yyyy-mm-ddThh:mm:ssZ&max-results=499
12:51 🔗 arkiver That's a good idea
12:51 🔗 arkiver that's not so hard to implement
12:52 🔗 arkiver if SketchCow thinks it's good to do and we are good on space, I think we should do that
12:52 🔗 arkiver update every month
12:57 🔗 antomatic Mm, that works - e.g. "everything since Jan 1st 2014" is expressed like: http://buzz.blogger.com/search?updated-min=2014-01-01T00:00:00Z&max-results=499
12:58 🔗 antomatic that's blogger.com but the same works on blogspot.com
13:04 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:11 🔗 wp494_ has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
13:11 🔗 wp494 has joined #archiveteam
13:21 🔗 signius has quit IRC (Read error: Operation timed out)
13:24 🔗 ohhdemgir has joined #archiveteam
13:34 🔗 signius has joined #archiveteam
13:46 🔗 sankin has joined #archiveteam
14:50 🔗 sankin has quit IRC (Leaving.)
14:58 🔗 sankin has joined #archiveteam
15:01 🔗 ohhdemgir so whos writing the discovery for blogger?
15:02 🔗 primus104 has quit IRC (Leaving.)
15:04 🔗 Kazzy ohhdemgir: arkiver is/was writing disco scripts
15:22 🔗 khaoohs_ has quit IRC (Ping timeout: 306 seconds)
15:28 🔗 arkiver discovery scripts are ready
15:28 🔗 arkiver not started yet, but ready
15:32 🔗 mistym has joined #archiveteam
15:50 🔗 mistym has quit IRC (Remote host closed the connection)
15:53 🔗 mhazinsk has quit IRC (Ping timeout: 186 seconds)
15:56 🔗 mhazinsk has joined #archiveteam
16:11 🔗 mistym has joined #archiveteam
16:15 🔗 SketchCow I take partial credit for the reversal, because they used some of my language
16:16 🔗 SketchCow My attitude on Blogger grabs is:
16:16 🔗 SketchCow - Grab the oldest blogs (most likely to die in some arbitrary cull)
16:16 🔗 SketchCow - Grab the biggest blogs (less likely, but good to hace)
16:17 🔗 SketchCow - Grab blogs with words like "erotic", "sex-positive" or "adult-oriented" in the search results, as those are now shown to be second class citizens
16:17 🔗 SketchCow That can go on for a while
16:17 🔗 SketchCow Also, getting discovery going and making frameworks is also important. Having record of what was out there can be really helpful for researchers.
16:19 🔗 arkiver Ok, we'll run the discovery and filter for those sex oriented words and I'll run a discovery to check which blogs request a verification of age
16:20 🔗 Start we could also discover blogs by abusing the next blog button on the navbar
16:20 🔗 arkiver So the idea to grab everything and update regularly is off?
16:22 🔗 arkiver Biggest blogs is biggest as in a lot of visitors or biggest as in a lot of posts? We can check the number of posts easy, but number of vistors would be a bit harder. That would have to be user submitted
16:26 🔗 ohhdemgir arkiver, ready.. so lets start?
16:31 🔗 khaoohs has joined #archiveteam
16:31 🔗 oli_ has joined #archiveteam
16:33 🔗 oli has quit IRC (hub.se efnet.port80.se)
16:33 🔗 Rickster has quit IRC (hub.se efnet.port80.se)
16:33 🔗 Muad-Dib has quit IRC (hub.se efnet.port80.se)
16:33 🔗 GLaDOS has quit IRC (hub.se efnet.port80.se)
16:33 🔗 WubTheCap has quit IRC (hub.se efnet.port80.se)
16:33 🔗 Sue_ has quit IRC (hub.se efnet.port80.se)
16:33 🔗 nox has quit IRC (hub.se efnet.port80.se)
16:33 🔗 danneh_ has quit IRC (hub.se efnet.port80.se)
16:33 🔗 LittUp has quit IRC (hub.se efnet.port80.se)
16:33 🔗 deathy has quit IRC (hub.se efnet.port80.se)
16:33 🔗 russss has quit IRC (hub.se efnet.port80.se)
16:33 🔗 lhobas has quit IRC (hub.se efnet.port80.se)
16:36 🔗 aaaaaaaaa has joined #archiveteam
16:37 🔗 Sue__ has joined #archiveteam
16:42 🔗 SketchCow Sorry, biggest in terms of popularity.
16:42 🔗 SketchCow These are just arbitrary things, just to do instead of a full deep scan.
16:48 🔗 Atluxity I like the generall idea of having a big project to work on in more idle time for our warriors
16:48 🔗 ersi Who doesn't :)
16:49 🔗 Atluxity it is not motivating for a user running a warrior to see it being idle
16:49 🔗 oli_ is now known as oli
16:49 🔗 ersi That has been obvious for quite some time, yeah :)
16:49 🔗 WubTheCap has joined #archiveteam
17:08 🔗 Spring has joined #archiveteam
17:12 🔗 mistym has quit IRC (Remote host closed the connection)
17:21 🔗 Kenshin one note btw, there are blogs that use their own domains
17:21 🔗 Kenshin but still use blogger's platform
17:27 🔗 Start_ has joined #archiveteam
17:27 🔗 Start has quit IRC (Read error: Connection reset by peer)
17:27 🔗 Start_ is now known as Start
17:36 🔗 Start once we've got more of the upcoming projects out of the way, i'd like to start projects for some of the isp web hosts we've found at #webroasting
17:36 🔗 Start not sure which order they'd be done in
17:37 🔗 Start maybe oldest/biggest/most decayed first
17:38 🔗 primus104 has joined #archiveteam
17:46 🔗 primus104 has quit IRC (Leaving.)
17:59 🔗 xmc sets mode: +o swebb
18:06 🔗 mistym has joined #archiveteam
18:07 🔗 mistym_ has joined #archiveteam
18:16 🔗 mistym has quit IRC (Ping timeout: 600 seconds)
18:27 🔗 primus104 has joined #archiveteam
18:34 🔗 slash` has quit IRC (Ping timeout: 512 seconds)
18:36 🔗 balrog has quit IRC (Ping timeout: 512 seconds)
18:37 🔗 balrog has joined #archiveteam
18:37 🔗 swebb sets mode: +o balrog
18:37 🔗 thefinn93 has quit IRC (Ping timeout: 606 seconds)
18:43 🔗 Nickname has joined #archiveteam
18:43 🔗 Nickname http://science.slashdot.org/story/15/02/25/2313241/argonne-national-laboratory-shuts-down-online-ask-a-scientist-program
18:44 🔗 Nickname NEWTON is (soon to be was) an on online repository of science questions submitted by school children from around the world.
18:44 🔗 yipdw got it
18:44 🔗 yipdw http://archive.fart.website/archivebot/viewer/?q=newton
18:44 🔗 thefinn93 has joined #archiveteam
18:48 🔗 Nickname @yipdw: Sorry for asking then. I was unable to find it in the wiki.
18:48 🔗 Nickname Did you really get everything?
18:48 🔗 yipdw no apologies needed, just wanted to point it out
18:48 🔗 yipdw I haven't done a thorough check but glancing over the logs it doesn't look dreadfully bad
18:49 🔗 yipdw if you'd like to verify, you can use https://github.com/ikreymer/webarchiveplayer
18:50 🔗 yipdw download https://archive.org/download/archiveteam_archivebot_go_20150227000003/www.newton.dep.anl.gov-inf-20150226-022927-cm6eh-00000.warc.gz and load it into the player
18:50 🔗 yipdw actually i'll do that now
18:53 🔗 Nickname so the archivebot http://archive.fart.website/archivebot/viewer/?q=newton , shows files that are already saved on archive.org?
18:55 🔗 yipdw Nickname: yeah, it's an index of the archivebot collection in IA, which in turn gets ingested into the Wayback Machine eventually
18:55 🔗 yipdw Wayback works but there are details that I've found that other tools currently render better
18:55 🔗 yipdw infinite-scroll for example
18:56 🔗 yipdw so pywb/webarchiveplayer/etc
18:58 🔗 Nickname What should I even look for? (Isn't there a Tool to check for broken links or detect suspicious external/internal Links?)(Oh infinite-scroll how I hate thee...)
19:00 🔗 Smiley wget can check for broken links.
19:01 🔗 yipdw Nickname: I suggest as a first step downloading the WARC and looking through it manually, comparing it against Newton
19:01 🔗 yipdw once it looks like the copy is reasonably faithful it's good up to the point that you trust our spiders
19:02 🔗 yipdw the crawler is https://github.com/chfoo/wpull
19:02 🔗 yipdw I guess you could run a link check on the loaded WARC, I don't know of any tools to do that
19:02 🔗 yipdw it's also ambiguous -- a 404 in the WARC could very well be a 404 on the captured site
19:04 🔗 Nickname hmm. The Internet Archive's sites are offline for scheduled maintenance and upgrades. Oh well. I'll check back later.
19:07 🔗 Nickname Does the WARC capture 404-at-crawl Events?
19:09 🔗 Nickname (Capturing the received Error-Page could also be useful, for checking Purposes.)
19:11 🔗 RedType has joined #archiveteam
19:23 🔗 slash` has joined #archiveteam
19:25 🔗 RedType_ has joined #archiveteam
19:26 🔗 RedType_ has quit IRC (Client Quit)
19:28 🔗 RedType_ has joined #archiveteam
19:34 🔗 SketchCow Internet Archive's having a little sadness
19:34 🔗 SketchCow Leonard Nimoy's gone, what's the point
19:39 🔗 Smiley nod
19:39 🔗 Smiley turn off the lights on your way out
19:40 🔗 SketchCow So
19:40 🔗 SketchCow Someone passed me information and wants it some way confidential
19:40 🔗 Smiley ?LOL
19:40 🔗 SketchCow So I'm going to paraphrase it
19:41 🔗 Smiley ...k....
19:41 🔗 * Smiley sounds the sirens
19:41 🔗 SketchCow Last.fm is going to switch codebases in the first two weeks of April
19:41 🔗 SketchCow Opinion of these folks is... it's not going to well
19:42 🔗 RedType has quit IRC (Quit: Lost terminal)
19:42 🔗 SketchCow Code music data likely to survive, but some user generated material may die
19:42 🔗 SketchCow From latter:
19:42 🔗 SketchCow There are 1m+ forum posts spanning nearly 11 years across global
19:42 🔗 SketchCow forums (which you can see at http://www.last.fm/forum) and group
19:42 🔗 SketchCow forums. The good news is all the forums are accessible by incrementing
19:42 🔗 SketchCow the ID at the end of http://www.last.fm/forum/<id>. They are spread
19:42 🔗 SketchCow across a mostly-continuous ID namespace.
19:42 🔗 SketchCow 'm also slightly concerned about user journals (e.g.
19:42 🔗 SketchCow http://www.last.fm/user/Russ/journal), but that's more difficult as
19:42 🔗 SketchCow there's no easy way of enumerating users, short of crawling similar
19:42 🔗 SketchCow user/friends lists.
19:42 🔗 SketchCow I'd also suggest archiving blog.last.fm during this switchover as
19:42 🔗 SketchCow hilarity is likely to ensue.
19:42 🔗 SketchCow ...
19:42 🔗 SketchCow That's all.
19:43 🔗 SketchCow So I think a project is worth it
19:43 🔗 SketchCow I suggest #lastchance.fm
19:46 🔗 garyrh_ their blog is in archivebot
19:46 🔗 garyrh_ last post was jan. 2014
19:47 🔗 xmc last.fm's name is self-parodying
19:48 🔗 garyrh_ didntlast.fm
19:49 🔗 xmc wontlast
19:50 🔗 garyrh_ camelast.fm, etc. etc
19:52 🔗 Stilett0 has joined #archiveteam
19:52 🔗 Stilett0 has left
20:13 🔗 kyan has quit IRC (Quit: Leaving)
20:17 🔗 BlueMaxim has joined #archiveteam
20:17 🔗 sep332 Google has updated their updated policy. doesn't look (quite) as bad anymore
20:17 🔗 sep332 https://productforums.google.com/forum/m/#!category-topic/blogger/jAep2mLabQY
20:18 🔗 sep332 i'm guessing we're grabbing anyway, huh?
20:21 🔗 xmc no rules no masters #yolo
20:29 🔗 sep332 oh well ok then
20:30 🔗 lag has joined #archiveteam
20:38 🔗 Nickname has quit IRC (Quit: Page closed)
20:38 🔗 SketchCow We went over how we're doing blogger.
20:38 🔗 SketchCow Grab old blogs, grab sexy and erotic blogs
20:39 🔗 SketchCow Don't go nuts, but save from them because they're obviously not so hot
20:41 🔗 sep332 thanks. i saw the scrollback from 10 hours ago but missed the update 4 hours ago
20:47 🔗 lag2 has joined #archiveteam
20:51 🔗 lag has quit IRC (Ping timeout: 258 seconds)
20:59 🔗 RedType_ has quit IRC (Quit: leaving)
20:59 🔗 RedType has joined #archiveteam
21:10 🔗 mistym_ has quit IRC (Remote host closed the connection)
21:56 🔗 mistym has joined #archiveteam
22:01 🔗 sankin has quit IRC (Leaving.)
22:09 🔗 tephra_ codehaus (http://www.codehaus.org/) is shutting down
22:16 🔗 lag2 has quit IRC (Quit: Leaving)
22:28 🔗 cbb2 has joined #archiveteam
23:17 🔗 cbb2 has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 primus104 has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 schbirid has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 Zebranky_ has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 Fusl has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 cbb2 has joined #archiveteam
23:17 🔗 primus104 has joined #archiveteam
23:17 🔗 schbirid has joined #archiveteam
23:17 🔗 Zebranky_ has joined #archiveteam
23:17 🔗 Fusl has joined #archiveteam
23:24 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
23:25 🔗 Ymgve has quit IRC ()
23:27 🔗 db48x` has joined #archiveteam
23:32 🔗 Spring has quit IRC (Quit: Leaving)
23:32 🔗 Ymgve has joined #archiveteam
23:37 🔗 BlueMaxim has joined #archiveteam
23:45 🔗 Jonimus has joined #archiveteam
23:50 🔗 T31m_ has joined #archiveteam
23:50 🔗 nico_32_ has joined #archiveteam
23:50 🔗 nico_32 has quit IRC (Read error: Connection reset by peer)
23:50 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
23:51 🔗 Daloader_ has joined #archiveteam
23:53 🔗 BlueMaxim has joined #archiveteam
23:57 🔗 T31m_ has quit IRC (Read error: Operation timed out)
23:58 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
23:58 🔗 T31M has quit IRC (Read error: Operation timed out)
23:59 🔗 BlueMaxim has joined #archiveteam
23:59 🔗 khaoohs has quit IRC (Read error: Connection reset by peer)
23:59 🔗 khaoohs has joined #archiveteam

irclogger-viewer