[00:03] *** Stiletto has joined #archiveteam [00:04] *** Start has quit IRC (Quit: Disconnected.) [00:06] *** Start has joined #archiveteam [00:23] *** JesseW has joined #archiveteam [00:29] *** bsmith093 has quit IRC (Read error: Operation timed out) [00:32] *** bsmith093 has joined #archiveteam [00:53] *** bsmith093 has quit IRC (Quit: Leaving.) [00:58] *** bsmith093 has joined #archiveteam [01:02] *** bsmith093 has quit IRC (Client Quit) [01:36] *** Stiletto has quit IRC (Read error: Operation timed out) [01:38] *** Stiletto has joined #archiveteam [01:49] *** bsmith093 has joined #archiveteam [03:24] *** bsmith093 has quit IRC (Ping timeout: 260 seconds) [03:27] *** bsmith093 has joined #archiveteam [04:36] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [04:44] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:52] *** Sk1d has joined #archiveteam [05:00] *** BartoCH has joined #archiveteam [05:04] *** GLaDOS has quit IRC (Ping timeout: 260 seconds) [05:40] *** dashcloud has quit IRC (Read error: Operation timed out) [05:44] *** dashcloud has joined #archiveteam [06:13] I thought there would be less http://jobs.code4lib.org/jobs/digital-preservation/ [06:38] *** bsmith093 has quit IRC (Remote host closed the connection) [06:40] *** bsmith093 has joined #archiveteam [06:57] *** JesseW has quit IRC (Ping timeout: 370 seconds) [06:58] *** GLaDOS has joined #archiveteam [07:48] *** Honno has joined #archiveteam [08:47] *** WinterFox has joined #archiveteam [09:16] *** schbirid has joined #archiveteam [09:20] *** dashcloud has quit IRC (Read error: Operation timed out) [09:23] *** dashcloud has joined #archiveteam [09:31] *** metalcamp has joined #archiveteam [09:51] *** bwn has quit IRC (Ping timeout: 244 seconds) [09:58] *** bwn has joined #archiveteam [10:16] *** BlueMaxim has quit IRC (Read error: Operation timed out) [10:18] *** BlueMaxim has joined #archiveteam [10:37] *** BlueMaxim has quit IRC (Read error: Operation timed out) [10:39] *** BlueMaxim has joined #archiveteam [11:20] *** dashcloud has quit IRC (Read error: Operation timed out) [11:23] *** dashcloud has joined #archiveteam [12:29] *** lysobit has quit IRC (Ping timeout: 370 seconds) [12:50] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [13:12] Well, the term is meaningless as a tag. [13:12] Example: Being a Archivist, they thrown in "digitial preservation" [13:12] Without really, what that means [13:12] "You will use a computer" [13:13] "Someone will ask you to scan something" [13:13] "There will be a pile of things, watch it carefully, it's digital" [13:13] Happy to say the hiphop mixtapes have beenm up for a week and not one drive-by [13:14] Also, even though I've now been introduced to hours and hours of songs about purple drank, I've not once been lured into mixing some up [13:15] In fact, purple drank appears to be entirely a method of cultural and ethnic genocide [13:15] Don't tell anyone [13:29] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [13:44] *** BlueMaxim has quit IRC (Quit: Leaving) [13:45] *** metalcamp has joined #archiveteam [14:11] *** WinterFox has quit IRC (Remote host closed the connection) [15:15] http://forums.projectspark.com/yaf_postst214854.aspx [15:16] >Starting 5/13/16, “Project Spark” will no longer be available for download on the Xbox Marketplace or Windows Store. For existing users of “Project Spark,” online services will be unavailable after 8/12/16. Without services, players will no longer be able to download user-generated content or upload their own creations. [15:29] *** mhazinsk has joined #archiveteam [15:43] *** JesseW has joined #archiveteam [16:03] *** dan- has quit IRC (Ping timeout: 260 seconds) [16:10] *** dan- has joined #archiveteam [16:35] *** bsmith093 has quit IRC (Remote host closed the connection) [16:40] *** bsmith093 has joined #archiveteam [17:02] *** mhazinsk has quit IRC (Ping timeout: 633 seconds) [17:02] *** luckcolor has quit IRC (Read error: Connection reset by peer) [17:03] *** atomotic has joined #archiveteam [17:04] *** bsmith093 has quit IRC (Quit: Leaving.) [17:06] *** luckcolor has joined #archiveteam [17:06] *** mhazinsk has joined #archiveteam [17:11] *** bwn has quit IRC (Ping timeout: 244 seconds) [17:16] *** bwn has joined #archiveteam [17:30] *** JesseW has quit IRC (Ping timeout: 370 seconds) [17:34] Nice to see someone do it right for once: https://medium.com/@craigmod/archiving-our-online-communities-e5868eab4d9a#.lzk599i7h [17:35] (They will be engraving the entire contents of their site on nickle plates, like the Long Now Foundation's Rosetta Disc project, in addition to having the IA crawl the entire site) [17:43] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [17:43] *** bsmith093 has joined #archiveteam [17:53] *** acridAxid has quit IRC (marauder) [17:54] *** acridAxid has joined #archiveteam [17:58] Nice. Not everybody has a two-letter domain name though. Will the content be pulled down by IA if the new owners put up a robots.txt says the whole site should not be crawled? [18:08] MrRadar, was that link generated with a "share this link" tool? (looks like a tracking link) [18:26] *** JesseW has joined #archiveteam [18:43] http://openpsych.net/forum/showthread.php?tid=279 [18:43] [ODP] The OKCupid dataset: A very large public dataset of dating site users [18:52] Looks to be too small it include all match.com users (Match.com TOS essentially says sign up for one, you sign up for them all) [19:11] *** zino has joined #archiveteam [19:25] *** bsmith093 has quit IRC (Quit: Leaving.) [19:26] *** bsmith093 has joined #archiveteam [19:28] I have the first version of the new IA census ready, as a torrent. If someone could upload it to IA for me, that would be appreciated. [19:28] magnet:?xt=urn:btih:d5f9909f56f14867ca2e7a925cb1dadbb2a3da49&dn=ia%5Fcensus%5F201604%5Fpublic&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.leechers-paradise.org%3A6969 [19:36] Please use the identifier ia_census_201604 [19:45] JesseW, what file size is it? [19:52] *** schbirid has quit IRC (Quit: Leaving) [19:55] JesseW, thrown it at my Feral Hosting slot [19:56] *** bsmith093 has quit IRC (Read error: Operation timed out) [19:57] HCross: It's 23G [19:57] I see you are grabbing it [19:58] yeah. Ill get it pushed in [20:00] Thanks! [20:00] Also, please dump a copy of the sha1 file (which is only 5.7G) on any random places you can think of that will host a 5.7G file. [20:02] i.e. Github, Bitbucket, Gmail, etc. [20:03] ^ file is avaliable in the torrent [20:03] Ive put the sha1 at highest priority [20:04] It's a list of the sha1's of all the files in all the fully public-and-downloadable items on IA. [20:05] I'm still discussing with IA about releasing in bulk the metadata for the non-downloadable files -- and we both agreed not to distribute in bulk the metadata for the (many) items on IA marked with "noindex". [20:08] *** zino has quit IRC (Read error: Connection reset by peer) [20:12] *** bsmith093 has joined #archiveteam [20:33] http://www.youtube.com/channel/UCZM0q3tj_RstUAUM4ox3R3g/live [20:38] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [22:20] *** JesseW has quit IRC (Read error: Operation timed out) [22:59] *** BlueMaxim has joined #archiveteam [23:03] *** JesseW has joined #archiveteam [23:14] *** Honno has quit IRC (Read error: Operation timed out) [23:56] So we need to ignore some URLs on arto to complete the grab [23:56] We need to split items on myvip [23:57] other projects are experienceproject, orkut [23:57] Did I miss anything there? [23:57] *** Coderjoe_ has quit IRC (Read error: Operation timed out)