[00:22] *** howdoicom has quit IRC (Quit: Page closed) [00:50] *** Petri152 has joined #archiveteam-bs [00:59] *** JesseW has joined #archiveteam-bs [01:21] *** tomwsmf has quit IRC (Read error: Operation timed out) [01:26] *** JesseW has quit IRC (Ping timeout: 370 seconds) [01:50] *** JesseW has joined #archiveteam-bs [02:01] *** Honno has joined #archiveteam-bs [02:40] HCross: ping me in a couple weeks about newsgrab, I may have a host if you can run in a container, but am waaaay too busy right now. :( [02:51] *** tuankiet6 has joined #archiveteam-bs [02:52] *** tuankiet6 is now known as tuankiet [03:04] *** RichardG has quit IRC (Read error: Connection reset by peer) [03:04] *** RichardG has joined #archiveteam-bs [03:12] *** RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) [03:15] *** JesseW has quit IRC (Ping timeout: 370 seconds) [03:39] *** RichardG has joined #archiveteam-bs [03:50] *** mutoso has quit IRC (Ping timeout: 260 seconds) [03:58] *** mutoso has joined #archiveteam-bs [04:16] *** JesseW has joined #archiveteam-bs [04:22] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:30] *** Sk1d has joined #archiveteam-bs [04:42] *** JesseW has quit IRC (Ping timeout: 370 seconds) [05:02] *** JesseW has joined #archiveteam-bs [05:31] *** RichardG has quit IRC (Read error: Operation timed out) [05:31] *** RichardG has joined #archiveteam-bs [05:54] *** dan- has quit IRC (Ping timeout: 633 seconds) [05:57] *** dan- has joined #archiveteam-bs [06:25] *** JesseW has quit IRC (Ping timeout: 370 seconds) [06:32] *** aschmitz has quit IRC (Read error: Operation timed out) [06:39] *** aschmitz has joined #archiveteam-bs [07:11] *** RichardG has quit IRC (Ping timeout: 255 seconds) [07:25] *** BlueMaxim has quit IRC (Read error: Operation timed out) [07:26] *** BlueMaxim has joined #archiveteam-bs [08:02] *** kristian_ has joined #archiveteam-bs [08:33] *** fie has joined #archiveteam-bs [08:38] *** RichardG has joined #archiveteam-bs [08:47] *** BlueMaxim has quit IRC (Read error: Operation timed out) [08:48] *** BlueMaxim has joined #archiveteam-bs [09:32] *** whydomain has joined #archiveteam-bs [09:37] Not really on topic, but this popped up in the vintage mac forums earlier. [09:37] From the ad: "The lease is up at the end of the month, and all of my Macintosh stuff has to go! ... I must liquidate this collection (which is probably one of the biggest in the country), or it goes to the recycler!" [09:37] Ad: http://denver.craigslist.org/sys/5732303316.html [09:38] Sadly I'm 4000+ miles away. Anyone want to save some hardware for cheap? [09:41] Ignore it [09:55] *** whydomain has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [10:23] *** kristian_ has quit IRC (Leaving) [10:59] *** wp494 has quit IRC (Read error: Connection reset by peer) [11:41] *** fie_ has joined #archiveteam-bs [11:43] *** fie has quit IRC (Read error: Operation timed out) [12:07] wow, he had a lot of crap [12:39] *** BlueMaxim has quit IRC (Quit: Leaving) [12:43] *** vitzli has joined #archiveteam-bs [13:05] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [13:11] *** BartoCH has joined #archiveteam-bs [14:18] *** wp494 has joined #archiveteam-bs [14:30] *** tuankiet has quit IRC (Ping timeout: 246 seconds) [14:38] *** tuankiet6 has joined #archiveteam-bs [15:51] *** vitzli has quit IRC (Quit: Leaving) [16:17] *** tomwsmf has joined #archiveteam-bs [17:28] *** kristian_ has joined #archiveteam-bs [17:51] *** schbirid has joined #archiveteam-bs [18:02] lol [18:02] [ 989.029486] sd 8:0:0:0: [sdg] Very big device. Trying to use READ CAPACITY(16). [18:04] *** gfscott has joined #archiveteam-bs [18:06] i'm going after gawker.com sitemap for 2016-06 and 2016-07 [18:07] i will byte because i am actually interested, how is cc0 not a license? [18:07] byte :D [18:17] uploaded: https://archive.org/details/gawker.com-sitemap-2016-01-20160627 [18:31] *** GE has joined #archiveteam-bs [18:31] This is the offtopic channel so other channels like #archivebot and #archiveteam can stay clutter-free for important discussion. [18:32] *** kristian_ has quit IRC (Leaving) [18:50] *** riordan has joined #archiveteam-bs [19:09] *** riordan has quit IRC (riordan) [19:10] *** riordan_ has joined #archiveteam-bs [19:12] so gawker.com sitemaps urls are uploaded [19:17] *** riordan_ has quit IRC (Read error: Operation timed out) [19:21] godane: is gawker.com 100% in the wayback machine now? [19:21] or well, less than 100% of course [19:21] but you get what I mean [19:22] https://archive.org/details/@chris85?and[]=gawker.com&sort=-publicdate [19:23] that should be very close to 100% [19:28] *** riordan has joined #archiveteam-bs [19:36] schbirid: it even says "license" in the legal doc: https://creativecommons.org/publicdomain/zero/1.0/legalcode [19:36] jepp [19:36] it just prefers a public domain dedication [19:36] wow nice https://gitlab.peach-bun.com/snippets/28 [19:42] actually that's probably not a fair question, because that gcc package includes all the frontends too [19:52] what i have a kotaku.com so far: https://archive.org/details/@chris85?and[]=kotaku.com&sort=-publicdate&and[]=subject%3A%22archiveteam%22 [19:56] *** riordan has quit IRC (riordan) [19:57] *** riordan has joined #archiveteam-bs [19:58] *** riordan_ has joined #archiveteam-bs [19:58] *** riordan has quit IRC (Read error: Operation timed out) [20:11] *** riordan_ has quit IRC (Ping timeout: 633 seconds) [20:25] *** schbirid has quit IRC (Quit: Leaving) [20:30] godane: you might want to go after this one if you haven't already: http://thecuck.gawker.com/sitemap.xml [20:31] it's...controversial to say the least [20:32] *** schbirid has joined #archiveteam-bs [20:37] *** arrith has joined #archiveteam-bs [20:53] hey folks, new to this scene. I'm thinking about getting a mirror of http://exoteric.roach.org/ that has a bunch of fractal wallpapers, and the site owner has http basic auth on the images and download quota. so I'm thinking about emailing the guy and requesting a rip. is this reasonable? [20:57] yes [20:58] well yipdw [20:58] site owners can usually be reasonable about that [20:58] http://exoteric.roach.org/terms.php [20:58] yes [20:58] even more reason to ask [20:58] right [20:59] I don't see him mentioning quotas, and the site is probably incredibly old and never updated if he thinks he needed to do that for a few image files, lol [20:59] yeah, his last news was from 2 years ago, and he hasn't added images in ages [21:01] here at the bottom it says "This site employs software to limit the maximum amount of data any one person can download from the site in a given period of time" http://exoteric.roach.org/faq.html [21:04] even more reason to ask [21:04] archiveteam has access to a lot of IP space and a lot of bandwidth but you've heard that thing about with great power comes great responsibility [21:05] yeah, I would rather it be a win-win [21:06] indeed [21:16] Somebody should try to rip World Of Spectrum if that maybe possible [21:16] http://www.worldofspectrum.org/archivenote.html Just something to ntoe [21:16] note* [21:23] hey Selavi [21:23] look what I found [21:23] http://exoteric.roach.org/bg/dl/torrents.html [21:23] click them :p [21:23] boner [21:25] there's a couple torrents on TPB, but naturally they're deader than dead [21:36] godane: here's a complete list of every gawker subdomain, scraped from google: https://gist.githubusercontent.com/PressStartandSelect/ca04f19ddd856bd53c36b892265d81d7/raw/f7ea7f08dc165176291ceca9e7363ff83219ac1b/gawker%2520subdomains [22:17] *** gfscott has quit IRC (gfscott) [22:26] *** Stiletto has quit IRC (Ping timeout: 246 seconds) [22:29] Start: i'm going thur that now [22:31] looks like antiviral.gakwer.com has only 533 urls [22:38] *** BlueMaxim has joined #archiveteam-bs [22:40] btw june and july of kotaku.com and io9.gizmodo.com sitemap urls are being uploaded [22:49] https://archive.org/details/antiviral.gawker.com-sitemap-2014-20160818 [23:05] there was 2 http://gawker.com urls in the deframer.gawker.com grabbing [23:05] i'm not uploading those pages cause there both saved in wayback machine [23:06] mostly cause its one random url in 2005 sitemap and 2006 sitemap for defamer.gawker.com [23:07] the real meat of defamer.gawker.com starts in 2013 sitemap [23:07] here are the urls below: [23:07] PSA: this search engine lets you get results in JSON: https://searx.me/ [23:07] https://web.archive.org/web/*/http://gawker.com/157998/meg-ryans-baby-to-be-afflicted-with-identity-issues [23:07] we might want this for discovery projects [23:07] cc arkiver [23:07] https://web.archive.org/web/*/http://gawker.com/033069/lindsay-lohan-now-in-doll-form [23:09] *** Honno has quit IRC (Read error: Operation timed out) [23:26] blackbag.gawker.com is saved: https://archive.org/search.php?query=creator%3A%22blackbag.gawker.com%22 [23:40] *** GE has quit IRC (Quit: zzz) [23:43] *** Stiletto has joined #archiveteam-bs