#archiveteam-bs 2013-02-24,Sun

↑back Search

Time Nickname Message
00:16 🔗 SmileyG SketchCow, payment in bitcoin, I'm interested in your view.
02:15 🔗 joepie91 It's the opposite of a traditional automated teller that dispenses currency. Instead, these Bitcoin ATMs will accept dollar bills -- using the same validation mechanism as vending machines -- and instantly convert the amount to Bitcoins and deposit the result in your account.
02:15 🔗 joepie91 http://news.cnet.com/8301-13578_3-57570925-38/need-bitcoins-this-atm-takes-dollars-and-funds-your-account/
02:15 🔗 joepie91 cc SmileyG
07:59 🔗 SmileyG joepie91: hmmm
07:59 🔗 SmileyG so they are buying bitcoin from the market, interesting
08:50 🔗 omf_ archiveteam is the only non-scummy scraping group
08:51 🔗 omf_ when you find a forum or site about scrapers it usually has SEO and other nonsense as the uses
08:51 🔗 omf_ they put the value on making money, not on solving the problem
09:13 🔗 ersi glad to hear
09:13 🔗 ersi in a way, sad in another
09:15 🔗 omf_ screen scraping has a weird stigma to it
09:15 🔗 omf_ Like it is bad
09:18 🔗 SmileyG its bad for a infomation collection purpose
09:18 🔗 SmileyG however thats not what we are doing.
09:20 🔗 omf_ I think the deep problem is very simple
09:20 🔗 omf_ People need to understand that when you put something on the internet it is for EVERYONE
09:20 🔗 SmileyG nod
09:20 🔗 omf_ That is the core problem
09:20 🔗 SmileyG 5th dimention.
09:21 🔗 omf_ We see these programmers writing about how big websites just need APIs
09:21 🔗 omf_ reddit has a great api
09:21 🔗 SmileyG indeed, heard a stupid story about how someone at my wife's work posted photos of her baby info (scan date, etc) on instagram
09:21 🔗 omf_ take any url
09:21 🔗 SmileyG people started talking about, and she went bonkers about how people were talking about her at work
09:21 🔗 SmileyG :/
09:22 🔗 omf_ add /.json to the end and bam
09:22 🔗 SmileyG xbox.com is one which people screen scrape
09:22 🔗 SmileyG they DO have a api, but its limited by number of requests and they only allow some people access each year.
09:22 🔗 omf_ well that is fucking MS for you
09:22 🔗 SmileyG ;)
09:22 🔗 omf_ destroyer of everything
09:23 🔗 SmileyG i thought that was IGN ;)
09:23 🔗 omf_ shit IGN was a couple of website
09:23 🔗 SmileyG Cannot write to 'planetcivilization.gamespy.com/index.html?start_from=11' (Connection timed out).
09:23 🔗 SmileyG FINISHED --2013-02-24 09:17:48--
09:23 🔗 omf_ MS killed encarta
09:23 🔗 SmileyG Total wall clock time: 6m 55s
09:23 🔗 SmileyG Downloaded: 110 files, 4.2M in 58s (74.6 KB/s)
09:23 🔗 omf_ MSN, hotmail and hundreds of communities
09:24 🔗 GLaDOS You're making Yahoo sound good..
09:24 🔗 SmileyG :D
09:24 🔗 omf_ scary isn't it
09:24 🔗 * SmileyG sets MS speed to 1082mph
09:24 🔗 omf_ what about all the MSDN data that now only lives on stacks of cds in the back rooms of offices
09:25 🔗 omf_ they killed xda, how long till those forums are gone
09:26 🔗 omf_ we come down harder because Yahoo is an internet company and they should "know" better
09:26 🔗 GLaDOS Maybe we should get an insider into Microsoft and Yahoo, and get archives that way..
09:28 🔗 omf_ Yahoo just killed all remote jobs and I am not moving
09:28 🔗 omf_ for MS it would have to be system admin at redmond
09:28 🔗 omf_ well easier would be to find someone who has all the MSDN cds
09:29 🔗 omf_ my old job had hundreds upon hundreds of them
09:30 🔗 omf_ a research university with data partnerships would be easest
09:45 🔗 S[h]O[r]T if your looking for old M$ or msdn stuff i know someone. its a lot of stuff though
09:46 🔗 omf_ I know
09:46 🔗 omf_ it is important stuff
09:46 🔗 omf_ take directx for example
09:46 🔗 omf_ they documented the shit out of that
09:47 🔗 omf_ now as a side project I would be interested in mapping feature growth in it vs opengl
09:47 🔗 omf_ and factor in how game consoles slow api development
09:47 🔗 omf_ and the result would be some valuable information for graphics programmers now
09:48 🔗 omf_ graphics programming as a whole still seems like all black magic to most programmers
09:49 🔗 omf_ When I get the right job I am going to do all kinds of science stuff to help build up our computer knowledge
09:49 🔗 omf_ the discipline is so young and we have so much data. It is a unique opportunity
09:59 🔗 DFJustin http://www.thenational.ae/news/world/africa/how-timbuktu-s-heritage-was-saved-in-rice-sacks-and-canoes
10:01 🔗 omf_ they need to scan that shit in
10:03 🔗 omf_ they talk about saving the books, and re-cataloging but what about digital
10:03 🔗 omf_ you want to save information. Take a high res picture of each page
10:03 🔗 DFJustin well they had a building doing that but it's a giant target
10:03 🔗 omf_ I didn't see that in the article
10:04 🔗 DFJustin the newly built south african one that got ransacked but luckily only had a small percentage of the collection
10:06 🔗 DFJustin there's a little bit here but most of it is behind a registration wall http://www.tombouctoumanuscripts.org/db/
10:09 🔗 DFJustin I think everybody involved knows it needs digitizing but it's much easier said than done when dealing with thousands of fragile documents (no speed-scanning) and an african country with no money and difficult logistics
10:09 🔗 omf_ yeah I see the whole thing as a logistics problem not a people problem
10:09 🔗 omf_ we are here, they are way over thre
15:17 🔗 chazchaz For the videos on archive.org, is there a way to tell which encode is the one that was originally uploaded and which are re-encodes?
15:19 🔗 chazchaz For example, I'm looking at http://archive.org/details/Quicksand_clear and I don't want to download the 1 G MPEG2 if the 592 M Cineoack is the original.
15:26 🔗 chazchaz Never mind. I found the answer in the files.xml
15:51 🔗 godane looks like i have 710.4gb of g4tv.com video so far
15:51 🔗 godane thats about the about of video i still have to upload
15:52 🔗 godane i moved the older stuff thats uploaded to another drive
16:42 🔗 dashcloud hi folks, back from vacation- anything new and exciting happen over the past week?
16:44 🔗 godane dashcloud: ign is going down
16:44 🔗 godane g4tv is going down
16:44 🔗 dashcloud the whole site?
16:44 🔗 dashcloud that's a pretty damn big site
16:45 🔗 godane i got maybe over 800gb of videos from g4tv.com so far
16:46 🔗 godane good news is i may have almost all non hd videos from there
16:46 🔗 db48x22 that's a lot of video
16:46 🔗 godane i have uploaded over 9000+ so far
16:47 🔗 godane i'm trying to grab the bigger hd videos
16:47 🔗 godane like pressconfs
16:47 🔗 godane podcast stuff
16:48 🔗 godane also guys i have the first 4 episodes of conan late night show from 1993
16:49 🔗 godane there are also 3 tapes of conan from 1997
16:49 🔗 godane on myspleen
17:34 🔗 godane so i found something interesting
17:36 🔗 godane flvhd videos for e3 2010 don't exist but for some reason microsoft had one
17:36 🔗 omf_ 2010 wasn't the year e3 skipped?
17:37 🔗 godane no
17:37 🔗 godane i have e3 2010
17:37 🔗 godane just the hd videos don't exist
17:37 🔗 omf_ does e3 self host the videos?
17:37 🔗 godane but when i tryed grabing the flvhd for microsoft it was the same video non-hd flv
17:38 🔗 godane md5sum is the same
18:52 🔗 omf_ A sha256sum of 2tb worth of files took 15 hours to complete
18:53 🔗 omf_ finally done
20:41 🔗 omf_ http://dilbert.com/strips/comic/2013-02-24/ - sometimes the code wins
22:20 🔗 db48x22 omf_: heh
23:49 🔗 dashcloud http://www.epiclol.com/cdn/pictures/2012/06/monopoly-com-edition-circa-2000-weve-come-a-long-way-baby_1338542678_epiclolcom.jpg

irclogger-viewer