A Wikipedia for data sets?

Bret Taylor has an interesting post about the fact that even though there is a ton of data on the web, no one really has access to it in a way that lets you manipulate it in a programmatical way.  That is, you have all the world's map data at your fingertips -- as long as you don't mind using Google Maps or Virtual Earth or whatever to access it -- you can't just get it all in a database so that you can build your own map database.

I've run into this at work, where I'm doing very top secret things (but we don't really want to keep the data that we process itself secret).

I've also run into this in pursuit of a pet project, a new basketball statistics site (look beyond scoring totals!!!).  It's really hard to get a stream of nba statistics on a daily basis.  Sure, there are lots of pages with NBA stats, but if you want them, you've got to write a complicated crawler that parses the stats out of the site.  You can't just plug into a database and get the raw numbers.

Bret suggests a wikipedia for data sets (not information, DATA).  An interesting idea.  Check out CKAN, which seems to be taking some baby steps in this direction. I don't like that CKAN refers to itself as a knowledge repository, because knowledge is not data.  To illustrate what I mean, think about whether an almanac disseminates knowledge about the weather or data about the weather.  If it were the former, reading an almanac might be an important step towards becoming a meteorologist.  But it isn't.

Categories

0 TrackBacks

Listed below are links to blogs that reference this entry: A Wikipedia for data sets?.

TrackBack URL for this entry: http://unexpected-value.pminton.org/cgi-bin/mt/mt-tb.cgi/28

Leave a comment

About Me

My name's Patrick Minton. I'm an MBA student, technology professional,  basketball coach, amateur economist, or part-time poker shark, depending on my mood. This blog is basically my way of shaking my fists at the heavens.

About this Entry

This page contains a single entry by Patrick Minton published on April 10, 2008 2:45 PM.

Enough Basketball, More Food was the previous entry in this blog.

Fiorina for VP. Whatever is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Powered by Movable Type 4.0