Thursday, 20th May 2010
New Database: The National Data Catalog (Alpha)
The wonderful and amazing Sunlight Foundation and Sunlight Labs (do they ever slow down?) have released an alpha version of what one day in the not so distant future will likely become a key reference tool.
The National Data Catalog (alpha) at: http://nationaldatacatalog.com/ is a database to find and then access info about government data sets and APIs. Entries are browsable and the browse section of the site already has info about more than 1800 data sets. Browsable entries can be focused by:
+ Source Type
+ Release Year
What's truly amazing is the 1800 data sets come from only three sources. Remember, we're in alpha mode. The three sources are Data.gov (the U.S. government), the District of Columbia, the State of Utah, and although not mentioned on the home page, the City of San Francisco. So, if all goes smoothly this could be one massive database. Their goal is to have info for data sets from the federal government as well as state and local governments. Coordinating this is a massive one job and while data holders and those who know where to find the data will contribute, it will also probably take researchers time to identify and track down what is not sent to the Sunlight Foundation.
Registration is optional but does allow you to mark and save favorites, participate in discussions and the creation and maintenance of documentation.
We wonder how closely Sunlight is working with the library community. Obviously government librarians but most other types of info pros could also help. Two that quickly come to mind are legal and perhaps academic or special librarians who focus on specific topics. Let's say education in Alaska.
Here are a few words from the Director of Sunlight Labs, Clay Johnson.
“Up until now, there hasn’t been a definitive source that you could go to in order to find all of the data that government regularly places on the Internet,” said Clay Johnson, director of the Sunlight Labs. “The National Data Catalog will serve as a sort of Dewey decimal system for government data online. Want to know if the government releases data on how much electricity is generated in the United States per year? You can find the data set here.”
Yes, even a mention of the DDC. It would be nice to start moving away from this analogy since we don't see anything really DDC like here (this could change overnight). Meta data. Yes? But what about subject meta data that could help answer the question, what is the site about. For example, show databases about nursing or farming.
Yes, yes, yes, we are being sticklers here. We fully admit it! The phrase Dewey Decimal System is a synonym for organized information. No, we don't have time to discuss government resources and SuDoc classification right now. (-:
Overall and in all seriousness, this database shows a tremendous amount of possibilities for the future. At this point one of the most exciting things is that the Sunlight Foundation and Sunlight Labs are developing it. They continuously develop and provide (all for free) resources that are both useful and award winning. Sunlight Labs director, Clay Johnson, was named a 2010 member of the Federal 100 (top people in government info tech).
The only thing we would like to see added at this point is if the data set is available on the web in a searchable or browsable database, place a link to the web database in the data catalog.
Good luck moving forward! Since this is an open project let us know if/when we might be able to help.
See Also: More Sunlight
The Sunlight Foundation sponsored Apps for America and Apps for America 2.
Developers built tools using government data and API's. If you've never seen and used the winners, you should. It's very likely that a few of the resources will become favorites.
More? Here are all of the apps that were entered in the Apps 2 contest.