Jump to content

Where can I find public data available for download?

+ 2
  juliesteele's Photo
Posted Oct 02 2009 09:32 AM

In Toby Segaran's work on Freebase, he's looked at hundreds of data sets that were interesting on their own, but even more interesting as augmentation and context for other data sets. These come from nonprofits, governments, companies, and grassroots efforts. Here's a pretty big list (but a very small sample) of what's out there:

  • The Center for Responsible Politics publishes contributions by individuals to political candidates in the United States.

  • Many countries have data from their census available online, including the United States.

  • The Geonames database has the longitude, latitude, containment, and class of named places all over the world.

  • The Securities and Exchange Commission has downloadable financial data for all companies listed on U.S. stock exchanges.

  • Agencies like the Environmental Protection Agency have downloadable information about environmental pollution in certain places and the facilities that produce the most pollution.

  • A surprisingly useful resource is the Trademark database, which can be used to find which companies own rights to brand names, what the brand names are used to sell, and, often amusingly, all the art associated with different brands.

  • Many social networks allow downloads of subsets of information, including relationships and other fields such as location.

  • Nutritional information (calories, grams of fat, etc.) about almost every consumable product is available from the U.S. Department of Agriculture.

  • The National Center for Biotechnology Information (NCBI) publishes many databases related to genetic and medical informatics, including Genbank, Pubmed, Gene, and dbSNP.

  • Many city or state health departments publish data about restaurant inspections, which is a good source of free data about which restaurants are in a city and also how clean they are.

  • Agencies such as Medicare and the Food and Drug Administration have huge downloadable data sets of drug availability, costs, and usage.

  • Online message boards often contain mentions of companies, products, and places, along with text that can be mined for sentiment and relationships.


You'll notice that although a lot of these sources come from totally different places, they often talk about very similar things.
Cover of Beautiful Data
Learn more about this topic from Beautiful Data. 

With this insightful book, you'll learn from the best data practitioners in the field just how wide-ranging -- and beautiful -- working with data can be. Join 39 contributors as they explain how they developed simple and elegant solutions on projects ranging from the Mars lander to a Radiohead video.

Learn More Read Now on Safari


Tags:
0 Subscribe


1 Reply

0
  simonstl's Photo
Posted Oct 02 2009 10:25 AM

There are also lots of state and municipal data sources available. I was delighted when I found the Cornell University Geospatial Information Repository (CUGIR), which has lots of information about New York State. The amount of data available varies a lot by county, but Tompkins County, where I live and where Cornell is based, has a lot.

They're part of the National Spatial Data Clearinghouse program, and there's also a New York State GIS Clearinghouse.

I've used this data to create election result maps, as well as to calculate the populations of areas that aren't themselves municipalities.

I used MapServer (free) to create the election maps, and Manifold ($245) to calculate the census information. There's a lot more potential in this data, though.

Cover of Web Mapping Illustrated
Learn more about this topic from Web Mapping Illustrated. 

Developers who want to publish maps on the web often discover that commercial tools cost too much and hunting down the free tools scattered across Internet can use up too much of your time and resources. Web Mapping Illustrated shows you how to create maps, even interactive maps, with free tools, including MapServer, OpenEV, GDAL/OGR, and PostGIS. It also explains how to find, collect, understand, use, and share mapping data, both over the traditional Web and using OGC-standard services like WFS and WMS.

Learn More Read Now on Safari
Cover of Mapping Hacks
Learn more about this topic from Mapping Hacks. 

Mapping Hacks is a collection of one hundred simple techniques available to developers and power users who want to draw digital maps. You'll learn where to find the best sources of geographic data and then how to integrate that data into your own creations. With so many industrial-strength tips and tools, Mapping Hacks effectively takes the sting out of digital mapmaking.

Learn More Read Now on Safari