Jump to content

A data science cheat sheet

  macslocum's Photo
Posted Jun 02 2010 05:25 AM

Over on Radar, Mike Loukides takes a deep dive into the growing field of data science. It's a fantastic analysis that examines data science from all angles -- the companies, the technologies and the unique skills.

Mike's post is quite extensive, so I figured a cheat sheet could help folks get their arms around this broad and quickly-transforming topic. As is always the case with this sort of thing, the abridged version is no substitute for the original. I highly recommend you invest time in the full post.

That said, the biggest takeaways (in my opinion) are:

-- Data science is linked to Web 2.0 through "collective intelligence." As users add more data to an application, the application itself grows more useful.

-- The future belongs to companies that can turn data into data products.

-- Google, Amazon, Facebook, and LinkedIn represent the first wave of data-science innovation, but many other businesses are catching on.

-- If you're an analytical sort who can pluck signals and stories from massive data streams, you're in an enviable position. Myopic specialists need not apply.

As an aside, you can see how analysis and storytelling come together in data scientist Hilary Mason's recent Web 2.0 Expo keynote:

-- Agility and adaptation -- two must-haves in the startup world -- are important skills in the data science domain. Much like their start-up cousins, data scientists use a variety of tools and technologies to quickly explore questions and iterate toward solutions.

-- Data analysis is often comparative rather than precise. Mike notes: "If you're asking whether sales to Northern Europe are increasing faster than sales to Southern Europe, you aren't concerned about the difference between 5.92 percent annual growth and 5.93 percent."

Much more is outlined in Mike's piece.

0 Subscribe

1 Reply

  ramarske's Photo
Posted Dec 14 2010 04:52 AM

I find the following comment interesting -- "Data analysis is often comparative rather than precise." This leads to a question, eg how should a publisher of "official statistics" balance its mandate to provide precision with the desire to provide compelling stories through visualizations that, by their nature, are less precise?