Querying CouchDB can be confusing to someone familiar with querying a relational database. This excerpt from Anderson, Lehnardt, & Slater's CouchDB: The Definitive Guide introduces you to using MapReduce to query a CouchDB database.
Traditional relational databases allow you to run any queries you like as long as your data is structured correctly. In contrast, CouchDB uses predefined map and reduce functions in a style known as MapReduce. These functions provide great flexibility because they can adapt to variations in document structure, and indexes for each document can be computed independently and in parallel. The combination of a map and a reduce function is called a view in CouchDB terminology.
Note
For experienced relational database programmers, MapReduce can take some getting used to. Rather than declaring which rows from which tables to include in a result set and depending on the database to determine the most efficient way to run the query, reduce queries are based on simple range requests against the indexes generated by your map functions.
Map functions are called once with each document as the argument. The function can choose to skip the document altogether or emit one or more view rows as key/value pairs. Map functions may not depend on any information outside of the document. This independence is what allows CouchDB views to be generated incrementally and in parallel.
CouchDB views are stored as rows that are kept sorted by key. This makes retrieving data from a range of keys efficient even when there are thousands or millions of rows. When writing CouchDB map functions, your primary goal is to build an index that stores related data under nearby keys.
Before we can run an example MapReduce view, we’ll need some data to
run it on. We’ll create documents carrying the price of various
supermarket items as found at different stores. Let’s create documents for
apples, oranges, and bananas. (Allow CouchDB to generate the
_id and _rev fields.) Use Futon to
create documents that have a final JSON structure that looks like
this:
{
"_id" : "bc2a41170621c326ec68382f846d5764",
"_rev" : "2612672603",
"item" : "apple",
"prices" : {
"Fresh Mart" : 1.59,
"Price Max" : 5.99,
"Apples Express" : 0.79
}
}This document should look like Figure 3.6 when entered into Futon.
OK, now that that’s done, let’s create the document for oranges:
{
"_id" : "bc2a41170621c326ec68382f846d5764",
"_rev" : "2612672603",
"item" : "orange",
"prices" : {
"Fresh Mart" : 1.99,
"Price Max" : 3.19,
"Citrus Circus" : 1.09
}
}And finally, the document for bananas:
{
"_id" : "bc2a41170621c326ec68382f846d5764",
"_rev" : "2612672603",
"item" : "banana",
"prices" : {
"Fresh Mart" : 1.99,
"Price Max" : 0.79,
"Banana Montana" : 4.22
}
}
Imagine we’re catering a big luncheon, but the client is very price-sensitive. To find the lowest prices, we’re going to create our first view, which shows each fruit sorted by price. Click “hello-world” to return to the hello-world overview, and then from the “select view” menu choose “Temporary view…” to create a new view. The result should look something like Figure 3.7.
Edit the map function, on the left, so that it looks like the following:
function(doc) {
var store, price, value;
if (doc.item && doc.prices) {
for (store in doc.prices) {
price = doc.prices[store];
value = [doc.item, store];
emit(price, value);
}
}
}This is a Javascript function that CouchDB runs for each of our documents as it computes the view. We’ll leave the reduce function blank for the time being.
Click “Run” and you should see result rows like in Figure 3.8, with the various items sorted by price. This map
function could be even more useful if it grouped the items by type so that
all the prices for bananas were next to each other in the result set.
CouchDB’s key sorting system allows any valid JSON object as a key. In
this case, we’ll emit an array of [item, price] so that
CouchDB groups by item type and price.
Let’s modify the view function so that it looks like this:
function(doc) {
var store, price, key;
if (doc.item && doc.prices) {
for (store in doc.prices) {
price = doc.prices[store];
key = [doc.item, price];
emit(key, store);
}
}
}Here, we first check that the document has the fields we want to use. CouchDB recovers gracefully from a few isolated map function failures, but when a map function fails regularly (due to a missing required field or other Javascript exception), CouchDB shuts off its indexing to prevent any further resource usage. For this reason, it’s important to check for the existence of any fields before you use them. In this case, our map function will skip the first “hello world” document we created without emitting any rows or encountering any errors. The result of this query should look like Figure 3.9.
Once we know we’ve got a document with an item type and some prices, we iterate over the item’s prices and emit key/values pairs. The key is an array of the item and the price, and forms the basis for CouchDB’s sorted index. In this case, the value is the name of the store where the item can be found for the listed price.
View rows are sorted by their keys—in this example, first by item, then by price. This method of complex sorting is at the heart of creating useful indexes with CouchDB.
Note
MapReduce can be challenging, especially if you’ve spent years working with relational databases. The important things to keep in mind are that map functions give you an opportunity to sort your data using any key you choose, and that CouchDB’s design is focused on providing fast, efficient access to data within a range of keys.
Three of CouchDB's creators show you how to use this document-oriented database as a standalone application framework or with high-volume, distributed applications. With its simple model for storing, processing, and accessing data, CouchDB is ideal for web applications that handle huge amounts of loosely structured data. You'll learn how to work with CouchDB through its RESTful web interface, and become familiar with key features such as simple document CRUD (create, read, update, delete), advanced MapReduce, deployment tuning, and more.




Help









