Jump to content

How to Work With MongoDB's Shell Interface

+ 1
  chco's Photo
Posted May 02 2011 10:20 AM

This excerpt covers some operational aspects of running a cluster. Once you have a cluster up and running, how do you know what’s going on? Below we have some information from the O'Reilly publication Scaling MongoDB which can help.

As with a single instance of MongoDB, most administration on a cluster can be done through the mongo shell.

Getting a Summary

db.printShardingStatus() is your executive summary. It gathers all the important information about your cluster and presents it nicely for you.


> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "ubuntu:27017" }
{ "_id" : "shard0001", "host" : "ubuntu:27018" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : true, "primary" : "shard0000" }
test.foo chunks:
shard0001 15
shard0000 16
{ "_id" : { $minKey : 1 } } -->> { "_id" : 0 } on : shard1 { "t" : 2, "i" : 0 }
{ "_id" : 0 } -->> { "_id" : 15074 } on : shard1 { "t" : 3, "i" : 0 }
{ "_id" : 15074 } -->> { "_id" : 30282 } on : shard1 { "t" : 4, "i" : 0 }
{ "_id" : 30282 } -->> { "_id" : 44946 } on : shard1 { "t" : 5, "i" : 0 }
{ "_id" : 44946 } -->> { "_id" : 59467 } on : shard1 { "t" : 7, "i" : 0 }
{ "_id" : 59467 } -->> { "_id" : 73838 } on : shard1 { "t" : 8, "i" : 0 }
... some lines omitted ...
{ "_id" : 412949 } -->> { "_id" : 426349 } on : shard1 { "t" : 6, "i" : 4 }
{ "_id" : 426349 } -->> { "_id" : 457636 } on : shard1 { "t" : 7, "i" : 2 }
37
{ "_id" : 457636 } -->> { "_id" : 471683 } on : shard1 { "t" : 7, "i" : 4 }
{ "_id" : 471683 } -->> { "_id" : 486547 } on : shard1 { "t" : 7, "i" : 6 }
{ "_id" : 486547 } -->> { "_id" : { $maxKey : 1 } } on : shard1 { "t" : 7, "i" : 7 }


db.printShardingStatus() prints a list of all of your shards and databases. Each sharded collection has an entry (there’s only one sharded collection here, test.foo). It shows you how chunks are distributed (15 chunks on shard0001 and 16 chunks on shard0000). Then it gives detailed information about each chunk: its range—e.g., { "_id" : 115882 } -->> { "_id" : 130403 } corresponding to _ids in [115882, 130403)—and what shard it’s on. It also gives the major and minor version of the chunk, which you don’t have to worry about.

Each database created has a primary shard that is its “home base.” In this case, the test database was randomly assigned shard0000 as its home. This doesn’t really mean anything—shard0001 ended up with more chunks than shard0000! This field should never matter to you, so you can ignore it. If you remove a shard and some database has its “home” there, that database’s home will automatically be moved to a shard that’s still in the cluster.

db.printShardingStatus() can get really long when you have a big collection, as it lists every chunk on every shard. If you have a large cluster, you can dive in and get more precise information, but this is a good, simple overview when you’re starting out.

The config Collections

mongos forward your requests to the appropriate shard—except for when you query the config database. Accessing the config database patches you through to the config servers, and it is where you can find all the cluster’s configuration information. If you do have a collection with hundreds or thousands of chunks, it’s worth it to learn about the contents of the config database so you can query for specific info, instead of getting a summary of your entire setup.

Let’s take a look at the config database. Assuming you have a cluster set up, you should see these collections:

> use config
switched to db config
> show collections
changelog
chunks
collections
databases
lockpings
locks
mongos
settings
shards
system.indexes
version


Many of the collections are just accounting for what’s in the cluster:

config.mongos

A list of all mongos processes, past and present

> db.mongos.find()
    { "_id" : "ubuntu:10000", "ping" : ISODate("2011-01-08T10:11:23"), "up" : 0 }
    { "_id" : "ubuntu:10000", "ping" : ISODate("2011-01-08T10:11:23"), "up" : 20 }
    { "_id" : "ubuntu:10000", "ping" : ISODate("2011-01-08T10:11:23"), "up" : 1 }


_id is the hostname of the mongos. ping is the last time the config server pinged it. up is whether it thinks the mongos is up or not. If you bring up a mongos, even if it’s just for a few seconds, it will be added to this list and will never disappear. It doesn’t really matter, it’s not like you’re going to be bringing up millions of mongos servers, but it’s something to be aware of so you don’t get confused if you look at the list.


config.shards

All the shards in the cluster


config.databases

All the databases, sharded and non-sharded


config.collections

All the sharded collections


config.chunks

All the chunks in the cluster


config.settings contains (theoretically) tweakable settings that depend on the database version. Currently, config.settings allows you to change the chunk size (but don’t!) and turn off the balancer, which you usually shouldn’t need to do. You can change these settings by running an update. For example, to turn off the balancer:

> db.settings.update({"_id" : "balancer"}, {"$set" : {"stopped" : true }}, true)


If it’s in the middle of a balancing round, it won’t turn off until the current balancing has finished.

The only other collection that might be of interest is the config.changelog collection. It is a very detailed log of every split and migrate that happens. You can use it to retrace the steps that got your cluster to whatever its current configuration is. Usually it is more detail than you need, though.

“I Want to Do X, Who Do I Connect To?”

If you want to do any sort of normal reads, writes, or administration, the answer is always “a mongos.” It can be any mongos (remember that they’re stateless), but it’s always a mongos—not a shard, not a config server.

You might connect to a config server or a shard if you’re trying to do something unusual. This might be looking at a shard’s data directly or manually editing a messed up configuration. For example, you’ll have to connect directly to a shard to change a replica set configuration.

Remember that config servers and shards are just normal mongods; anything you know how to do on a mongod you can do on a config server or shard. However, in the normal course of operation, you should almost never have to connect to them. All normal operations should go through mongos.

Scaling MongoDB

Learn more about this topic from Scaling MongoDB.

Create a MongoDB cluster that will to grow to meet the needs of your application. With this short and concise ebook, you'll get guidelines for setting up and using clusters to store a large volume of data, and learn how to access the data efficiently. In the process, you'll understand how to make your application work with a distributed database system.

See what you'll learn


Tags:
0 Subscribe


0 Replies