Download a stable release from an Apache Download Mirror and unpack it on your local filesystem. For example:
% tar xzf hbase-x.y.z.tar.gz
As with Hadoop, you first need to tell HBase where Java is located on your system. If you have the JAVA_HOME environment variable set to point to a suitable Java installation, then that will be used, and you don’t have to configure anything further. Otherwise, you can set the Java installation that HBase uses by editing HBase’s conf/hbase-env.sh, and specifying the JAVA_HOME variable to point to version 1.6.0 of Java.
Note:
HBase, just like Hadoop, requires Java 6.
For convenience, add the HBase binary directory to your command-line path. For example:
% export HBASE_HOME=/home/hbase/hbase-x.y.z % export PATH=$PATH:$HBASE_HOME/bin
To get the list of HBase options, type:
% hbase Usage: hbase <command> where <command> is one of: shell run the HBase shell master run an HBase HMaster node regionserver run an HBase HRegionServer node zookeeper run a Zookeeper server rest run an HBase REST server thrift run an HBase Thrift server avro run an HBase Avro server migrate upgrade an hbase.rootdir hbck run the hbase 'fsck' tool or CLASSNAME run the class named CLASSNAME Most commands print help when invoked w/o parameters.
Test Drive
To start a temporary instance of HBase that uses the /tmp directory on the local filesystem for persistence, type:
% start-hbase.sh
This will launch a standalone HBase instance that persists to the local filesystem; by default, HBase will write to /tmp/hbase-${USERID}.[115]
To administer your HBase instance, launch the HBase shell by typing:
% hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version: 0.89.0-SNAPSHOT, ra4ea1a9a7b074a2e5b7b24f761302d4ea28ed1b2, Sun Jul 18 15:01:50 PDT 2010 hbase(main):001:0>
This will bring up a JRuby IRB interpreter that has had some HBase-specific commands added to it. Type
help and then RETURN to see the list of shell commands grouped into categories. Type help COMMAND_GROUP for help by category or help COMMAND for help on a specific command and example usage. Commands use Ruby formatting to specify lists and dictionaries. See the end of the main help screen for a quick tutorial.Now let us create a simple table, add some data, and then clean up.
To create a table, you must name your table and define its schema. A table’s schema comprises table attributes and the list of table column families. Column families themselves have attributes that you in turn set at schema definition time. Examples of column family attributes include whether the family content should be compressed on the filesystem and how many versions of a cell to keep. Schemas can be later edited by offlining the table using the shell
disable command, making the necessary alterations using alter, then putting the table back online with enable.To create a table named test with a single column family name data using defaults for table and column family attributes, enter:
hbase(main):007:0> create 'test', 'data' 0 row(s) in 1.3066 seconds
Tip:
If the previous command does not complete successfully, and the shell displays an error and a stack trace, your install was not successful. Check the master logs under the HBase logs directory—the default location for the logs directory is ${HBASE_HOME}/logs—for a clue as to where things went awry.
See the
help output for examples adding table and column family attributes when specifying a schema.To prove the new table was created successfully, run the
list command. This will output all tables in user space:hbase(main):019:0> list test 1 row(s) in 0.1485 seconds
To insert data into three different rows and columns in the data column family, and then list the table content, do the following:
hbase(main):021:0> put 'test', 'row1', 'data:1', 'value1' 0 row(s) in 0.0454 seconds hbase(main):022:0> put 'test', 'row2', 'data:2', 'value2' 0 row(s) in 0.0035 seconds hbase(main):023:0> put 'test', 'row3', 'data:3', 'value3' 0 row(s) in 0.0090 seconds hbase(main):024:0> scan 'test' ROW COLUMN+CELL row1 column=data:1, timestamp=1240148026198, value=value1 row2 column=data:2, timestamp=1240148040035, value=value2 row3 column=data:3, timestamp=1240148047497, value=value3 3 row(s) in 0.0825 seconds
Notice how we added three new columns without changing the schema.
To remove the table, you must first disable it before dropping it:
hbase(main):025:0> disable 'test' 09/04/19 06:40:13 INFO client.HBaseAdmin: Disabled test 0 row(s) in 6.0426 seconds hbase(main):026:0> drop 'test' 09/04/19 06:40:17 INFO client.HBaseAdmin: Deleted test 0 row(s) in 0.0210 seconds hbase(main):027:0> list 0 row(s) in 2.0645 seconds
Shut down your HBase instance by running:
% stop-hbase.sh
To learn how to set up a distributed HBase and point it at a running HDFS, see the Getting Started section of the HBase documentation.
Apache Hadoop is ideal for organizations with a growing need to store and process massive application datasets. With Hadoop: The Definitive Guide, programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. The book includes case studies that illustrate how Hadoop is used to solve specific problems.




Help






