O'Reilly Answers is a community site for sharing knowledge, asking questions, and providing answers that brings together our customers, authors, editors, conference speakers, and Foo (Friends of O'Reilly). More »
This excerpt covers some operational aspects of running a cluster. Once you have a cluster up and running, how do you know what’s going on? Below we have some information from the O'Reilly publi...
Managing multiple ESX Servers is easy with the VMware. This excerpt from Troy & Helmke's VMware Cookbook will guide you through the process of creating a cluster with the vCenter client.
If you ...
Big Data is when your data is so large you seriously have to consider how you're going to organize, store, and manage it in order to gain some benefit from it. Here are a few links to get you star...
There are some practical techniques that are worth knowing about
when you are developing and running Pig programs. This section covers
some of them.ParallelismWhen running in Hadoop mode...
Is the cluster set up correctly? The best way to answer this
question is empirically: run some jobs and confirm that you get the
expected results. Benchmarks make good tests, as you also...
You can run a MapReduce job with a single line of code:
JobClient.runJob(conf). It’s very short, but it
conceals a great deal of processing behind the scenes. This section
uncov...
To take advantage of the parallel processing that Hadoop provides,
we need to express our query as a MapReduce job. After some local,
small-scale testing, we will be able to run it on a ...