O'Reilly Answers is a community site for sharing knowledge, asking questions, and providing answers that brings together our customers, authors, editors, conference speakers, and Foo (Friends of O'Reilly). More »
There are some practical techniques that are worth knowing about
when you are developing and running Pig programs. This section covers
some of them.ParallelismWhen running in Hadoop mode...
Is the cluster set up correctly? The best way to answer this
question is empirically: run some jobs and confirm that you get the
expected results. Benchmarks make good tests, as you also...
You can run a MapReduce job with a single line of code:
JobClient.runJob(conf). It’s very short, but it
conceals a great deal of processing behind the scenes. This section
uncov...
To take advantage of the parallel processing that Hadoop provides,
we need to express our query as a MapReduce job. After some local,
small-scale testing, we will be able to run it on a ...