Skip to content

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

Notifications You must be signed in to change notification settings

linjk/hadoop-book

This branch is 156 commits behind tomwhite/hadoop-book:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0087add · Jan 12, 2014
Jul 23, 2010
Jan 12, 2014
Jan 10, 2014
Jan 9, 2014
Feb 8, 2012
Feb 2, 2012
Jan 9, 2014
Jan 10, 2014
Jan 10, 2014
Jan 11, 2012
Jan 19, 2012
Jan 29, 2012
Jan 12, 2014
Feb 6, 2012
Jan 13, 2012
Aug 22, 2011
Feb 2, 2012
Jan 11, 2012
Jan 25, 2012
Jan 10, 2014
Jan 29, 2012
Jan 10, 2014
Jan 29, 2012
Jan 12, 2014
Jan 13, 2012

Repository files navigation

Example code for "Hadoop: The Definitive Guide, Third Edition" by Tom White.
Copyright (C) 2011 Tom White, 978-1-449-31152-0

http://www.hadoopbook.com/
http://oreilly.com/catalog/9781449311520/

The code is hosted at http://github.com/tomwhite/hadoop-book/. You can find code
for the first edition at http://github.com/tomwhite/hadoop-book/tree/1e, and
for the second edition at http://github.com/tomwhite/hadoop-book/tree/2e.

This version of the code has been tested with:
 * Hadoop 1.2.1/0.22.0/0.23.x/2.2.0
 * Avro 1.5.4
 * Pig 0.9.1
 * Hive 0.8.0
 * HBase 0.90.4/0.94.15
 * ZooKeeper 3.4.2
 * Sqoop 1.4.0-incubating
 * MRUnit 0.8.0-incubating

Before running the examples you need to install Hadoop, Pig, Hive, HBase,
ZooKeeper, and Sqoop (as appropriate) as explained in the book.

You also need to install Maven.

Then you can build the code with:

% mvn package -DskipTests

By default Hadoop 1.2.1 is used. This can be changed by specifying the
hadoop.version property, e.g.

% mvn package -DskipTests -Dhadoop.version=1.2.0

There are profiles for different Hadoop major versions and distributions,
specified in hadoop-meta/pom.xml, and they are specified using the hadoop.distro
property. For example, to use the default version of Hadoop 2:

% mvn package -DskipTests -Dhadoop.distro=apache-2

Again, you can specify hadoop.version to use a particular Hadoop 2 version:

% mvn package -DskipTests -Dhadoop.distro=apache-2 -Dhadoop.version=2.1.1-beta

You should then be able to run the examples from the book.

Chapter names for "Hadoop: The Definitive Guide", Third Edition

ch01 - Meet Hadoop
ch02 - MapReduce
ch03 - The Hadoop Distributed Filesystem
ch04 - Hadoop I/O
ch05 - Developing a MapReduce Application
ch06 - How MapReduce Works
ch07 - MapReduce Types and Formats
ch08 - MapReduce Features
ch09 - Setting Up a Hadoop Cluster
ch10 - Administering Hadoop
ch11 - Pig
ch12 - Hive
ch13 - HBase
ch14 - ZooKeeper
ch15 - Sqoop
ch16 - Case Studies

app1 - Installing Apache Hadoop
app2 - Cloudera's Distribution for Hadoop
app3 - Preparing the NCDC Weather Data

About

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Makefile 90.0%
  • Java 9.0%
  • Shell 0.4%
  • Perl 0.3%
  • Python 0.2%
  • PigLatin 0.1%