The instructor provides a short and incomplete tutorial about setting up Hadoop cluster.

Complete the following:

  1. Setting up a Hadoop cluster capable of running MapReduce workload if not already.
  2. Adding the ZooKeeper service to the cluster (See References)

Prepare slides, discussing:

  • Motivation. (e.g., Why do we need ZooKeeper?)
  • Problem. (e.g., What problem does ZooKeeper solve? )
  • Design. (e.g., How does ZooKeeper help? Is there any trade-offs? What are the trade-offs)
  • Experiment. (e.g., completing setup, running experiments observing in what ways ZooKeeper helps)

Be prepared to present selected slides.

Reference:

  1. “HDFS High Availability”, https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html, retrieved November 2025.
  2. “Apache ZooKeeper Project”, https://zookeeper.apache.org/, retrieved November 2025.