Discussion: MapReduce Implementation and Experiment
Investigate the implementation strategy of MapReduce System
- Give a sample program using the MapReduce programming pattern in your favorite programming language
for one of the following problems:
- Count word frequency of text files in a single computer system
- Compute and output TF-IDF for text files in a single computer system
- Design a MapReduce system, and document the design in pseudo code. Consider the following components
- User program
- Master
- Map worker
- Reduce worker
- Set up Apache Hadoop over a cluster of Virtual Machines, and write and run a program for
one of the problems.
- To run multiple virtual machines, it is advisable to set up Linux guests without GUI
- Complete a deck of slides. Record video of your experiments.
- Include resources used (LLMs, such as ChatGPT; websites; papers; GitHub repositories; or blogs etc.)
Be prepared to present selected slides and your video.