This tutorial contains step by step instructions for installing hadoop 2.x on Mac OS X El Capitan. These instructions should work on other Mac OS X versions such as Yosemite and Sierra. This tutorial uses pseudo-distributed mode for running hadoop which allows us to use a single machine to run different components of the system in different Java processes. We will also configure YARN as the resource manager for running jobs on hadoop.
Java 6 El Capitan
Java's Runtime Environment: Java is a universal programming language that allows software and utilities to be compatible across a wide variety of devices. OS X El Capitan will present a prompt to download Java if an application needs Java to function correctly. Java library for the plugins are not compatible with El Capitan?
Hadoop Component Versions
- Java 7 or higher. Java 8 is recommended.
- Hadoop 2.7.3 or higher.
Hadoop Installation on Mac OS X Sierra & El Capitan
- Environment: Windows XP-10, Microsoft Office Suite 2010-2016 and 365, Microsoft live accounts, OSX Leopard, El Capitan, iOS, Android. Show more Show less Help Desk Associate.
- Mac OS X and Apple Java 6 End of Life Apple has posted notice that Mac OS X 10.11 (El Capitan) will be the last OS X release that supports Java 6, and as such, recommends developers whose applications depend on Java 6 to migrate their apps to a newer Java version provided by Oracle.
Step 1: Install Java
Hadoop 2.7.3 requires Java 7 or higher. Run the following command in a terminal to verify the Java version installed on the system.
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
If Java is not installed, you can get it from here.
Step 2: Configure SSH
When hadoop is installed in distributed mode, it uses a password less SSH for master to slave communication. To enable SSH daemon in mac, go to System Preferences => Sharing. Then click on Remote Login to enable SSH. Execute the following commands on the terminal to enable password less login to SSH,
Step 3 : Install Hadoop
Download hadoop 2.7.3 binary zip file from this link (200MB). Extract the contents of the zip to a folder of your choice.
Step 4: Configure Hadoop
First we need to configure the location of our Java installation in etc/hadoop/hadoop-env.sh. To find the location of Java installation, run the following command on the terminal,
Copy the output of the command and use it to configure JAVA_HOME variable in etc/hadoop/hadoop-env.sh.
Modify various hadoop configuration files to properly setup hadoop and yarn. These files are located in etc/hadoop.
etc/hadoop/core-site.xml
etc/hadoop/hdfs-site.xml
etc/hadoop/mapred-site.xml
etc/hadoop/yarn-site.xml
Note the use of disk utilization threshold above. This tells yarn to continue operations when disk utilization is below 98.5. This was required in my system since my disk utilization was 95% and the default value for this is 90%. If disk utilization goes above the configured threshold, yarn will report the node instance as unhealthy nodes with error 'local-dirs are bad'.
Step 5: Initialize Hadoop Cluster
From a terminal window switch to the hadoop home folder (the folder which contains various sub folders such as bin and etc). Run the following command to initialize the metadata for the hadoop cluster. This formats the hdfs file system and configures it on the local system. By default, files are created in /tmp/hadoop-<username> folder.
It is possible to modify the default location of name node configuration by adding the following property in the hdfs-site.xml file. Similarly the hdfs data block storage location can be changed using dfs.data.dir property.
The following commands should be executed from the hadoop home folder.
Step 6: Start Hadoop Cluster
Run the following command from terminal (after switching to hadoop home folder) to start the hadoop cluster. This starts name node and data node on the local system.
To verify that the namenode and datanode daemons are running, execute the following command on the terminal. This displays running Java processes on the system.
29219 Jps
19126 NameNode
19303 SecondaryNameNode
Step 7: Configure HDFS Home Directories
We will now configure the hdfs home directories. The home directory is of the form - /user/<username>. My user id on the mac system is jj. Replace it with your user name. Run the following commands on the terminal,
Install Java El Capitan
Step 8: Run YARN Manager
Start YARN resource manager and node manager instances by running the following command on the terminal,
Run jps command again to verify all the running processes,
29283 Jps
19413 ResourceManager
19126 NameNode
19303 SecondaryNameNode
19497 NodeManager
Step 9: Verify Hadoop Installation
Access the URL http://localhost:50070/dfshealth.html to view hadoop name node configuration. You can also navigate the hdfs file system using the menu Utilities => Browse the file system.
Access the URL http://localhost:8088/cluster to view the hadoop cluster details through YARN resource manager.
Step 10: Run Sample MapReduce Job
Hadoop installation contains a number of sample mapreduce jobs. We will run one of them to verify that our hadoop installation is working fine.
We will first copy a file from local system to the hdfs home folder. We will use core-site.xml in etc/hadoop as our input,
Verify that the file is in HDFS folder by navigating to the folder from the name node browser console.
Let us run a mapreduce program on this hdfs file to find the number of occurrences of the word 'configuration' in the file. A mapreduce program for word count is available in the hadoop samples.
This runs the mapreduce on the hdfs file uploaded earlier and then outputs the results to the output folder inside the hdfs home folder. The file will be named as part-r-00000. This can be downloaded from the name node browser console or run the following command to copy it to the local folder.
Print the contents of the file. This contains the number of occurrences of the word 'configuration' in core-site.xml.
Finally delete the uploaded file and the output folder from hdfs system,
Step 11: Stop Hadoop/YARN Cluster
Run the following commands to stop hadoop/YARN daemons. This stops name node, data node, node manager and resource manager.
'Great tutorial. Thank you for concise directions to get me running with Java on my new iMac.'A.M., May 3, 2010
'Thank you so much for this tutorial! I am new to programming and have never written any code on my mac before now. This was very helpful!'M.A., August 29, 2010
'Great tutorial.'W.C., October 8, 2009
'Great tutorial! Many thanks, this is really helpful for a programming assignment in a computer network class I'm taking.'Z.L., October 6, 2009
'Great tut omg thanks so much!'S., October 4, 2009
'I love this tutorial! Thank you!'S.R., March 4, 2009
'Fantastic!!!!! Thank you very muchhhhh!! I can start mac programing now. The tutorial is great!!!'F., November 8, 2008
'Incredibly clear to follow, Thanks very much'J.G., October 1, 2008
'Très bon tutorial, merci beaucoup'K., November 7, 2007
'This is exactly the information I was looking for!! Great job explaining how to use the Jar Bundler. I always wondered how to group my files into a single 'app' file.'J.L.M., July 6, 2007
'Thanks for the tutorial.'D.W., June 6, 2007
'Great tutorial! Thank You very much!'T., May 19, 2007
'This is a very good developement friendly site'A.K., March 18, 2007
'Thankyou, I am very happy to read your content for MacJava. Thankyou very much'A., January 12, 2007
'Very Good!!!!'D.A., August 10, 2006
'Great article - I found it very helpful! Thanks!'J.T., August 1, 2006
'Hello, Great information, thank you very much! Can I translate your article in french?'P.T., January 21, 2006
'Great article!'D., December 9, 2005
'Thank you so much. Your instructions are perfectly concise. There is plenty of information about programming, but not nearly enough about installing and packaging. This page is a blessing.'S.B., October 21, 2005
'This is a great info on mac application builder'S.S., October 11, 2005
'sweet article! one of the clearest and simplest ive seen and everything just works!!'J.B., September 9, 2005
Java 6 El Capitan
'Amazing, I would never have thought it was that easy... You are truely the best!'M.F., September 7, 2005
'Couldn't be more clearer!'M.H., August 13, 2005
'THX a lot - very helpful and direct to the point'T.W., August 9, 2005
'Cool -- you solved the mystery!'C.M., June 27, 2005