Installing Hadoop and YARN
This topic describes how to use package managers to download and install Hadoop and YARN services from the EEP repository.
About the Hadoop and YARN Packages
Beginning with core 6.2.0 and EEP 7.0.0, Hadoop and YARN services are no longer included in the data-fabric repository for core packages. They are provided as ecosystem components in the EEP repository. For example:
Old location:
https://package.ezmeral.hpe.com/releases/v<version>/redhat/
New location:
https://package.ezmeral.hpe.com/releases/MEP/MEP-<version>/redhat/
Package | Description |
---|---|
mapr-hadoop-util |
This package is new for Release 6.2.0. This package contains the essential
libraries to run hadoop fs and hadoop mfs shell
commands, plus the minimal required Hadoop libraries for core to be able to
function. On a data-fabric core node, mapr-hadoop-util is the
minimal package you need to install for Hadoop shell commands, and for data-fabric
operations, such as maprlogin , to work.
mapr-core and mapr-hadoop-client automatically
pull in mapr-hadoop-util , so you don't need to install it
explicitly. |
mapr-hadoop-client |
This package is new for Release 6.2.0. This package contains the Hadoop job
clients (MR and YARN). Clients can submit jobs to a server running
mapr-hadoop-core . mapr-hadoop-client is
sufficient to run all hadoop mfs and hadoop fs
commands, and submit MapReduce jobs to whichever server is running
mapr-hadoop-core . |
mapr-hadoop-core |
This package contains all the required libraries to run MapReduce jobs
locally. Installing mapr-hadoop-core installs
mapr-hadoop-client and mapr-hadoop-util as
dependencies. |
mapr-nodemanager |
Installs the NodeManager service. This package installs
mapr-hadoop-core as a dependency. |
mapr-resourcemanager |
Installs the ResourceManager service. This package installs
mapr-hadoop-core as a dependency. |
mapr-historyserver |
Installs the HistoryServer service. This package installs
mapr-hadoop-core as a dependency. |
mapr-timelineserver |
Installs the TimelineServer service. |
mapr-httpfs |
Installs the HttpFS service. Beginning with EEP 9.0.0, HttpFS is part of Hadoop. |
Note that the mapr-mapreduce2
package has been removed and is no longer
available. mapr-hadoop-core
obsoletes the mapr-mapreduce2
package. All the contents of mapr-mapreduce2
are now part of
mapr-hadoop-core
.
For package dependency information, see Package Dependencies.
Where to Install the Packages
On these nodes | Install these packages |
---|---|
All nodes where you need access to the file system |
|
Designated nodes where Hadoop or YARN services are needed (install only the packages you need) |
|
Nodes where Hadoop or YARN services are installed | mapr-hadoop-core |
Client nodes and nodes where applications will be launched | mapr-hadoop-client |
Installing Hadoop and YARN Packages
The following steps use the operating-system package managers to download and install Hadoop and YARN packages from the EEP repository:
- Change to the
root
user or usesudo
:- On RHEL, CentOS, or Oracle Linux, use the
yum
command to install the services that you want to run on the node.Syntax
Exampleyum install <package_name> <package_name> <package_name>
yum install mapr-hadoop-util mapr-nodemanager mapr-httpfs
- On SLES, use the
zypper
command to install the services that you want to run on the node. (SLES support might be limited; for more information, see Operating System Support Matrix.)Syntax
Examplezypper install <package_name> <package_name> <package_name>
zypper install mapr-hadoop-util mapr-nodemanager mapr-httpfs
- On Ubuntu, use the
apt-get
commands to update the Ubuntu package cache and install the services that you want to run on the node.- Update the Ubuntu package cache:
apt-get update
- Install the services:
Syntax
Exampleapt-get install <package_name> <package_name> <package_name>
apt-get install mapr-hadoop-util mapr-nodemanager mapr-httpfs
- Update the Ubuntu package cache:
- On RHEL, CentOS, or Oracle Linux, use the
- On each node, run
configure.sh
with the-R
option. Include the-TL
option if the timeline server is installed on the cluster. For example:configure.sh -R -HS <hostname> -TL <hostname>