Installer Prerequisites and Guidelines

The node on which you run the installer and the nodes you plan to include in your cluster must meet certain user, connectivity, and security requirements.

Installer Requirements

The node that runs the Installer must meet the following requirements:
Installer Node
Beginning with Installer 1.6, the node that runs the Installer does not need to be one of the nodes you plan to install the cluster on. Ensure that the default umask for the root user is set to 0022 on all nodes in the cluster. You can change the umask setting in the /etc/profile file, or in the .cshrc or .login file. The root user must have a 0022 umask because the cluster admin user requires access to all files and directories under the /opt/mapr directory, even those initially created by root services.
Note also that the Installer is not FIPS compliant, and is not supported to run on a FIPS-enabled node.
Package Dependencies
Depending on the operating system, the Installer requires the following packages. If these packages are not found, the Installer attempts to download them from Internet repositories:
Ubuntu Nodes Red Hat / CentOS Nodes SLES Nodes
  • ca-certificates
  • curl*
  • debianutils
  • dnsutils
  • iputils-arping
  • libnss3
  • libssl1.0.0
  • libsysfs2
  • netcat
  • nfs-common
  • ntp
  • ntpdate
  • openssl
  • python-dev
  • python-pycurl
  • sdparm
  • sudo
  • syslinux
  • sysstat
  • uuid-runtime
  • wget
  • curl*
  • device-mapper
  • iputils
  • libsysfs
  • lvm2
  • nc
  • nfs-utils
  • ntp
  • nss
  • openssl
  • python-devel
  • sdparm
  • sudo
  • syslinux
  • sysstat
  • wget
  • which
  • yum-utils
  • compat-openssl10 (required only when running MapR 6.1.x on RHEL version 8 and above)
  • ca-certificates
  • curl*
  • device-mapper
  • iputils
  • libopenssl1_0_0
  • sysfsutils
  • lvm2
  • mozilla-nss
  • nfs-client
  • ntp
  • sdparm
  • sudo
  • syslinux
  • sysstat
  • util-linux
  • wget
  • libfreebl3

*The curl version must be greater than 7.51.0.

Repository Connectivity
The Installer requires connectivity to valid repositories for the:
  • Linux operating system
  • Core
  • Ecosystem Pack (EEP)
The Installer can connect to an Internet repository or to a preinstalled local repository, as described in Using a Local, Shared Repository With the Installer. If the Installer dependencies and packages are present, but there is no connectivity to an OS repository, the Installer fails with the following message:
ERROR: Unable to install dependencies (installer). Ensure that a core OS repo is enabled and retry mapr-setup.sh
Java
Installer 1.14 and later require Java JDK 11 or an equivalent Java distribution. Before using Installer 1.14 on Ubuntu 16.04 nodes, you must manually install the JDK. If you are using Installer 1.14 on RHEL/CentOS or SLES, the Installer installs OpenJDK 11 for you.

For more information about the supported Java JDK versions, see the Java Support Matrix and Java.

SSH Access
The Installer must have SSH access to all nodes that you want to include in the cluster.
Port Availability
Port 9443 or the non-default port that you configure using mapr-setup.sh must be accessible on the Installer node to all nodes that you want to include in the cluster.
Files Extracted into /tmp Require Execute Privileges
Do not mount /tmp with the noexec option. The HPE Ezmeral Data Fabric extracts certain files into /tmp and must run them from /tmp. Some processes can fail if noexec is set for /tmp because some files extracted into /tmp require execute privileges. In addition, if you use the java.io.tmpdir variable to change the location of the temporary directory used by Java processes, then the newly specified temporary directory must not be mounted with the noexec option.

Perform the following steps to change the location of the temporary directory used by Java processes using java.io.tmpdir variable:

  1. Create a custom tmp directory for mapr and set its permission similar to /tmp.
    # mkdir /opt/mapr/tmp
    # chmod 1777 /opt/mapr/tmp
  2. Set the custom tmp directory as java.io.tmpdir.
    1. For Java version 8 and previous, append the following command to /opt/mapr/conf/env_override.sh location.
      export JAVA_OPTIONS="-Djava.io.tmpdir=/opt/mapr/tmp"
    2. For Java version 9 and later, run the following command:
      export JDK_JAVA_OPTIONS="-Djava.io.tmpdir=/opt/mapr/tmp"
  3. Restart mapr-warden service on the node.
    NOTE
    You cannot hide the Picked up _JAVA_OPTIONS: <…> message due to Java sources implementation.
Supported Web Browsers
Once the Installer is installed and configured, you can use the following web browsers to access the Installer web interface:
  • Safari
  • Firefox
  • Chrome

Cluster Admin User Requirements

The installation process requires a valid cluster admin user to be present on all nodes in the cluster. The Installer can create a user (the mapr user) for you or use a user that you have created. If you choose to create a cluster admin user, make sure the following conditions are met:

  • The user must have a home directory and a password.
  • The user must be present on all nodes in the cluster.
  • The numeric user and group IDs (MAPR_UID and MAPR_GID) must be configured for the user, and these values must match on all nodes.
  • The mapr user and root user must be configured to use bash. Other shells are not supported.

If the user is not a valid user, installation errors can result. For information about creating the user, see Managing Users and Groups.

If you choose to have the Installer create the user, the Installer runs the following command to add a local user to serve as the cluster admin user:

useradd -m -u $MAPR_UID -g $MAPR_GID -G $(stat -c '%G' /etc/shadow) $MAPR_USER
In this command:
  • MAPR_USER defaults to mapr.
  • MAPR_UID defaults to 5000.
  • MAPR_GID defaults to 5000.
  • The home directory is typically /home/mapr.

The installer also adds the following to the MAPR_USER .bashrc file:

[[ -f /opt/mapr/conf/env.sh ]] && . /opt/mapr/conf/env.sh

Node Requirements

Nodes that you want to include in the cluster must meet the following criteria:

Minimum Cluster Size
The latest Installer requires a minimum of five data nodes. However, more nodes are recommended. The Installer can install clusters with fewer nodes, but you should review the special considerations for smaller clusters in Minimum Cluster Size.
Fully Qualified Domain Names (FQDNs)
The nodes are expressed as fully-qualified domain names (FQDNs), as described in Connectivity. DO NOT specify hostnames as aliases or IP addresses.
OS and Security Updates
Nodes are configured to accept operating system and security updates. They must also be patched with the latest security fixes. See your operating-system vendor documentation for details.
Disk Space Requirements
Nodes meet the requirements listed in Preparing Each Node. The Installer verifies the requirements prior to installation.
OS-partition, disk, and swap-space requirements are the same whether you install the cluster manually or by using the Installer. See Minimum Disk Space.
For data disks, Installer versions 1.12.0.0 and later require a minimum disk size that is equal to the physical memory on the node. If a data disk does not meet the minimum disk size requirement, a verification error is generated.
Access to the Installer Node
Nodes have HTTPS access to the Installer node over port 9443.
Proxy Server Requirements
If nodes in the cluster use an HTTP proxy server, the nodes must also meet the following requirements:
  • The no_proxy environment variable must be set.

    Nodes in the cluster need to be able to communicate without the use of a proxy. If the https_proxy and http_proxy environment variable is set for nodes in the cluster, you must also set the no_proxy environment variable for the cluster admin user and the root user on each node. Configure the no_proxy environment variable to the IP range of the nodes or to the sub-domain that contains the nodes.

    In addition, you must follow this guideline from the Python documentation: "The no_proxy environment variable can be used to specify hosts which shouldn't be reached via proxy; if set, it should be a comma-separated list of hostname suffixes, optionally with :port appended, for example cern.ch,ncsa.uiuc.edu,some.host:8080."

    For cloud-based clusters (Amazon EC2, Google Compute Engine (GCE), and Microsoft Azure), you must include this entry in the no-proxy configuration:
    169.254.169.254
  • The global proxy for package repositories must be set.

    The Installer creates repository files. However, the proxy setting is not configured for each repository. Therefore, configure global proxy settings on each node in the cluster.
    • On CentOS/RedHat, set global proxy settings in /etc/yum.conf.
    • On Ubuntu, set global proxy settings in /etc/apt/apt.conf.
Enabling Package Repositories for SLES 15
Before using the Installer for a new data-fabric installation on SLES 15 SP2, run the following command on all nodes to enable the Python 2 package repository. You must also run the command on the Installer node if the Installer node is not part of the cluster and is running SLES 15 SP2 (or a later supported service pack):
SUSEConnect -p sle-module-python2/15.<version>/x86_64
If you are developing applications on the cluster, run the following command on all nodes:
SUSEConnect -p sle-module-development-tools/15.<version>/x86_64
To view the available SLES modules and learn how to enable or disable them, use the SUSEConnect -l command.

Security Requirements

Before installing or upgrading software using the Installer, make sure that you have reviewed the list of known vulnerabilities in Security Vulnerabilities. If a vulnerability applies to your release, contact your support representative for a fix, and apply the fix immediately, if applicable.

Cloud Requirements

When you run the Installer on nodes in the cloud, you must:

  • Verify that port 9443 is open.

    The Installer requires that this port is available.

  • Ensure that the Installer and service UI URLs should refer to an external URL and not an internal URL.

    For example, when you open the Installer URL, replace any internal hostname or IP address with its associated external address. For Amazon EC2 and Google Compute Engine (GCE) clusters, the Installer automatically translates internal addresses to external addresses.

  • On the Configure Nodes page of the Installer web interface, make sure that you do the following:
    • Define each node using a fully-qualified domain name (FQDN) and internal, resolvable hostnames, as described in Connectivity.
    • For the remote authentication, use the same user ID and private key that you use to ssh into your cloud instances. This user must be root or a user with sudo permissions.