Infrastructure
Identifies certain software and settings that contribute to your node's infrastructure.
Network Time
To keep all cluster nodes time-synchronized, Data Fabric requires software such as a Network Time Protocol (NTP) server (or chrony for RHEL 7) to be configured and running on every node. If server clocks in the cluster drift out of sync, serious problems will occur with certain Data Fabric services. Data Fabric raises a Time Skew alarm on any out-of-sync nodes. For more information about obtaining and installing NTP, see http://www.ntp.org/.
Advanced: It is recommended to install an internal time server with which the cluster nodes can sync directly. If internet connectivity is lost, the time on the cluster nodes stays in sync. For more details, refer to the preceding documentation link for NTP
System Locale
Ensure that your system locale is set to en_us. For more information about
        setting the system locale, see this website.
Syslog
Syslog should be enabled on each node to preserve logs for killed processes
        or failed jobs. Modern versions such as syslog-ng and
          rsyslog are possible, making it more difficult to be sure that a
          syslog daemon is present. One of the following commands should
        suffice:
syslogd -v
service syslog status
rsyslogd -v
service rsyslog status
    Default umask
To prevent significant installation problems, ensure that the default umask for the root user
        is set to 0022 on all Data Fabric nodes in the cluster. You
        can change the umask setting in the /etc/profile file, or in the
          .cshrc or .login file. The root user
        must have a 0022 umask because the Data Fabric
        admin user requires access to all files and directories under the
          /opt/mapr directory, even those initially created by root services.
ulimit
        ulimit is a command that sets limits on a user's access to system-wide
        resources. Specifically, it provides control over the resources available to the shell and
        to processes started by it.
The mapr-warden script uses the ulimit command to set the
        maximum number of file descriptors (nofile) and processes
          (nproc) to 64000. Higher values are unlikely to result in an appreciable
        performance gain. Lower values, such as the default value of 1024, are likely to result in
        task failures.
Depending on your environment, you might want to set limits manually for service accounts
        used to run I/O-heavy operations rather than relying on Warden to set them automatically
        using ulimit. 
PAM
Nodes that run the Control System can take advantage of Pluggable Authentication Modules (PAM) if found. Configuration files in the
          /etc/pam.d/ directory are typically provided for each standard Linux
        command. Data Fabric can use, but does not require, its own
        profile.
Security - SELinux
Using SELinux is supported if the cluster admin follows some specific best practices. See SELinux Support.
TCP Retries
net.ipv4.tcp_retries2 to 5 so that
          Data Fabric can detect unreachable nodes with less
          latency.net.ipv4.tcp_syn_retries to 4 on each node.- Edit the file 
/etc/sysctl.confand add the following line:net.ipv4.tcp_retries2=5 - Save the file and run: 
sysctl -p 
NFS
Disable the stock Linux NFS server on nodes that will run the Data Fabric NFS server.
iptables/firewalld
Enabling iptables on a node can close ports that are used by Data Fabric. If you enable iptables, make sure
        that required ports
        remain open. Check your current iptables rules by using the following
        command:
$ service iptables status
      In CentOS 7, firewalld replaces iptables. To check your
        current iptables rules, use this command:
systemctl status firewalld
      firewalld by
        using this command:systemctl disable firewalldTransparent Huge Pages (THP)
For data-intensive workloads, Data Fabric recommends disabling the Transparent Huge Pages (THP) feature in the Linux kernel.
RHEL Example
$ echo never > /sys/kernel/mm/transparent_hugepage/enabled
      CentOS 7 Example
echo never > /sys/kernel/mm/transparent_hugepage/enabled
    
    Ubuntu Example
$ echo never > /sys/kernel/mm/transparent_hugepage/defrag
    Automated Configuration
Some users find tools such as
          Ansible,
          Puppet, or Chef useful to configure each node in a
        cluster. Make sure, however, that any configuration tool does not reset changes made when
          Data Fabric packages are later installed. Specifically, do
        not let automated configuration tools overwrite changes to the following files:
- 
          
/etc/sudoers - 
          
/etc/sysctl.conf /etc/sysctl.d/60-mapr_elasticsearch.conf/etc/sysctl.d/60-mapr_fluentd.conf- 
          
/etc/security/limits.conf - 
          
/etc/udev/rules.d/99-mapr-disk.rules