Preparing HPE Ezmeral Data Fabric to be Primary Storage for HPE Ezmeral Unified Analytics Software

Provides the steps that an HPE Ezmeral Data Fabric administrator (mapr) must complete before an HPE Ezmeral Unified Analytics Software administrator installs Unified Analytics and configures HPE Ezmeral Data Fabric as primary storage for Unified Analytics.

During installation, the HPE Ezmeral Unified Analytics Software administrator must provide the CSI driver (KDF-CSI) with the information needed to successfully connect to an external HPE Ezmeral Data Fabric cluster. The CSI driver requires the following information:

  • List of CLDB hosts
  • List of API servers
  • Tenant ticket
  • Username
  • Password
  • CA certificate
  • Mount prefix

The HPE Ezmeral Data Fabric cluster administrator (mapr) can obtain this information while preparing the HPE Ezmeral Data Fabric cluster to be accessed by Unified Analytics and its users.

HPE Ezmeral Data Fabric preparation includes:
  • Specifying user information for the Unified Analytics deployment
  • Specifying the mount prefix for the Unified Analytics deployment
  • Creating a new user in the HPE Ezmeral Data Fabric cluster
  • Giving the new user permissions to access the HPE Ezmeral Data Fabric cluster
  • Creating a dedicated volume for the new user
  • Creating a tenant ticket for the new user
  • Obtaining the root and signing CA certificates for the HPE Ezmeral Data Fabric cluster
  • Obtaining a list of CLDB hosts in the HPE Ezmeral Data Fabric cluster
  • Obtaining a list of API servers in the HPE Ezmeral Data Fabric cluster

The following section provides the preparation steps.

Preparing the HPE Ezmeral Data Fabric Cluster

As you complete the steps required to prepare the HPE Ezmeral Data Fabric cluster, take note of the following information, as this information is required during the installation of Unified Analytics:
  • Username and password for the HPE Ezmeral Data Fabric user
  • Mount prefix
  • Contents of the tenant ticket
  • Contents of the HPE Ezmeral Data Fabric CA certificate
  • List of CLDB hosts
  • List of API/REST servers
To prepare the HPE Ezmeral Data Fabric cluster, complete the following steps:
  1. SSH in to one of the nodes in the external HPE Ezmeral Data Fabric cluster:
    ssh <node-ip-address>
  2. Specify the user information for your Unified Analytics deployment:
    export USER=ezua
    export GROUP=ezua
    export USERID=7000
    export GROUPID=7000
    export PASSWORD=$(openssl rand -base64 12)
    
    TIP
    • If you have multiple Unified Analytics deployments, HPE recommends having a dedicated user for each deployment.
    • If the password is user-provided instead of auto-generated, properly escape special characters or sequences, such as $!, to prevent the shell from replacing them and returning unexpected results.
  3. Specify the mount prefix for your Unified Analytics deployment:
    export MOUNT_PREFIX=/ezua
    TIP
    • If you have multiple Unified Analytics deployments, HPE recommends having a dedicated mount prefix for each deployment.
    • Do not use /mapr as the mount prefix, as /mapr denotes the global namespace and some tools, including the hadoop client, are configured to reference this directory for their operations.
  4. Create a new HPE Ezmeral Data Fabric user by running the following commands on all nodes in the HPE Ezmeral Data Fabric cluster:
    sudo groupadd -g ${GROUPID?} ${GROUP?}
    sudo adduser -g ${GROUP?} -M -u ${USERID?} ${USER?}
    echo "${USER?}:${PASSWORD?}" | sudo chpasswd
    
    TIP
    • Use the same password on all nodes. For example, run the openssl command one time.
    • For additional information, see User Accounts.
    • An alternative option for this step is to add this user to your LDAP.
  5. Verify the password:
    echo ${PASSWORD}
  6. Verify that you can log in as the new user:
    echo ${PASSWORD} | maprlogin password -user ${USER?}
  7. Log in as the mapr administrative user:
    maprlogin password -user mapr
  8. Assign the create volume ACL to the HPE Ezmeral Data Fabric user:
    maprcli acl edit -type cluster  -user ${USER?}:login,cv
  9. Create a volume that this user can access under a dedicated prefix:
    maprcli volume create -name ezua-base-volume-${USER?} -path ${MOUNT_PREFIX?} \
    -createparent true -type rw -json -rootdiruser ${USER?} -rootdirgroup ${GROUP?}
  10. Create a tenant ticket for this user:
    maprlogin generateticket -type tenant -user ${USER?} -out /tmp/maprtenantticket-${USER?}
    TIP
    Unified Analytics and the CSI driver do not currently support rotating tickets; therefore, the system checks the ticket expiration date to verify that it is at least 100 years from the current date. By default, tenant tickets have LIFETIME duration (10000 years) to ensure that the ticket does not expire. For additional information, including how to set the duration, see maprlogin.
  11. Inspect the tenant ticket:
    maprlogin print -ticketfile /tmp/maprtenantticket-${USER?}
  12. Obtain the tenant ticket:
    cat /tmp/maprtenantticket-${USER?}
  13. Obtain the root and signing CA of the HPE Ezmeral Data Fabric cluster:
    sudo cat /opt/mapr/conf/ca/chain-ca.pem
  14. Obtain the endpoints of the HPE Ezmeral Data Fabric cluster:
    maprcli node list -columns hn,ip -filter svc==cldb
    TIP
    • Filtering nodes using svc==cldb returns the nodes currently running the CLDB service. If the CLDB service is configured on a node, but not running for some reason, that node will not appear in the results. Alternatively, you can filter nodes using csvc==cldb, which returns a list of nodes configured with the CLDB service.
    • If MAPR_EXTERNAL is configured, the maprcli node list command returns an extIp column, which lists the external IP addresses of the nodes in the HPE Ezmeral Data Fabric cluster. Unified Analytics uses the external IP addresses to access the HPE Ezmeral Data Fabric cluster. When you provide Unified Analytics with the endpoints, use the external IP addresses; do not use the local hostnames.
      maprcli node list -columns hn,ip -filter svc==cldb
      hostname                     ip           extIp
      ip-10-0-0-100.ec2.internal   10.0.0.100   10.10.100.110:5660,10.10.100.120:5692
      
      In this example, you would provide the extIp (10.10.100.110), not the hostname (ip-10-0-0-100.ec2.internal). For additional information, see MAPR_EXTERNAL Environment Variable.
    1. Obtain a list of the CLDB hosts and then append port :7222 to each host in a comma-separated list:
      maprcli node list -columns hn,ip -filter svc==cldb
    2. Obtain a list of API servers and then append port :8443 to each host in a comma-separated list:
      maprcli node list -columns hn,ip -filter svc==apiserver
    IMPORTANT
    Verify that the Unified Analytics nodes can access the HPE Ezmeral Data Fabric nodes. For example, verify that the firewall is not blocking the connections. See Port Information.