Before Upgrading the Platform

This topic describes the tasks that you must complete before you upgrade the HPE Ezmeral Runtime Enterprise software. Hewlett Packard Enterprise highly recommends performing a configuration and upgrade pre-check and resolve issues before upgrading HPE Ezmeral Runtime Enterprise.

Verify Upgrade Path

Verify that the version of HPE Ezmeral Runtime Enterprise that you are upgrading from is a valid starting point when upgrading to HPE Ezmeral Runtime Enterprise 5.4.x. For information about upgrade paths, see Upgrading to HPE Ezmeral Runtime Enterprise 5.7.x.

Plan for Impact on Workloads

Upgrading to this version of HPE Ezmeral Runtime Enterprise involves multiple tasks, some of which require node reboots or pod restarts. See Upgrading to HPE Ezmeral Runtime Enterprise 5.7.x.

Upgrade Kubeflow

If your environment includes Kubeflow and you are upgrading HPE Ezmeral Runtime Enterprise, contact Hewlett Packard Enterprise support for assistance before you begin the upgrade. Several manual steps must be performed to replace the existing version of Kubeflow with the new version of Kubeflow.

Upgrade OS Versions

If your HPE Ezmeral Runtime Enterprise installation is based on an OS version that is not supported by HPE Ezmeral Runtime Enterprise 5.7.x, you must upgrade the OS version to at least the minimum supported version supported by HPE Ezmeral Runtime Enterprise.

For a list of supported operating system versions, see OS Support.

To upgrade the operating system, see System Maintenance.

(Optional) Update or Configure Air Gap Settings

If you are using Kubernetes in an air-gapped environment or you want to change your current environment to air gap your Kubernetes objects, configure the air gap settings before you upgrade Kubernetes. Changes to Air gap settings are not propagated to the Kubernetes hosts until the host is rebooted or the Kubernetes version is upgraded.
IMPORTANT
Changing an existing HPE Ezmeral Runtime Enterprise configuration from a non-airgapped environment to an air-gapped environment forces a reinstall of Kubernetes clusters.

If you are changing an existing HPE Ezmeral Runtime Enterprise configuration from a non-airgapped environment to an air-gapped environment, contact Hewlett Packard Enterprise support for assistance before you begin the transition. Several manual steps must be performed to transition to an air-gapped environment.

For more information, see the following:

Upgrade Kubernetes

If your current environment is using Kubernetes, you must update Kubernetes to at least the minimum version supported by this version of HPE Ezmeral Runtime Enterprise. Ensure that the version that you upgrade to is also supported on your current version of HPE Ezmeral Runtime Enterprise.

Kubernetes requires upgrading one version at a time, so you might have to perform this upgrade multiple times until the clusters are running a supported version of Kubernetes.

For information about upgrading Kubernetes, see Upgrading Kubernetes.

Optionally, you can upgrade to later versions of Kubernetes after all the tasks involved in upgrading HPE Ezmeral Runtime Enterprise, such as upgrading add-ons, are complete.

For a list of supported Kubernetes versions, see Support Matrixes.

Obtain the HPE Ezmeral Runtime Enterprise Software

Your Hewlett Packard Enterprise representative can provide information about obtaining the correct HPE Ezmeral Runtime Enterprise upgrade package for your environment. You will copy the package bundle to the controller host as part of running the upgrade pre-checks.

Run Configuration and Upgrade Pre-Checks

Hewlett Packard Enterprise highly recommends performing both a configuration check and an upgrade pre-check before upgrading HPE Ezmeral Runtime Enterprise. Ensure that you address any issues reported by these checks before performing the actual upgrade.

  1. Verify that all HPE Ezmeral Runtime Enterprise services are operating in Healthy (green) status using the Services tab of the Platform Administrator Dashboard screen. See Dashboard - Platform Administrator.
  2. Copy the upgrade package to the /srv/bluedata/bundles folder on the Controller host.
  3. Execute the command chmod 770 <bin-file-name>, where <bin-file-name> is the full name of the package that you copied in Step 2.
  4. Verify that the upgrade package appears in the Available Upgrades tab.
  5. Run the configuration check as described in Config Checks Tab.
  6. Review the output of this check, and resolve any errors.
  7. Download the hpe-cp-prechecks-<version>.bin script to each host, where <version> is the version number, such as 5.6.
    RHEL
    SLES
  8. On one of the hosts, execute the command <bin_file> --upgrade, where <bin_file> is the complete name of the .bin file.
  9. Review the script output and resolve any errors.
  10. Repeat Steps 8 and 9 on each of the remaining hosts in HPE Ezmeral Runtime Enterprise.

The upgrade pre-check script returns output that is similar to the output shown in the following table. The Error Resolution column of the table lists the most common errors encountered by each check, along with diagnosis and remediation instructions.

Option Expected Result Error Resolution
Checking integrity ... GOOD.
 Extracting contents ... done.
 ##
 ## ##
 ## ##
 ## ## ## ##
 ## ## ## ##
 ## ## ## ## HPE Software, Inc.
 ## ## ## ## ##
 ## ## ##
 ## ## ## ##
 ## ##
 HPE Ezmeral Container Platform Enterprise-Docker debug <version> (minimal)
 Executing UPGRADE (PLHA: [false|true] NODE: <A.B.C.D>
 Logging to /tmp/bds-<time_stamp>.log
 Pre-install checks for HPE Ezmeral Container Platform Enterprise-Docker <version>
 Operating system configuration
Checking OS Family:
PASSED
This check fails if OS type for the installer does not match with the OS. Use the correct installer.
Checking running kernel version:
PASSED
This check fails if the following kernel versions are not installed:
  • 2.6.32 or later for Rhel7.
  • 3.10.0 or later for Rhel8.
  • 4.12.14 or later for SLES 15.

You can upgrade the versions if needed.

Checking SELinux setting:
PASSED
This check only generates a warning if SELinux is disabled; re-enable if necessary.
Checking IPtables/Firewalld configuration:
PASSED

This check fails if either:

  • iptables is configured to run at boot time but is currently stopped.
  • iptables is currently running but is to run at boot time.
  • If iptables is not running for some reason.
Checking rsyslog setting:
PASSED

This check will fail if either:

  • /etc/rsyslog.d is not included in rsyslog.conf.
  • The imuxsock module is not loaded in rsyslog.con.
Checking user and group specified:
PASSED
For non-root installs, this check verifies that the user exists and is part of the specified group.
Checking dnsmasq user and group specified:
PASSED

The check fails if user and group specified in --dnsmasquser and --dnsmasq group does not exist. If needed, you can create user and group.

Checking cgconfig kernel params:
PASSED
Verify that cgconfig is not disabled in the kernel boot parameters. This is for cgroup checks.
Checking for presence of erlang cookie:
PASSED
The check fails if erlang cookie generated by controller is not present.
Total: 9 -- Failed: 0 -- Warning: 0 -- Forced(success): 0
Checking Monitoring status:
PASSED
The monitoring service must be installed and running correctly.
Checking HDFS status:
PASSED
HDFS must be installed and running correctly.
Checking MapR status:
PASSED
MapR must be installed and running correctly.
Checking BDMGMT status:
PASSED
BDMGMT must be installed and running correctly.
Checking Data Server status:
PASSED
The data server must be installed and running correctly.
Total: 5 -- Failed: 0 -- Warning: 0 -- Forced(success): 0
***************************************************************************
Aggregate tests summary:
           Total: 14
           Failed: 0
           Warning: 0    
           Forced(success) : 0 
Additional information for debugging is written to/tmp/bd_prechecks.<process_id>.log
***************************************************************************

After you are satisfied that the pre-check has completed correctly, do the following:

  1. If you have hosts that have GPU devices, remove those hosts from the Kubernetes cluster and then remove the hosts from HPE Ezmeral Runtime Enterprise. See Remove Hosts That Have GPUs.
  2. If your environment includes HPE Ezmeral Data Fabric on Kubernetes and want to upgrade HPE Ezmeral Data Fabric on Kubernetes before you upgrade the HPE Ezmeral Runtime Enterprise software. See Upgrading HPE Ezmeral Data Fabric on Kubernetes.

  3. If your environment does not include HPE Ezmeral Data Fabric on Kubernetes, proceed to upgrade the HPE Ezmeral Runtime Enterprise software as described in Upgrading the Platform Software.

Remove Hosts That Have GPUs

HPE Ezmeral Runtime Enterprise adds support for MIG-enabled GPUs. For all GPUs to be recognized by the system after the upgrade, all hosts that have GPUs must be removed from HPE Ezmeral Runtime Enterprise before the upgrade, and then added back to the configuration after the upgrade process is complete. This requirement applies to all GPUs, including those GPUs are that are not MIG-enabled.

HPE Ezmeral Runtime Enterprise 5.3.5 and later deploy updated versions of the NVIDIA runtime and other required NVIDIA packages, and has changed the node label used for hosts that have GPU devices. Both of these configurations changes are made to a host at the time that the host is added to HPE Ezmeral Runtime Enterprise. You will add the hosts to HPE Ezmeral Runtime Enterprise as one of the post upgrade tasks.

To remove a host from HPE Ezmeral Runtime Enterprise:

  1. Remove the host from the Kubernetes cluster.

    See Expanding or Shrinking a Kubernetes Cluster.

  2. Delete the host from HPE Ezmeral Runtime Enterprise.

    See Decommissioning/Deleting a Kubernetes Host.

Before Starting the Upgrade

Before proceeding to HPE Ezmeral Runtime Enterprise 5.5.0 upgrade, you must consider the following:
  • Ensure that all Kubernetes clusters are updated to either 1.20.x or 1.21.x.
  • HPE Ezmeral Runtime Enterprise upgrade will fail on the controller if:
    • Installation has EPIC virtual clusters.
    • Installation has Exthosts configured.
    • Installation has more than three EPIC workers (controller, shadow, arbiter).
      NOTE
      Gateway hosts are not considered as EPIC workers. So this limitation does not apply for Gateway hosts.
  • By default, all existing EPIC tenants will be deleted during upgrade. You may lose data stored in tenant storage for these tenants. You can modify this behaviour using the following command on the primary controller, before starting the upgrade:
    echo "bd_mgmt_config:update(bds_cleanup_tenant, false)." >>/opt/bluedata/common-install/bd_mgmt/tmp.w
    IMPORTANT
    If the bds_cleanup_tenant flag is set to false and the upgrade is attempted, you will no longer be able to access the tenants from the WebUI . Reach out to HPE support if you are in this situation and want to delete the invisible tenant.
  • All pre-5.5.0 Kubernetes clusters are preserved during the HPE Ezmeral Runtime Enterprise upgrade process. You will be able to expand (with a manual step), shrink and delete those pre-5.5.0 clusters. For details, see Post Upgrade Tasks. As the older Kubernetes distributions are no longer used, you will not be able to upgrade them. However, it is possible to migrate the Kubernetes cluster to the HPE-Kubernetes-distribution.
  • After the successful upgrade to HPE Ezmeral Runtime Enterprise 5.5.1, all new Kubernetes hosts will be created by default with Containerd, and all new Kubernetes clusters will use the HPE-Kubernetes-distribution.
    NOTE
    Contact HPE support for more information on migrating pre-5.5.0 cluster to the HPE- Kubernetes-distribution.

Kubernetes Bundles Upgrade

Starting HPE Ezmeral Runtime Enterprise 5.5.0, HPE decouples the upgrade of the HPE Ezmeral Runtime Enterprise platform from Kubernetes-related components.

With this feature, the user can upgrade the following Kubernetes related components, without performing the complete HPE Ezmeral Runtime Enterprise platform upgrade. For more details, see Upgrading Kubernetes Bundles