Release Notes (1.5.5)

This document provides a comprehensive overview of the latest updates and security fixes in HPE Ezmeral Unified Analytics Software (version 1.5.5).

HPE Ezmeral Unified Analytics Software provides software foundations for enterprises to develop and deploy end-to-end data and advanced analytics solutions from data engineering to data science and machine learning across hybrid cloud infrastructures – delivered as a software-as-a-service model.

Version 1.5.5 Updates

Version 1.5.5 of HPE Ezmeral Unified Analytics Software includes the following updates:
  • Supports new installations and upgrades from versions 1.5.4 and 1.5.3.
  • Includes an updated upgrade bundle image. See Upgrade Bundle Images.
NOTE
HPE Ezmeral Unified Analytics Software version 1.5.5 does not support RHEL 8.10. For supported OS versions, see Operating System.

Installation

Before you install or upgrade, HPE recommends that you back up your data.  If you encounter any issues during or after the installation or upgrade process, please contact HPE Support. We appreciate your feedback and strive to continually enhance your product experience. 

Upgrade Notes

Back-up custom configurations

Settings, including certificates and custom configurations, do not persist to the new version of HPE Ezmeral Unified Analytics Software. You must back up all custom application configurations and certificates prior to performing an upgrade. After you upgrade HPE Ezmeral Unified Analytics Software, you must reconfigure applications that had custom configurations and apply certificates.

Manually download the workload cluster upgrade bundle

You cannot automatically download the workload cluster upgrade bundle. You must use the manual procedure described in Download the Workload Cluster Upgrade Bundle.

Configure the Air Gap registry URL
If the previous version of HPE Ezmeral Unified Analytics Software was installed without enabling the Air Gap environment or the Air Gap registry URL is not configured for your existing setup (URL is an empty string), you must manually set the Air Gap registry URL to marketplace.us1.greenlake-hpe.com/ezua/. To set the Air Gap registry URL to marketplace.us1.greenlake-hpe.com/ezua/, run the following commands on the Kubernetes master node:
kubectl patch cm ezua-cluster-config -n ezua-system --type merge -p '{"data":{"airgap.registryUrl":"marketplace.us1.greenlake-hpe.com/ezua/"}}'

kubectl patch ezac in-cluster -n ezaddon-system --type merge -p '{"spec":{"registryUrl":"marketplace.us1.greenlake-hpe.com/ezua/"}}'

Resolved Issues

The following issues have been resolved in this release:

Ability to configure Git-Sync container resources for Airflow DAGs in the UI
Git-Sync runs only once during the initial DAG run to reduce resource consumption. You can configure Git-Sync container resources for Airflow DAGs in values.yaml through the HPE Ezmeral Unified Analytics Software UI.
To modify git-sync resources in values.yaml, configure Airflow as described in Configuring Included Frameworks. In values.yaml, modify the git-sync resources, as shown:
# ~~~
airflow-cluster-ua:
  # ~~~
  airflowCluster:
    # ~~~
    dags:
      git:
        # ~~~
        ## `resources` is the resource limits and requests for the git-sync container
        ##
        resources:
          limits:
            cpu: 250m
            memory: 300Mi
          requests:
            cpu: 25m
            memory: 75Mi

Known Issues

This release has the following known issues:

UI visibility issue with %manage_spark in Notebook

When you execute %manage_spark in any of the Spark kernels within a Notebook and then add an endpoint through the wizard, the contents of the Create Session tab are not completely visible unless you reduce the page resolution. Also, the wizard is not scrollable horizontally which hides the Create Session button. This issue occurs because the version of Jupyter Notebook in HPE Ezmeral Unified Analytics Software version 1.5.5 does not support overflow scrolling.

Resolution

Enlarge the window for full content visibility.

Cannot access packages installed via conda after using %createKernel magic

When you use the %createKernel magic command to create custom Jupyter kernels, packages installed via conda may not be accessible within the kernel. This occurs because kernel registration does not properly link the conda environment's PATH. This issue affects:

  • Python packages installed during kernel creation, such as numpy and pandas
  • CLI tools installed via conda, such as pigz

Resolution

Note that this resolution has the following known limitations:

  • Applies to pip-installable packages only
  • Conda-specific packages or CLI tools may not be available using this method
  • You must repeat these steps for each new notebook using a custom kernel
To resolve the issue, complete the following steps:
  1. In python, run the %createKernel magic command without specifying any packages, or pass a minimal dummy package:
    %createKernel
    
  2. Log out of the notebook environment and then log back in to ensure that the new kernel is properly registered.
  3. Open a new notebook and select the newly created kernel (myenv) from the kernel selection menu.
  4. Inside the new notebook, install your required packages using pip:
    !pip install numpy pandas scikit-learn
    If packages fail to download, verify that all HTTP proxy settings are correctly configured.
By default, DAGs run in the namespaces of the users that trigger them; however, you can configure DAGs to run in a centralized namespace by changing the executor variable in values.yaml to CeleryKubernetes. When you change the executor variable to CeleryKubernetes, tasks run in one celery worker pod in the airflow-hpe namespace instead of running in separate pods.
To change the executor variable in values.yaml:
  1. Sign in to HPE Ezmeral Unified Analytics Software as Administrator.
  2. In the left navigation bar, click Tools & Frameworks.
  3. Select the Data Engineering tab.
  4. On the Airflow tile, click the three-dots menu and then select Configure. The YAML file editor opens.
  5. In the editor, find the airflowCluster: section and executor: variable in that section.
  6. Change the executor: to CeleryKubernetes, as shown:
    airflowCluster: 
      Executor: CeleryKubernetes
  7. Save the change.
  8. To verify that the worker pod is running, run the following command:
    sudo kubectl get pod -n airflow-hpe
    If the pod is not running, run the following command to remove the af-cluster-worker resource from the airflow-hpe namespace:
    kubectl delete statefulset.apps/af-cluster-worker -n airflow-hpe
    After you run this command, the operator (controller) pod deletes and re-creates the af-cluster-worker statefulset in the airflow-hpe namespace. This process can take up to five minutes.

Notebooks stuck in Unknown status after upgrade
Notebooks that were running prior to upgrade do not start post upgrade and have an Unknown status in the UI. This is caused by a certificate issue with the admission-webhook-service that results in the following error:
Error creating: Internal error occurred:
failed calling webhook "admission-webhook-deployment.kubeflow.org": 
failed to call webhook: Post "https://admission-webhook-service.kubeflow.svc:443/apply-poddefault?timeout=10s": 
tls: failed to verify certificate: x509: certificate signed by unknown authority

Workaround

Restart the pods in the cert-manager namespace. After the pods start, restart the admission-webhook-deployment pod in the kubeflow namespace.

To implement the workaround, run the following commands:
kubectl delete po -n cert-manager --all --force --grace-period=0
kubectl delete po -n kubeflow admission-webhook-deployment-<deployment_id>
kubectl delete po -n cert-manager --all --force --grace-period=0

Upgrading from HPE Ezmeral Unified Analytics Software 1.5.2 to 1.5.4 fails due to Airflow database dump size exceeding Kubernetes secret limits
Upgrading from HPE Ezmeral Unified Analytics Software 1.5.2 to 1.5.4 fails because the upgrade script attempts to dump the entire Postgres database into a secret, exceeding the 1MB size limit of the secret. This blocks the upgrade and renders the environment unusable.

Workaround

IMPORTANT
You must perform this workaround before upgrading HPE Ezmeral Unified Analytics Software.
To workaround this issue, you can manually update the Docker image in the database pod to the image used in the upgraded version before upgrading. When upgrading, the upgrade script compares the database docker image for the current database pod to the docker image for the database pod that will be deployed during the upgrade. Performing a manual update of the docker image resolves the issue.

Perform the following steps BEFORE upgrading HPE Ezmeral Unified Analytics Software:

  1. Get the current Airflow Postgres Database image:
    1. To get the airflow-base YAML, run the following command:
      kubectl get airflowbase af-base -n airflow-base -o yaml
    2. In the YAML, under spec, get the airgapRegistry field, and under postgres get the image and version fields.
  2. Find out which Airgap Registry will be used for the platform after upgrade. This is the value of the airgap container registry that will be used for the new platform. The team that performs the upgrade should be able to provide this information.
  3. Compare the values from steps 1 and 2 and verify the following:
    1. Verify that the airgapRegistry prior to upgrade is not the same as the airgap registry that will be used after the upgrade. If the airgapRegistry is not the same, you must update it (step 4).
    2. Verify that the image prior to upgrade is not the same as <airgap-registry>gcr.io/mapr-252711/postgres. Where <airgap-registry> is the airgap registry which will be used after upgrade (step 2). If the image is not the same, you must update it (step 4).
    3. Verify that the current version (before upgrade) is not 14.12. If the version is not 14.12, you must update it (step 4).
  4. Edit the airflow-base YAML by running the following command:
    kubectl edit airflowbase af-base -n airflow-base
  5. Update the airgapRegistry, image, and version fields with the expected values and then save the changes. Wait for database readiness.
  6. Upgrade HPE Ezmeral Unified Analytics Software.

    Example

    If an airflow-base YAML has the following fields and data:
    • airgapRegistry: ""
    • image: gcr.io/mapr-252711/postgres
    • version: 14.12
    And you consult with the upgrade team to find out what the fields will be post upgrade, for example:
    • airgapRegistry: marketplace.us1.greenlake-hpe.com/ezua/
    • image: marketplace.us1.greenlake-hpe.com/ezua/gcr.io/mapr-252711/postgres
    • version: 14.12(the same)
    IMPORTANT
    You must update the fields in the YAML to match what the upgrade team has provided before performing the HPE Ezmeral Unified Analytics Software upgrade.

Katib jobs fail when launched through Kale
If you launch a Katib job through Kale from a notebook, the Katib job fails because resource limits are not provided. Pods get stuck in a pending state and the system returns a warning message stating that resource limits must be defined.

Workaround

To work around this issue:
IMPORTANT
In air-gapped environments or environments that cannot access docker.io, push the akravacyber/katib-kfp-trial:v0.7.0 image into the air gap registry that is configured for the cluster. Update line 8 in the kale-katib.patch file to add the air gap registry address as the prefix to akravacyber/katib-kfp-trial:v0.7.0.
  1. Download the following file and put it in the /mnt/user directory:
    kale-katib.patch
  2. Open a notebook terminal and run the following command:
    cd /opt/conda/lib/python3.11/site-packages
  3. From the notebook terminal, run the following command:
    git apply /mnt/user/kale-katib.patch
  4. Close all the open notebook tabs and shut down all the kernels running in notebooks.
  5. In the top menu bar, select File > Log Out.
  6. Log in again.

Vulnerabilities

Sweet32 Vulnerability on etcd Endpoint (CVE-2016-2183)
During Kubernetes HPE Ezmeral Unified Analytics Software VAPT scans, security tools detect a Sweet32 vulnerability on the etcd client endpoint (TCP port 2379) of the control plane nodes. Etcd supports the medium-strength cipher TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA. Triggers Sweet32 birthday attack warning and lowers overall TLS cipher strength rating. This occurs when scanners enumerate TLS ciphers against the etcd endpoint.

In HPE Ezmeral Unified Analytics Software setups, etcd runs as a system managed service (not a static pod) with default TLS settings for broad compatibility. Supports weak 3DES ciphers (64-bit block size) alongside strong ones like AES-GCM and ChaCha20-Poly1305 (TLS 1.2/1.3). Security scanners flag Sweet32 (CVE-2016-2183) because any weak cipher triggers the alert. This is due to cipher negotiation defaults, etcd remains secure and not externally exposed.

Remedy

To mitigate the Sweet32 vulnerability, restrict etcd to use only strong TLS cipher suites and explicitly disable 3DES by updating the etcd service environment configuration.
  1. SSH to the Kubernetes control plane node and switch to the root user.
  2. Edit the etcd environment configuration file:
    vi /opt/ezkube/bootstrap/systemd/10-etcd.env  
  3. Add or update the ETCD_EXTRA_ARGS variable to enforce TLS 1.2+ and allow only strong cipher suites:
    ETCD_EXTRA_ARGS="--tls-min-version=TLS1.2 \  
    --cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,\  
    TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,\  
    TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256"  
  4. Reload the systemd configuration and restart the etcd service:
    systemctl daemon-reload  
    systemctl restart etcd  
  5. Verify that the etcd service is running successfully:
    systemctl status etcd  
  6. Re-run the TLS cipher enumeration scan to confirm that the Sweet32 vulnerability is no longer reported:
    nmap --script ssl-enum-ciphers -Pn -p 2379 <control-plane-node-ip>  
  7. Repeat the same steps on the other control plane nodes in the cluster. If there are three control plane nodes, you cannot perform the steps on all nodes at once or the quorum will be lost. If the control plane node is recreated for any reason, you must perform these steps again.
    NOTE
    • In clusters with multiple control plane nodes, etcd must be restarted one node at a time to avoid quorum loss and service disruption.
    • TCP ports 2379 (client) and 2380 (peer) are mandatory for etcd operation and must remain open for Kubernetes to function correctly.
    • If a control plane node is rebuilt or recreated, the above configuration changes must be re-applied.
    • This issue does not indicate external exposure of etcd; it is a cipher configuration hardening requirement identified by VAPT tools.

Additional Resources

Thank you for choosing HPE Ezmeral Unified Analytics Software. Enjoy the new features and improvements introduced in this release.