Release Notes (1.5.0)
This document provides a comprehensive overview of the latest updates and enhancements in HPE Ezmeral Unified Analytics Software (version 1.5.0), including new features, improvements, bug fixes, and known issues.
HPE Ezmeral Unified Analytics Software provides software foundations for enterprises to develop and deploy end-to-end data and advanced analytics solutions from data engineering to data science and machine learning across hybrid cloud infrastructures – delivered as a software-as-a-service model.
New Features
- Support for External Storage Platforms
- HPE Ezmeral Unified Analytics Software now integrates with external storage platforms, eliminating the internal data fabric as primary storage. This integration leverages existing storage solutions for a seamless and scalable data management experience while reducing the amount of resources required to deploy an HPE Ezmeral Unified Analytics Software cluster. It also enhances high availability (HA) efficiency to ensure a fully operational cluster after recovery from a power outage or reboot. HPE Ezmeral Unified Analytics Software currently supports HPE Ezmeral Data Fabric as primary storage with support for additional storage solutions coming in subsequent releases. For details, see Primary Storage, Preparing HPE Ezmeral Data Fabric to be Primary Storage for HPE Ezmeral Unified Analytics Software, and Installing on User-Provided Hosts (Connected and Air-gapped Environments).
- MAPRSASL Authentication for Hive Metastore
- You can now configure a Hive data source in HPE Ezmeral Unified Analytics Software to use MAPRSASL for authentication with the Hive Metastore on HPE Ezmeral Data Fabric. This enhancement ensures secure access and integration, providing an added layer of security for data management. For additional details, see Using MAPRSASL to Authenticate to Hive Metastore on HPE Ezmeral Data Fabric.
Enhancements
- Flexibility in Tools and Frameworks Installation
- You now have the option to deploy a subset of tools and frameworks during
installation, and the flexibility to install the other tools and frameworks later. You
can exclude the following tools and frameworks from the initial installation of
HPE Ezmeral Unified Analytics Software:
- Superset
- EzPresto
- Livy
- MLDE
- Feast
- UI for Adding Volumes
- A new user interface is now available for connecting to external storage platforms, allowing you to use them as data sources for applications and frameworks in your HPE Ezmeral Unified Analytics Software cluster. The UI supports integration with HPE Ezmeral Data Fabric and GreenLake for File Storage, providing a seamless and user-friendly way to access diverse data sources. Note that with this change, the Data Fabrics option previously under Administration in the left navigation panel has been moved to the Data Volumes tab. For additional details, see Connecting to HPE Ezmeral Data Fabric and Connecting to HPE GreenLake for File Storage.
- Revoke User Access on Data Sources
- Administrators can revoke user access to data sources within the Data Engineering section of the UI. This functionality allows for easy management of user privileges, ensuring secure access to both structured and object store data. For additional details, see Revoking Member Access to Data.
- Run CTAS Queries with Hive Discovery Metastore
- The Hive Discovery Metastore now supports running CTAS (CREATE TABLE AS SELECT) queries on CSV and parquet files stored in the HPE Ezmeral Data Fabric file system or S3 object storage, including HPE Ezmeral Data Fabric S3, MinIO S3, and AWS S3. You can also insert data into the created tables. To utilize this feature, set up a Hive data source connection with the specified parameters, as described in Hive Discovery Metastore Connection Parameters. Use schema discovery for CSV files, delta discovery for delta files, and include the format in the query for parquet files.
- Installation Configuration Review
- Before finalizing the installation of HPE Ezmeral Unified Analytics Software on your cluster, you can review and adjust the installation configuration details on the Review screen. This feature ensures accuracy and customization of the setup process.
- Seamless Deletion of Imported Tools and Frameworks
- You can now automatically delete a chart from the Charmuseum when an
ezappconfig
custom resource (CR) is deleted. This feature simplifies the management of imported tools and frameworks by ensuring that associated configurations and resources are removed seamlessly.
Resolved Issues
This release includes numerous fixes that enhance system security, stability, and performance, including the following resolutions:
- Permission denied error when submitting the Kubeflow pipeline while using the Kubeflow notebook images
- Submitting a Kubeflow pipeline using the KFP SDK V2 Kubeflow notebook images no longer returns a permission denied error.
- The driver pod of the cloned Spark job remains in the container creating state
- When you use the Clone option to create a new Spark application with a similar configuration as an existing Spark application, the driver pod of the cloned Spark job no longer remains in the container creating state.
- Permission denied error when installing packages while using the Kubeflow notebook images
- Installing the Kubeflow notebook images (with KFP SDK V2) provided by HPE Ezmeral Unified Analytics no longer returns a permission denied error.
- Replace Fluent Bit with OTEL for log collection and parsing
- Log collection and parsing now uses Open Telemetry (OTEL) instead of Fluent Bit, which reduces resource consumption (memory).
- Unable to download infrastructure and application services logs
- You can download the infrastructure and application services logs without issue.
- Unable to delete Data Fabric connection due to "Secret not found" error
- You can delete Data Fabric connections by deleting the Data Volume source.
- Uploading a term license
- Uploading a term license no longer results in an ezlicense controller pod crashloopbackoff error.
- Activation code change no longer results in a crashloopbackoff error
- The activation code change that caused a crashloopbackoff error when a capacity license was applied before upgrading is resolved.
Known Issues
The following sections describe known issues with workarounds where applicable:
- EzPresto installation fails due to mysql pod entering CrashLoopBackOff state
- During EzPresto deployment, the HPE Ezmeral Unified Analytics Software installation fails due to slow disk I/O, which leads to
the mysql pod in EzPresto entering a CrashLoopBackOff state.
Workaround: To resolve this issue, see EzPresto installation fails due to mysql pod entering CrashLoopBackOff state.
- Installation pre-check fails if the SSH key does not have a passphrase
- If you use an SSH key file, the SSH key must have a passphrase; otherwise, the installation pre-check fails and installation cannot occur. You can set the passphrase to any value, even a dummy value.
- Running CTAS against a Hive data source fails with ORC file error
- Running a CTAS query against a Hive data source that is configured to use MAPRSASL
authentication fails with the following
error:
Error creating ORC file. Error getting user info for current user, presto.
This issue occurs if the HPE Ezmeral Data Fabric ticket was generated with impersonation enabled uids and impersonation was not enabled when the Hive data source connection was configured in HPE Ezmeral Unified Analytics Software. For example, the ticket was created as shown:maprlogin generateticket -user pa -type servicewithimpersonationandticket \ -impersonateduids 112374829 -out pa.out
Workaround: To resolve this issue, delete the Hive data source connection and create a new Hive data source connection, making sure to include the following options in addition to the other required options:- Select the Hive HDFS Impersonation Enabled option.
- Enter the principal/username that Presto will use when connecting to HPE Ezmeral Data Fabric in the Hive Hdfs Presto Principal field. If this field is not visible, perform a search for it in the Hive Advanced Settings search field.
- CTAS query on Hive Metastore in HPE Ezmeral Data Fabric fails
- For Hive connections that authenticate to HPE Ezmeral Data Fabric via MAPRSASL, running a CTAS query against
HPE Ezmeral Data Fabric returns
the following error:
Database 'pa' location does not exist:<file_path>
Workaround: To resolve this issue, create and upload a configuration file that points to the HPE Ezmeral Data Fabric cluster, as described in Using MAPRSASL to Authenticate to Hive Metastore on HPE Ezmeral Data Fabric.
- The Hive connection to HPE Ezmeral Data Fabric exists after deleting files
- Deleting the cluster details and tickets from the
mapr-clusters.conf
andmaprtickets
files does not terminate the Hive connection to HPE Ezmeral Data Fabric. Users can still create new Hive connections to HPE Ezmeral Data Fabric and run queries against HPE Ezmeral Data Fabric. This issue occurs because HPE Ezmeral Unified Analytics Software caches the HPE Ezmeral Data Fabric files.Workaround: After you delete the cluster details and tickets from themapr-clusters.conf
andmaprtickets
files, restart the EzPresto pods. To restart the pods, run:kubectl rollout restart statefulset -n ezpresto ezpresto-sts-mst kubectl rollout restart statefulset -n ezpresto ezpresto-sts-wrk
- Optional Fields display by default when connecting an Iceberg data source
- When adding Iceberg as a data source, the UI lists all possible connection fields (mandatory and optional) instead of listing the mandatory connection fields only.
- EzPresto does not release memory when a query completes
-
EzPresto retains allocated memory after query completion for subsequent queries because of an open-source issue (https://github.com/prestodb/presto/issues/15637). For example, if a query uses 10GB of memory, EzPresto does not release the memory when the query completes and then uses it for the next query. If the next query requires additional memory, for instance, 12GB, EzPresto accumulates an extra 2GB and does not release it after query completion. For assistance, contact HPE support.
- Configuration changes to long-running pods are not applied in Ray
-
Configuration changes or upgrades to long-running pods in Ray, such as adjusting resource capacities or expanding persistent volume (PV) storage are not applied in Ray.
Workaround
To ensure successful configuration changes or upgrades, manually delete relevant pods after the reconfiguration or upgrade process. For details, see https://github.com/ray-project/kuberay/issues/527.
- Worker nodes do not automatically spawn with
JobSubmissionClient
in the Ray cluster -
When submitting jobs to the Ray cluster using
JobSubmissionClient
, worker nodes do not spawn automatically.Workaround
To ensure proper functionality when submitting Ray jobs usingJobSubmissionClient
, you must manually specify entry point resources as follows:- For CPU, set
entrypoint_num_cpus
to 1 - For GPU, set
entrypoint_num_gpus
to 1
HPE is actively engaging with the community to address this open-source issue (https://github.com/ray-project/ray/issues/42436).
- For CPU, set
- NVIDIA GPU cannot enforce SELinux
- Due to a known NVIDIA GPU issue (https://github.com/NVIDIA/gpu-operator/issues/553), SELinux cannot be enforced for GPU deployments.
- Ray dashboard UI
- A known Ray issue prevents the Ray Dashboard UI from displaying the GPU worker group details correctly. To see updates regarding resolution and to learn more, see https://github.com/ray-project/ray/issues/14664.
- Upgrade on OpenShift cluster
- If you want to perform an in-place upgrade of HPE Ezmeral Unified Analytics Software on an Openshift cluster, contact HPE support for assistance to ensure a smooth transition and to address any potential complexities that can arise during the upgrade process.
Installation
- To install HPE Ezmeral Unified Analytics Software (version 1.5.0), see Installing on User-Provided Hosts (Connected and Air-gapped Environments).
-
To upgrade HPE Ezmeral Unified Analytics Software to version 1.5.0, contact HPE Support.
Additional Resources
- Documentation
- Release note archives:
Thank you for choosing HPE Ezmeral Unified Analytics Software. Enjoy the new features and improvements introduced in this release.