Spark 3.2.0.0 - 2201 (EEP 8.1.0) Release Notes

This section provides reference information, including new features, patches, and known issues for Spark 3.2.0.0.

The notes below relate specifically to the Hewlett Packard Enterprise Distribution for Apache Hadoop. For more information, you may also want to consult the open-source Spark 3.2.0 Release Notes.

These release notes contain only Hewlett Packard Enterprise specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.

NOTE
Spark 3.2.0 runs on Java 11, Scala 2.12, Python 3.6+ and SparkR 3.5+.
Spark Version 3.2.0.0
Release Date January 2022
HPE Version Interoperability See Component Versions for Released EEPs and EEP Components and OS Support.
Source on GitHub https://github.com/mapr/spark
GitHub Release Tag

3.2.0.0-eep-810

Maven Artifacts https://repository.mapr.com/maven/
Package Names Navigate to https://package.ezmeral.hpe.com/releases/MEP/ and select your EEP and OS to view the list of package names.
IMPORTANT
  • Beginning with EEP 6.0.0, the KeyStore and TrustStore password can be removed from spark-defaults.conf and set in /opt/mapr/conf/ssl-client.xml.
  • Beginning with EEP 6.0.0, after an upgrade, the previous version's configuration files are saved in the /opt/mapr/spark directory.
  • MapR 6.1.0 with EEP 6.0.0 and later support simplified security. If you enable security on your data-fabric cluster, HPE scripts automatically configure Spark security features.
  • Beginning with Core 6.2 and EEP 7.0, Spark supports SSL for WebUI.

Hive Support

  • Starting from Spark 3.1.2, Spark supports Hive 2.3.

Delta Lake Support

Spark 3.2.0 provides Delta Lake support on HPE Ezmeral Data Fabric. See Apache Spark Feature Support.

New in This Release

Fixes

This HPE release includes the following new fixes since the latest Spark release. For details, refer to the commit log for this project in GitHub.

GitHub Commit Date (YYYY-MM-DD) Comment
7c727c3 2021/11/04 MapR[SPARK-960] Update Hadoop in Spark-3.2.x
d53fe9f 2021/11/19 MapR [SPARK-979] Backport all needed 3.1.2 EEP commits tp 3.2 branch
e85b0ce 2021/11/22 MapR [SPARK-982] Update Spark version in warden files
5693d20 2021/11/22 MapR [SPARK-972] STS start fail due to java.lang.NoSuchMethodError
3b6cb09 2021/11/25 MapR [SPARK-981] Select from table with data storing as a local file fails
31ead44 2021/11/29 MapR [SPARK-950] Can't start spark job/services with enabled FIPS
85b3d44 2021/12/07 MapR [SPARK-952] Spark services can't start on cluster with enabled FIPS
9cfd68c 2021/12/07 MapR [SPARK-963] select from hbase table which was created via hive fails
82bfd4d 2021/12/09 MapR [SPARK-966] Streaming application with the latest offset read 1 message from mapr stream which was produced before application start
96a3e9d 2021/12/10 MapR [SPARK-964] MapRDBSourceConfig.CreateTableOption=true causes structured streaming application fail
697e7f9 2021/12/24 MapR [SPARK-985] Spark and Livy application fails if spark main package is not installed on each node
5e8401a 2021/12/28 MapR [SPARK-986] log4j-1.2.17.jar vulnerability:CVE-2019-17571
f951c10 2021/12/28 MapR [SPARK-975] Spark CVE fixes for Jan 2022 release
0c07103 2021/12/30 MapR [SPARK-921] Replace sudo command with maprexecute in Spark
6a38156 2021/12/30 MapR [SPARK-984] Select from temp view which was created under orc df fails
279f325 2022/01/11 MapR [SPARK-965] Spark Structured Streaming application fails when need to recovery from checkpoint
6060b6c 2022/01/14 MapR [SPARK-994] Update jackson-mapper-asl v1.9.13 to 1.9.13-atlassian-5
1636a6e 2022/01/17 MapR [SPARK-992] STS doesn't work on cluster with enabled FIPS
1661404 2022/01/18 MapR [SPARK-995] Write to parquet fails.
5b4c35f 2022/01/25 MapR [SPARK-1001] Update log4j v1 to the 1.3.1-mapr
a919353 2022/01/25 MapR [SPARK-1002] Spark WebUI not work on FIPS cluster
5fa1feb 2022/01/28 MapR [SPARK-1000] Spark's -Djava.library.path misses hadoop native libs

Known Issues and Limitations

  • The JDBC driver for Microsoft SQL Server does not support WITH CTE query on Spark.

  • When you enable the SSL in a mixed (FIPS and non-FIPS) configuration, Spark application run fails. To run Spark applications, set spark.ssl.ui.enabled option to false in spark-defaults.conf configuration file.

  • If you are using Spark SQL with Derby database without Hive or Hive Metastore installation, you will see the Java Runtime Exception. See Apache Spark Feature Support for workaround. Spark 3.2.0 does not support log4j1.2 logging on HPE Ezmeral Data Fabric.

  • HPE Ezmeral Data Fabric does not support GPU aware scheduling feature on Spark 3.2.0. See Apache Spark Feature Support.

Resolved Issues

  • None.