Spark 2.1.0-1703 Release Notes

The notes below relate specifically to the MapR Distribution for Apache Hadoop. You may also be interested in the open-source Spark 2.1.0 Release Notes.

Spark Version 2.1.0
Release Date April 2017
MapR Version Interoperability See EEP Components and OS Support.
Source on GitHub
GitHub Release Tag 2.1.0-mapr-1703
Maven Artifacts
Package Names See Package Names for Ecosystem Packs (EEPs)
API Changes for this Version See Spark API Changes.
NOTE For some important Spark limitations, See "Known Issues and Limitations" later in this release note.

New in This Release

This version of Spark supports integration with Hive. However, note the following exceptions:


This MapR release includes the following new fixes since the latest MapR Spark release. For details, refer to the commit log for this project in GitHub.

GitHub Commit Date (YYYY-MM-DD) Comment
f4bf0f5 2017/03/16 [MAPR-26060] Fixed case when mapr-streams make gaps in offsets (#97).
b6f643d 2017/03/09 Ported features from kafka 10 to kafka 9
b2d468e 2017/03/09 Merge remote-tracking branch 'origin/branch-2.1.0-mapr' into branch-2.1.0-mapr.
8aba33a 2017/03/06 Merge pull request #95 from mapr/spark-2.1.1-critical-backport.
c64db71 2017/03/06 [SPARK-18589][SQL] Fix Python UDF accessing attributes from both side of join.
417eca2 2017/03/06 [SPARK-19120] Refresh Metadata Cache After Loading Hive Tables.
0422b78 2017/03/06 [SPARK-18700][SQL] Add StripedLock for each table's relation in cache.
a45edcc 2017/03/06 [SPARK-19129][SQL] SessionCatalog: Disallow empty part col values in partition spec.
b6529d8 2017/03/06 [SPARK-19520][STREAMING] Do not encrypt data written to the WAL.
edfa296 2017/03/06 [SPARK-19750][UI][BRANCH-2.1] Fix redirect issue from http to https.
0c13e47 2017/03/06 [SPARK-19652][UI] Do auth checks for REST API access (branch-2.1).
2e3fcd3 2017/03/06 [SPARK-19220][UI] Make redirection to HTTPS apply to all URIs. (branch-2.1).
bfb75e5 2017/03/06 [SPARK-19766][SQL] Constant alias columns in INNER JOIN should not be folded by FoldablePropagation rule.
611e920 2017/03/01 [MAPR-26289][SPARK-2.1] Streaming general improvements (#93).
519f6f6 2017/02/27 Merge pull request #92 from mapr/mapr-26258.
9841429 2017/02/27 Set default HBase version to 1.1.8.
8c85366 2017/02/27 [MAPR-26258] hbasecontext.HBaseDistributedScanExample fails.
82a01e7 2017/02/13 Changes from Kafka10 package were ported to Kafka09 package.
7577dd7 2017/02/09 Merge pull request #88 from mapr/mapr-26076-spark-2.1.0.
a90ea6e 2017/02/09 [SPARK-15844][CORE] HistoryServer doesn't come up if spark.authenticate = true.
3a83ddb 2017/02/08 Merge pull request #87 from mapr/mapr-26053.
608e920 2017/02/08 [MAPR-26053] Include MapR Classes to the default value of spark.sql.hive.metastore.sharedPrefixes.
5fca03a 2017/01/23 Merge pull request #85 from mapr/mapr-24068.
33830be 2017/01/23 [MAPR-24068] YARN throws exception when label expression set.
e4263dc 2017/01/17 Merge pull request #84 from mapr/mapr-25807.
c9a53dc 2017/01/17 [MAPR-25807] Spark-Warehouse path computes incorrectly.
f7b6fcc 2017/01/16 Merge pull request #83 from mapr/thrift-maprsasl-spark-2.1.0.
ad8a592 2017/01/16 Add MapR-SASL support for Thrift Server.
a0d8c09 2017/01/12 Adding scala library.
a5f1bb2 2017/01/12 [MAPR-25713] Spark might try to load MapR Class Loader multiple times and fail.
6683ffc 2017/01/12 [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer.
ed5b22f 2017/01/12 [MAPR-25311] Bump Spark dependencies after ECO-1611 release.
effc5ba 2017/01/12 [MINOR] Fix script.
fa6f142 2017/01/12 [MAPR-24603] Could not launch beeline shell after starting Spark thrift server.
fc17f1a 2017/01/12 fixed syntax error in V09DirectKafkaWordCount example (#75).
c7de39f 2017/01/12 Spark 2.0.1 MAPR-streams Python API (#73).
e338b71 2017/01/12 [MAPR-24415] SPARK_JAVA_OPTS is deprecated (#71).
de237dc 2017/01/12 Kafka streaming producer added. (#66).
adb91d4 2017/01/12 Fixed Scala Style for SparkHiveExample.
5e4ba56 2017/01/12 [MAPR-24491] HBase classpath might contain Hive libraries.
6935a9a 2017/01/12 Minor fix for previous commit.
14902af 2017/01/12 Added script for MAPR-24374.
ae730d2 2017/01/12 Some minor changes to spark-defaults.conf.
0d2545c 2017/01/12 Changed default HBase version to 1.1.1 in compatibility.version.
dfda5f3 2017/01/12 Streaming example was refactored.
583a764 2017/01/12 [MAPR-24470] HiveFromSpark test fails in yarn-cluster mode.
f03efbe 2017/01/12 Changed Hive execution version to 1.2.0.
e2dc96b 2017/01/12 Added spark streaming integration with kafka 0.9 and mapr-streams.
b6d6609 2017/01/12 Added MapR Repo.
6e2f22e 2017/01/12 [MAPR-23559] Spark PID in /opt/mapr/pid.
4bbbfc1 2017/01/12 [MAPR-22940] Failed to connect Spark beeline (after Spark thrift server is started) on Kerberos cluster.
979a663 2017/01/12 Remove Hive jars from generated classpath for Hive.
3bc9863 2017/01/12 Fix hardcoded Hive library path.
8801758 2017/01/12 [MAPR-23203] Remove derby jars from generated Hive classpath.
4e9bd0f 2017/01/12 [MAPR-18865] Unable to submit Spark apps from Windows client.
12d25e8 2017/01/12 Skip maven clean task on the parent module.
9ef2527 2017/01/12 New: Issue with running Hive commands in Spark.
66002ec 2017/01/12 Spark should have dependency on CLDB.
ddcbed7 2017/01/12 Remove DFS shuffle settings.
b7cc4bb 2017/01/12 Fix bugs in the logic to avoid SSH for localhost.
1495ce6 2017/01/12 Copy every file in the conf directory into the distribution package.
647579b 2017/01/12 Create spark-defaults.conf for MapR.
f05c39b 2017/01/12 Avoid SSH to localhost when stopping secondary instances.
500a83c 2017/01/12 Add htrace jar to Spark classpath for hbase 0.98.
e19480e 2017/01/12 Support hbase classpath computation in util script.
8d8db1e 2017/01/12 Created ext-util.
1b5e0d3 2017/01/12 Adding external conf and scripts.
de9bdaf 2017/01/12 Enable SPARK_HIVE mode while building.
f64b395 2017/01/12 Build Spark on MapR.
de78691 2017/01/12 Spark Master failed to start in HA mode.
d9eef52 2017/01/12 The datanucleus jar in Spark need to be updated for Bug 21228.
89e1555 2017/01/12 Change dependencies to MapR and bump Hadoop version.
9e41ce0 2017/01/12 Change Spark version.

Known Issues and Limitations

  • Spark 2.1 does not support Spark Structured Streaming.
  • Full support of HPE Ezmeral Data Fabric Streams is available only on clusters with MapR 5.2 and later.
  • Spark 2.1 is able to connect to Hive Metastore 2.1, but features of Hive that were added after Hive 1.2 are not supported by Spark.
  • Spark is not able to submit jobs to YARN when the cluster is in "classic" mode, even if YARN is installed and configured.
  • MAPR-17271: On secure clusters, the MapR Control System (MCS) does not display links for Spark-Master and Spark-HistoryServer.
  • MAPR-26254: Spark Standalone is not fully supported on Kerberos-secured clusters.
  • MAPR-26039: Spark does not propagate mapr_sec_enabled variable to Driver.
  • MAPR-25770: MapR-FS logs ERROR when Spark is trying to delete an already-deleted file.
  • Filter push-down is not supported with HPE Ezmeral Data Fabric Database.
  • The HPE Ezmeral Data Fabric Database Binary Connector for Apache Spark supports HPE Ezmeral Data Fabric Database binary tables except for the "bulk load" operation (SPARK-7).
  • Spark versions up to and including 2.3.0 have the following security vulnerability: CVE-2018-1334 Apache Spark local privilege escalation vulnerability

Resolved Issues