Spark SQL Thrift Server
Spark SQL Thrift (Spark Thrift) was developed from Apache Hive HiveServer2 and operates like HiveSever2 Thrift server.
Spark Thrift is supported on secure clusters. You can run the Spark Thrift server and connect to Hive versions supported by Spark 2.1.0 and later with Business Intelligence (BI) tools or the Beeline command-line tool.
Starting in the EEP 4.0 release, the Spark Thrift server is available as a separate package. To install this package, see Installing Spark Standalone or Installing Spark on YARN, depending on the type of cluster manager you are installing.
In EEP 3.0, MapR introduces additional security mechanisms for Spark with the Spark Thrift server. MapR-SASL and Kerberos are supported:
- For JDBC connections into Spark Thrift server
- Between Spark and Hive metastore
To enable these security mechanisms for the Spark Thrift server, starting in the EEP 4.0 release, for secure clusters, running configure.sh -R configures
MapR-SASL security. The script modifies or creates a
SPARK_HOME/conf/hive-site.xml
file as follows:
- If Hive is installed in your cluster, the script copies
HIVE_HOME/conf/hive-site.xml
toSPARK_HOME/conf
and modifies the file. - If Hive is not installed and you are using MapR-SASL security, the script creates a new
SPARK_HOME/conf/hive-site.xml
file. - Each time the script runs, if there is a pre-existing
SPARK_HOME/conf/hive-site.xml
file, the script saves a copy of the file inSPARK_HOME/conf/hive-site.xml.old
before modifying it.
You can configure security manually by following the steps outlined in sub-topics listed on this page.
To launch the Spark Thrift server, perform the procedures required to configure Apache Spark to use Hive.
- Starting in the EEP 4.0 release, if you
start and stop the Spark Thrift server using Warden, the connection port number is 2304.
If you start and stop by running the
/opt/mapr/spark/<spark-version/sbin/{start,stop}-thriftserver.sh
scripts, the port number remains 10000. - Starting in the EEP 5.0.4 and EEP 6.3.0 releases, if you start and stop the Spark
Thrift server by running the
/opt/mapr/spark/<spark-version/sbin/{start,stop}-thriftserver.sh
scripts, the port number remains 2304.
Default Behavior
The default behavior of the Spark Thrift server is as follows:
- After installation, the Spark Thrift server is started in the local master mode.
- If the Spark master package is installed, then Spark Thrift server is started in the standalone master mode.
- If the
spark.master
property is set in thespark-defaults.conf
file, then Spark Thrift server uses the master set by this property.
Known Limitations
- MapR-SASL support is implemented for Spark 2.1.0 and later versions of Spark. For Spark version information, see Component Versions for Released EEPs.
- The ODBC drivers do not support MAPR-SASL.
- Username and password authentication through PAM is not supported in EEP 3.0.
- Spark Thrift server supports only features and commands in Hive 1.2.
- Although Spark 2.1.0 can connect to Hive 2.1 Metastore, only Hive 1.2 features and commands are supported by Spark 2.1.0.
Related Links
For information related to Spark Thrift server, see:
MapR | Apache |