Configuring Spark Thrift Server with Kerberos

You can configure Spark Thrift server to use Kerberos for its communications with various components on a secure Data Fabric cluster if necessary.

NOTE

Data Fabric clusters do not provide Kerberos infrastructure. The information in this section assume a Linux-based Kerberos environment, and the specific commands for your environment may vary. Consult with your Kerberos administrator for assistance.

To enable Kerberos authentication:

Create a Kerberos identity and keytab. You can use the following commands in a Linux-based Kerberos environment to set up the identity and update the keytab file.
- The hive.keytab file must be owned and readable only by the mapr user.
- FQDN@REALM is case-sensitive.
```
# kadmin
          : addprinc -randkey mapr/<FQDN@REALM>
          : ktadd -k /opt/mapr/conf/hive.keytab mapr/<FQDN@REALM>
```

Configure the following properties in hive-site.xml on each node where HiveServer2 is installed:

Property	Value
hive.server2.authentication	KERBEROS
hive.server2.authentication.kerberos.principal	`mapr/FQDN@REALM` (where `mapr/FQDN@REALM` is the principal that you want to use for the Spark Thrift server)
hive.server2.authentication.kerberos.keytab	`/opt/mapr/conf/mapr.keytab` (where `/opt/mapr/conf/mapr.keytab` is path to the keytab that must be used)

<property>
     <name>hive.server2.authentication</name>
     <value>KERBEROS</value>
     <description>authenticationtype</description>     
</property>
<property>
      <name>hive.server2.authentication.kerberos.principal</name>
      <value>mapr/FQDN@REALM</value>
      <description>Spark Thrift server principal. If _HOST is used as the FQDN portion, 
      it will be replaced with the actual hostname of the running instance.
      </description>
</property>
<property>
     <name>hive.server2.authentication.kerberos.keytab</name>
     <value>/opt/mapr/conf/mapr.keytab</value>
     <description>Keytab file for Spark Thrift server principal</description>  
</property>

Reconfigure the following options in env.sh (/opt/mapr/conf/env.sh) on each node where HiveServer2 is installed:

NOTE

These configurations are listed in the portion of the file that begins with if [ "$MAPR_SECURITY_STATUS" = "true" ];. However, you should make the changes in the /opt/mapr/conf/env_override.sh file. For more information, see About env_override.sh.

Existing Configuration	Required Configuration
`MAPR_HIVE_SERVER_LOGIN_OPTS="-Dhadoop.login=maprsasl_keytab"` `MAPR_HIVE_LOGIN_OPTS="-Dhadoop.login=maprsasl"`	`MAPR_HIVE_SERVER_LOGIN_OPTS="-Dhadoop.login=hybrid"` `MAPR_HIVE_LOGIN_OPTS="-Dhadoop.login=hybrid"`

Restart Spark Thrift server to apply this change. sbin is in your Spark directory at /opt/mapr/spark/spark-<spark_version>/.

IMPORTANT
The MapR administrative user (generally, the account named mapr) should start Spark Thrift server. Then, process identifier (PID) files will be owned by this user, and impersonation support (where applicable) will function correctly.
```
./sbin/stop-thriftserver.sh
./sbin/start-thriftserver.sh 
```

HPE Ezmeral Data Fabric – Customer-Managed 7.9.0 Documentation
Abstract	This site contains documentation for the customer-managed platform of the HPE Ezmeral Data Fabric version 7.9.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	April 2025
Edition	7.9.0
Topic last updated	2024-06-20

Configuring Spark Thrift Server with Kerberos

Related Links