Spark Security
This topic describes the Spark security concepts in HPE Ezmeral Runtime Enterprise.
Authentication for Spark on Kubernetes
Kubernetes authentication and authorization rules are applicable to Spark applications of
kind SparkApplication
or ScheduledSparkApplication
.
For example: You can create, edit, delete, and submit the Spark applications according to RBAC configuration in a tenant namespace.
User Secrets
Spark application images are run as a root user. You must start Spark applications as a user who submits the Spark Application.
HPE Ezmeral Data Fabric is configured with Data Fabric SASL security. When you create the Spark applications in a Data Fabric which is HPE Ezmeral Data Fabric on Kubernetes tenant or in HPE Ezmeral Data Fabric on Bare Metal tenant, you must authenticate Spark driver pods against the HPE Ezmeral Data Fabric.
To start a Spark application as a user who submits the Spark application and to authenticate Spark driver pods against the Data Fabric, you must create a secret. A secret contains the user information like user id, user name, user’s main group id and group name, and user’s MapR ticket.
Creating User Secrets
You can create a user secret in three different ways:
- Automatically creating secrets:
-
The
autoticketgenerator
webhook intercepts all the Create Spark Application requests.The webhook automatically generates a ticket and secret when AD/LDAP integration is enabled on Data Fabric and Kubernetes cluster. This ticket has a default expiration time of 14 days.
You cannot change or renew the expiration time of ticket.
The generated secrets will be deleted when you delete the Spark Application.
- Manually creating secrets with ticketcreator utility:
- The Data Fabric tenants
contain the tenantcli pod. You can manually create your user secrets using the
ticketcreator.sh
script in the tenantcli pod.This ticket has a default expiration time of 14 days. You provide this ticket to the Spark applications using the secrets; thus, you cannot change or renew the expiration time of ticket.
Perform the following steps to useticketcreator.sh
script fromtenantcli
pod:- Run the following command to enter into
tenantcli
pod on tenant namespace.kubectl exec -it tenantcli-0 -n <namespace> -- bash
- Run the
ticketcreator.sh
utility by using the following command:/opt/mapr/kubernetes/ticketcreator.sh
- Enter the following information on the prompt:
- The username and password of the user for whom to create the secret.
- The name of the user secret. The default name is randomized for security.
- Run the following command to enter into
- Manually creating secret without ticketcreator utility:
-
Some Spark applications have a long runtime, for example, Spark streaming applications. In such cases, you will lose the access to HPE Ezmeral Data Fabric services like HPE Ezmeral Data Fabric Filesystem in 14 days.
For the Spark applications which must run for a long time (greater than 14 days), you can create the ticket secrets with a longer expiration time using
-duration
option of the maprlogin utility. The maprlogin utility is available at Kubernetes cluster or attentantcli-0
pod andadmincli-0
pod at HPE Ezmeral Data Fabric on Kubernetes cluster. See Tickets and mapr Command Examples.For example: If you have a ticket saved at
/home/user/maprticket
file, you can run the following command to manually create ticket secrets with a long expiration time:kubectl -n <namespace> create secret generic <secret-name> \ --from-file=CONTAINER_TICKET=/home/user/maprticket \ --from-literal=MAPR_SPARK_USER="[username]" \ --from-literal=MAPR_SPARK_GROUP="[usergroup]" \ --from-literal=MAPR_SPARK_UID="[uid]" \ --from-literal=MAPR_SPARK_GID="[main_gid]"
Add the secret name to
spark.mapr.user.secret
field on your Spark Applicationyaml
file.