Using Spark OSS Images
Describes how to use Spark Open-Source Software (OSS) images to submit Spark applications.
Spark OSS are Apache Spark images that do not
support Data Fabric filesystem, Data Fabric Streams, or any other Data Fabric sources and
sinks that require a Data Fabric client. These Spark images also do not support Data
Fabric-specific security features (data-fabric SASL (maprsasl
)).
You can use Spark OSS images with two different
workflows as follows:
- Spark Operator workflow using the Create Spark Application GUI. See Using the Create Spark Application GUI.
- Spark Operator workflow using Airflow. See Using Airflow.
Using the Create Spark Application GUI
To use Spark OSS images, choose one of the
following option in the GUI:
- Using Upload YAML in GUI
-
- Select the Spark OSS image from the List of Spark Images.
- Configure your Spark YAML file with the Spark OSS
image.
image: gcr.io/mapr-252711/apache-spark:<image-tag>
- To set the logged-in user’s context, add the following configuration in the
sparkConf
section.
To learn more about user context, see Setting the User Context.spark.hpe.webhook.security.context.autoconfigure: "true"
- Perform the instructions to create a Spark application as described in Creating Spark Applications until you reach the Application Details step.
- In the Application Details step, choose the Upload YAML option.
- Click Select File and, browse and upload the YAML file.
- To specify the details for other boxes or options in the Application Details step and to complete creating the Spark application, see Creating Spark Applications.
- Using New application in GUI
-
- Perform the instructions to create a Spark application as described in Creating Spark Applications until you reach the Review step.
- To open an editor to change the application configuration using YAML in the GUI, click Edit YAML.
- Select the Spark OSS image from the List of Spark Images.
- Replace the default Spark image in YAML with the Spark OSS
image.
image: gcr.io/mapr-252711/apache-spark:<image-tag>
- To set the logged-in user’s context, add the following configuration in the
sparkConf
section.
To learn more about user context, see Setting the User Context.spark.hpe.webhook.security.context.autoconfigure: "true"
- To submit the application with the Spark OSS image, click Create Spark Application on the bottom right of the Review step.
To learn about how to submit Spark applications by using GUI, see Creating Spark Applications.
Using Airflow
When you submit the Spark application by using Airflow, your Spark application will be configured with your chosen Spark image in your YAML file. This YAML file is set in the Airflow DAG.
For example:
submit = SparkKubernetesOperator(
task_id='submit',
namespace="example",
application_file="example.yaml",
dag=dag,
api_group="sparkoperator.hpe.com",
enable_impersonation_from_ldap_user=True
)
To learn about how to submit Spark applications by using Airflow DAG, see Submitting Spark Applications by Using DAGs.