Using Spark OSS Images

Describes how to use Spark Open-Source Software (OSS) images to submit Spark applications.

Spark OSS are Apache Spark images that do not support Data Fabric filesystem, Data Fabric Streams, or any other Data Fabric sources and sinks that require a Data Fabric client. These Spark images also do not support Data Fabric-specific security features (data-fabric SASL (maprsasl)).

You can use Spark OSS images with two different workflows as follows:

Using the Create Spark Application GUI

To use Spark OSS images, choose one of the following option in the GUI:
Using Upload YAML in GUI
  1. Select the Spark OSS image from the List of Spark Images.
  2. Configure your Spark YAML file with the Spark OSS image.
    image: gcr.io/mapr-252711/apache-spark:<image-tag>
  3. To set the logged-in user’s context, add the following configuration in the sparkConf section.
    spark.hpe.webhook.security.context.autoconfigure: "true"
    To learn more about user context, see Setting the User Context.
  4. Perform the instructions to create a Spark application as described in Creating Spark Applications until you reach the Application Details step.
  5. In the Application Details step, choose the Upload YAML option.
  6. Click Select File and, browse and upload the YAML file.
  7. To specify the details for other boxes or options in the Application Details step and to complete creating the Spark application, see Creating Spark Applications.
Using New application in GUI
  1. Perform the instructions to create a Spark application as described in Creating Spark Applications until you reach the Review step.
  2. To open an editor to change the application configuration using YAML in the GUI, click Edit YAML.
  3. Select the Spark OSS image from the List of Spark Images.
  4. Replace the default Spark image in YAML with the Spark OSS image.
    image: gcr.io/mapr-252711/apache-spark:<image-tag>
  5. To set the logged-in user’s context, add the following configuration in the sparkConf section.
    spark.hpe.webhook.security.context.autoconfigure: "true"
    To learn more about user context, see Setting the User Context.
  6. To submit the application with the Spark OSS image, click Create Spark Application on the bottom right of the Review step.

To learn about how to submit Spark applications by using GUI, see Creating Spark Applications.

Using Airflow

When you submit the Spark application by using Airflow, your Spark application will be configured with your chosen Spark image in your YAML file. This YAML file is set in the Airflow DAG.

For example:
submit = SparkKubernetesOperator(
    task_id='submit',
    namespace="example",
    application_file="example.yaml",
    dag=dag,
    api_group="sparkoperator.hpe.com",
    enable_impersonation_from_ldap_user=True
)
To learn about how to submit Spark applications by using Airflow DAG, see Submitting Spark Applications by Using DAGs.