Using Your Own Open-Source Spark Images
Describes how to use your own open-source Spark images to submit Spark applications.
You can use your own open-source Spark images that are compatible with the Kubernetes version
supported on HPE Ezmeral Unified Analytics Software. By
bringing your own open-source Spark, you can build Spark with any profile of your choice;
however, there will be no support for Data Fabric filesystem, Data Fabric Streams, or any
other Data Fabric sources and sinks that require a Data Fabric client. Also, open-source Spark
images will not support Data Fabric-specific security features (data-fabric SASL
(maprsasl
)).
- Build Spark. See Building Spark.
- Build Spark images to run in HPE Ezmeral Unified Analytics Software. See Building Images.
- Choose one of the following:
Using the Create Spark Application GUI
- Using Upload YAML
-
- Configure your Spark YAML file with the built Spark image of your
choice.
image: <base-repository>/<image-name>:<image-tag>
- To set the logged-in user’s context, add the following configuration in the
sparkConf
section.
To learn more about user context, see Setting the User Context.spark.hpe.webhook.security.context.autoconfigure: "true"
- Perform the instructions to create a Spark application as described in Creating Spark Applications until you reach the Application Details step.
- In the Application Details step, choose the Upload YAML option.
- Click Select File and, browse and upload the YAML file.
- To specify the details for other boxes or options in the Application Details step and to complete creating the Spark application, see Creating Spark Applications.
- Configure your Spark YAML file with the built Spark image of your
choice.
- Using New application
-
- Perform the instructions to create a Spark application as described in Creating Spark Applications until you reach the Review step.
- To open an editor to change the application configuration using YAML in the GUI, click Edit YAML.
- Replace the default Spark image in YAML with your built open-source Spark
image.
image: <base-repository>/<image-name>:<image-tag>
- To set the logged-in user’s context, add the following configuration in the
sparkConf
section.
To learn more about user context, see Setting the User Context.spark.hpe.webhook.security.context.autoconfigure: "true"
- To submit the application with your own Spark image, click Create Spark Application on the bottom right of the Review step.
Using Airflow
When you submit the Spark application by using Airflow, your Spark application will be configured with your chosen Spark image in your YAML file. This YAML file is set in the Airflow DAG.
submit = SparkKubernetesOperator(
task_id='submit',
namespace="example",
application_file="example.yaml",
dag=dag,
api_group="sparkoperator.hpe.com",
enable_impersonation_from_ldap_user=True
)
To learn about how to submit Spark applications by using Airflow DAG, see Submitting Spark Applications by Using DAGs.