Managing Spark Applications Dependencies
This topic describes how to pass the dependencies to Spark applications in HPE Ezmeral Runtime Enterprise.
You can manage custom Spark dependencies in three different ways:
- Build the dependencies in main application jar. For example: Use the maven-assembly-plugin. See maven-assembly-plugin.
- Create a PersistentVolume for dependencies and mount it into driver and executor pods.
You can use
local
schema to reference those dependencies. For example: See PySpark-with-dependencies.yaml. - Build the custom images on top of the Spark images provided by HPE Ezmeral Runtime Enterprise. Copy or install
the dependencies in your custom images. You can use
local
schema to reference those dependencies.
Supported Schemas for Main Application File in Spark Applications:
The following schemas are supported for main application file in Spark applications:
local
dtap
s3a
maprfs
Supported Schemas for Passing Dependencies
The following schemas are supported for passing dependencies to Spark applications:
local
s3a
(AWS)dtap
Unsupported Schemas for Passing Dependencies
The following schemas are not supported for passing dependencies to Spark applications:
maprfs
s3a
(custom). To learn more, see Spark on Kubernetes Issues (5.4.0) on Issues and Workarounds.