Using Amazon S3 to Store Logs
Amazon Web Services (AWS) offers Amazon Simple Storage Service (Amazon S3). Amazon S3 provides the storage and retrieval of objects through a web service interface.
Configure the Spark History Server with existing Amazon S3 storage buckets to store the event logs.
To store logs on Amazon S3 buckets,
-
Set the following flags during Spark History Server installation. See Installing and Configuring Spark History Server.
The configuration options like--set tenantIsUnsecure=true \ --set eventlogstorage.kind=s3 \ --set eventlogstorage.s3Endpoint=http://s3host:9000 \ --set eventlogstorage.s3path=s3a://bucket/<path-to-folder> \ --set eventlogstorage.s3AccessKey=<access-key \ --set eventlogstorage.s3SecretKey=<secret-key>
s3AccessKey
ands3SecretKey
are passed to Spark History Server using a Kubernetes secret.You can also securely pass the Amazon S3 credentials by settingsparkExtraConfigs
option invalues.yaml
file.sparkExtraConfigs: | spark.hadoop.fs.s3a.access.key [access_key] spark.hadoop.fs.s3a.secret.key [secret_key]
- Set the following options in
values.yaml
file in a tenant namespace.# Space separated Java options for Spark HS (Will be added to SPARK_HISTORY_OPTS in spark-env.sh) HSJavaOpts: -Dcom.sun.net.ssl.checkRevocation=false -Dcom.amazonaws.sdk.disableCertChecking=true