Configuring MapR-SASL and SSL for Hooks Connections

This topic decribes configuration options for MapR-SASL and SSL for hook connections in Airflow.

Using Airflow, you can import and export data to multiple systems. Airflow provides a high-level interface called Hooks to connect to these systems by integrating with Connections.

A connection is an object that stores credentials such as your username, password and hostname, the type of system you are connecting to, and other configuration options.

HPE Ezmeral Data Fabric 7.0.0 supports MapR-SASL authentication for Airflow.

To support MapR-SASL authentication for HPE Ezmeral Data Fabric 6.2.x, see Applying a Patch.

Airflow authenticates with MapR-SASL in the following ways:

Using the Ecosystem Component Client

To authenticate with MapR-SASL, Airflow uses the clients of ecosystem component installed on the node. To submit the tasks, configure a Data Fabric User Ticket on a secure cluster. See Generating a HPE Ezmeral Data Fabric User Ticket.

Using the REST API or Thrift protocol

To authenticate with MapR-SASL, you can use REST API or Thrift protocol by setting the additional configuration options.

WebEZFSHook (webezfs_default connection id)
To connect with file system, set the following configuration options on extra section of connection configuration.
MapR-SASL: Set {"auth": "maprsasl"}.
SSL: On secure clusters, set {"use_ssl": "true"} option. For nondefault SSL configuration, set {"cert":"/path_to_truststore.pem"}.
EzHiveCliHook (hive_cli_default connection_id, auth authenticationMethod)
To connect with Hive, set the following configuration options on connection configuration.
MapR-SASL: Set {"use_beeline": true, "ssl":"true"}. Add auth parameter to EzHiveCliHook. For example: hive = EzHiveCliHook(auth="maprsasl").
EzHiveMetastoreHook (metastore_default connection id)
To connect with Hive Metastore, set the following configuration options on extra section of connection configuration.
MapR-SASL: Set {"authMechanism":"MAPRSASL"}.
EzHiveServer2Hook (hiveserver2_default connection id)
To connect with HiveServer2, set the following configuration options on extra section of connection configuration.
MapR-SASL: Set {"authMechanism":"MAPRSASL"}.
SSL: On secure clusters, set {"ssl": "true"} option. For nondefault SSL configuration, set {"certificate":"/path_to_truststore.pem"}.
EzLivyHook (livy_default connection id)
To connect with Livy, set the following configuration options on extra section of connection configuration.
MapR-SASL: Set {"auth":"maprsasl"}.
SSL: On secure clusters, set {"use_ssl": "true"} option. For nondefault SSL configuration, set {"cert":"/path_to_truststore.pem"}.
EzS3Hook (aws_default connection id)
To connect with S3, set the following configuration options on the extra section of connection configuration.
SSL: On secure clusters, set the {"cert":"/path_to_truststore.pem"} option for nondefault SSL configuration.
To connect with the HPE Ezmeral Data Fabric Object Store and AWS, you must also add the endpoint URL to the extra section of the connection configuration. For example:
{"endpoint_url": "https://<hostname>:9000"}