Configuring a Remote Database for Airflow

This topic describes how to configure a remote database for Airflow on the HPE Ezmeral Data Fabric.

About this task

Airflow uses SQLAlchemy to connect to the metadata database. The metadata database stores the information about Airflow configurations, user information, roles and policies, and statistics of each DAG state, run, and task.

You can configure databases supported by SQLAlchemy to host Airflow metadata. The most common databases are:
  1. Postgres
  2. MySQL
  3. SQLite

While SQLite is the default database on Apache Airflow, the Airflow community recommends Postgres for most use cases. For more details, see Set up a Database Backend.

For a list of the supported databases, see Choosing database backend.

Procedure

  1. Configure the remote database for Airflow. See Set up a Database Backend.
  2. Update <AIRFLOW-HOME>/conf/airflow.cfg with your SQLAlchemy connection string.
    • For PostgreSQL, update <AIRFLOW-HOME>/conf/airflow.cfg with:

      postgresql+psycopg2://<user>:<password>@<host>/<db>

    • For MySQL, update <AIRFLOW-HOME>/conf/airflow.cfg with:

      mysql+mysqldb://<airflow-user>:<airflow-password>@<host>[:<port>]/<airflow-dbname>

      To connect to MySQL from Airflow, install the mysqlclient.
      1. Run . <airflow_home>/build/env/bin/activate
      2. Run pip install mysqlclient==2.2.0
      3. Run deactivate
  3. Create the database schema using the steps that apply to the currently installed EEP. To identify the EEP that is installed, see Checking the EEP Version:
    • EEP 9.2.0 and later:
      1. Run the airflow db migrate command to initialize the database:
        airflow db migrate
      2. Run the airflow connections create-default-connections command to create default connections:
        airflow connections create-default-connections
    • EEP 9.1.x and earlier:
      airflow db init
  4. After the database configuration completes, create a user by following the steps in the Command Line Interface and Environment Variables Reference. For example:
    airflow users create --username mapr --firstname mapr --lastname mapr -p mapr --role Admin --email admin@example.org
  5. Restart Airflow services as described in Starting, Stopping, and Restarting Airflow Services.