Configuring a Remote Database for Airflow
This topic describes how to configure a remote database for Airflow on the HPE Ezmeral Data Fabric.
About this task
Airflow uses SQLAlchemy to connect to the metadata database. The metadata database stores the information about Airflow configurations, user information, roles and policies, and statistics of each DAG state, run, and task.
- Postgres
- MySQL
- SQLite
While SQLite is the default database on Apache Airflow, the Airflow community recommends Postgres for most use cases. For more details, see Set up a Database Backend.
For a list of the supported databases, see Choosing database backend.
Procedure
- Configure the remote database for Airflow. See Set up a Database Backend.
-
Update
<AIRFLOW-HOME>/conf/airflow.cfg
with your SQLAlchemy connection string.-
For PostgreSQL, update
<AIRFLOW-HOME>/conf/airflow.cfg
with:postgresql+psycopg2://<user>:<password>@<host>/<db>
-
For MySQL, update
<AIRFLOW-HOME>/conf/airflow.cfg
with:mysql+mysqldb://<airflow-user>:<airflow-password>@<host>[:<port>]/<airflow-dbname>
To connect to MySQL from Airflow, install the mysqlclient.- Run
. <airflow_home>/build/env/bin/activate
- Run
pip install mysqlclient==2.2.0
- Run
deactivate
- Run
-
-
Create the database schema using the steps that apply to the currently installed
EEP. To identify the EEP that is installed, see Checking the EEP Version:
- EEP 9.2.0 and later:
- Run the
airflow db migrate
command to initialize the database:airflow db migrate
- Run the
airflow connections create-default-connections
command to create default connections:airflow connections create-default-connections
- Run the
- EEP 9.1.x and
earlier:
airflow db init
- EEP 9.2.0 and later:
-
After the database configuration completes, create a user by following the steps in the
Command Line Interface and Environment Variables
Reference. For example:
airflow users create --username mapr --firstname mapr --lastname mapr -p mapr --role Admin --email admin@example.org
- Restart Airflow services as described in Starting, Stopping, and Restarting Airflow Services.