Airflow
Describes how to identify and debug issues for Airflow.
Airflow UI
- Cannot access Airflow UI or cannot see DAGs.
-
- Ensure that the Git repository is configured properly. See Airflow DAGs Git Repository.
- The administrator can refer to the logs from the
git-sync
container in thescheduler
pod in theairflow-hpe
namespace.
- Cannot sign in to Airflow or other issues in Airflow UI.
-
Check the logs from the
af-cluster-airflowui-0
pod in theairflow-hpe
namespace. Run:kubectl logs -n airflow-hpe af-cluster-airflowui-0
NOTEIf more than one user needs to access the same browser, the logged-in user must explicitly log out before another user can access the UI. Failure to explicitly log out results in caching and dashboard permission issues if multiple users try to access the same UI.
Airflow DAG
- Airflow DAG is failing.
-
If Airflow DAG is failing, you can check the logs in the following three ways:
- To check the logs of the failed task in the Airflow UI page, follow these
steps:
- Sign in to HPE Ezmeral Unified Analytics Software.
- Click the Applications & Frameworks icon on the left navigation bar. Navigate to the Airflow tile under the Data Engineering tab and click Open.
- Click Browse and select Task
Instances.
- Select the failed task from the list.
- Scroll horizontally to the right until you find the Log Url button.
- Click on the Log Url button to view the logs associated with the failed task.
- To check the logs from the pod of a task by its name in the
airflow-hpe
namespace, run:kubectl logs -n airflow-hpe <pod_name_associated_with_the_task>
- To check the logs from the
scheduler
pod in theairflow-hpe
namespace, run:kubectl logs -n airflow-hpe af-cluster-scheduler-0
- To check the logs of the failed task in the Airflow UI page, follow these
steps:
Airflow scheduler
Pod
- The
scheduler
pod is not coming up. -
If the
scheduler
pod is not coming up, follow these steps:NOTEPerforming the next steps will result in the deletion of Airflow metadata. Proceed with caution.- Delete the PVC in the
airflow-hpe
namespace without waiting for the deletion.kubectl delete pvc -n airflow-hpe <pvc_name>
- Delete the PostgreSQL database StatefulSet in the
airflow-hpe
namespace.kubectl delete statefulset -n airflow-hpe <postgres_db_statefulset_name>
- Restart the
scheduler
pod.kubectl rollout restart sts -n airflow-hpe af-cluster-scheduler
- Delete the PVC in the