Airflow 2.9.1.0 - 2407 (EEP 9.3.0) Release Notes

The following notes relate specifically to the HPE Ezmeral Data Fabric Distribution for Apache Airflow. You may also be interested in the Apache Airflow home page.
Airflow Version 2.9.1.0
Release Date July 2024
HPE Version Interoperability See EEP Components and OS Support.
Source on GitHub https://github.com/mapr/airflow
GitHub Release Tag 2.9.1.0-eep-930
Package Names Navigate to http://package.ezmeral.hpe.com/releases/MEP/, and select your EEP(MEP) and OS to view the list of package names.
Documentation

New in This Release

This release:
  • Updates the Airflow component to version 2.9.1.0.
  • Implements the HPE Ezmeral Drill hook and operator.
  • Adds the new default ezhive_cli_default connection for EZHiveOperator and EzHiveCliHook.
  • Adds the airflow-env.sh configuration for flexible configuration of Airflow services.

Fixes

The following fixes are added on top of Apache Airflow 2.9.1 or backported from Apache Airflow 2.9.2:
  • Fix bug that makes AirflowSecurityManagerV2 leave transactions in the idle in transaction state (#39935)
  • Fix Mark Instance state buttons stay disabled if user lacks permission (#37451). (#38732)
  • Use SKIP LOCKED instead of NOWAIT in mini scheduler (#39745)
  • Remove DAG Run Add option from FAB view (#39881)
  • Add max_consecutive_failed_dag_runs in API spec (#39830)
  • Fix example_branch_operator failing in python 3.12 (#39783)
  • Change dataset URI validation to raise warning instead of error in Airflow 2.9 (#39670)
  • Visible DAG RUN doesn’t point to the same dag run id (#38365)
  • Refactor SafeDogStatsdLogger to use get_validator to enable pattern matching (#39370)
  • Fix custom actions in security manager has_access (#39421)
  • Fix HTTP 500 Internal Server Error if DAG is triggered with bad params (#39409)
  • Fix static file caching is disabled in Airflow Webserver. (#39345)
  • Fix TaskHandlerWithCustomFormatter now adds prefix only once (#38502)
  • Do not provide deprecated execution_date in @apply_lineage (#39327)
  • Add missing conn_id to string representation of ObjectStoragePath (#39313)
  • Fix sql_alchemy_engine_args config example (#38971)
  • Add Cache-Control “no-store” to all dynamically generated content (#39550)

Known Issues and Limitations

  • The Installer can install Airflow, but cannot set up MySQL as the backend database for Airflow. The default Airflow database is SQLite.​
  • Apache PySpark has many CVEs and is removed from the default Airflow dependencies. To use the Spark JDBC operator/hook from Apache, install PySpark as follows:
    1. Run <airflow_home>/build/env/bin/activate.
    2. Run pip install pyspark==3.3.3.
    3. Run deactivate.
    4. NOTE
      This process does not affect the HPE Ezmeral Spark provider.
  • If the repair_pip_depends.sh script fails with the following error, you must run the script again:
    subprocess.CalledProcessError: Command 'krb5-config --libs gssapi' returned non-zero exit      status 127. 
     [end of output] 

Resolved Issues

None.