Lokasi ngalangkungan proxy:   [ UP ]  
[Ngawartoskeun bug]   [Panyetelan cookie]                
Skip to content

Skip FK-referenced dag_version rows during db clean#68339

Open
ephraimbuddy wants to merge 1 commit into
apache:mainfrom
astronomer:fix-db-clean-dag-version-fk-pinned-66177
Open

Skip FK-referenced dag_version rows during db clean#68339
ephraimbuddy wants to merge 1 commit into
apache:mainfrom
astronomer:fix-db-clean-dag-version-fk-pinned-66177

Conversation

@ephraimbuddy

Copy link
Copy Markdown
Contributor

airflow db clean on the dag_version table selected old, non-latest versions for deletion regardless of whether they were still referenced. Because task_instance.dag_version_id is ON DELETE RESTRICT, deleting a version still referenced by a task instance fails the foreign key, so the command could not prune dag_version at all for any DAG with history.

Add a generic skip_if_referenced option to the cleanup table config that excludes rows still referenced by a given (table, fk_column) via a correlated NOT EXISTS, and apply it to dag_version for task_instance.dag_version_id. Cleanup now prunes only orphaned older versions and makes progress as task instances age out and are cleaned.

related: #66177


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: claude opus 4.8 following the guidelines

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves airflow db clean behavior for dag_version by preventing deletion attempts of rows that are still referenced by task_instance.dag_version_id (an ON DELETE RESTRICT FK), allowing cleanup to make progress by pruning only orphaned, older DAG versions.

Changes:

  • Add a generic skip_if_referenced (and referenced_pk_column) option to db cleanup table configuration, implemented as a correlated NOT EXISTS filter in _build_query().
  • Apply this option to dag_version to skip versions still referenced by task_instance.dag_version_id.
  • Add a unit regression test to ensure pinned dag_version rows are skipped while orphaned old versions are deleted.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
airflow-core/src/airflow/utils/db_cleanup.py Adds skip_if_referenced filtering to the cleanup query builder and configures dag_version to use it.
airflow-core/tests/unit/utils/test_db_cleanup.py Adds a regression test verifying dag_version cleanup skips FK-pinned versions but prunes orphaned old ones.

Comment thread airflow-core/src/airflow/utils/db_cleanup.py
Comment thread airflow-core/tests/unit/utils/test_db_cleanup.py Outdated
@ephraimbuddy ephraimbuddy force-pushed the fix-db-clean-dag-version-fk-pinned-66177 branch from a9023cd to 5a53021 Compare June 10, 2026 14:19
airflow db clean on the dag_version table selected old, non-latest
versions for deletion regardless of whether they were still referenced.
Because task_instance.dag_version_id is ON DELETE RESTRICT, deleting a
version still referenced by a task instance fails the foreign key, so
the command could not prune dag_version at all for any DAG with history.

Add a generic skip_if_referenced option to the cleanup table config that
excludes rows still referenced by a given (table, fk_column) via a
correlated NOT EXISTS, and apply it to dag_version for
task_instance.dag_version_id. Cleanup now prunes only orphaned older
versions and makes progress as task instances age out and are cleaned.

related: apache#66177
@ephraimbuddy ephraimbuddy force-pushed the fix-db-clean-dag-version-fk-pinned-66177 branch from 5a53021 to bec56dc Compare June 10, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants