Ensure that DB migrations handles all kinds of NaN values in historical xcoms#57866
Merged
Conversation
amoghrajesh
commented
Nov 5, 2025
ashb
reviewed
Nov 5, 2025
ashb
reviewed
Nov 5, 2025
ashb
reviewed
Nov 5, 2025
Contributor
Author
|
Alright, with head at: 0587a26, I tried both postgres and mysql, and it works fine for both of them with the same testing instructions as in the PR desc with the DAG: from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
def push_nan_to_xcom(**kwargs):
# This dict contains a NaN, which is not valid JSON
value = [{"day": "2024-06-07", "ArticleCountMetric": float("nan")}]
kwargs["ti"].xcom_push(key="bad_json", value=value)
def number_of_people(**kwargs):
# This dict contains a NaN, which is not valid JSON
list_of_people_in_space = [
{"craft": "Tiangong", "name": "Ye GuangfuNaN"},
]
kwargs["ti"].xcom_push(key="people_in_space", value=list_of_people_in_space)
def long_url(**kwargs):
# This dict contains a NaN, which is not valid JSON
value = {"name": "advisors-ndjson-20250107151944.ndjson.gz", "mime_type": "application/gzip", "data_type": "advisors-ndjson", "md5": "1f7b41a00548bdee85a3cd02c02efbc8", "size": 770854, "created_at": "2025-01-07T14:19:45.057179Z", "url": "https://storage.googleapis.com/prod-gain-bulk-files/advisors-ndjson-20250107151944.ndjson.gz?Expires=1736309071&GoogleAccessId=prod-gain-sa%40gain-prod-414309.iam.gserviceaccount.com&Signature=bxojFAGIn%2B5R1xlSeV91XFGA1ZBSINaNxKZOVHdaezneaFxvQ9TPiTJ%2BIfdZBJhZg8bEuXGIOg5xJ7U0Gu1%2Fe5R52JhH81SzkvshxUBZGaHrKKAVauXrxjzvgJ39QpUrOiYAs4GSq4MNYu1ZvVfOO8q%2B3sdO3X6z2QXbfbwXXVoMmZP4XNuiQRJWSNbDanlLDNEZqotYA%3D%3D", "schema": "advisors", "s3_path_latest": "bulk_files/advisors/historical/dt=2025-01-07/advisors-ndjson-20250107151944.ndjson.gz"}
kwargs["ti"].xcom_push(key="long_url", value=value)
def array_nan(**kwargs):
# This dict contains a NaN, which is not valid JSON
value = [float("nan")]
kwargs["ti"].xcom_push(key="array_nan", value=value)
with DAG(
dag_id="xcom_nan_example",
start_date=datetime(2024, 1, 1),
schedule_interval=None,
catchup=False,
) as dag:
t0 = PythonOperator(
task_id="t1",
python_callable=push_nan_to_xcom,
provide_context=True,
)
t1 = PythonOperator(
task_id="t2",
python_callable=number_of_people,
provide_context=True,
)
t2 = PythonOperator(
task_id="t3",
python_callable=long_url,
provide_context=True,
)
t3 = PythonOperator(
task_id="t4",
python_callable=array_nan,
provide_context=True,
)
[t0, t1, t2, t3]And for 2.10.0 -> main (3.2.0) |
Contributor
|
Tested this looks good to me. |
vatsrahul1001
approved these changes
Nov 5, 2025
Contributor
Backport failed to create: v3-1-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker 5168e62 v3-1-testThis should apply the commit to the v3-1-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continue |
amoghrajesh
added a commit
to amoghrajesh/airflow
that referenced
this pull request
Nov 5, 2025
…al xcoms (apache#57866) Ensure that DB migrations handles all kinds of NaN values in historical xcoms (cherry picked from commit 5168e62)
Contributor
Author
|
Manual cherry pick: #57893 |
xchwan
pushed a commit
to xchwan/airflow
that referenced
this pull request
Nov 6, 2025
…al xcoms (apache#57866) Ensure that DB migrations handles all kinds of NaN values in historical xcoms
Copilot AI
pushed a commit
to jason810496/airflow
that referenced
this pull request
Dec 5, 2025
…al xcoms (apache#57866) Ensure that DB migrations handles all kinds of NaN values in historical xcoms
jedcunningham
added a commit
to astronomer/airflow
that referenced
this pull request
Mar 2, 2026
XCom values containing float('nan'), float('inf'), or float('-inf')
caused the database migration to silently corrupt data or fail
outright when upgrading. Three bugs were present across backends:
- Consecutive tokens (e.g. [NaN, NaN]) were only partially replaced,
leaving bare NaN/Infinity in the output and breaking the JSON cast.
- Infinity and -Infinity were not handled at all — only NaN was.
- Bare top-level values (a single NaN or Infinity, not inside a list
or dict) were not matched and passed through unconverted.
MySQL also had two bugs in the replacement query that caused it to produce
the wrong output (one of these was pre-existing from apache#57866).
1 task
vatsrahul1001
added a commit
that referenced
this pull request
Mar 3, 2026
XCom values containing float('nan'), float('inf'), or float('-inf')
caused the database migration to silently corrupt data or fail
outright when upgrading. Three bugs were present across backends:
- Consecutive tokens (e.g. [NaN, NaN]) were only partially replaced,
leaving bare NaN/Infinity in the output and breaking the JSON cast.
- Infinity and -Infinity were not handled at all — only NaN was.
- Bare top-level values (a single NaN or Infinity, not inside a list
or dict) were not matched and passed through unconverted.
MySQL also had two bugs in the replacement query that caused it to produce
the wrong output (one of these was pre-existing from #57866).
Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
jedcunningham
added a commit
to astronomer/airflow
that referenced
this pull request
Mar 3, 2026
XCom values containing float('nan'), float('inf'), or float('-inf')
caused the database migration to silently corrupt data or fail
outright when upgrading. Three bugs were present across backends:
- Consecutive tokens (e.g. [NaN, NaN]) were only partially replaced,
leaving bare NaN/Infinity in the output and breaking the JSON cast.
- Infinity and -Infinity were not handled at all — only NaN was.
- Bare top-level values (a single NaN or Infinity, not inside a list
or dict) were not matched and passed through unconverted.
MySQL also had two bugs in the replacement query that caused it to produce
the wrong output (one of these was pre-existing from apache#57866).
Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
(cherry picked from commit 7a301e6)
jedcunningham
added a commit
that referenced
this pull request
Mar 3, 2026
…2760) XCom values containing float('nan'), float('inf'), or float('-inf') caused the database migration to silently corrupt data or fail outright when upgrading. Three bugs were present across backends: - Consecutive tokens (e.g. [NaN, NaN]) were only partially replaced, leaving bare NaN/Infinity in the output and breaking the JSON cast. - Infinity and -Infinity were not handled at all — only NaN was. - Bare top-level values (a single NaN or Infinity, not inside a list or dict) were not matched and passed through unconverted. MySQL also had two bugs in the replacement query that caused it to produce the wrong output (one of these was pre-existing from #57866). (cherry picked from commit 7a301e6) Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
vatsrahul1001
added a commit
that referenced
this pull request
Mar 4, 2026
…2760) XCom values containing float('nan'), float('inf'), or float('-inf') caused the database migration to silently corrupt data or fail outright when upgrading. Three bugs were present across backends: - Consecutive tokens (e.g. [NaN, NaN]) were only partially replaced, leaving bare NaN/Infinity in the output and breaking the JSON cast. - Infinity and -Infinity were not handled at all — only NaN was. - Bare top-level values (a single NaN or Infinity, not inside a list or dict) were not matched and passed through unconverted. MySQL also had two bugs in the replacement query that caused it to produce the wrong output (one of these was pre-existing from #57866). (cherry picked from commit 7a301e6) Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
81 tasks
dominikhei
pushed a commit
to dominikhei/airflow
that referenced
this pull request
Mar 11, 2026
XCom values containing float('nan'), float('inf'), or float('-inf')
caused the database migration to silently corrupt data or fail
outright when upgrading. Three bugs were present across backends:
- Consecutive tokens (e.g. [NaN, NaN]) were only partially replaced,
leaving bare NaN/Infinity in the output and breaking the JSON cast.
- Infinity and -Infinity were not handled at all — only NaN was.
- Bare top-level values (a single NaN or Infinity, not inside a list
or dict) were not matched and passed through unconverted.
MySQL also had two bugs in the replacement query that caused it to produce
the wrong output (one of these was pre-existing from apache#57866).
Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
While upgrading from 2.x -> 3.x if the DB had NaN values (either natively or within a string), things were failing, some attempts were made to fix it: #57614 but looks like native NaN broke due to this.
Hence attempting to fix it again properly to handle all sorts of cases.
How this was tested?
Db entries:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.