pyspark.sql.streaming.DataStreamReader.changes#

DataStreamReader.changes(tableName)[source]#

Returns the row-level changes (Change Data Capture) from the specified table as a streaming DataFrame. Currently this API is only supported for Data Source V2 tables whose catalog implements TableCatalog.loadChangelog().

Use option() to specify the starting version/timestamp and processing options.

New in version 4.2.0.

Parameters

tableNamestr: string, name of the table.

Returns

DataFrame

Notes

This API is evolving.

Examples

>>> spark.readStream.option("startingVersion", "10").changes(
...     "my_table"
... )