Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased
0.7.2 - 2026-03-24
0.7.0 - 2025-12-10
Changed
- Support for pyspark>=4
- Support for python>=3.12
- Build backend to uv_build
- Support for Databricks Runtime 17.3 LTS
0.6.0 - 2025-08-22
Added
- Support for
delta_utils.geocoding.lookup_countrycan run on serverless
Changed
- Using DataFrame.mapInArrow instead of DataFrame.rdd.mapPartitions in
delta_utils.geocoding.lookup_countryresulted in an accidental performance increase
0.5.0 - 2024-12-18
Added
- Warning message when initializing a DeltaChanges class when the table is not a delta table
- New geocoding module to resolve country and address for a geo position
- New SSM module to fetch data from AWS parameter store
- Function drop_all_parameters_null_columns
- Function location_for_hive_table
- Lineage class to get downstream_tables
- Upgrade to Python 3.10 and above
Fixed
- DeltaTable.isDeltaTable doesn't seem to work with Unity Catalog,
the function
delta_utils.core.last_written_timestamp_for_delta_pathwill now try to get the last timestamp regardless if the dataset exists or is a delta table, it will returnNoneand print an error message if it couldn't get the last timestamp
0.4.0 - 2022-11-25
Changed
- Fileregistry is deprecated, with Unity Catalog S3 works entirely different and boto3 is not possible
- DeltaChanges work with Unity Catalog table names
0.3.0 - 2022-05-04
Added
- Github Action changelog.yml to check if the CHANGELOG.md file is being changed in the pull request
- Nested names option for flatten function
0.2.1 - 2022-04-21
Added
delta_utils.clean.flattento flatten dataframedelta_utils.clean.fix_invalid_column_namesto remove invalid char in column names
0.2.0 - 2022-04-20
Fixed
- Force readthedocs to use mkdocs>=1.3.0
0.1.1 - 2022-03-31
Added
delta_utils.fileregistry.S3FullScan.remove_file_pathsto delete rows in the File Registry
Changed
delta_utils.fileregistry.S3FullScan.clearis renamed toclear_dates
0.1.0 - 2022-03-30
Added
delta_utils.core.spark_current_timestampfunction to return the spark server timestamp (resolves race conditions)delta_utils.fileregistry.S3FullScanclass to scan S3 bucket + prefix and suffix, this will keep you from loading processed files
Changed
delta_utils.core.read_change_feedwill check if delta.enableChangeDataFeed set to true, otherwise it raisesReadChangeFeedDisabledexceptiondelta_utils.utils.DeltaChangesanddelta_utils.utils.NonDeltaLastWrittenTimestampwill also raise this exception
[0.0.1] - 2022-03-13
Added
- First working code
- Tests
- Documentation