Skip to content

feat: extensible transforms pipeline for zarr build#80

Draft
turban wants to merge 2 commits intomainfrom
feat/transforms-pipeline
Draft

feat: extensible transforms pipeline for zarr build#80
turban wants to merge 2 commits intomainfrom
feat/transforms-pipeline

Conversation

@turban
Copy link
Copy Markdown
Contributor

@turban turban commented May 8, 2026

Summary

  • Replaces the hardcoded _UNIT_CONVERSIONS dict and pre_process list with a single transforms pipeline in the dataset YAML
  • Each entry is a dotted-path callable (string or {function, params} dict), resolved at runtime the same way ingestion.function works
  • Adds src/climate_api/transforms/ with two built-in transforms: convert_units and deaccumulate_era5
  • Updates era5_land.yaml to use transforms: for both temperature and precipitation datasets

Closes #79

Usage

transforms:
  - climate_api.transforms.deaccumulate_era5
  - climate_api.transforms.convert_units

External transforms from dhis2eo or any other package can be referenced by dotted path without changes to core code.

Test plan

  • uv run pytest tests/test_transforms.py — 12 new tests covering unit conversion, deaccumulation, pipeline execution, and edge cases
  • uv run pytest — full suite passes (153 + 12 tests)

@turban turban marked this pull request as draft May 8, 2026 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: replace convert_units and pre_process with a unified transforms pipeline

1 participant