Conversation
There was a problem hiding this comment.
Pull request overview
Exposes DataFusion’s map-related scalar functions to the Python API (wrapping new Rust bindings) and adds Python-level helpers plus unit tests.
Changes:
- Add Rust
pyo3bindings formake_mapand map accessors (map_keys,map_values,map_extract,map_entries). - Add Python wrappers (
map,make_map,map_keys,map_values,map_extract,map_entries,element_at) and export them via__all__. - Add unit tests covering the new map construction and accessor functions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
crates/core/src/functions.rs |
Adds make_map binding and registers map accessor functions in the internal Rust module. |
python/datafusion/functions.py |
Adds Python-facing wrappers for map functions and updates exported symbol list. |
python/tests/test_functions.py |
Adds unit tests validating map creation and map accessor behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ntries, element_at) Closes apache#1448 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
element_at is already a Python-only alias for map_extract, so the Rust binding is unnecessary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
make_map now takes a dict for the common case and also supports separate keys/values lists for column expressions. Non-Expr keys and values are automatically converted to literals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
map() now supports three calling conventions matching upstream:
- map({"a": 1, "b": 2}) — from a Python dictionary
- map([keys], [values]) — two lists that get zipped
- map(k1, v1, k2, v2, ...) — variadic key-value pairs
Non-Expr keys and values are automatically converted to literals.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add examples for all three map() calling conventions - Use clearer descriptions instead of jargon (no "zipped" or "variadic") - Break map_keys/map_values/map_extract/map_entries examples into two steps: create the map column first, then call the function Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove map() function that shadowed Python builtin; make_map() is now the sole entry point for creating map expressions - Fix map_extract/element_at docstrings: missing keys return [None], not an empty list (matches actual upstream behavior) - Add length validation for the two-list calling convention - Update all tests and docstring examples accordingly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3b55137 to
dbe4d2e
Compare
Reduce boilerplate by combining make_map construction tests and map accessor function tests into two @pytest.mark.parametrize groups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@nuno-faria I think this one is good to go if you have time. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
nuno-faria
left a comment
There was a problem hiding this comment.
Thanks @timsaucer, I leave some additional suggestions below.
Co-authored-by: Nuno Faria <nunofpfaria@gmail.com>
Co-authored-by: Nuno Faria <nunofpfaria@gmail.com>
Co-authored-by: Nuno Faria <nunofpfaria@gmail.com>
Co-authored-by: Nuno Faria <nunofpfaria@gmail.com>
Which issue does this PR close?
Closes #1448
Rationale for this change
These functions were not exposed but are available upstream.
What changes are included in this PR?
Expose functions.
Add python wrappers.
Add unit tests.
Are there any user-facing changes?
New addition only.
Disclaimer
Most of this code was written with an AI agent, but every line has been reviewed by me and I am willing to discuss all design decisions.