Add missing SessionContext utility methods#1475
Add missing SessionContext utility methods#1475timsaucer wants to merge 4 commits intoapache:mainfrom
Conversation
Expose upstream DataFusion v53 utility methods: session_start_time, enable_ident_normalization, parse_sql_expr, execute_logical_plan, refresh_catalogs, remove_optimizer_rule, and table_provider. The add_optimizer_rule and add_analyzer_rule methods are omitted as the OptimizerRule and AnalyzerRule traits are not yet exposed to Python. Closes apache#1459. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR exposes additional SessionContext utility/introspection methods in the datafusion-python API to match capabilities available in upstream DataFusion v53 (Issue #1459), and adds unit tests to cover the new Python surface area.
Changes:
- Added Python
SessionContextwrappers for:session_start_time,enable_ident_normalization,parse_sql_expr,execute_logical_plan,refresh_catalogs,remove_optimizer_rule, andtable_provider. - Added corresponding Rust binding methods on
PySessionContextto call into DataFusion v53 APIs. - Added unit tests validating the new Python methods.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
python/datafusion/context.py |
Adds new SessionContext methods to the public Python API and wraps internal bindings (Expr, DataFrame, Table). |
crates/core/src/context.rs |
Exposes the underlying DataFusion SessionContext methods via the PyO3 PySessionContext bindings. |
python/tests/test_context.py |
Adds tests for the newly exposed SessionContext methods. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Examples: | ||
| >>> ctx = SessionContext() | ||
| >>> start_time = ctx.session_start_time() | ||
| >>> assert "T" in start_time # RFC 3339 contains a 'T' separator |
There was a problem hiding this comment.
This assert feels a little odd, what about showing a result?
>>> ctx.session_start_time()
'2026-01-01T12:34:56.123456789+00:00'|
|
||
| Examples: | ||
| >>> ctx = SessionContext() | ||
| >>> assert isinstance(ctx.enable_ident_normalization(), bool) |
There was a problem hiding this comment.
Same thing here:
| >>> assert isinstance(ctx.enable_ident_normalization(), bool) | |
| >>> ctx.enable_ident_normalization() | |
| True |
| from datafusion.expr import Expr # noqa: PLC0415 | ||
|
|
||
| return Expr(self.ctx.parse_sql_expr(sql, schema)) |
There was a problem hiding this comment.
I think we could remove the import and the wrapping with Expr.
| from datafusion.catalog import Table # noqa: PLC0415 | ||
|
|
||
| return Table(self.ctx.table_provider(name)) |
There was a problem hiding this comment.
Also here I think we can remove the Table and the import.
| assert isinstance(st, str) | ||
| assert "T" in st # RFC 3339 format |
There was a problem hiding this comment.
What about this? The conversion should fail if the string is badly formatted.
| assert isinstance(st, str) | |
| assert "T" in st # RFC 3339 format | |
| dt.datetime.fromisoformat(st).isoformat() |
| >>> expr = ctx.parse_sql_expr("1 + 2", schema) | ||
| >>> assert "Int64(1) + Int64(2)" in str(expr) |
There was a problem hiding this comment.
| >>> expr = ctx.parse_sql_expr("1 + 2", schema) | |
| >>> assert "Int64(1) + Int64(2)" in str(expr) | |
| >>> ctx.parse_sql_expr("1 + 2", schema) | |
| Expr(Int64(1) + Int64(2)) |
|
|
||
| schema = DFSchema.empty() | ||
| expr = ctx.parse_sql_expr("1 + 2", schema) | ||
| assert "Int64(1) + Int64(2)" in str(expr) |
There was a problem hiding this comment.
| assert "Int64(1) + Int64(2)" in str(expr) | |
| assert str(expr) == "Expr(Int64(1) + Int64(2))" |
|
|
||
|
|
||
| def test_remove_optimizer_rule(ctx): | ||
| assert ctx.remove_optimizer_rule("nonexistent_rule") is False |
There was a problem hiding this comment.
Testing with a rule that exists as well:
| assert ctx.remove_optimizer_rule("nonexistent_rule") is False | |
| assert ctx.remove_optimizer_rule("push_down_filter") | |
| assert ctx.remove_optimizer_rule("nonexistent_rule") is False |
| result = ctx.enable_ident_normalization() | ||
| assert isinstance(result, bool) |
There was a problem hiding this comment.
I think it's better to change the value and check it.
| result = ctx.enable_ident_normalization() | |
| assert isinstance(result, bool) | |
| assert ctx.enable_ident_normalization() | |
| ctx.sql("set datafusion.sql_parser.enable_ident_normalization = false") | |
| assert ctx.enable_ident_normalization() is False |
Unrelated but the original method name is a bit misleading since it does not enable the flag, only returns the value.
Which issue does this PR close?
Closes #1459
Rationale for this change
These methods exist in the upstream repository but have not been exposed to Python.
What changes are included in this PR?
Add methods to the Python API
Add unit tests
Are there any user-facing changes?
New addition only.