Skip to content

[Multi-GPU Polars] Introduce Ray mode for multi-GPU cudf-polars execution#21746

Merged
rapids-bot[bot] merged 38 commits intorapidsai:mainfrom
madsbk:rapidsmpf-ray
Mar 17, 2026
Merged

[Multi-GPU Polars] Introduce Ray mode for multi-GPU cudf-polars execution#21746
rapids-bot[bot] merged 38 commits intorapidsai:mainfrom
madsbk:rapidsmpf-ray

Conversation

@madsbk
Copy link
Copy Markdown
Member

@madsbk madsbk commented Mar 11, 2026

Introduces a Ray-based execution frontend for cudf-polars backed by the RapidsMPF streaming engine, complementing the existing SPMD mode.

@madsbk madsbk self-assigned this Mar 11, 2026
@madsbk madsbk added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 11, 2026
@github-actions github-actions Bot added Python Affects Python cuDF API. cudf-polars Issues specific to cudf-polars labels Mar 11, 2026
@GPUtester GPUtester moved this to In Progress in cuDF Python Mar 11, 2026
@madsbk madsbk force-pushed the rapidsmpf-ray branch 3 times, most recently from 2d3e98f to 69eea3c Compare March 12, 2026 12:14
@madsbk madsbk marked this pull request as ready for review March 12, 2026 17:55
@madsbk madsbk requested a review from a team as a code owner March 12, 2026 17:55
@madsbk madsbk requested review from Matt711 and mroeschke March 12, 2026 17:55
Comment thread python/cudf_polars/docs/cudf-polars-mp.md Outdated
Comment thread python/cudf_polars/docs/cudf-polars-mp.md Outdated
Comment on lines +252 to +253
self._comm = None
self._mr = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exit_actor tears down the actor process though, so that should delete the class and hence all these objects.

Comment thread python/cudf_polars/docs/cudf-polars-mp.md Outdated
madsbk and others added 2 commits March 16, 2026 13:09
Co-authored-by: Tom Augspurger <tom.augspurger88@gmail.com>
Co-authored-by: Lawrence Mitchell <wence@gmx.li>
@madsbk madsbk added breaking Breaking change and removed non-breaking Non-breaking change labels Mar 16, 2026
@madsbk madsbk requested a review from wence- March 16, 2026 13:34
@wence-
Copy link
Copy Markdown
Contributor

wence- commented Mar 16, 2026

Needs #21787 first.

@madsbk madsbk added the DO NOT MERGE Hold off on merging; see PR for details label Mar 16, 2026
Copy link
Copy Markdown
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flushing some comments - Still planning to read through this a bit more.

Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/core.py
Comment thread python/cudf_polars/cudf_polars/utils/config.py
Copy link
Copy Markdown
Contributor

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions / suggestions

Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/frontend/ray.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/frontend/ray.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/frontend/spmd.py Outdated
@madsbk madsbk requested a review from Matt711 March 16, 2026 18:18
@madsbk madsbk removed the DO NOT MERGE Hold off on merging; see PR for details label Mar 16, 2026
Copy link
Copy Markdown
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Mads!

Comment thread python/cudf_polars/docs/cudf-polars-mp.md
assert config_options.executor.runtime == "rapidsmpf", "Runtime must be rapidsmpf"

# Lower the IR graph on the client process (for now).
ir, partition_info, stats = lower_ir_graph(ir, config_options)
Copy link
Copy Markdown
Member

@rjzamora rjzamora Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that we may use the GPU during this lowering stage, so the client does technically need a GPU. (I don't think you say otherwise anywhere in this PR, but I think you may have in some discussion about/within #21769)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Can we avoid the GPU requirement by calling lower_ir_graph on each worker instead of on the client?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can/should move in that direction. I don't think it will be difficult, but its not completely trivial either. If we just move the lowering to the worker "as is", the workers will all sample the same parquet metadata/row-groups redundantly. We probably want to tweak the logic so that the workers target different information and allgather the results (so the workers sample more efficiently and make consistent decisions). I don't think it's a problem to do this after you have the initial frontends in place.

Copy link
Copy Markdown
Member Author

@madsbk madsbk Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an issue: #21841

@madsbk
Copy link
Copy Markdown
Member Author

madsbk commented Mar 17, 2026

/merge

@rapids-bot rapids-bot Bot merged commit 001653a into rapidsai:main Mar 17, 2026
165 of 167 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in cuDF Python Mar 17, 2026
@madsbk madsbk deleted the rapidsmpf-ray branch March 17, 2026 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change cudf-polars Issues specific to cudf-polars improvement Improvement / enhancement to an existing function Python Affects Python cuDF API.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants