Skip to content

[DPD] DPD CRD changes with version bump to v3.2#427

Open
rgadagot wants to merge 1 commit into
aws:mainfrom
rgadagot:main
Open

[DPD] DPD CRD changes with version bump to v3.2#427
rgadagot wants to merge 1 commit into
aws:mainfrom
rgadagot:main

Conversation

@rgadagot
Copy link
Copy Markdown
Contributor

@rgadagot rgadagot commented Jun 3, 2026

What's changing and why?

The feature in this release is for Disaggregated Prefill and Decoding (DPD) support to HyperPod Inference, separating the prefill and decode phases onto dedicated GPU pods. The solution extends the existing Inference Operator with role-specific Kubernetes Deployments (prefiller and decoder), GPU-to-GPU KV cache transfer via NIXL over EFA, conditional routing based on input length, and independent autoscaling.

@rgadagot rgadagot requested a review from a team as a code owner June 3, 2026 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants