Skip to content

WIP: OCPSTRAT-3036: Rebase 1.36#2653

Open
jubittajohn wants to merge 2959 commits into
openshift:masterfrom
jubittajohn:rebase-1.36
Open

WIP: OCPSTRAT-3036: Rebase 1.36#2653
jubittajohn wants to merge 2959 commits into
openshift:masterfrom
jubittajohn:rebase-1.36

Conversation

@jubittajohn
Copy link
Copy Markdown

@jubittajohn jubittajohn commented Apr 23, 2026

Summary by CodeRabbit

  • New Features

    • Added MutatingAdmissionPolicy and MutatingAdmissionPolicyBinding resources for admission control.
    • Added PodGroup resource (v1alpha2) and Pod scheduling group support.
    • Added DeviceTaintRule and ResourcePoolStatusRequest resources for device management.
    • Introduced alpha sharded list/watch support for improved scalability across multiple API endpoints.
  • Documentation

    • Published release notes for versions 1.35.3 and 1.35.4.
    • Added OpenAPI documentation for new Kubernetes vendor extensions.
  • Chores

    • Updated Go toolchain from 1.25.7 to 1.26.2.
    • Updated CI/CD build configuration and owner aliases.

k8s-ci-robot and others added 30 commits March 21, 2026 08:16
…pool-status

KEP-5677: Add ResourcePoolStatusRequest API for DRA resource availability visibility
KEP-5491: DRA: List Types for Attributes [Alpha]
The fast-delete pod status tests currently require the intentionally failing
"fail" container to report exit code 1. In CI, some runtimes occasionally
report exit code 2 with reason=Error even though the tested invariant still
holds: the container failed and the blocked workload container never started.

The latest dims/test-k8s failure on master showed exactly that state: the pod
remained Failed, Initialized=False, the blocked container reported
started=false, and only the failing init container drifted from exit 1 to exit
2. This matches kubernetes/kubernetes issue 135713 and the related
pending-container history in PR 131605.

Accept exit code 2 in this verifier so the test continues to assert the
behavior it is meant to cover instead of a lower-layer exit-code detail.

Fixes issue 135713

Tested:
- hack/verify-gofmt.sh
- hack/verify-test-code.sh
- hack/verify-typecheck.sh ./test/e2e/node/...
- go test ./test/e2e/node -run TestNonExistent -count=1

Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
Replace plain bool with sync/atomic.Bool for the useStreaming field
in remoteRuntimeService and remoteImageService to eliminate a data
race when multiple goroutines concurrently read/write the field
during Unimplemented fallback.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…logquery-lock-defualt

[FeatureGate] Promote NodeLogQuery to GA in  v1.36 and lock default to `true`
…-flake

Set WithSerial on HPA tests that conflict api registration
…able-tolerance-e2e-deterministic-cpu-load

fix: [sig-autoscaling] flaky HPAConfigurableTolerance e2e should scale up but should not scale down
…-pod-status-exit-2

test/e2e/node: tolerate exit code 2 in pod status flake
gRPC defaults to the DNS resolver for bare targets passed to
NewClient. For CRI socket endpoints, GetAddressAndDialer returns a
socket path plus a custom dialer, but handing the bare path to
grpc.NewClient still lets gRPC resolve the target first.

That breaks unix socket clients with errors like "name resolver error:
produced zero addresses" before the custom dialer ever sees the raw
path. Use the passthrough resolver for socket-style addresses so the
runtime and image clients hand the original endpoint directly to the
custom dialer.

Add a regression test for unix sockets, Windows named pipes, and TCP
addresses.

Precedent:
https://github.com/etcd-io/etcd/blob/v3.3.27/clientv3/client.go#L266-L270
https://github.com/grpc/grpc-go/blob/v1.72.2/dialoptions.go#L448-L451

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
…nt-unix-socket-dialing

cri-client: use passthrough resolver for socket endpoints
KEP-5729: DRA: ResourceClaim Support for Workloads
…g-fixes

[InPlacePodLevelResourcesVerticalScaling] Plr ippr kubelet bug fixes
[InPlacePodLevelResourcesVerticalScaling] Ippr flaky test
[PodLevelResources] Graduate InPlacePodLevelResourcesVerticalScaling feature to beta
cri-client: use atomic.Bool for useStreaming to fix data race
Fix restartable init container startup race
gnufied and others added 25 commits May 13, 2026 23:20
UPSTREAM: <carry>: Add plugin for storage performant security policy
Add featuregate for performantsecuritypolicy for storage

UPSTREAM: <carry>: Feature gates must now declare dependencies, even if there are none.
Signed-off-by: Harshal Patil <12152047+harche@users.noreply.github.com>
Analysis of flakes from the k8s suite has shown consistent examples
of otherwise well behaved testing failing due timeouts because of
temporary load on controllers during parallel testing. Increasing these
timeouts will reduce flakes.
MutableCSINodeAllocatableCount is now enabled in the default feature set,
the tests should succeed just fine.
…imary clusters

  Detect cluster's primary IP family by querying kubernetes.default service
  ClusterIP instead of using HasIPv4/HasIPv6 flags. The previous logic
  incorrectly returned ipv4 for dual-stack v6-primary clusters because
  both HasIPv4 and HasIPv6 were true.

  This matches the upstream approach in test/e2e/e2e.go and fixes DNS tests
  that were querying for A records instead of AAAA records in v6-primary
  environments.
After openshift/origin#30786 added ibmcloud to the provider switch in
openshift-tests, the provider name is now correctly passed through to
k8s-tests-ext. However, k8s-tests-ext only registers upstream Kubernetes
providers (aws, azure, gce, kubemark, openstack, vsphere) via the
test/e2e/providers.go import. OpenShift-specific providers like ibmcloud
are not registered, causing framework.AfterReadingAllFlags to call
SetupProviderConfig which fails with "Unknown provider" and Exit(1),
crashing every test process.

This registers all OpenShift-specific cloud providers (baremetal, ovirt,
kubevirt, alibabacloud, nutanix, ibmcloud, external) as NullProviders in
k8s-tests-ext. These providers don't require special setup for upstream
kube e2e tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e2e test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The WatchList test “[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] should be requested by metadatainformer when WatchListClient is enabled” works by fetching an expected (initial) state of secrets, starting an informer, and polling until context timeout for the informer to converge to that expected state. If any other secret in the namespace changes while the test is running, they never converge, and the test times out. This change limits the secrets we’re listing to just the ones relevant to the test.
Signed-off-by: jubittajohn <jujohn@redhat.com>
To be squashed with the following commit later:"UPSTREAM: <carry>: Add OpenShift tooling, images, configs and docs"

Signed-off-by: jubittajohn <jujohn@redhat.com>
…er_manager_linux_test.go

Squash into: UPSTREAM: <carry>: disable load balancing on created cgroups when managed is enabled
…s in flagz_test.go and statusz_test.go

Squash into: UPSTREAM: <carry>: apiserver: add system_client=kube-{apiserver,cm,s} to apiserver_request_total
Signed-off-by: Shaza Aldawamneh <shaza.aldawamneh@hotmail.com>
…e when claims.email is used in username expression

Signed-off-by: Shaza Aldawamneh <shaza.aldawamneh@hotmail.com>
…acheGC is enabled

Squash into UPSTREAM: <carry>: create termination events
Could squash into UPSTREAM: <carry>: emit event when readyz goes true
Squash into: UPSTREAM: <carry>: add management support to kubelet
kuberc subcommand is not yet registered in oc. Tests will be re-enabled after oc is bumped to 1.36
To be squashed with the commit UPSTREAM: <carry>: Add OpenShift tooling, images, configs and docs before 1.36 rebase bump merges

Signed-off-by: jubittajohn <jujohn@redhat.com>
… driver when not enabled

The upstream csi-hostpath-plugin.yaml manifest now includes a csi-snapshot-metadata sidecar container and volume (added in k/k#130918). Upstream PR k/k#137057 added conditional stripping of these when CapSnapshotMetadata is not enabled, but only for the upstream hostpathCSIDriver. The OpenShift-specific groupSnapshotHostpathCSIDriver was never updated, causing the driver pod to fail with "secret csi-snapshot-metadata-server-certs not found"  and all csi-hostpath-groupsnapshot tests to fail in techpreview jobs.

Signed-off-by: jubittajohn <jujohn@redhat.com>
Signed-off-by: jubittajohn <jujohn@redhat.com>
Signed-off-by: jubittajohn <jujohn@redhat.com>
@openshift-ci-robot
Copy link
Copy Markdown

@jubittajohn: the contents of this pull request could not be automatically validated.

The following commits are valid:

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 14, 2026

@jubittajohn: This pull request references OCPSTRAT-3036 which is a valid jira issue.

Details

In response to this:

Summary by CodeRabbit

  • New Features

  • Added MutatingAdmissionPolicy and MutatingAdmissionPolicyBinding resources for admission control.

  • Added PodGroup resource (v1alpha2) and Pod scheduling group support.

  • Added DeviceTaintRule and ResourcePoolStatusRequest resources for device management.

  • Introduced alpha sharded list/watch support for improved scalability across multiple API endpoints.

  • Documentation

  • Published release notes for versions 1.35.3 and 1.35.4.

  • Added OpenAPI documentation for new Kubernetes vendor extensions.

  • Chores

  • Updated Go toolchain from 1.25.7 to 1.26.2.

  • Updated CI/CD build configuration and owner aliases.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jubittajohn
Copy link
Copy Markdown
Author

/retest

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 14, 2026

@jubittajohn: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-techpreview f5f42a4 link false /test e2e-aws-ovn-techpreview
ci/prow/e2e-aws-ovn-techpreview-serial-2of2 f5f42a4 link false /test e2e-aws-ovn-techpreview-serial-2of2
ci/prow/e2e-gcp f5f42a4 link true /test e2e-gcp
ci/prow/e2e-aws-ovn-upgrade f5f42a4 link true /test e2e-aws-ovn-upgrade
ci/prow/e2e-aws-ovn-techpreview-serial-1of2 f5f42a4 link false /test e2e-aws-ovn-techpreview-serial-1of2
ci/prow/e2e-aws-ovn-runc f5f42a4 link false /test e2e-aws-ovn-runc
ci/prow/e2e-aws-ovn-serial-1of2 f5f42a4 link true /test e2e-aws-ovn-serial-1of2
ci/prow/e2e-aws-ovn-crun f5f42a4 link true /test e2e-aws-ovn-crun
ci/prow/e2e-aws-ovn-fips f5f42a4 link true /test e2e-aws-ovn-fips
ci/prow/e2e-aws-ovn-cgroupsv2 f5f42a4 link true /test e2e-aws-ovn-cgroupsv2
ci/prow/e2e-aws-ovn-serial-2of2 f5f42a4 link true /test e2e-aws-ovn-serial-2of2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. vendor-update Touching vendor dir or related files

Projects

None yet

Development

Successfully merging this pull request may close these issues.