MON-4500: Migrate Prometheus targets discovering from Endpoints to EndpointSlices#460
MON-4500: Migrate Prometheus targets discovering from Endpoints to EndpointSlices#460machine424 wants to merge 2 commits into
Conversation
|
Skipping CI for Draft Pull Request. |
|
@machine424: This pull request references MON-4500 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by existing tests |
|
@machine424: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@machine424: This pull request references MON-4500 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by existing tests |
|
@machine424: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@machine424: This pull request references MON-4500 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/assign @rikatz |
|
@machine424 you need to apply some extra changes:
Please add unit tests for these changes as well. Thanks! |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
WalkthroughThe pull request adds EndpointSlice monitoring support to the DNS operator. Changes include extending RBAC permissions across cluster-level and role-based rules to grant discovery access to EndpointSlices, implementing metrics role reconciliation logic in the controller, and updating ServiceMonitor configuration to specify EndpointSlice service discovery. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~18 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
…rviceDiscoveryRole: EndpointSlice in ServiceMonitors
This PR migrates Prometheus service discovery from the deprecated Endpoints API to the EndpointSlices API, by:
Setting serviceDiscoveryRole: EndpointSlice on ServiceMonitors.
Granting Prometheus endpointslices permissions.
We're taking a conservative approach by keeping the existing endpoints permissions alongside the new endpointslices ones. This provides a safety net in case any ServiceMonitors, whether deployed from this repo or from another source, still rely on the same Role and were missed during the migration.
That said, since both resources provide essentially the same data, keeping both isn't meaningfully more permissive from a security standpoint.
These changes target OpenShift 4.22+ and should not be backported to earlier releases.
Due to the scope of changes across multiple repositories, these modifications were generated with Claude assistance.
| - rbac.authorization.k8s.io | ||
| resources: | ||
| - clusterroles | ||
| - roles |
There was a problem hiding this comment.
allows the operator to update pkg/manifests/assets/dns/metrics/role.yaml on the cluster
|
Pushed some changes, keeping an eye on the CI. |
|
@machine424: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| rbacv1 "k8s.io/api/rbac/v1" | ||
| ) | ||
|
|
||
| func TestDNSggMetricsRoleChanged(t *testing.T) { |
| @@ -0,0 +1,26 @@ | |||
| package controller | |||
There was a problem hiding this comment.
please follow the same approach of unit tests on https://github.com/openshift/cluster-dns-operator/blob/master/pkg/operator/controller/controller_cluster_role_test.go
| - get | ||
| - list | ||
| - watch | ||
| - apiGroups: |
There was a problem hiding this comment.
you need to add these same permissions to the ClusterRole, otherwise the user running openshift-dns-operator won´t be able to give this permission.
See:
ERROR Reconciler error {"controller": "dns_controller", "object": {"name":"default"}, "namespace": "", "name": "default", "reconcileID": "edd6c3b1-ee05-40eb-a566-6a83471fb643", "error": "failed to ensure dns default: failed to integrate metrics with openshift-monitoring for dns default: failed to ensure dns metrics role for default: failed to update dns metrics role openshift-dns/prometheus-k8s: roles.rbac.authorization.k8s.io \"prometheus-k8s\" is forbidden: user \"system:serviceaccount:openshift-dns-operator:dns-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:openshift-dns-operator\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"discovery.k8s.io\"], Resources:[\"endpointslices\"], Verbs:[\"get\"]}", "errorCauses": [{"error": "failed to ensure dns default: failed to integrate metrics with openshift-monitoring for dns default: failed to ensure dns metrics role for default: failed to update dns metrics role openshift-dns/prometheus-k8s: roles.rbac.authorization.k8s.io \"prometheus-k8s\" is forbidden: user \"system:serviceaccount:openshift-dns-operator:dns-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:openshift-dns-operator\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"discovery.k8s.io\"], Resources:[\"endpointslices\"], Verbs:[\"get\"]}"}]}
This PR migrates Prometheus service discovery from the deprecated Endpoints API to the EndpointSlices API, by:
serviceDiscoveryRole: EndpointSliceon ServiceMonitors.endpointslicespermissions.We're taking a conservative approach by keeping the existing
endpointspermissions alongside the newendpointslicesones. This provides a safety net in case any ServiceMonitors, whether deployed from this repo or from another source, still rely on the same Role and were missed during the migration.That said, since both resources provide essentially the same data, keeping both isn't meaningfully more permissive from a security standpoint.
These changes target OpenShift 4.22+ and should not be backported to earlier releases.
Due to the scope of changes across multiple repositories, these modifications were generated with Claude assistance.