profile-compaction: CollapseConfig CRD + projection overlay + user-ma…#808
profile-compaction: CollapseConfig CRD + projection overlay + user-ma…#808entlein wants to merge 5 commits into
Conversation
…naged lifecycle Signed-off-by: entlein <einentlein@gmail.com>
|
Warning Review limit reached
More reviews will be available in 59 minutes and 55 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds tamper-detection and optional exporter wiring for user overlays, stamps overlay identity into container profile SyncChecksum, refactors projection to classify dynamic/wildcard entries and preserve exec-args, extends related data shapes and mocks, and makes rule-binding notifier dispatch non-blocking with tests. ChangesContainer Profile Cache: Tamper Detection, Overlay Identity, and Projection
Rule Binding Cache: Non-Blocking Notification Dispatch
Miscellaneous
Sequence Diagram(s)(omitted — changes are multiple focused features; sequence diagrams included in hidden artifact where helpful) Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/objectcache/containerprofilecache/containerprofilecache.go`:
- Around line 412-414: The verification calls to
c.verifyUserApplicationProfile(userAP, sharedData.Wlid) currently ignore its
boolean result; update both call sites (the one using userAP at lines ~412 and
the similar one around ~428-430) to check the returned bool and, when it is
false and the cache is running in strict verification mode (e.g., the instance
flag controlling strict verification such as c.strictVerify or equivalent),
avoid projecting/merging the overlay (skip the merge/projectOverlay path), log
the verification failure with context (Wlid and which overlay), and return or
surface an error instead of continuing; if not in strict mode, continue but log
a warning. Ensure you reference and use c.verifyUserApplicationProfile(...) and
the strict-mode flag when implementing the conditional.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: acff329b-9e89-4538-a272-7877e0ad1e70
📒 Files selected for processing (12)
pkg/objectcache/containerprofilecache/containerprofilecache.gopkg/objectcache/containerprofilecache/projection.gopkg/objectcache/containerprofilecache/projection_apply.gopkg/objectcache/containerprofilecache/tamper_alert.gopkg/objectcache/containerprofilecache/tamper_alert_test.gopkg/objectcache/containerprofilecache/test32_projection_test.gopkg/objectcache/projection_types.gopkg/objectcache/shared_container_data.gopkg/objectcache/v1/mock.gopkg/rulebindingmanager/cache/cache.gopkg/rulebindingmanager/cache/cache_test.gopkg/rulemanager/cel/libraries/cache/function_cache.go
…verlay CodeRabbit upstream PR kubescape#808 / containerprofilecache.go:414 (Major). The verifyUserApplicationProfile and verifyUserNetworkNeighborhood methods already return a boolean reflecting verification outcome — true when the overlay is unsigned OR when verification succeeded OR in permissive mode (EnableSignatureVerification=false); false only in strict mode on actual tamper. The two call sites in projection-load were discarding that return, so tampered overlays in strict mode silently merged anyway. The R1016 alert was emitted but the protection was advisory only. Now: when verify returns false (strict mode + tamper detected) the overlay is nilled out before the merge step so the cache never projects a known-tampered profile. Permissive mode is unchanged — verify always returns true, the overlay still merges, R1016 still fires. New tests: - TestVerifyAP_StrictMode_ReturnsFalseOnTamper — sign + tamper an ApplicationProfile, construct a cache with EnableSignatureVerification=true, and assert verifyUserApplicationProfile returns false (caller drops overlay). - TestVerifyNN_StrictMode_ReturnsFalseOnTamper — symmetric pin for the NetworkNeighborhood path. The existing legacy-permissive tamper test (TestVerifyAP_TamperedProfile_PopulatesDedupMap) continues to pass unchanged — that path still returns true with the R1016 emitted.
…c1 final state Signed-off-by: entlein <einentlein@gmail.com>
|
@coderabbitai review Collective re-review request after applying your atomic-split advisory + rabbit-feedback fixes across the full SBOB upstream PR set. Each PR is now scoped to a single tier per your dependency map: Tier 0 (opens):
Tier 2 (execs):
Tier 3 (network):
Tier 4 (tamper/sign):
Umbrellas:
Please confirm atomicity per your earlier advisory and flag any remaining cross-PR coupling we missed. |
|
PR
|
| File | Issue | Belongs to |
|---|---|---|
projection_types.go |
Defines ExecsByPath map[string][]string on ProjectedContainerProfile |
Tier 2 · #812 ("ExecsByPath composite-key surface") |
projection_apply.go |
Consumes ExecsByPath, containsDynamicSegment, isNetworkIPWildcard, isNetworkDNSWildcard |
Tier 2 (#812) for exec surface; Tier 3 (#806/#324) for network wildcards |
v1/mock.go |
Updated to populate ExecsByPath + wildcard network patterns |
Tier 2 + Tier 3 |
rulebindingmanager/cache/cache.go + test |
Non-blocking fanout refactor — purely infrastructure, no tamper/sign semantics | Pre-Tier 4 (Tier 0 or standalone) |
Specific recommendations
-
ExecsByPathtype definition should either live in#812(Tier 2) with#808taking a compile dependency on it, or you should explicitly note that#808is the defining PR and#812merely consumes it. Right now the tier map implies the reverse, which is confusing. -
isNetworkIPWildcard/isNetworkDNSWildcardinprojection_apply.goduplicate classifier logic that Tier 3 (#806,#324) is supposed to own. Either import from the Tier 3 package or note these are intentionally local shims with a// TODO: consolidate with pkg/objectcache/networkmatchcomment. -
Non-blocking fanout (
rulebindingmanager/cache) is fully orthogonal to tamper/sign. If this was included because it was required to prevent deadlocks when emitting tamper alerts under load, that's a valid reason — but it should be called out explicitly in the PR description as a required infrastructure prerequisite, not just bundled silently.
Bottom line: The tamper/sign core is clean. The main atomicity leaks are the ExecsByPath surface definition crossing from Tier 4 into what the map says is Tier 2, and the fanout refactor having no declared relationship to the Tier 4 scope. If you can clarify the ExecsByPath ownership direction and add a sentence to the PR description explaining why the fanout refactor is here, the remaining coupling is defensible.
✅ Actions performed
Review triggered.
Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.
…e#807, kubescape#808) (#44) * fix(containerprofilecache): honor verify result before merging user overlay CodeRabbit upstream PR kubescape#808 / containerprofilecache.go:414 (Major). The verifyUserApplicationProfile and verifyUserNetworkNeighborhood methods already return a boolean reflecting verification outcome — true when the overlay is unsigned OR when verification succeeded OR in permissive mode (EnableSignatureVerification=false); false only in strict mode on actual tamper. The two call sites in projection-load were discarding that return, so tampered overlays in strict mode silently merged anyway. The R1016 alert was emitted but the protection was advisory only. Now: when verify returns false (strict mode + tamper detected) the overlay is nilled out before the merge step so the cache never projects a known-tampered profile. Permissive mode is unchanged — verify always returns true, the overlay still merges, R1016 still fires. New tests: - TestVerifyAP_StrictMode_ReturnsFalseOnTamper — sign + tamper an ApplicationProfile, construct a cache with EnableSignatureVerification=true, and assert verifyUserApplicationProfile returns false (caller drops overlay). - TestVerifyNN_StrictMode_ReturnsFalseOnTamper — symmetric pin for the NetworkNeighborhood path. The existing legacy-permissive tamper test (TestVerifyAP_TamperedProfile_PopulatesDedupMap) continues to pass unchanged — that path still returns true with the R1016 emitted. * docs(applicationprofile): rule-author contracts on opens-with-suffix/prefix + exec-args Two doc-only fixes for CodeRabbit cross-PR advisory: #7 (NA kubescape#807): ap.was_path_opened_with_suffix / _prefix — explicitly document the false-negative gap when the projection is in pass-through mode (cp.Opens.All=true). Wildcard Patterns are skipped from the suffix/prefix scan because their token-bearing text doesn't safely answer suffix questions. Rule authors who need wildcard-aware coverage should either declare an Opens-projection slice (so SuffixHits/PrefixHits become authoritative for the literals they care about) or use ap.was_path_opened (which runs CompareDynamic over Patterns). #8 (NA kubescape#807): wasExecutedWithArgs — document the three states of ExecsByPath: 1. Path absent from Execs.Values → exec not allowed, fall through. 2. Path in Values, ABSENT from ExecsByPath → legacy back-compat "no argv constraint", match. 3. Path in Values, PRESENT with empty arg list [] → explicit "ran with no args" constraint, NOT a wildcard. The distinction is load-bearing for profile authors: an entry of {Path: ..., Args: []} is a constraint, not a free pass. No behavioural change. Tests pass unchanged. * deps: pin stereoscope v0.1.9 + runtime-spec v1.2.1 (compat with kubescape/syft fork) Storage rc1 bumped to syft v1.42.4 (CVE-2026-33481), which transitively requires stereoscope v0.1.22 + runtime-spec v1.3.0. Those versions use the new moby/moby/client submodule API, which is incompatible with inspektor-gadget's moby/moby umbrella requirement on the node-agent side (ambiguous-import wall — see issue #45). Node-agent stays on kubescape/syft v1.32.0-ks.2 via the existing replace, but transitive resolution from storage's go.mod pulls the newer stereoscope into the build, breaking the build with: undefined: client.New undefined: client.PingOptions Adds two replace directives to force the older transitive chain that matches kubescape/syft v1.32.0-ks.2's expectations: github.com/anchore/stereoscope => v0.1.9-0.20250826202322-... github.com/opencontainers/runtime-spec => v1.2.1 This is the minimum set needed for node-agent to build cleanly against storage rc1 (with syft v1.42.4) while still using kubescape/syft on its own side. Storage's CVE fix remains in effect at the storage binary; node-agent's syft surface is unchanged. Verified locally: go build ./... ok go test ./pkg/objectcache/... ./pkg/rulemanager/... -count=1 → 30+ packages ok --------- Co-authored-by: Entlein <eineintlein@gmail.com>
matthyx
left a comment
There was a problem hiding this comment.
I found three merge-blocking build issues and left inline notes on each failing spot.
| "github.com/kubescape/node-agent/pkg/exporters" | ||
| "github.com/kubescape/node-agent/pkg/rulemanager/types" | ||
| "github.com/kubescape/node-agent/pkg/signature" | ||
| "github.com/kubescape/node-agent/pkg/signature/profiles" |
There was a problem hiding this comment.
go test ./... fails on this new file because github.com/kubescape/node-agent/pkg/signature and .../pkg/signature/profiles do not exist in the checked-in module (pkg/**/signature/** is empty here). As written, this breaks package setup for every importer of containerprofilecache, so the PR will not build until these imports point at a real package or the code is wired through an existing dependency.
| } | ||
| // v0.0.2 IPAddresses[] — list form supporting CIDRs and '*' sentinel. | ||
| // Same semantics as the deprecated singular IPAddress, just plural. | ||
| addrs = append(addrs, n.IPAddresses...) |
There was a problem hiding this comment.
The current github.com/kubescape/storage/pkg/apis/softwarecomposition/v1beta1.NetworkNeighbor type only exposes IPAddress (singular) — go doc ...NetworkNeighbor has no IPAddresses field. That makes this append, and the matching ingress append below, a compile error against the current dependency, so this needs to stay on the singular field unless the storage API is updated first.
| pcp.EgressAddresses.All = true | ||
| pcp.EgressAddresses.Values = make(map[string]struct{}) | ||
| if !specInstalled || spec.EgressAddresses.InUse { | ||
| addrs := make([]string, 0, len(n.IPAddresses)+1) |
There was a problem hiding this comment.
Same blocker in the mock path: NetworkNeighbor does not have an IPAddresses field, so len(n.IPAddresses) and the append here fail to compile (same issue repeats in the analogous ingress block below). go test ./pkg/objectcache/v1 is currently red because of this.
Three matthyx blockers (2026-05-27): (1) tamper_alert.go:28 — imports pkg/signature and pkg/signature/profiles which ship in node-agent#809 (not yet merged). Adds minimum-surface stubs (pkg/signature/stub.go, pkg/signature/profiles/stub.go) so the PR compiles standalone. With IsSigned returning false, the tamper path short-circuits and never invokes Verify — the no-op stub is behavior- safe: signed-profile detection is dormant until kubescape#809 lands and replaces these files with the real implementation. (2) projection_apply.go:265 — NetworkNeighbor.IPAddresses missing. (3) mock.go:202 — same IPAddresses issue. Both (2) and (3) resolved by replacing kubescape/storage with k8sstormcenter/storage's upstream-pr/sbob-network tip 46f37d32 (sibling of kubescape/storage#324). Removed when kubescape#324 merges and a release ships IPAddresses.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/rulebindingmanager/cache/cache.go (2)
188-189:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDon't multiplex refresh pulses onto the pod-delta notifier.
rulebindingmanager.RuleBindingNotify{}is a real zero-value event here (Action == Addedwith an empty Pod).pkg/containerwatcher/v2/container_watcher_collection.go:144-173treats every notification as a concrete pod update, so this pulse will mark rule bindings initialized and add"/"toruleManagedPods. Ifpkg/rulemanager/rule_manager.go:112-133needs a coalesced refresh signal, give it a separate notifier path or extend the payload with an explicit refresh action.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/rulebindingmanager/cache/cache.go` around lines 188 - 189, The refresh pulse is being sent as an empty rulebindingmanager.RuleBindingNotify via dispatchNonBlocking(notifiers, ...), which the container watcher treats as a real Pod Added event and corrupts ruleManagedPods; stop multiplexing refreshes onto the pod-delta notifier by adding a dedicated refresh path or explicit action field: either add a new notifier channel specifically for coalesced refreshes and call dispatchNonBlocking(refreshNotifiers, nil, "refresh pulse"), or extend rulebindingmanager.RuleBindingNotify with an explicit Action enum/value (e.g., Refresh) and emit RuleBindingNotify{Action: Refresh} from the caller and update the consumers in container_watcher_collection and rule_manager.go to ignore Refresh as a pod update and handle it as a coalesced refresh signal.
201-219:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDrop-on-full is unsafe for the multi-pod add/modify/delete paths.
This helper is fine for idempotent pulses, but the handlers pass slices of per-pod notifications. Once a subscriber channel fills, later pod events are silently lost, and
pkg/containerwatcher/v2/container_watcher_collection.go:144-173does not resync from cache — it updatesruleManagedPodsfrom each message. A single large rule-binding change can therefore leave a slow subscriber permanently missing pods. Consider coalescing those handlers to one resync event or moving delivery behind a per-subscriber queue/worker that preserves at-least-once delivery.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/rulebindingmanager/cache/cache.go` around lines 201 - 219, The current dispatchNonBlocking drops per-pod notifications which breaks callers that expect at-least-once delivery (the add/modify/delete handlers that send slices of per-pod RuleBindingNotify and consumers that update ruleManagedPods). Fix by changing delivery semantics: either (A) coalesce the handlers so they emit one consolidated resync-style RuleBindingNotify instead of many per-pod messages (update the add/modify/delete handlers to produce a single resync message), or (B) implement a per-subscriber worker/queue: create a goroutine per notifier that owns a buffered channel and serializes delivery (modify dispatchNonBlocking to enqueue into each notifier's dedicated queue/channel and have the worker perform the actual send with retries/blocking to guarantee at-least-once), and update indexOfNotifier usage accordingly; pick one approach and adjust all callers of dispatchNonBlocking and the add/modify/delete handlers to match.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/rulebindingmanager/cache/cache.go`:
- Line 140: The AddHandler and ModifyHandler implementations hold c.mutex and
call addRuleBinding/modifiedRuleBinding while those helpers use
context.Background() for Namespaces().List and Pods(...).List which prevents
cancellation; change the signatures of addRuleBinding(ctx context.Context, ...)
and modifiedRuleBinding(ctx context.Context, ...) and pass the incoming watch
context from AddHandler/ModifyHandler into those calls, then replace
context.Background() uses in both helpers with the passed ctx for both
Namespaces().List and Pods(...).List; finally update cache_test.go to call the
new function signatures accordingly.
---
Outside diff comments:
In `@pkg/rulebindingmanager/cache/cache.go`:
- Around line 188-189: The refresh pulse is being sent as an empty
rulebindingmanager.RuleBindingNotify via dispatchNonBlocking(notifiers, ...),
which the container watcher treats as a real Pod Added event and corrupts
ruleManagedPods; stop multiplexing refreshes onto the pod-delta notifier by
adding a dedicated refresh path or explicit action field: either add a new
notifier channel specifically for coalesced refreshes and call
dispatchNonBlocking(refreshNotifiers, nil, "refresh pulse"), or extend
rulebindingmanager.RuleBindingNotify with an explicit Action enum/value (e.g.,
Refresh) and emit RuleBindingNotify{Action: Refresh} from the caller and update
the consumers in container_watcher_collection and rule_manager.go to ignore
Refresh as a pod update and handle it as a coalesced refresh signal.
- Around line 201-219: The current dispatchNonBlocking drops per-pod
notifications which breaks callers that expect at-least-once delivery (the
add/modify/delete handlers that send slices of per-pod RuleBindingNotify and
consumers that update ruleManagedPods). Fix by changing delivery semantics:
either (A) coalesce the handlers so they emit one consolidated resync-style
RuleBindingNotify instead of many per-pod messages (update the add/modify/delete
handlers to produce a single resync message), or (B) implement a per-subscriber
worker/queue: create a goroutine per notifier that owns a buffered channel and
serializes delivery (modify dispatchNonBlocking to enqueue into each notifier's
dedicated queue/channel and have the worker perform the actual send with
retries/blocking to guarantee at-least-once), and update indexOfNotifier usage
accordingly; pick one approach and adjust all callers of dispatchNonBlocking and
the add/modify/delete handlers to match.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1b640f7e-c5dd-4b9f-a2b1-436aa9cc7564
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (6)
go.modpkg/objectcache/containerprofilecache/containerprofilecache.gopkg/rulebindingmanager/cache/cache.gopkg/rulebindingmanager/cache/cache_test.gopkg/signature/profiles/stub.gopkg/signature/stub.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/objectcache/containerprofilecache/containerprofilecache.go
| c.mutex.Unlock() | ||
| return | ||
| } | ||
| rbs = c.addRuleBinding(ruleBinding) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show repo file
ls -la
# Print the relevant portions of the file around the referenced lines
FILE="pkg/rulebindingmanager/cache/cache.go"
echo "=== File stats ==="
wc -l "$FILE"
echo
echo "=== Snippet around line 120-180 ==="
nl -ba "$FILE" | sed -n '110,190p'
echo
echo "=== Snippet around line 220-310 ==="
nl -ba "$FILE" | sed -n '220,310p'
echo
echo "=== Snippet around line 320-380 ==="
nl -ba "$FILE" | sed -n '320,380p'
echo
echo "=== Find all context.Background() uses in this file ==="
rg -n "context\.Background\(\)" "$FILE" || true
echo
echo "=== Find definitions/usages of addRuleBinding / modifiedRuleBinding ==="
rg -n "func \(c \*RBCache\) (addRuleBinding|modifiedRuleBinding)\b" "$FILE" || true
rg -n "\.addRuleBinding\(" "$FILE" || true
rg -n "\.modifiedRuleBinding\(" "$FILE" || true
echo
echo "=== Show where List() calls happen in addRuleBinding/modifiedRuleBinding ==="
# Print whole functions if not too large
perl -ne 'print if $.>=0' "$FILE" >/dev/null
# Use ast-grep? Fallback to sed by locating function start lines
ADD_START=$(nl -ba "$FILE" | rg -n "func \(c \*RBCache\) addRuleBinding" | head -n1 | cut -d: -f1 || true)
MOD_START=$(nl -ba "$FILE" | rg -n "func \(c \*RBCache\) modifiedRuleBinding" | head -n1 | cut -d: -f1 || true)
if [ -n "${ADD_START:-}" ]; then
echo "=== addRuleBinding from $ADD_START (print until next func) ==="
nl -ba "$FILE" | sed -n "${ADD_START},$((ADD_START+160))p"
fi
if [ -n "${MOD_START:-}" ]; then
echo "=== modifiedRuleBinding from $MOD_START (print until next func) ==="
nl -ba "$FILE" | sed -n "${MOD_START},$((MOD_START+160))p"
fiRepository: kubescape/node-agent
Length of output: 2017
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="pkg/rulebindingmanager/cache/cache.go"
echo "=== File stats ==="
wc -l "$FILE"
echo
echo "=== Snippet around line 110-190 ==="
awk 'NR>=110 && NR<=190 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo
echo "=== Snippet around line 220-310 ==="
awk 'NR>=220 && NR<=310 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo
echo "=== Snippet around line 320-380 ==="
awk 'NR>=320 && NR<=380 {printf "%d\t%s\n", NR, $0}' "$FILE"
echo
echo "=== Find all context.Background() uses in this file ==="
rg -n "context\.Background\(\)" "$FILE" || true
echo
echo "=== Find definitions/usages of addRuleBinding / modifiedRuleBinding ==="
rg -n "func \(c \*RBCache\) (addRuleBinding|modifiedRuleBinding)\b" "$FILE" || true
rg -n "\.addRuleBinding\(" "$FILE" || true
rg -n "\.modifiedRuleBinding\(" "$FILE" || true
echo
echo "=== Function bodies (trimmed) ==="
ADD_START=$(rg -n "func \(c \*RBCache\) addRuleBinding\b" "$FILE" | head -n1 | cut -d: -f1 || true)
MOD_START=$(rg -n "func \(c \*RBCache\) modifiedRuleBinding\b" "$FILE" | head -n1 | cut -d: -f1 || true)
if [ -n "${ADD_START:-}" ]; then
echo "--- addRuleBinding starting at $ADD_START ---"
awk -v s="$ADD_START" 'NR>=s && NR<=s+220 {printf "%d\t%s\n", NR, $0}' "$FILE"
fi
if [ -n "${MOD_START:-}" ]; then
echo "--- modifiedRuleBinding starting at $MOD_START ---"
awk -v s="$MOD_START" 'NR>=s && NR<=s+220 {printf "%d\t%s\n", NR, $0}' "$FILE"
fiRepository: kubescape/node-agent
Length of output: 24559
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "\.addRuleBinding\(" -S . || true
rg -n "\.modifiedRuleBinding\(" -S . || true
rg -n "func \(c \*RBCache\) addRuleBinding\(" pkg/rulebindingmanager/cache/cache.go
rg -n "func \(c \*RBCache\) modifiedRuleBinding\(" pkg/rulebindingmanager/cache/cache.goRepository: kubescape/node-agent
Length of output: 739
Restore watch-context cancellation for RB list calls.
pkg/rulebindingmanager/cache/cache.go keeps c.mutex held in AddHandler/ModifyHandler while it calls addRuleBinding/modifiedRuleBinding, and addRuleBinding currently uses context.Background() for Namespaces().List and Pods(...).List (lines 140, 159, 237-281, 343-345). This prevents list requests from being canceled when the watch context is canceled and can stall cache operations during/after shutdown or restart. Thread ctx through addRuleBinding(ctx, ...) / modifiedRuleBinding(ctx, ...), pass it from AddHandler/ModifyHandler, and use it for both List calls; update cache_test.go call sites for the signature change.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/rulebindingmanager/cache/cache.go` at line 140, The AddHandler and
ModifyHandler implementations hold c.mutex and call
addRuleBinding/modifiedRuleBinding while those helpers use context.Background()
for Namespaces().List and Pods(...).List which prevents cancellation; change the
signatures of addRuleBinding(ctx context.Context, ...) and
modifiedRuleBinding(ctx context.Context, ...) and pass the incoming watch
context from AddHandler/ModifyHandler into those calls, then replace
context.Background() uses in both helpers with the passed ctx for both
Namespaces().List and Pods(...).List; finally update cache_test.go to call the
new function signatures accordingly.
| helpers.Error(err)) | ||
| // Honour strict-mode: refuse to load on any verification failure, | ||
| // but do NOT touch the dedup map or emit R1016. | ||
| return !c.cfg.EnableSignatureVerification |
There was a problem hiding this comment.
c.cfg.EnableSignatureVerification still does not exist on config.Config, so this file has four hard compiler errors at the strict-mode return sites (for example here, and again at lines 108/136/146). Please either add/plumb that config field in this PR or remove the strict-mode branches so the tamper path matches currently available config.
| // signature-payload accessors. | ||
| type Signable interface{} | ||
|
|
||
| func IsSigned(_ Signable) bool { return false } |
There was a problem hiding this comment.
These stubs still leave the new tamper-detection path effectively disabled: IsSigned always returns false, VerifyObjectAllowUntrusted always returns nil, and the adapters in pkg/signature/profiles/stub.go return nil. That means the runtime path never verifies anything and the new tamper tests are not exercising the intended behavior. This needs the real signature implementation (or the feature/tests need to stay out of this PR).
There was a problem hiding this comment.
Yes, those are just to make the PR "atomic" they need to be deleted after merge. I havent yet found a way to completely seperate the PRs 808 and 809
matthyx
left a comment
There was a problem hiding this comment.
I re-checked this PR and I’m still seeing three merge blockers:
pkg/objectcache/containerprofilecache/tamper_alert.goreferencesc.cfg.EnableSignatureVerification, butpkg/config/config.gostill has no such field. LSP flags four compiler errors at the strict-mode returns.cmd/main.gocreates both the runtime alert exporter and the container profile cache, but never callscpc.SetTamperAlertExporter(exporter). As written,emitTamperAlert()is a no-op in production, so the new R1016 path never actually emits alerts end-to-end.pkg/signature/stub.go/pkg/signature/profiles/stub.gostill short-circuit the feature (IsSigned == false,VerifyObjectAllowUntrusted == nil, adapters returnnil), so the new tamper-detection behavior is effectively dormant and the added tamper tests are not validating real verification behavior.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
matthyx
left a comment
There was a problem hiding this comment.
Checked again.
The earlier import/IPAddresses blockers are addressed now: the PR has local pkg/signature stubs, and the storage replace brings NetworkNeighbor.IPAddresses.
I’m still seeing three blockers before merge:
pkg/objectcache/containerprofilecache/tamper_alert.gostill readsc.cfg.EnableSignatureVerificationat the strict-mode return sites, butpkg/config/config.gostill has no such field onconfig.Config.cmd/main.gocreates both the runtime alert exporter and the container profile cache, but never callscpc.SetTamperAlertExporter(exporter), so the new R1016 tamper path still never emits alerts end-to-end.pkg/signature/stub.goandpkg/signature/profiles/stub.gostill short-circuit the feature (IsSigned == false,VerifyObjectAllowUntrusted == nil, adapters returnnil), so the tamper-detection path remains dormant and the added tamper tests are not validating a real verification flow.
here things got confusing during the rebase, this is the sister PR to storage and need the signature PR (from matthyx)
Summary by CodeRabbit
New Features
Bug Fixes
Tests