feat(stats)!: add agent version gating to client-side stats [APMSP-3361]#2040
feat(stats)!: add agent version gating to client-side stats [APMSP-3361]#2040VianneyRuhlmann wants to merge 8 commits into
Conversation
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: c3f4fda | Docs | Datadog PR Page | Give us feedback! |
22670c0 to
b607a25
Compare
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2040 +/- ##
==========================================
+ Coverage 72.90% 73.05% +0.15%
==========================================
Files 460 460
Lines 76396 76748 +352
==========================================
+ Hits 55696 56070 +374
+ Misses 20700 20678 -22
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b607a2507d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| .info | ||
| .version | ||
| .as_ref() | ||
| .is_some_and(|v| (v.major, v.minor, v.patch) >= MIN_STATS_AGENT_VERSION) |
There was a problem hiding this comment.
Reject prerelease versions below the minimum
When the agent reports a prerelease such as 7.65.0-rc.1, the parser preserves that suffix in metadata, but this comparison only checks (major, minor, patch), so the prerelease is treated as satisfying the >= 7.65.0 gate. Under semver precedence 7.65.0-rc.1 is lower than 7.65.0, so in environments running RC/dev agents that also advertise /v0.6/stats, client-side stats can be enabled before the minimum supported stable agent version.
Useful? React with 👍 / 👎.
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
oscar.ledauphin@datadoghq.com unqueued this merge request |
| where | ||
| D: Deserializer<'de>, | ||
| { | ||
| let Some(raw) = Option::<String>::deserialize(deserializer)? else { |
There was a problem hiding this comment.
Small detail but this implies agent version is always a string, because otherwise the ? here is the only place we will return an Err and thus make the whole schema un-parsable.
In a more general note, I don't like how any parse failure in a field on the schema fails the deserialization of the whole struct because it makes for some really unclear errors. I don't have a solution tho :/
There was a problem hiding this comment.
The version is always a string https://github.com/DataDog/datadog-agent/blob/8f7d7790726510ee1ecd8e6b7d57cc9c406691bf/pkg/trace/api/info.go#L25 . I don't think we want to partially parse the version here, either it's a valid semver version and we can't rely on it to check for feature support or it's not and we shouldn't try to rely on it
There was a problem hiding this comment.
One important note here is that this version can also come from the serverless rust agent, which today is also a string, but greatly complicates the usage of version checking :(
There was a problem hiding this comment.
e.g. today the rust agent doesn't return a version at all
There was a problem hiding this comment.
I guess it's ok for now to only expect agent version then. They will need a way to let the tracers know which "flavor" of agent they're talking to anyway.
There was a problem hiding this comment.
The DD_TRACE_STATS_COMPUTATION_ENABLED config was always dependent on the agent advertising support for CSS. Previously support was only checked by the presence of drop_p0s=true in /info however according to the spec we also need the a supported agent version >=7.65 and the /v0.6/stats endpoint to be advertised. I don't think it's a good idea to allow unconditional CSS to be configured
There was a problem hiding this comment.
(edit: I posted this comment before refreshing, didn't see the comment above)
I suppose the rust agents can fake it and return version 7.65, but that's even hackier than gating features by using version checks instead of feature detection. 😅
There was a problem hiding this comment.
The
DD_TRACE_STATS_COMPUTATION_ENABLEDconfig was always dependent on the agent advertising support for CSS.
Exactly. Feature detection vs version checks. The whole point of adding features to /info was to move away from version checks. Why are we going back? 😞
I don't think it's a good idea to allow unconditional CSS to be configured
I agree. I didn't mean unconditional. I meant skip the new version check (which is arguably a breaking change for users who already had CSS working with DD_TRACE_STATS_COMPUTATION_ENABLED=true), not the check for drop_p0s=true.
There was a problem hiding this comment.
I'm fine with removing the agent version and relying only on client_drop_p0s and the endpoint being available. Also I think the rust agent currently hardcodes client_drop_p0s to be disabled which means CSS won't be enabled regardless of DD_TRACE_STATS_COMPUTATION_ENABLED
There was a problem hiding this comment.
I think the rust agent currently hardcodes
client_drop_p0sto be disabled
Yes! We had to fix that bug recently because that rust agent (there are two rust agents!) was lying to the tracers, announcing CSS support. For now, we can't enable CSS in AWS Lambda at all, but some tracers started enabling CSS by default.
The workloads that enable CSS use the other rust agent (https://github.com/DataDog/serverless-components/tree/main/crates/datadog-serverless-compat), which reports support for client_drop_p0s correctly.
Happy to take this thread to Slack if this is getting too long 😅
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
| major: 7, | ||
| minor: 65, | ||
| patch: 0, | ||
| metadata: None, |
There was a problem hiding this comment.
I don't think you are testing metadata anywhere?
What does this PR do?
Parse the agent version as semver and check it when enabling client-side stats. Also check for the stats endpoint.
Motivation
According to spec client-side stats should only be enabled when agent version is higher than. 7.65 and the stats endpoint is available.
Additional Notes
Anything else we should know when reviewing?
How to test the change?
Describe here in detail how the change can be validated.