Skip to content

refactor(parametric/python): consolidate set_and_wait_rc helpers + absorb #6951#6956

Draft
bm1549 wants to merge 1 commit into
brian.marks/rc-applied-endpoint-python-pilotfrom
brian.marks/rc-applied-migration
Draft

refactor(parametric/python): consolidate set_and_wait_rc helpers + absorb #6951#6956
bm1549 wants to merge 1 commit into
brian.marks/rc-applied-endpoint-python-pilotfrom
brian.marks/rc-applied-migration

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented May 15, 2026

Motivation

set_and_wait_rc() waits only for the test-agent ACK. That ACK fires when the tracer's RC client validates the payload — not when product subscribers actually apply it. The gap is small for in-process tracers and large for out-of-process ones (PHP sidecar), and it's the root cause of the flaky-test gates we've been carrying. #6954 added set_and_wait_rc_applied() as a deterministic variant that drains the tracer after the ACK, and converted two pilot tests on Python. The two-helper design was always transitional: there is no real use case for the ACK-only variant. Every behavioural test that reads tracer state after RC arrives benefits from the deterministic guarantee, and the helper degrades to the old ACK-only path for tracers outside _RC_APPLY_ENDPOINT_LANGS. This PR finishes that migration and removes the redundant helper, then absorbs the scope of #6951 — which papered over the same race with retry loops on three Python tests.

Changes

The single remaining helper is set_and_wait_rc(test_agent, test_library, config_overrides, config_id=None). It waits for the ACK, then on tracers in _RC_APPLY_ENDPOINT_LANGS (currently {"python"}) calls flush_remote_config() to synchronously drain pending RC. Tracers outside the allowlist get the ACK-only path with no behaviour change. Stacked on #6954 — base branch is brian.marks/rc-applied-endpoint-python-pilot.

File Change
tests/parametric/test_dynamic_configuration.py Replace set_and_wait_rc (ACK-only) with the unified helper. Rename set_and_wait_rc_applied -> set_and_wait_rc. Migrate all 19 call sites to pass test_library as the second positional argument. Remove the retry loop in test_remote_sampling_rules_retention that the helper now makes redundant.
tests/parametric/test_remote_config_apply_endpoint.py Update the contract test to call the renamed helper (test_set_and_wait_rc_applies_synchronously).
utils/docker_fixtures/_test_clients/_test_client_parametric.py Update flush_remote_config docstring — most tests no longer need to call it directly.
docs/parametric/remote-config-apply-contract.md Reflect the new single-helper API and adoption story.
manifests/python.yml Remove three flaky-test gates that the deterministic drain makes unnecessary: test_remote_sampling_rules_retention, test_trace_sampling_rules_override_env, test_trace_sampling_rules_override_rate.

Design choice: rename rather than rename-and-replace

I picked Option X from the brief: take over the set_and_wait_rc name with the new signature, rather than keeping set_and_wait_rc_applied as the long-term name. Two reasons. First, once there is a single helper, the _applied suffix carries no information — set_and_wait_rc should be the deterministic one. Second, the helper already has a transparent fallback for non-Python tracers, so existing call sites (which all already had test_library in scope) keep the same call name with one extra argument. The git blame churn is one-line-per-call-site and the rename is mechanical; net readability wins. The alternative was deleting the old name and keeping _applied everywhere, which costs the same churn while leaving a worse name on the surface forever.

Supersedes #6951

#6951 added retry loops in test_trace_sampling_rules_override_env and test_trace_sampling_rules_override_rate around post-ACK assertions on _dd.rule_psr / _dd.p.dm, plus the same three manifest unflaggings this PR includes. Those retry loops were a stopgap for the race the unified helper now eliminates deterministically. The retry loops in this PR's converted tests are gone; the manifest changes are the same. #6951 should be closed after this lands. The probabilistic retry inside get_sampled_trace() stays as-is — it handles sampling-by-chance, not the RC race.

Local verification

I ran both files five times each against locally installed ddtrace from PyPI (prepare-local-build.sh python --method pip).

Suite Iterations Result per iteration
tests/parametric/test_remote_config_apply_endpoint.py 5 5 passed
tests/parametric/test_dynamic_configuration.py 5 15 passed, 17 skipped, 3 xfailed

All three previously-flagged tests (test_remote_sampling_rules_retention, test_trace_sampling_rules_override_env, test_trace_sampling_rules_override_rate) ran in the unflagged set and passed every iteration.

Workflow

  1. ⚠️ Created as draft.
  2. Working on CI to pass.
  3. Tests-only change touching tests/, manifests/, docs/, and one wrapper file in utils/docker_fixtures/. RFC owner review may be appropriate given the parametric harness contract.

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? Yes — docs/parametric/remote-config-apply-contract.md and the APMLibrary wrapper docstring in utils/docker_fixtures/. The wrapper change is docstring-only. R&P approval if required for the doc-and-docstring change.
  • A docker base image is modified? No.
  • A scenario is added, removed or renamed? No.

…tic by default)

Renames set_and_wait_rc_applied -> set_and_wait_rc and deletes the prior
ACK-only helper. The remaining helper waits for the test-agent ACK and then
synchronously drains pending RC on tracers in _RC_APPLY_ENDPOINT_LANGS, so
every RC-driven test gets deterministic behavior by default. Tracers outside
the allowlist fall back to the ACK-only path transparently.

All 19 call sites in tests/parametric/test_dynamic_configuration.py and the
contract test in test_remote_config_apply_endpoint.py are migrated to pass
test_library as the second positional argument. The APMLibrary wrapper and
docs/parametric/remote-config-apply-contract.md are updated to match.

Removes three flaky-test gates from manifests/python.yml that the previous
ACK-only race motivated. The retry loops in test_trace_sampling_rules_override_env,
test_trace_sampling_rules_override_rate, and test_remote_sampling_rules_retention
go away with this change; the probabilistic retry inside get_sampled_trace
stays because it handles sampling-by-chance, not the RC race.

Supersedes #6951.
@bm1549 bm1549 added the ai-generated The pull request includes a significant amount of AI-generated code label May 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

CODEOWNERS have been resolved as:

docs/parametric/remote-config-apply-contract.md                         @DataDog/system-tests-core
manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
tests/parametric/test_dynamic_configuration.py                          @DataDog/system-tests-core @DataDog/apm-sdk-capabilities
tests/parametric/test_remote_config_apply_endpoint.py                   @DataDog/system-tests-core @DataDog/apm-sdk-capabilities
utils/docker_fixtures/_test_clients/_test_client_parametric.py          @DataDog/system-tests-core

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated The pull request includes a significant amount of AI-generated code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant