Skip to content

feat: runtime reversibility check for constrained leapfrog integrator (RATTLE)#89

Draft
MaartenMarsman wants to merge 3 commits intomainfrom
feature/reversibility-check
Draft

feat: runtime reversibility check for constrained leapfrog integrator (RATTLE)#89
MaartenMarsman wants to merge 3 commits intomainfrom
feature/reversibility-check

Conversation

@MaartenMarsman
Copy link
Copy Markdown
Collaborator

Description

Add a forward-backward reversibility check to the constrained leapfrog
integrator (RATTLE/SHAKE). Each constrained leapfrog step runs a reverse
half-step and compares the result to the original position. Steps that fail
the round-trip within a tolerance of 0.5 * eps^2 are flagged as
non-reversible. Non-reversible steps terminate the NUTS tree and reject the
proposal, matching the existing divergence handling.

Problem / Motivation

The RATTLE/SHAKE constrained integrator projects positions and momenta onto
the constraint manifold (positive-definite Cholesky factor with structural
zeros for absent edges). These projections are iterative and can accumulate
numerical error, especially when edge indicators change between iterations.
Unlike unconstrained leapfrog, there is no guarantee that the constrained
integrator is time-reversible to machine precision. Without a runtime check,
non-reversible trajectories silently corrupt the detailed-balance condition
of the Markov chain.

Diagnostic analysis on the Wenchuan dataset (p=17, 136 possible edges)
showed that non-reversible steps co-occur with divergent trajectories and
are not caused by intrinsic RATTLE/SHAKE projection failures. The
dimensionless coupling constant C = max_rev_diff / eps^2 has a typical
range of 0.001-0.14 (median 0.002) for non-divergent steps, well below
the 0.5 threshold.

Proposed Changes / New Functionality

C++ sampling layer:

  • leapfrog_constrained_checked() in leapfrog.h/.cpp: runs forward step,
    then reverse half-step, compares positions. Returns a reversible flag
    alongside the usual position/momentum.
  • nuts_step() and hmc_step() in nuts.h/.cpp and hmc.h/.cpp: accept
    reverse_check and reverse_check_tol parameters. Non-reversible steps
    set s_prime = 0, n_prime = 0 (tree terminates, proposal rejected).
  • StepResult and ChainResult extended with non_reversible field.
  • SamplerConfig stores reverse_check = true and
    reverse_check_tol = 0.5 as internal defaults (not exposed to R).
  • Phase-aware enforcement in sampler_base.h: the check observes during
    warmup and enforces (rejects) during sampling.

R output layer:

  • non_reversible counts stored per iteration in chain_result.
  • Surfaced through nuts_diagnostics as nuts_diag$non_reversible and
    nuts_diag$summary$total_non_reversible.

The reversibility check is always on. There is no user-facing parameter to
disable it.

Type of Change

  • New feature
  • Performance optimization

The constrained leapfrog now runs a forward + backward step, approximately
doubling the leapfrog cost. On the Wenchuan dataset (p=17), elapsed time
increases from ~3s to ~6s compared to main. This is the expected cost of
the round-trip check.

Documentation and Release Notes

  • Updated function documentation for any signature or behaviour changes
  • Regenerated man/*.Rd files if roxygen comments changed
  • Added or updated NEWS.md entry if the change affects users

Testing and Validation

14 tests in test-reversibility-check.R:

Unit-level (C++ test interface):

  • Well-behaved steps pass the check (non_reversible_count = 0)
  • Checked leapfrog matches unchecked positions to 1e-12
  • Impossibly tight tolerance (1e-11) triggers non-reversible detections
  • Large step sizes (eps=0.5) with near-machine-epsilon tolerance catches
    round-trip drift
  • Long trajectory (200 constrained leapfrog steps) stays reversible at
    normal tolerance
  • Rank-deficient regime (n=5, p=8) stays reversible
  • Ill-conditioned data (condition number > 1e3) stays reversible

Integration-level (bgm()):

  • non_reversible diagnostic field present in bgm() output

  • High-dimensional GGM (p=20, edge selection) completes without crash

  • Added or updated tests in tests/testthat/ when needed

  • Ran the project styler

  • Ran lintr::lint_package()

  • Ran roxygen2::roxygenise() if roxygen comments changed

  • Ran rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"))

  • Verified numerical behaviour where relevant

R CMD check: 0 errors, 0 warnings, 1 note (pre-existing CODE_OF_CONDUCT.md).

Additional Notes

  • The bgmCompare sampler (bgmCompare_sampler.cpp) still uses the old
    nuts_step/hmc_step signatures. It compiles via C++ default parameters
    but does not pass reverse_check explicitly. This is fine for now since
    bgmCompare does not use constrained integration, but should be updated
    when that code is migrated to the unified runner.
  • The ggm_test_leapfrog_constrained_checked C++ test interface exposes
    reverse_check_factor for unit testing with arbitrary tolerances. This is
    not part of the public R API.
  • Performance: the 2x cost is inherent to the forward-backward round-trip.
    A future optimization could skip the reverse check for unconstrained
    leapfrog steps (where reversibility is guaranteed by the symplectic
    integrator), but this is not implemented since unconstrained steps are
    already fast.

Phase-aware reverse check: observe during warmup, enforce during sampling.
Forward+backward leapfrog in leapfrog_constrained_checked detects
non-reversible RATTLE/SHAKE projections. Non-reversible steps terminate
the NUTS tree and reject the proposal (same as divergence handling).

Internal tolerance fixed at 0.5 * eps^2 (not exposed in R API).
Non-reversible counts stored per iteration and surfaced through
nuts_diagnostics alongside divergence counts.

R layer: reverse_check parameter in bgm(), threaded through spec,
validation, and run_sampler dispatch to C++ backends (GGM, OMRF, mixed).
C++ layer: SamplerConfig, sampler_base, nuts_sampler, hmc_sampler,
leapfrog, nuts, hmc, step_result, chain_result all extended.
Tests: 6 tests covering parameter validation, output structure, and
integration with continuous GGM edge selection.
The reversibility check is now always on (hardcoded in SamplerConfig).
Removed the parameter from bgm(), bgm_spec(), validate_sampler(),
run_sampler dispatchers, and the three C++ Rcpp::export entry points.
Updated test-validate-sampler expected return list.
Five new scenarios exercising the RATTLE reverse check:
- Large step sizes (eps=0.5) with near-machine-epsilon tolerance
- Long trajectories (200 constrained leapfrog steps)
- Rank-deficient regime (n=5, p=8)
- Ill-conditioned data (condition number > 1e3)
- High-dimensional GGM integration (p=20, edge selection)
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 97.10145% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.70%. Comparing base (c9a01b3) to head (47c3f90).

Files with missing lines Patch % Lines
src/mcmc/algorithms/hmc.cpp 70.00% 3 Missing ⚠️
src/ggm_gradient_interface.cpp 96.15% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #89      +/-   ##
==========================================
+ Coverage   88.54%   88.70%   +0.15%     
==========================================
  Files          72       72              
  Lines       12406    12518     +112     
==========================================
+ Hits        10985    11104     +119     
+ Misses       1421     1414       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant