Skip to content

Add steady state detection using modified CUSUM algorithm (#71)#532

Open
chraibi wants to merge 3 commits intoPedestrianDynamics:mainfrom
chraibi:steady_state
Open

Add steady state detection using modified CUSUM algorithm (#71)#532
chraibi wants to merge 3 commits intoPedestrianDynamics:mainfrom
chraibi:steady_state

Conversation

@chraibi
Copy link
Copy Markdown
Collaborator

@chraibi chraibi commented Apr 9, 2026

Implement the method from Liao et al. (2016, Physica A 461) to detect steady state intervals in density/speed time series.

chraibi added 2 commits April 9, 2026 14:39
…Dynamics#71)

Implement the method from Liao et al. (2016, Physica A 461) to detect
steady state intervals in density/speed time series. The detection
threshold is calibrated via an autoregressive model solved with the
block-tridiagonal Thomas Algorithm.

- Add pedpy/methods/steady_state_detector.py with detect_steady_state()
  and combine_steady_states()
- Add 21 unit tests including validation against reference data
- Add user guide section with usage examples and visualisation
- Add API reference entry
- Add reference dataset (experiment AO, b=240cm)
@chraibi chraibi marked this pull request as ready for review April 9, 2026 18:45
Copilot AI review requested due to automatic review settings April 9, 2026 18:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds steady-state interval detection for density/speed time series using the modified CUSUM approach from Liao et al. (2016), and exposes it as a public PedPy API with accompanying docs and examples.

Changes:

  • Introduces detect_steady_state() (CUSUM + AR(1)-calibrated threshold) and utilities to combine intervals across series.
  • Exports the new API from pedpy and documents it in Sphinx + user guide notebook.
  • Adds unit/integration tests and ships a demo/reference dataset.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pedpy/methods/steady_state_detector.py New implementation of modified CUSUM steady-state detection + interval combination helpers.
tests/unit_tests/methods/test_steady_state_detector.py New unit/integration tests + attempted regression validation against legacy reference outputs.
pedpy/__init__.py Re-exports steady-state detection API (detect_steady_state, combine_steady_states, SteadyStateResult).
docs/source/api/methods.rst Adds API docs entry for the new steady-state detection module.
notebooks/user_guide.ipynb Adds tutorial section demonstrating steady-state detection and visualization.
notebooks/demo-data/bottleneck/rho_v_Voronoi_traj_AO_b240.txt_id_1.dat Adds demo dataset used by the user guide (and suitable for tests).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +105
ref_acf = float(pearsonr(ref_series[:-1], ref_series[1:])[0])

if ref_std == 0:
raise ValueError("Reference series has zero standard deviation.")

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pearsonr() is called before checking ref_std == 0. For a constant (or near-constant) reference series, SciPy emits a ConstantInputWarning and returns nan for the correlation before you raise the intended error. Compute/validate ref_std first (and consider guarding against nan ACF) before calling pearsonr to avoid noisy warnings and downstream nan behavior.

Suggested change
ref_acf = float(pearsonr(ref_series[:-1], ref_series[1:])[0])
if ref_std == 0:
raise ValueError("Reference series has zero standard deviation.")
if ref_std == 0:
raise ValueError("Reference series has zero standard deviation.")
ref_acf = float(pearsonr(ref_series[:-1], ref_series[1:])[0])
if not np.isfinite(ref_acf):
raise ValueError("Reference series has invalid lag-1 autocorrelation.")

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +92
if len(frames) != len(values):
raise ValueError("frames and values must have the same length.")

if ref_start > ref_end:
raise ValueError(f"ref_start ({ref_start}) must be <= ref_end ({ref_end}).")

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module raises built-in ValueError for input validation, while the rest of the codebase typically raises pedpy.errors.InputError / PedPyValueError for user-facing API validation failures (e.g. pedpy/methods/profile_calculator.py:362). Consider switching these to PedPy’s custom exception types for consistent error handling by library consumers.

Copilot uses AI. Check for mistakes.
Comment on lines +298 to +307
# build block matrices b1 (outside [-q,q]) and b2 (inside [-q,q])
b1, b2 = _build_block_matrices(xi, ia, ib, kp1, c, d)
eye = np.eye(kp1)

# solve block-tridiagonal system via Thomas algorithm
td = _solve_thomas(b1, b2, eye, kp1, n_blocks, dnorm)

# assemble full solution vector and normalise
tms = np.vstack(td)
tms = tms / (d * np.sum(tms))
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_compute_theta() performs multiple 501×501 dense linear solves inside a loop over s_max blocks. Calling detect_steady_state() repeatedly (e.g., for many series) will be expensive. Consider caching theta by (acf, alpha, gamma, s_max, grid_size, grid_half_width) (or precomputing a lookup table) so repeated runs with similar parameters don’t redo this heavy calibration step.

Copilot uses AI. Check for mistakes.
Comment on lines +289 to +299
data_path = os.path.join(
os.path.dirname(__file__),
"..",
"..",
"..",
"materials",
"SteadyState",
"rho_v_Voronoi_traj_AO_b240.txt_id_1.dat",
)
if not os.path.exists(data_path):
pytest.skip("Reference data not available")
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference-data fixture points to materials/SteadyState/..., but this repository doesn’t have a materials/ directory (so these tests will always skip). Update the fixture to load the dataset that’s actually included in this PR (e.g. under notebooks/demo-data/bottleneck/) or add the expected materials/SteadyState path to the repo so the validation tests run in CI.

Suggested change
data_path = os.path.join(
os.path.dirname(__file__),
"..",
"..",
"..",
"materials",
"SteadyState",
"rho_v_Voronoi_traj_AO_b240.txt_id_1.dat",
)
if not os.path.exists(data_path):
pytest.skip("Reference data not available")
base_dir = os.path.abspath(
os.path.join(os.path.dirname(__file__), "..", "..", "..")
)
file_name = "rho_v_Voronoi_traj_AO_b240.txt_id_1.dat"
candidate_paths = [
os.path.join(base_dir, "materials", "SteadyState", file_name),
os.path.join(
base_dir, "notebooks", "demo-data", "bottleneck", file_name
),
]
data_path = next((path for path in candidate_paths if os.path.exists(path)), None)
if data_path is None:
pytest.skip(
"Reference data not available. Checked: "
+ ", ".join(candidate_paths)
)

Copilot uses AI. Check for mistakes.
Comment on lines +106 to +107
assert len(starts) == 1
# the interval should be adjusted by reaction times
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_single_steady_interval sets up inputs and calls _find_steady_intervals(), but it never asserts anything about starts/ends (the test currently passes even if the function is broken). Add assertions for the expected start/end values (including the reaction-time adjustment) to make this test meaningful.

Suggested change
assert len(starts) == 1
# the interval should be adjusted by reaction times
assert len(starts) == 1
assert len(ends) == 1
# the interval should be adjusted by reaction times
np.testing.assert_array_equal(starts, np.array([0.0]))
np.testing.assert_array_equal(ends, np.array([199.0]))

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +179
# collect all intervals from all results
all_intervals = []
for r in results:
for s, e in zip(r.frame_start, r.frame_end, strict=True):
all_intervals.append((s, e))

if not all_intervals:
return []

if len(results) == 1:
return all_intervals

# find pairwise overlaps across all results
# start with intervals from the first result, intersect with each
# subsequent result
current = list(zip(results[0].frame_start, results[0].frame_end, strict=True))
for r in results[1:]:
next_intervals = list(zip(r.frame_start, r.frame_end, strict=True))
current = _intersect_interval_lists(current, next_intervals)

return current
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine_steady_states() builds all_intervals for every result even when len(results) > 1, but then discards it and recomputes intervals from results[0] for the intersection loop. This adds unnecessary work and makes the control flow harder to follow; consider only building the intervals list in the len(results) == 1 branch (or reuse all_intervals appropriately).

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 98.19277% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.09%. Comparing base (304d029) to head (fe64ff7).

Files with missing lines Patch % Lines
pedpy/methods/steady_state_detector.py 98.18% 3 Missing ⚠️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@schroedtert schroedtert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a first look focussing on the API of the steady state detection. From my point of view there are some fundamental changes needed to integrate the feature better in the PedPy environment. These changes will not really affect the underlying parts, if I see it correctly. Will have a more detailed look at the math part later.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this file needed? Wouldn't it make sense to just compute these values directly with PedPy in the notebook?

Comment on lines +4550 to +4581
"#### Load density/speed time series\n",
"\n",
"We use Voronoi density and speed data from a bottleneck experiment (experiment AO, b=240cm).\n",
"The data file has three columns: frame, density, and speed."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"data = np.loadtxt(\n",
" \"demo-data/bottleneck/rho_v_Voronoi_traj_AO_b240.txt_id_1.dat\",\n",
" usecols=[0, 1, 2],\n",
")\n",
"# Remove rows where density is zero (no measurement)\n",
"data = data[data[:, 1] != 0]\n",
"\n",
"frames = data[:, 0]\n",
"density = data[:, 1]\n",
"speed = data[:, 2]\n",
"\n",
"print(f\"Frames: {frames[0]:.0f} to {frames[-1]:.0f} ({len(frames)} observations)\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really get the idea, why we should read in a file with density and speed values in the user guide. From my point of view it would be more beneficial for the users to see how they can use their results from density and speed computations to detect steady state.

Comment on lines +44 to +55
def detect_steady_state(
*,
frames: np.ndarray,
values: np.ndarray,
ref_start: int,
ref_end: int,
alpha: float = 0.99,
gamma: float = 0.99,
s_max: int = 100,
grid_size: int = 500,
grid_half_width: float = 3.2,
) -> SteadyStateResult:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a DataFrame here as input? This seems to be the only function where numpy is used as main input and not TrajectoryData or DataFrame. I would prefer to have it similarly. Maybe something like:

Suggested change
def detect_steady_state(
*,
frames: np.ndarray,
values: np.ndarray,
ref_start: int,
ref_end: int,
alpha: float = 0.99,
gamma: float = 0.99,
s_max: int = 100,
grid_size: int = 500,
grid_half_width: float = 3.2,
) -> SteadyStateResult:
def detect_steady_state(
*,
data: pd.DataFrame,
column: str
ref_start: int,
ref_end: int,
alpha: float = 0.99,
gamma: float = 0.99,
s_max: int = 100,
grid_size: int = 500,
grid_half_width: float = 3.2,
) -> SteadyStateResult:

So the results from compute_classis_density, compute_voronoi_density, compute_mean_speed_per_frame, compute_voronoi_speed could be directly reused.

Comment on lines +68 to +76
alpha: critical probability for the step function F
(default 0.99)
gamma: confidence level for the detection threshold theta
(default 0.99)
s_max: upper boundary of the CUSUM statistics (default 100)
grid_size: number of grid points K for the numerical theta
computation (default 500)
grid_half_width: half-width of the discretisation grid
(default 3.2)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these typical know parameters of the CUSUM algorithm and well-chosen defaults? Without digging too much into it, it just seems arbitrary. Is it possible to display them in a figure to better explain it?

@chraibi
Copy link
Copy Markdown
Collaborator Author

chraibi commented Apr 12, 2026

Thanks Tobi for review. I'm still in the process of translating the old python code of this paper into pedpy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants