Some updates for the impact plot naming and style by kdlong · Pull Request #2 · bendavid/WRemnants

kdlong · 2024-11-18T09:48:17Z

Some changes made for the comparison of impacts probably need to be made more general.

Updates on QCD background

Merging from Wmass/main

Merge from cippy/main

… [0.1,0.1] for c1 and c2; Fix postfit plotting

…ng and simplify grouping in function

…ials QCD background smoothing polynomials

…zedUnfolding

…WRemnants into 240503_linearizedUnfolding

Fix security issue in index.php

Merging from WMass/main

Merging from WMass

Automatic deletion of old kerberos credentials

Support alphaS fit

Merge from WMass/main

Small fixes for plots

Updates on plots

Fixes

…ar-mode The two-stage continuity pipeline with the deterministic width smear is now the only design. Removes ~1560 lines of superseded code across the four jpsi mass-fit modules. Legacy path (the single-stage θ-conditioned forward-fold): * model: drop theta_conditioning, linearize_scale, detach_flow_on_data, fixed_theta_sampling, *_sample_center buffers, theta_*_cond_scale; remove nll, log_p_signal_data/mc, event_nll; the flow now conditions only on muon_kin (n_cond = N_MUON_KIN); _build_flow_cond folded into log_p_nominal. * trainer: drop _nll_step, _nll_components, _maybe_mc_only_batch, _epoch_metrics, _adaptive_sigma, compute_fisher_info (legacy), and _train_loop_legacy; drop --stage legacy, --linearize-scale, --mc-only, --detach-flow-on-data, --fixed-theta-sampling, the θ-sampling noise/ adaptive-σ args, and the legacy θ-lr args. * diagnostics: drop the θ-conditioned _flow_density_on_grid and the legacy fold branch; the signal grid is always the #2 direct-eval tilt and the MC template is always the per-muon physical fold. Convolution smear-mode (the stochastic GH mass convolution): * width is now the only smear: drop smear_mode, _continuity_logp_conv, _source_rho_std, _gh_nodes, sigma_qop_pm, _smear_per_event, _softplus_inv, the softplus θ_smear init (→ 0) and effective_theta_smear softplus branch; _continuity_response returns only the advection s_adv (no diffusion V); _continuity_logp is the single deterministic width-fold density. * trainer/diagnostics: drop --smear-mode, --continuity-n-gh, --smear-init-a/c and all n_gh plumbing. The per-muon qop folding scheme for the validation plots is kept (fold_sigma_qop_pm reads the signed width coeffs clipped at 0). Verified end-to-end on synthetic data: --stage flow → fit → diagnostics (all closure/θ plots) → uncertainties (observed + empirical Fisher) all run finite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The continuity-density Jacobian was capped by a hard `log(G'.clamp_min(0.05))`, whose log is FLAT past the floor — a gradient-free NLL basin the optimiser was escaping into at large |η| (V ∝ c/sin²θ large, the linearised one-step Euler Jacobian goes negative, the floor activates, the fit drifts to large negative θ_c with no cost). unbinnedcal22/testvalidation5 showed θ_c → −6.6 at |η|>1.4 for an injection that should give θ_c=+10. Two Jacobian forms, controlled by --jacobian-form / model.jacobian_form: softlog (default): log(G'.clamp_min(floor)) + (floor−G')⁺²/(2·floor²) — exact log in the physical region, plus a C¹ quadratic BARRIER past the floor (value and slope zero at the seam; grows quadratically below it). Bounded from below by log(floor) so the density can't diverge to ±∞ from a degenerate G'. Restores the gradient (slope ≈ −(floor−G')/floor² past the floor) so the optimiser is *pulled back* toward physical G' > 0 instead of parked in the floored basin. NLL scan on the same checkpoint+injection now argmin's at θ_c = +2 in the forward bin (previously: drifted to large negative). exp (option): frozen-score continuous-flow approximation log G' = log(1+s_adv'(m')) − V·∂²_m log p₀(m')/2, no floor. Always finite — the closed-form N→∞ Euler limit assuming the score is constant along the trajectory. Caveats: different operator approximation than softlog's exact N-step Jacobian (extra "frozen-score" assumption — breaks near sharp peak features); rewards unphysical sharpening UNBOUNDEDLY by −log G' = +V·∂²/2 (no natural cap), so issue #2 is not as well-defended as with softlog's barrier. NLL scan still argmin's at +2 but with a much shallower minimum and θ_c=−10 still scoring NLL = −0.74 (vs softlog's +0.04 penalty there). Use only when the V-too-large breakdown (issue #1) is the dominant concern. Adopted in _build_model, _load_full_fit, diagnostics' build. SMEAR_GP_FLOOR unchanged (legacy 0.05); now used by softlog as the barrier seam, ignored by exp. Backward-compatible: old checkpoints without `jacobian_form` in args default to "softlog" via getattr fallback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… singularities `--jacobian-form exp` crashed with `non-finite loss [fit] epoch 1` at high LR (--fit-scale-lr 0.9 --fit-smear-lr 0.9): a single event whose source mass m' strayed into the flow's tail produced an inf/NaN in ∂²log p₀, and even when that was contained, the exp form's unbounded sharpening reward (log_Gp = −V·∂² /2 with no floor) drove log_p_theta = logp0 − log_Gp huge positive on overshoot events → downstream `.exp()` in data_nll_continuity overflowed → mixture density inf → NaN loss → trainer abort. Two guards (both no-ops in the softlog path): 1. `_log_jacobian_exp`: `nan_to_num` on both log_J_scale and log_J_smear so the (small minority of) boundary events with inf/NaN d2 contribute 0 to the Jacobian instead of poisoning the batch. The exp form exposes d2 = ∂² log p₀ directly; the softlog autograd Gp absorbs the singularity through the chain rule. 2. `_continuity_logp`: cap `log_p_theta` from ABOVE at +50 (the physical log-density on the m-window scale is well inside [−30, +5], so the cap leaves the operating regime untouched while preventing exp() overflow downstream). No lower cap — that would kill the softlog barrier's pull-back gradient; .exp() of a very-negative log_p_theta underflows cleanly to 0 and the mixture collapses to the Bernstein background. nan_to_num catches any residual NaN. Verified: both forms now survive 7/7 batches at lr=0.9 (the user's config) without non-finite loss. The exp form still drifts θ_c → −0.6 at this lr (its unbounded sharpening reward + lr=0.9 ≫ softlog at the same lr → −1.9 with the barrier counter-force); use --jacobian-form softlog (the default) for proper #2 defence at high LR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ty on (a, c) Add a positivity reparameterisation for θ_smear, selectable via `--smear-param-form` / `model.smear_param_form`: * `linear` (default): the O(1) θ_smear is the coefficient directly — signed, supports both broadening (V>0) and unsmear (V<0). Backward-compatible. * `softplus`: each of (a, c) INDIVIDUALLY constrained to ≥ 0 via `physical = softplus(θ)·SMEAR_VAR_SCALE`. Use this to defend against the negative-c drift (issue #2) by construction, at the cost of losing the two-sided fit (the model can no longer represent MC that's too broad vs data). The per-η fit mask is applied AFTER the transform in the single accessor `_smear_raw_to_effective`, so frozen parameters (smear_fit_params='a'/'c', or smearing_enabled=False) and inactive coefficients evaluate to EXACTLY zero in the per-muon σ_qop and all downstream transformations (the continuity density forward map, _continuity_logZ via _smear_per_event_linear, the fold via _qop_var_pm, effective_theta_smear for plots/Fisher). Verified: under softplus + smear_fit_params='c', the frozen 'a' column is bit-exact 0 regardless of the raw θ_smear[:,0] value; the active 'c' column stays ≥ 0 across all η bins through training. Fisher σ on the smear: the σ_eff conversion now applies the delta-method Jacobian — identity for `linear` (unchanged), sigmoid(θ̂) for `softplus` (matches `d softplus/dθ`). The param_space label records the actual form. Wired through _build_model, _load_full_fit's adopt list, and the diagnostics build with safe `getattr(..., "linear")` fallbacks so old checkpoints default to the original (linear) form. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…osure mode Two related fixes for the forward-|η| smear-collapse / bkg-absorption pathology the validation closures were showing: 1) Post-injection window cut (loader). The injection adds the per-muon qop Gaussian kick to m_ll; for a c≈5e-5 smear ~3% of |η|>1.8 pseudo-data events were pushed outside [m_lo, m_hi]. The loader was passing those through, and downstream the Bernstein-d1 basis evaluated to NEGATIVE values (linear u-extrapolation), the mixture `f₀p₀ + f₁p₁ + f_s p_s` could go negative, `log(p_mix.clamp_min(eps))` floored at ~+log(1/eps) ≈ +69 per event, AND the MLP had a strong incentive to grow f_bkg so the Bernstein extrapolation (large positive on the "other side") would absorb them at finite NLL — directly feeding the f_bkg=38% at |η|>1.8 problem. Fix: in `_batch_tensors`, zero the per-event weight for events outside [stats.m_lo, stats.m_hi] after the injection is applied. For real data the snapshot already cut to the window, so this is a no-op; for injected MC it drops ~0.06% central / ~3% forward — small statistical loss, removes a real systematic. Verified: 415/415 out-of-window pseudo-data events from a c=5e-5 injection now have weight 0 (323 of them in |η|∈[1.8,2.4]). 2) --no-background (model + trainer). Even with the window cut, the in-window tails of the broadened signal still degenerate with the Bernstein bkg at forward |η|: the model can broaden (positive c, pays −log G' per event) OR let the MLP grow f_bkg (no Jacobian cost). The optimiser prefers the bkg "free lunch" and the smear collapses to 0. For validation closures the truth is f_bkg=0 by construction (MC pseudo-data is signal-only) — there's no physical degree of freedom for the MLP to represent. Add `background_enabled: bool = True` to the model; when False, `data_nll_continuity` reduces to `−log p_signal` (MLP and Bernstein both bypassed), and `train_stage2` / `run_bootstrap_continuity` skip the MLP parameter group and freeze its grads. CLI flag `--no-background` (default OFF so real-data fits are unchanged). Diagnostics' f_data overlay reads the model flag and forces f ≡ [0, 0, 1] when disabled instead of showing the random-init MLP output. Verified: with bkg disabled, scrambling the MLP parameters to 1e6 still gives a finite per-event NLL — the data branch is genuinely MLP-independent. The two effects compound and showed up together (cal22/testvalidation7: 38% f_bkg + θ_c→0 at |η|>1.8); they're also independent — #1 fixes the injection-induced contribution and is correct for any injection magnitude; #2 fixes the underlying degeneracy in the validation setup. Use both together for validation closures of large smearings. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

bendavid and others added 30 commits August 19, 2024 15:39

Merge pull request #523 from davidwalter2/240724_fakes

fe51c5b

Updates on QCD background

Merge branch 'main' of github.com:WMass/WRemnants

2737862

latest changes in theory agnostic after review

372cd73

Implement arguments to set fake smoothing polynomials

727e624

Implement Chebyshev polynomials

aa5f935

Small typo fix

10d0b90

Dump narf

809252d

Fix output file naming of postfit plotting script and add some option

944d023

Merge branch 'main' of github.com:WMass/WRemnants

f2ef475

Merging from Wmass/main

Merge branch 'main' of github.com:cippy/WRemnants

93ed0cc

Merge from cippy/main

Switch to chebyshev polynomial for smoothing and parameter variations…

b293c7b

… [0.1,0.1] for c1 and c2; Fix postfit plotting

Change to Chebyshev polynomials also for plotting script

5148309

Try to allow scientific notation in plotting

322ab95

Enable flow when swapping bins

ebc3457

Fix plotting script; add option to filter processes in postfit plotti…

5e0b0c7

…ng and simplify grouping in function

Add new plotting scripts for validation of fake estimation

fb3e487

Update nuisance parameter grouping; small plotting script fixes

b45675b

Fix bug in mass decorrelated fits where rebinning was bugged

76036ad

Small fix on plot_decorr_params script

264293d

Merge pull request #527 from davidwalter2/240820_fakeSmoothingPolynom…

84be64a

…ials QCD background smoothing polynomials

Merge branch 'main' of github.com:WMass/WRemnants into 240503_lineari…

812d96a

…zedUnfolding

Merge branch '240503_linearizedUnfolding' of github.com:davidwalter2/…

ae6efeb

…WRemnants into 240503_linearizedUnfolding

Fix security issue in index.php

8cbf695

Merge pull request #529 from davidwalter2/240823_updIndexPHP

92055c5

Fix security issue in index.php

Support to plot postfit noi variations; support for scintific notation

4264245

Small fix in styles

5dee67c

Dump narf

05236cb

some updates for plots and QCD studies

76d528f

Merge branch 'main' of github.com:WMass/WRemnants

76125a2

Merging from WMass/main

add option to read pseudodata from fit input hdf5 file

4e2b4ac

cippy and others added 24 commits November 5, 2024 16:18

Merge branch 'main' of github.com:WMass/WRemnants

1d204ce

Merging from WMass

Merge pull request #557 from davidwalter2/main

7c68f3d

Automatic deletion of old kerberos credentials

Merge pull request #556 from davidwalter2/241101_alphaS

f2fc642

Support alphaS fit

Steps towards double ratio, fix MIT html

72da6d3

Make legend padding configurable, some style improvements

b7c7066

Working version of double ratios

b8e1093

fix for ploting scripts

023f512

Merge branch 'main' of github.com:WMass/WRemnants into fixPR

6ef749d

Merge from WMass/main

remove test change

2c04f5c

Merge branch 'genPtllPlot' of https://github.com/kdlong/WRemnants

3936719

fix

8fe3d1e

another fix

e83a580

test

812d208

Fix event number tables in log files

f32618a

fix

0078fe1

Polish paper plots

be4d3be

Merge pull request #559 from cippy/fixPR

189aa48

Small fixes for plots

Merge branch 'main' of github.com:WMass/WRemnants

fac76a0

Polish pdf summary plot and others; fix parsing

2690829

Polish mass decorr plots

e2c6193

Further refinements on plotting scripts; Implement marcos comments

eadbe9e

Merge pull request #560 from davidwalter2/main

75d64a0

Updates on plots

Merge remote-tracking branch 'upstream/main'

7dcb223

Some parameter renaming and changes for the impact plots

55262f4

bendavid pushed a commit that referenced this pull request Feb 3, 2026

Merge pull request #2 from davidwalter2/250716_gen

69e2606

Fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some updates for the impact plot naming and style#2

Some updates for the impact plot naming and style#2
kdlong wants to merge 4122 commits into
bendavid:mainfrom
kdlong:impactPlots

kdlong commented Nov 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

kdlong commented Nov 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants