`stages.flux.daemon_flux` module improvements (documentation, speed-up option) and pipeline run duration benchmarking by thehrh · Pull Request #930 · icecube/pisa

thehrh · 2026-04-22T15:15:08Z

On top of resolving #929, this PR fixes missing documentation and cleans up the module and demonstrates performance (improved speed when using fast_interp package) and accuracy (apparently only minor count deviations) in the notebook.

Unrelated apart from the fact that performance is also front and centre: also adds a workflow for benchmarking the execution times of the example pipelines using the public data sample. Automated benchmark result plotting on GitHub pages is done with https://github.com/benchmark-action/github-action-benchmark.

…umentation

thehrh · 2026-04-22T15:37:32Z

Failing workflows are due to #927

…owed any longer), no need to set calc_mode in setup_function, cosmetics

…> 5 statements), no need to set representation in compute_function, turn default energy grid into global variable with units [no ci]

thehrh · 2026-04-23T14:53:05Z

Timings as of now:

branch (single CPU): flux daemon_flux
- setup:    Total time (s): +0.000, n calls: 1
- compute:  Total time (s): 74.671, n calls: 100, time/call (s): mean 0.747, max. 0.784, min. 0.735
- apply:    Total time (s): +0.000, n calls: 100, time/call (s): mean +0.000, max. +0.000, min. +0.000

master (single CPU): flux daemon_flux
- setup:    Total time (s): +0.000, n calls: 1
- compute:  Total time (s): 76.168, n calls: 100, time/call (s): mean 0.762, max. 2.039, min. 0.738
- apply:    Total time (s): +0.000, n calls: 100, time/call (s): mean +0.000, max. +0.000, min. +0.000

master (3 CPUs): flux daemon_flux
- setup:    Total time (s): +0.000, n calls: 1
- compute:  Total time (s): 76.544, n calls: 100, time/call (s): mean 0.765, max. 2.095, min. 0.740
- apply:    Total time (s): +0.000, n calls: 100, time/call (s): mean +0.000, max. +0.000, min. +0.000

Let's see the how much 1) the external flux calculation by daemonflux, 2) the splining, and 3) the spline evaluation contribute (calc_mode = events using the 3y DeepCore events file). Here's a fairly typical example of the durations:

daemonflux numu flux generation duration (s): 3.8e-02
splining duration (s): 1.6e-03
daemonflux antinumu flux generation duration (s): 3.8e-02
splining duration (s): 1.5e-03
daemonflux nue flux generation duration (s): 3.9e-02
splining duration (s): 1.6e-03
daemonflux antinue flux generation duration (s): 4.0e-02
splining duration (s): 1.5e-03
PISA spline evaluation time (s, 795502 events): 6.0e-01

One can tell that evaluating the splines in event-by-event mode takes roughly 75% of the total time.

If we compare the pipeline IceCube_3y_neutrinos_daemon.cfg to IceCube_3y_neutrinos.cfg, the latter has the services flux.honda_ip + flux.barr_simple instead of flux.daemon_flux and performs its flux computations on a 200 x 200 grid.

Between these two analysis configurations, the template generation time of the one using daemonflux is ~0.5 s slower on a single CPU.

If the daemonflux pipeline employs the same 200 x 200 grid instead, the above PISA spline evaluation time drops from 0.6 s-> 0.4 s (template generation time still ~0.3 s slower than for IceCube_3y_neutrinos.cfg).

thehrh · 2026-04-29T22:26:53Z

@marialiubarska @jpyanez
Given the significant contribution of the daemon_flux service to template generation times (I presume not just for this 3-year DRAGON pipeline, timings above) and to inform possible searches for alternatives or optimisation attempts, I'm wondering which reasons guided the choice of creating a 2D interpolant with RectBivariateSpline here? For example, was speed a factor at all?

marialiubarska · 2026-04-29T22:37:17Z

Hey, I'm not sure I understood if your question is on the reason we do 2d interpolation or on the choice of sepcific interpolation method.

The reason for 2d interpolation is speed, because of how daemonflux package uses two 1d interpolations, which is not ideal for our case where each neutrino has random coszen and energy, using it directly was much slower then re-interpolating in 2d for each set of updated parameters.

In terms of the specific method, I don't remember why RectBivariateSpline was chosen, I don't think I looked into spped comparison with other interpolation methods.

thehrh · 2026-04-29T22:41:08Z

The choice of RectBivariateSpline is what I'm interested in, but the background on the 2D choice is useful to know also. I presume you can confirm that daemon_flux is a significant contributor to the time it takes to generate expected distributions during any given fit?

marialiubarska · 2026-04-29T22:58:00Z

Hmm, I'm not 100% sure, but I think it's not just daemonflux, 2d interpolation does take at least comparable amount of time to the grid evaluation. It's just that it was much faster then looping over events.

There was a third method: for N events, sort them by energy and coszen into NxN grid, then use it to get flux from daemonflux, then keep the diagonal and discard other values. This approach was a bit faster then 2d interpolation, but we didn't use it because with larger numbers of events it could lead to memory issues

marialiubarska · 2026-04-29T23:00:49Z

I remember we had a discussion with @afedynitch regarding this, maybe he also has some thoughts

thehrh · 2026-04-29T23:41:03Z

Hmm, I'm not 100% sure, but I think it's not just daemonflux, 2d interpolation does take at least comparable amount of time to the grid evaluation. It's just that it was much faster then looping over events.

There was a third method: for N events, sort them by energy and coszen into NxN grid, then use it to get flux from daemonflux, then keep the diagonal and discard other values. This approach was a bit faster then 2d interpolation, but we didn't use it because with larger numbers of events it could lead to memory issues

Based on my timings above, obtaining the fluxes from the external daemonflux dependency and creating the RectBivariateSpline objects is much less important than evaluating the latter (all in event-by-event mode), which is why I'm interested in optimising the PISA side (=interpolants created from daemonflux.Flux.flux) at this point.

The other aspect is that the total template generation time on a single core is of the order of twice the daemon_flux PISA service's compute time , so an $X\,\%$ performance gain here should result in an overall $\sim X/2\,\%$ gain. I would assume this relation to be roughly independent of the underlying event sample.

Any timings from a fit including the daemon_flux service similar to https://icecube.github.io/pisa/notebooks/IceCube_3y_oscillations_example.html#profiling (which uses honda+barr instead) would be very useful to me.

…ctBivariateSpline for interpolation

thehrh · 2026-05-08T17:27:10Z

I've now included an external numba-accelerated interpolation package as an (experimental) option. Compared to the above timings—when not explicitly suppressing parallel compilation—the spline evaluation time is reduced by a factor of approx. 4. In this case, it is smaller than the total daemonflux flux generation time (given the indicated number of 795502 MC events in the DRAGON sample):

[    INFO] daemonflux numu flux generation duration (s): 4.90e-02
[    INFO] splining duration (s): 8.17e-04
[    INFO] daemonflux antinumu flux generation duration (s): 4.53e-02
[    INFO] splining duration (s): 5.43e-04
[    INFO] daemonflux nue flux generation duration (s): 3.92e-02
[    INFO] splining duration (s): 6.13e-04
[    INFO] daemonflux antinue flux generation duration (s): 4.00e-02
[    INFO] splining duration (s): 5.20e-04
[    INFO] PISA spline evaluation time (s, 795502) events): 1.50e-01

The event distributions differ at the sub-percent level; here are typical maximal and minimal absolute fractional differences between binwise count expectations:

nue_cc, max: 0.0006977199225309526, min: 0.0006444681210882939
numu_cc, max: 7.285358006379291e-05, min: 5.952816310806611e-05
nutau_cc, max: 0.0008284691842364866, min: 0.0006873706923714039
nue_nc, max: 0.006025318891604914, min: 0.005163450066271475
numu_nc, max: 0.00019059368789305282, min: 0.00015835165290412938
nutau_nc, max: 0.0005200821727574986, min: 0.00042091459621874044
nuebar_cc, max: 0.001419649454289934, min: 0.0012558469729848204
numubar_cc, max: 0.0001523649138680785, min: 0.00013414333696720485
nutaubar_cc, max: 0.0008741137723083838, min: 0.0007975125458589856
nuebar_nc, max: 0.014431783203481844, min: 0.011095828418334359
numubar_nc, max: 0.0008163497000315399, min: 0.0007224898112474274
nutaubar_nc, max: 0.003314216747600409, min: 0.003006488879502088

I've also generated 50 random combinations of the parameters of the daemon_flux service in the public DRAGON neutrino pipeline and plotted the ratios between the event distributions resulting from the two implementations here: https://user-web.icecube.wisc.edu/~tehrhardt/map_comparisons/daemon_flux_new_regular_vs_fast_interp/fast_interp_default_egrid/.

A numu CC example is:

Others seem to look very similar. I wonder whether analyses are sensitive to the choice of implementation.

thehrh · 2026-05-20T15:09:31Z

Note: fast-interpolation timings below acquired with up to 4 allowed threads

Since I'm not aware of a specific motivation behind the choice of the ENERGY_GRID global variable, following are timings and relative Map deviations for other choices. As a reference, the minimal daemon_flux compute time with fast interpolation and the default energy grid, of 500 log-spaced values between 0.1 and 10⁵ GeV, is ~0.31 s.

There does not seem to be any significant speed up once grid size drops below 100 values (compute time ~0.21 s), where the relative deviations are still very similar to those of template produced with the default grid though. At 10 (!) grid points, all relative deviations are at the percent level.

energy grid with 200 log-spaced values

nue_cc, max: 0.0007032636910766987, min: 0.0006480499738102815
numu_cc, max: 6.76457459031931e-05, min: 5.590904064865424e-05
nutau_cc, max: 0.0008341897363580841, min: 0.0006910765444860524
nue_nc, max: 0.006078318403203845, min: 0.005193687414849116
numu_nc, max: 0.00019264806947761118, min: 0.00016047070964865805
nutau_nc, max: 0.0005089321612748992, min: 0.0003961536630941301
nuebar_cc, max: 0.0014348484781289731, min: 0.001275686138113708
numubar_cc, max: 0.00015358870153954767, min: 0.00013423628626833082
nutaubar_cc, max: 0.000886009634639455, min: 0.0007861050605572776
nuebar_nc, max: 0.014420792412245613, min: 0.01108239079226981
numubar_nc, max: 0.0008108631668439374, min: 0.0007059665774915829
nutaubar_nc, max: 0.003314310641292867, min: 0.003006599110231705

minimal daemon_flux compute duration: 0.235 s (vs. regular implementation: 0.748 s, both subject to fluctuations)
plots in https://user-web.icecube.wisc.edu/~tehrhardt/map_comparisons/daemon_flux_new_regular_vs_fast_interp/fast_interp_200_egrid/

energy grid with 100 log-spaced values

nue_cc, max: 0.0007562574961501183, min: 0.0006958975848508406
numu_cc, max: 9.408956167305278e-05, min: 7.177148553148146e-05
nutau_cc, max: 0.0008684512583732959, min: 0.0006928834822301446
nue_nc, max: 0.006124932126317551, min: 0.005278143020090229
numu_nc, max: 0.0001945088630831319, min: 0.0001625651172070575
nutau_nc, max: 0.0004409632070347742, min: 0.0003121188374449897
nuebar_cc, max: 0.0015471582456897215, min: 0.0014067864193395777
numubar_cc, max: 0.00014762517320871717, min: 0.00012948326146812488
nutaubar_cc, max: 0.0009158186987179999, min: 0.0007770057030777512
nuebar_nc, max: 0.01426574325371879, min: 0.010907300148320963
numubar_nc, max: 0.0008719907396266656, min: 0.0007393728077474975
nutaubar_nc, max: 0.00310803426122814, min: 0.0028841790153176774

minimal daemon_flux compute duration: 0.213 s (vs. regular implementation: 0.743 s)
plots in https://user-web.icecube.wisc.edu/~tehrhardt/map_comparisons/daemon_flux_new_regular_vs_fast_interp/fast_interp_100_egrid/
individual contributions to compute duration (one sample)—we'd expect the daemonflux flux generation times to decrease with decreasing no. of grid points

[    INFO] daemonflux numu flux generation duration (s): 2.94e-02
[    INFO] splining duration (s): 4.76e-04
[    INFO] daemonflux antinumu flux generation duration (s): 2.42e-02
[    INFO] splining duration (s): 6.01e-04
[    INFO] daemonflux nue flux generation duration (s): 2.64e-02
[    INFO] splining duration (s): 4.76e-04
[    INFO] daemonflux antinue flux generation duration (s): 2.74e-02
[    INFO] splining duration (s): 4.75e-04
[    INFO] PISA spline evaluation time (s, 795502) events): 1.44e-01

energy grid with 50 log-spaced values

nue_cc, max: 0.0007601242442450019, min: 0.0006979777841420998
numu_cc, max: 0.0004458494877522111, min: 0.0002604620387897745
nutau_cc, max: 0.0009039585320506854, min: 0.0006548871616288167
nue_nc, max: 0.006186760045457843, min: 0.005318751930374121
numu_nc, max: 0.0003193439661157826, min: 0.00024513407589436863
nutau_nc, max: 0.0005713281179285144, min: 0.0003753845242440459
nuebar_cc, max: 0.0014010953700068023, min: 0.0012283004896959083
numubar_cc, max: 0.00029831239895296605, min: 0.0002219255914019182
nutaubar_cc, max: 0.0008997864739751873, min: 0.0007847121073513096
nuebar_nc, max: 0.014669337019489017, min: 0.011185927609781817
numubar_nc, max: 0.0010766805047367196, min: 0.0009330475312435296
nutaubar_nc, max: 0.003351553177938384, min: 0.0030404888162098884

minimal daemon_flux compute duration: 0.203 s (vs. regular implementation: 0.744 s)
plots in https://user-web.icecube.wisc.edu/~tehrhardt/map_comparisons/daemon_flux_new_regular_vs_fast_interp/fast_interp_50_egrid/

energy grid with 25 log-spaced values

nue_cc, max: 0.0010238108720788549, min: 0.0009030754986619497
numu_cc, max: 0.000792844798613296, min: 0.0005191869667389621
nutau_cc, max: 0.001216129732052831, min: 0.0010716849205862611
nue_nc, max: 0.00599184761086515, min: 0.0051628881020905895
numu_nc, max: 0.0011878009231093092, min: 0.0009010935115135575
nutau_nc, max: 0.001156238624448378, min: 0.0007599743389719115
nuebar_cc, max: 0.0012960487235788546, min: 0.0010315298807703523
numubar_cc, max: 0.0004840622397312057, min: 0.00035036526953658917
nutaubar_cc, max: 0.0012904755428669134, min: 0.0011411679076896112
nuebar_nc, max: 0.014797319370727577, min: 0.010957170143665327
numubar_nc, max: 0.0010384473847636162, min: 0.0008936761776663445
nutaubar_nc, max: 0.0030963027675952205, min: 0.002820337667783901

minimal daemon_flux compute duration: 0.224 s (vs. regular implementation: 0.790 s)

energy grid with 10 (!) log-spaced values

nue_cc, max: 0.0308539886398366, min: 0.017277063903420285
numu_cc, max: 0.008770646400080415, min: 0.006050360347252464
nutau_cc, max: 0.010577352663165505, min: 0.005677879412975053
nue_nc, max: 0.02454013389333761, min: 0.014977111866241016
numu_nc, max: 0.009896584662672882, min: 0.005194588511014799
nutau_nc, max: 0.014032345142411268, min: 0.00775040902142593
nuebar_cc, max: 0.032752431356523906, min: 0.020514113485681613
numubar_cc, max: 0.01373526114756977, min: 0.008080009263064921
nutaubar_cc, max: 0.01709734102760799, min: 0.01072478751402709
nuebar_nc, max: 0.03377852929136632, min: 0.02546872585532833
numubar_nc, max: 0.010188422884134476, min: 0.005343213240449076
nutaubar_nc, max: 0.01932538860051647, min: 0.013034873837136392

minimal daemon_flux compute duration: 0.210 s (vs. regular implementation: 0.771 s)

…organise TODOs at top of module

…x in module [no ci]

…arget + minor mods

…no ci]

…line execution times [no ci]

daemon_flux module: remove numba jit decorators, clean up and add doc…

42a6524

…umentation

thehrh added this to the 4.3 milestone Apr 22, 2026

thehrh added usability documentation convention labels Apr 22, 2026

thehrh added 2 commits April 22, 2026 21:45

daemon_flux service: define supported reps in init (no apply_mode all…

fac23a1

…owed any longer), no need to set calc_mode in setup_function, cosmetics

clean up daemonflux service further: vectorise flux computation (10 -…

691e4c7

…> 5 statements), no need to set representation in compute_function, turn default energy grid into global variable with units [no ci]

thehrh added 2 commits April 23, 2026 17:47

minor changes somehow forgotten in last commit [no ci]

d4e1a34

small fixes [no ci]

5d910a6

don't hard-code minimum daemonflux version in runtimerror either [no ci]

03d6c86

Merge remote-tracking branch 'origin/master' into daemon_flux_no_numba

97421b4

thehrh marked this pull request as draft May 7, 2026 14:07

thehrh added 2 commits May 7, 2026 16:35

give daemonflux the option of using fast_interp package instead of Re…

5859ec8

…ctBivariateSpline for interpolation

log (info) various daemon_flux service timings [no ci]

2606240

thehrh changed the title ~~stages.flux.daemon_flux module: remove numba jit decorators, clean up and add documentation~~ stages.flux.daemon_flux module: remove numba jit decorators, clean up and add documentation, speed up interpolation/evaluation May 8, 2026

thehrh added 2 commits May 20, 2026 17:40

allow user to set energy grid to be passed to daemonflux package and …

f664706

…organise TODOs at top of module

also test fast daemon_flux configuration in notebook and minor doc fi…

507d351

…x in module [no ci]

thehrh mentioned this pull request May 20, 2026

rerun daemonflux example notebook #883

Merged

thehrh marked this pull request as ready for review May 20, 2026 17:42

thehrh marked this pull request as draft May 21, 2026 21:29

thehrh added 5 commits May 22, 2026 00:58

make sure daemon_flux service respects PISA_NUM_THREADS for a given t…

41a6055

…arget + minor mods

update daemonflux demonstration notebook (single core, new timings) […

4c47333

…no ci]

add script for benchmarking pipeline execution times [no ci]

1dba5c6

add manually dispatchable workflow for benchmarking main example pipe…

e86afcd

…line execution times [no ci]

improve workflow and benchmarking script further [no ci]

c6a8331

thehrh changed the title ~~stages.flux.daemon_flux module: remove numba jit decorators, clean up and add documentation, speed up interpolation/evaluation~~ stages.flux.daemon_flux module improvement (documentation, speed-up option) and pipeline run duration benchmarking May 28, 2026

thehrh changed the title ~~stages.flux.daemon_flux module improvement (documentation, speed-up option) and pipeline run duration benchmarking~~ stages.flux.daemon_flux module improvements (documentation, speed-up option) and pipeline run duration benchmarking May 28, 2026

thehrh marked this pull request as ready for review May 28, 2026 00:31

thehrh added enhancement infrastructure external dependency performance and removed usability documentation convention labels May 28, 2026

Merge branch 'master' into daemon_flux_no_numba

d02d158

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`stages.flux.daemon_flux` module improvements (documentation, speed-up option) and pipeline run duration benchmarking#930

`stages.flux.daemon_flux` module improvements (documentation, speed-up option) and pipeline run duration benchmarking#930
thehrh wants to merge 17 commits into
masterfrom
daemon_flux_no_numba

thehrh commented Apr 22, 2026 •

edited

Loading

Uh oh!

thehrh commented Apr 22, 2026

Uh oh!

thehrh commented Apr 23, 2026 •

edited

Loading

Uh oh!

thehrh commented Apr 29, 2026

Uh oh!

marialiubarska commented Apr 29, 2026

Uh oh!

thehrh commented Apr 29, 2026 •

edited

Loading

Uh oh!

marialiubarska commented Apr 29, 2026

Uh oh!

marialiubarska commented Apr 29, 2026

Uh oh!

thehrh commented Apr 29, 2026

Uh oh!

thehrh commented May 8, 2026 •

edited

Loading

Uh oh!

thehrh commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thehrh commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thehrh commented Apr 22, 2026

Uh oh!

thehrh commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thehrh commented Apr 29, 2026

Uh oh!

marialiubarska commented Apr 29, 2026

Uh oh!

thehrh commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marialiubarska commented Apr 29, 2026

Uh oh!

marialiubarska commented Apr 29, 2026

Uh oh!

thehrh commented Apr 29, 2026

Uh oh!

thehrh commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thehrh commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

energy grid with 200 log-spaced values

energy grid with 100 log-spaced values

energy grid with 50 log-spaced values

energy grid with 25 log-spaced values

energy grid with 10 (!) log-spaced values

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thehrh commented Apr 22, 2026 •

edited

Loading

thehrh commented Apr 23, 2026 •

edited

Loading

thehrh commented Apr 29, 2026 •

edited

Loading

thehrh commented May 8, 2026 •

edited

Loading

thehrh commented May 20, 2026 •

edited

Loading