Conversation
|
Performance improvement with multiple streams: New: Old: |
|
@javak87 : I am happy to merge it. Any reason it was still marked as draft? |
Not a specific reason. You can merge it. |
|
@javak87 : Can we use this version: It's equivalent to your code but avoids one shape promotion (in the third and fourth line of your code this is happening ones implicit and ones explicit). |
Good suggestion!! |
Ok, please double-check and then we can merge. |
Since Given this, I think my proposed change performs slightly better. |
|
decreased from 4486 to 4416 per GPU. @javak87 : For me this is in the noise range. Can you reproduce this difference reliably? |
Run
|
Let's use There is one temporary less. The small degradation might change with minor changes in pytorch and I prefer the cleaner solution. |
* optimize positional encoding * update positional encoding impl --------- Co-authored-by: Javad Kasravi <j.kasravi@fz-juelich.de> Co-authored-by: Christian Lessig <christian.lessig@ecmwf.int>


Description
This PR introduces a minor change in the code, resulting in a significant performance gain.
Issue Number
Fixes #2173
Is this PR a draft? Mark it as draft.
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60Performance comparison with develop branch
../WeatherGenerator-private/hpc/launch-slurm.py --time 60../WeatherGenerator-private/hpc/launch-slurm.py --time 60 --base-config ./config/config_forecasting.yml../WeatherGenerator-private/hpc/launch-slurm.py --time 60 --base-config ./config/config_jepa.ymlPerformance improvements ranging from 14% to 94%, depending on the configuration, are expected.