Skip to content

fix(sql): Further optimize GetForecastAsTimeseries#141

Merged
devsjc merged 1 commit intomainfrom
devsjc/forecast-sql-speedup
Mar 25, 2026
Merged

fix(sql): Further optimize GetForecastAsTimeseries#141
devsjc merged 1 commit intomainfrom
devsjc/forecast-sql-speedup

Conversation

@devsjc
Copy link
Contributor

@devsjc devsjc commented Mar 25, 2026

Contribution Checklist

  • Have you followed the Open Climate Fix Contribution Guidelines?
  • Have you referenced the Issue this PR addresses, where applicable?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added a summary of the changes?
  • Have you written new tests for your changes, where applicable?
  • Have you successfully run make lint with your changes locally?
  • Have you successfully run make test with your changes locally?

Warning

PRs may be closed if all the above boxes are not checked.

Changes in this Pull Request

Uses the LATERAL join trick to reduce the number of joins to the predicted_generation_values table.

The previous query was joining every possible forecast to the forecast values table in order to then filter

-> Append  (cost=0.42..82.73 rows=55 width=103) (actual time=0.005..0.005 rows=1.00 loops=38996)

Whereas, with the LIMIT 1 LATERAL join, the latest forecast for each location is found prior to joining

-> Append  (cost=0.42..131.49 rows=54 width=103) (actual time=0.773..0.773 rows=1.00 loops=331)

Sorting these old overlapping forecasts was being done by writing them to disk

-> Sort  (cost=1999.52..2000.04 rows=206 width=291) (actual time=218.055..220.497 rows=21184.00 loops=1)
      Sort Method: external merge  Disk: 5352kB

Which is entirely unecessary after the new LATERL/LIMIT process - postgres can simply join each individual forecast in memory via the pre-sorted index.

-> Limit  (cost=7.12..13.46 rows=1 width=177) (actual time=0.470..0.470 rows=1.00 loops=331)
-> Merge Append  (cost=7.12..184.70 rows=28 width=177) (actual time=0.469..0.469 rows=1.00 loops=331)

This reduces CPU usage by around 80% for this query. It is still constrained by disk I/O.

@github-actions
Copy link

Benchmark Results

Benchmark results
?   	github.com/openclimatefix/data-platform/cmd	[no test files]
?   	github.com/openclimatefix/data-platform/internal/gen/ocf/dp	[no test files]
?   	github.com/openclimatefix/data-platform/internal/interceptors	[no test files]
PASS
ok  	github.com/openclimatefix/data-platform/internal/server/dummy	0.005s
{"level":"debug","time":"2026-03-25T08:36:43Z","message":"Completed migrations"}
goos: linux
goarch: amd64
pkg: github.com/openclimatefix/data-platform/internal/server/postgres
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkPostgresClient/small/GetForecastAsTimeseries-4         	      68	  16100567 ns/op
BenchmarkPostgresClient/small/GetForecastAtTimestamp-4          	     286	   4061482 ns/op
BenchmarkPostgresClient/small/GetObservationsAsTimeseries-4     	     914	   1243641 ns/op
BenchmarkPostgresClient/small/CreateForecast-4                  	      92	  12643070 ns/op
PASS
ok  	github.com/openclimatefix/data-platform/internal/server/postgres	71.823s
?   	github.com/openclimatefix/data-platform/internal/server/postgres/gen	[no test files]
Benchmark vs base branch
goos: linux
goarch: amd64
pkg: github.com/openclimatefix/data-platform/internal/server/postgres
cpu: AMD EPYC 7763 64-Core Processor                
                                                   │ bench-main.txt │ bench-devsjc-forecast-sql-speedup.txt │
                                                   │     sec/op     │    sec/op     vs base                 │
PostgresClient/small/GetForecastAsTimeseries-4         18.12m ± ∞ ¹   16.10m ± ∞ ¹        ~ (p=1.000 n=1) ²
PostgresClient/small/GetForecastAtTimestamp-4         14.761m ± ∞ ¹   4.061m ± ∞ ¹        ~ (p=1.000 n=1) ²
PostgresClient/small/GetObservationsAsTimeseries-4     1.290m ± ∞ ¹   1.244m ± ∞ ¹        ~ (p=1.000 n=1) ²
PostgresClient/small/CreateForecast-4                  13.72m ± ∞ ¹   12.64m ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                8.296m         5.663m        -31.74%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

@devsjc devsjc merged commit c3bcfce into main Mar 25, 2026
4 checks passed
@devsjc devsjc deleted the devsjc/forecast-sql-speedup branch March 25, 2026 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant