Foil hole suggestions for a given grid square by d-j-hatton · Pull Request #296 · DiamondLightSource/smartem-decisions

d-j-hatton · 2026-06-03T09:05:47Z

Adds add an endpoint to make suggestions about holes to collect on a specified grid square accounting for foil hole score and distribution across different foil hole types as determined from a latent space representation (provided externally to this package). A larger number of holes are selected per latent space cluster when compared to the grid square code to account for the fact that it is unlikely every cluster will appear on a given grid square (clustering is across the whole grid).

Addresses #295

…fied grid square accounting for foil hole score and distribution across different foil hole types as determined from a latent space representation (provided externally to this package)

vredchenko

Thanks for this - the structure closely mirrors get_suggested_square_collections, which is the right approach. A few divergences from that reference function look like they would cause the endpoint to fail on most real inputs, and there is no test coverage to catch it. Details inline; the central one is that scores here holds (FoilHole, CurrentQualityPrediction) Row tuples that are sorted and indexed as if they were FoilHole objects, whereas the square endpoint extracts p[0] and sorts on x[1].value first.

Worth adding while here:

A unit test for this endpoint (and ideally backfilling one for the square version) - it is a pure ranking function over DB rows, so it should be cheap to test and would pin down the row handling and the per-cluster cap.
Module-level named constants for the two magic numbers (see inline), e.g. near the top of api_server.py:

HOLE_SELECTION_FRACTION = 2  # consider the top 1/N of scored holes
MAX_HOLES_PER_CLUSTER = 4

The inline suggestion uses these names. The same rationale (4 here vs 2 for grid squares) could later be applied to the square endpoint for consistency, but that is out of scope for this PR.

Marking as a comment rather than a blocker.

vredchenko · 2026-06-04T10:24:06Z

+async def get_suggested_hole_collections(
+    gridsquare_uuid: str, prediction_model_name: str, latent_rep_model_name: str, db: AsyncSession = DB_DEPENDENCY
+):
+    gridsquare = (await db.execute(select(GridSquare).where(GridSquare.uuid == gridsquare_uuid))).one()


.one() returns a Row, not a GridSquare, so gridsquare.grid_uuid on the next line raises AttributeError. Everywhere else in api_server.py uses .scalar_one() / .scalars() for single-entity fetches - this is the only bare .one(). It runs before the loop, so it would fail on every request regardless of hole count.

Suggested change

gridsquare = (await db.execute(select(GridSquare).where(GridSquare.uuid == gridsquare_uuid))).one()

gridsquare = (await db.execute(select(GridSquare).where(GridSquare.uuid == gridsquare_uuid))).scalar_one()

vredchenko · 2026-06-04T10:24:06Z

+    scores.sort(reverse=True)
+    cluster_counts = dict.fromkeys(set(cluster_indices.values()), 0)
+    suggested = []
+    for i in range(len(scores) // 2):
+        hole = scores[i]
+        if cluster_counts[cluster_indices[hole.uuid]] < 4:
+            suggested.append(hole)
+            cluster_counts[cluster_indices[hole.uuid]] += 1
+    return suggested


scores is a list of (FoilHole, CurrentQualityPrediction) Row tuples, which causes three problems here:

scores.sort(reverse=True) compares rows tuple-wise, hitting the FoilHole objects first. FoilHole is a SQLModel with no __lt__, so this raises TypeError as soon as there are two or more holes. It also means the prediction value is never read - the square endpoint sorts with key=lambda x: x[1].value, so as written the ranking-by-score is lost entirely, which contradicts the PR description.

hole = scores[i] is a Row, so hole.uuid is an AttributeError (row keys are entity-level, not column-level).

suggested.append(hole) appends the Row against response_model=list[FoilHole], which will not serialise.

Adopting the square endpoint's extract-then-sort idiom fixes all three. This also guards the cluster_indices lookup (otherwise a hole with no cluster-index row raises KeyError) and uses the named constants from the summary:

Suggested change

scores.sort(reverse=True)

cluster_counts = dict.fromkeys(set(cluster_indices.values()), 0)

suggested = []

for i in range(len(scores) // 2):

hole = scores[i]

if cluster_counts[cluster_indices[hole.uuid]] < 4:

suggested.append(hole)

cluster_counts[cluster_indices[hole.uuid]] += 1

return suggested

score_ordered_holes = [p[0] for p in sorted(scores, key=lambda x: x[1].value, reverse=True)]

cluster_counts = dict.fromkeys(set(cluster_indices.values()), 0)

suggested = []

for hole in score_ordered_holes[: len(score_ordered_holes) // HOLE_SELECTION_FRACTION]:

cluster = cluster_indices.get(hole.uuid)

if cluster is not None and cluster_counts[cluster] < MAX_HOLES_PER_CLUSTER:

suggested.append(hole)

cluster_counts[cluster] += 1

return suggested

Note the suggestion references HOLE_SELECTION_FRACTION and MAX_HOLES_PER_CLUSTER; add those at module level (see summary) before applying, or drop in the literals 2 and 4 if you would rather not.

vredchenko · 2026-06-04T10:24:06Z

+                select(QualityPredictionModelParameter)
+                .where(QualityPredictionModelParameter.grid_uuid == grid_uuid)
+                .where(QualityPredictionModelParameter.prediction_model_name == latent_rep_model_name)
+                .where(QualityPredictionModelParameter.group == "cluster_indices")


This filters the cluster parameters by grid_uuid and latent-rep model only, reusing the same "cluster_indices" group as the grid-square endpoint. Could you confirm the externally-provided parameters here are keyed by foil-hole uuid and will not collide with the grid-square-level cluster indices stored under the same group and grid? If both levels share the group, a distinct group name or an extra discriminator may be needed.

vredchenko · 2026-06-04T10:46:06Z

@d-j-hatton as you may have gathered the above review is me unleashing Claude Code on the PR, it's known to make mistakes. I shall attend in person later, focusing on clearing f/e feature backlog at present

vredchenko · 2026-06-04T11:13:41Z

Also, as we merge it we need to be mindful that we've changed the API source of truth - to that end we'll need a version bump resulting in a new release, which should kick off automation to update f/e client and devtools repo docs - leave it with me to verify it all works as intended

d-j-hatton · 2026-06-04T13:33:03Z

The sorting issue was a genuine bug. I also changed the .one() behaviour to .scalar_one() to align better with the rest of the code but haven't checked whether .one() works or not with the async session (I'm fairly sure that it would work normally)

add an endpoint to make suggestions about holes to collect on a speci…

a3bf338

…fied grid square accounting for foil hole score and distribution across different foil hole types as determined from a latent space representation (provided externally to this package)

d-j-hatton added the enhancement Minor improvements to existing functionality label Jun 3, 2026

one needs to be called on the result of the await

7a98b9d

vredchenko reviewed Jun 4, 2026

View reviewed changes

logic fix for score sorting in foilhole suggestion endpoint

d7108c1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foil hole suggestions for a given grid square#296

Foil hole suggestions for a given grid square#296
d-j-hatton wants to merge 3 commits into
mainfrom
feature/foilhole-suggestions

d-j-hatton commented Jun 3, 2026

Uh oh!

vredchenko left a comment

Uh oh!

vredchenko Jun 4, 2026

Uh oh!

vredchenko Jun 4, 2026

Uh oh!

vredchenko Jun 4, 2026

Uh oh!

vredchenko commented Jun 4, 2026

Uh oh!

vredchenko commented Jun 4, 2026

Uh oh!

d-j-hatton commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	gridsquare = (await db.execute(select(GridSquare).where(GridSquare.uuid == gridsquare_uuid))).one()
	gridsquare = (await db.execute(select(GridSquare).where(GridSquare.uuid == gridsquare_uuid))).scalar_one()

Conversation

d-j-hatton commented Jun 3, 2026

Uh oh!

vredchenko left a comment

Choose a reason for hiding this comment

Uh oh!

vredchenko Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

vredchenko Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

vredchenko Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

vredchenko commented Jun 4, 2026

Uh oh!

vredchenko commented Jun 4, 2026

Uh oh!

d-j-hatton commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants