remove gradient in state.positions after mace forward by thomasloux · Pull Request #540 · TorchSim/torch-sim

thomasloux · 2026-04-09T07:39:47Z

Summary

Mace tends to add gradient to positions by running data["positions"].requires_grad_(True)
https://github.com/ACEsuit/mace/blob/main/mace/modules/models.py#L776
Because Mace TorchSim interface passes state.positions directly, the gradient flows back to torchsim algorithms. In the following relaxation script, this ends failing with a weird random failure. I add Claude Code diagnostic explaining the steps for those interested:

Root Cause Chain
The gradient contamination follows this path across BFGS steps:

Step 1 — The seed

MACE model calls positions.requires_grad_(True) internally to compute forces via torch.autograd.grad. This is an in-place operation on state.positions (since
data_dict["positions"] is the same tensor object).

After the model returns, forces are detached, but state.positions still has requires_grad=True.

Step 1 — Hessian contamination

frac_positions = torch.linalg.solve(deform_grad, state.positions) — inherits requires_grad from positions

dpos = pos_new - pos_old — inherits from frac_positions

Hessian update terms term1, term2 inherit from dpos

state.hessian[idx] = H - term1 - term2 — this in-place IndexPut makes state.hessian part of the autograd graph (IndexPutBackward0)

Step 2 — Cell contamination

H_group = state.hessian[...] — inherits requires_grad from contaminated hessian

Eigendecomposition → step_group requires grad → step_dense requires grad → dr_cell requires grad

cell_positions_new = state.cell_positions + dr_cell → requires grad

deform_grad_new = torch.matrix_exp(...) → requires grad

state.row_vector_cell = torch.bmm(ref_cell, deform_grad_new.T) → state.cell gets requires_grad=True

Step 2+ — Forces contamination

_symmetrize_rank1 uses state.row_vector_cell (requires grad) as lattice

symmetrize_rank1 returns a tensor with grad_fn (through inv(lattice) and @ lattice)

vectors[start:end] = symmetrize_rank1(...) — in-place CopyBackwards makes forces require grad

Forces with grad → torch.split in _split_state → views with SplitWithSizesBackward0

post_init tries in-place modification on these views → CRASH

Script to reproduce:
uv sync --extra mace
uv run --with pymatgen --with moyopy python fix_relax_gradient.py

import torch
import torch_sim as ts
from torch_sim import Optimizer

from pymatgen.core.structure import Structure

# structure = Structure.from_file("Ti.cif")
# Create hcp titanium structure
structure = Structure(
    lattice=[[2.95, 0, 0], [-1.475, 2.556, 0], [0, 0, 4.68]],
    species=["Ti"] * 2,
    coords=[[0, 0, 0], [1 / 3, 2 / 3, 1 / 2]],
)
structure_list = [structure]

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

optimizer_enum = Optimizer("bfgs")

convergence_fn = ts.generate_force_convergence_fn(
  force_tol=0.01, include_cell_forces=True
)

from mace.calculators import mace_mp
from torch_sim.models.mace import MaceModel

dtype = torch.float32
model = mace_mp(
  model="medium-mpa-0",
  dispersion=False,
  default_dtype=dtype,
  device=device,
  return_raw_model=True,
)
model = MaceModel(model, compute_stress=True, device=device)

autobatcher = ts.InFlightAutoBatcher(
  model, memory_scales_with="n_atoms", max_memory_scaler=5_000
)

init_kwargs: dict = {}
from torch_sim import CellFilter

init_kwargs["cell_filter"] = CellFilter(
  "frechet"
)

states = ts.initialize_state(structure_list, dtype=model.dtype, device=model.device)
states.constraints = ts.constraints.FixSymmetry.from_state(
  states, symprec=0.1
)
system = states

final_state = ts.optimize(
  system=system,
  model=model,
  optimizer=optimizer_enum,
  convergence_fn=convergence_fn,
  max_steps=100,
  init_kwargs=init_kwargs or None,
  autobatcher=autobatcher,
)

Checklist

Before a pull request can be merged, the following items must be checked:

Doc strings have been added in the Google docstring format.
Run ruff on your code.
Tests have been added for any new functionality or bug fixes.

thomasloux · 2026-04-09T07:48:23Z

I'm ok with other solutions like pass a copy of state.positions but removing gradient seems the lightest solution

CompRhys · 2026-04-09T13:21:48Z

LGTM, just awareness that as we move to external posture for mace on the mace 0.3.16 release this will likely change upstream. I hope this tooling PR will go in before then and I don't think that this issue is addressed over there.

c.f. #524

thomasloux · 2026-04-09T15:06:57Z

This should be good in the new external Mace Model because the positions used by the model is positions after wrapping, so when you run wrapped_positions = wrapped_positions.requires_grad_(True), it does not propagate to the original state.positions.
In this case we don't need to merge to PR.

thomasloux · 2026-04-09T15:08:02Z

But this problem is a good reminder to be careful about side effects when modifying input arguments

CompRhys · 2026-04-09T15:44:27Z

merge or close? Did you try checkout that PR and see if it fixes the issue? if so would appreciate a bump on the thread to prompt ilyas towards merging and cutting a new mace-torch release

remove gradient in state.positions after mace forward

d9ec2ed

thomasloux requested review from CompRhys and orionarcher April 9, 2026 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove gradient in state.positions after mace forward#540

remove gradient in state.positions after mace forward#540
thomasloux wants to merge 1 commit intoTorchSim:mainfrom
thomasloux:fix/mace-positions-gradient

thomasloux commented Apr 9, 2026 •

edited

Loading

Uh oh!

thomasloux commented Apr 9, 2026

Uh oh!

CompRhys commented Apr 9, 2026

Uh oh!

thomasloux commented Apr 9, 2026

Uh oh!

thomasloux commented Apr 9, 2026

Uh oh!

CompRhys commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thomasloux commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

thomasloux commented Apr 9, 2026

Uh oh!

CompRhys commented Apr 9, 2026

Uh oh!

thomasloux commented Apr 9, 2026

Uh oh!

thomasloux commented Apr 9, 2026

Uh oh!

CompRhys commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thomasloux commented Apr 9, 2026 •

edited

Loading