Skip to content

Add mp-api interface#91

Open
jan-janssen wants to merge 7 commits intomainfrom
mp-api
Open

Add mp-api interface#91
jan-janssen wants to merge 7 commits intomainfrom
mp-api

Conversation

@jan-janssen
Copy link
Copy Markdown
Member

@jan-janssen jan-janssen commented Oct 5, 2023

Summary by CodeRabbit

  • New Features

    • Materials Project integration: search for crystal structures by composition and retrieve structures by Materials Project ID; results are returned as standard atomic-structure objects.
  • Chores

    • Optional development/install updated to include the Materials Project API (mp-api v0.37.2).

@jan-janssen jan-janssen marked this pull request as draft February 14, 2024 13:27
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Materials Project support: new materialsproject.py module providing search and by_id helpers that call mp-api, convert pymatgen structures to ASE Atoms, re-export those helpers from the build package, and pin mp-api==0.37.2 in CI and optional deps.

Changes

Cohort / File(s) Summary
Conda environment
\.ci_support/environment.yml``
Added mp-api =0.37.2 as a pinned Conda dependency.
Optional dependencies
pyproject.toml
Added optional-dependency group mp-api = ["mp-api==0.37.2"].
New Materials Project client
src/structuretoolkit/build/materialsproject.py
New module adding search(chemsys, fields=(), api_key=None, **kwargs) and by_id(material_id, final=True, conventional_unit_cell=False, api_key=None) that instantiate mp_api.MPRester (with user-agent and optional api_key), call summary.search / get_structure_by_material_id, convert pymatgen Structures to ASE Atoms, and yield/return those results.
Package exports
src/structuretoolkit/build/__init__.py
Imported search and by_id from the new module and re-exported them as materialsproject_search and materialsproject_by_id (added to __all__).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant MPR as mp_api.MPRester
    participant PM as pymatgen.Structure
    participant ASE as pymatgen_to_ase → ASE.Atoms

    Client->>MPR: instantiate MPRester(include_user_agent=True, api_key?)
    alt search(chemsys, **kwargs)
        Client->>MPR: summary.search(chemsys=..., fields=["structure","material_id"], **kwargs)
        MPR-->>Client: iterable of dicts with "structure"
        loop each record
            Client->>PM: obtain record["structure"]
            PM->>ASE: pymatgen_to_ase(record["structure"])
            ASE-->>Client: ASE Atoms (inserted into yielded dict)
        end
    else by_id(material_id, final, conventional_unit_cell)
        Client->>MPR: get_structure_by_material_id(material_id, final=..., conventional_unit_cell=...)
        MPR-->>Client: Structure or iterable of Structures
        loop each structure
            Client->>ASE: pymatgen_to_ase(structure)
            ASE-->>Client: ASE Atoms (returned)
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇
I hopped into mp-api’s glen,
fetched structures time and again,
turned pymatgen shapes to ASE art,
and carried atoms home in my cart.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add mp-api interface' directly and clearly summarizes the main change: adding a new Materials Project API interface with wrapper functions and dependencies.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mp-api

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@pmrv pmrv marked this pull request as ready for review April 2, 2026 21:28
@pmrv pmrv self-requested a review April 2, 2026 21:28
@pmrv
Copy link
Copy Markdown
Contributor

pmrv commented Apr 2, 2026

@jan-janssen I've updated the PR to current structure. Any strong feelings on the function names & reimports?

Copilot AI review requested due to automatic review settings April 2, 2026 21:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.ci_support/environment.yml:
- Line 23: The package imports MPRester at runtime in materialsproject.py within
the search() and by_id() functions but mp-api is missing from pyproject.toml;
update pyproject.toml to add mp-api (version 0.37.2) under
[project.optional-dependencies] as a new extra (e.g., "materials-project" or
grouped with existing Materials Project integrations) so end users installing
the extra will get mp-api and avoid ImportError when calling search() or
by_id().

In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 6-8: The Generator type annotation on the search function is
incomplete: update the return annotation from Generator[dict[str, Any]] to
include the required Send and Return type parameters, e.g. Generator[dict[str,
Any], None, None], so modify the search function signature (the search symbol)
to use Generator[dict[str, Any], None, None] to satisfy
collections.abc.Generator's three-type-parameter requirement.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7dc095c8-b1b0-4a8e-9f62-be3e9b01c923

📥 Commits

Reviewing files that changed from the base of the PR and between 2af45d9 and 98d0a21.

📒 Files selected for processing (3)
  • .ci_support/environment.yml
  • src/structuretoolkit/build/__init__.py
  • src/structuretoolkit/build/materialsproject.py

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 28.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.96%. Comparing base (94ec44a) to head (d175557).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/structuretoolkit/build/materialsproject.py 25.00% 18 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #91      +/-   ##
==========================================
- Coverage   82.69%   81.96%   -0.74%     
==========================================
  Files          25       26       +1     
  Lines        1832     1857      +25     
==========================================
+ Hits         1515     1522       +7     
- Misses        317      335      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/structuretoolkit/build/materialsproject.py (1)

9-9: ⚠️ Potential issue | 🟠 Major

Use a fully-parameterized Generator annotation for compatibility.

At Line 9, Generator[dict[str, Any]] is version-sensitive and may fail type checking/runtime annotation evaluation on supported Python versions that expect 3 parameters.

Proposed fix
-) -> Generator[dict[str, Any]]:
+) -> Generator[dict[str, Any], None, None]:
For Python 3.9–3.12, does collections.abc.Generator require 3 type parameters (YieldType, SendType, ReturnType), or is a single parameter valid?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` at line 9, The Generator
return annotation must be fully parameterized with three type args for
compatibility; update the function return from Generator[dict[str, Any]] to
Generator[dict[str, Any], None, None] and ensure you import the correct
Generator symbol (either from typing or collections.abc as used in the file).
Locate the annotation token "Generator[dict[str, Any]]" in the function
signature and replace it with "Generator[dict[str, Any], None, None]" and add or
adjust the import for Generator if missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 27-53: The docstring still claims the function returns an
MPQueryResults that implements HasStructure (and shows get_structure usage), but
the implementation actually yields dictionaries with ASE Atoms (converted
structures) and metadata; update the Returns and examples in the docstring of
the search function (create.structure.materialsproject.search) to accurately
describe the real return value (e.g., an iterable/list of dicts with keys for
material id and an ASE Atoms object), remove or replace the
MPQueryResults/HasStructure references and the get_structure example, and adjust
the Args/Usage text to show how to access structure data from the returned
dicts.
- Around line 63-65: The search call passes explicit fields but also forwards
**kwargs which may contain a fields key causing a TypeError; before calling
mpr.summary.search (the call where chemsys and fields are passed), remove any
'fields' entry from kwargs (e.g. kwargs.pop('fields', None)) or validate and
raise a clear error if callers pass their own fields, then call
mpr.summary.search(chemsys=chemsys, **kwargs, fields=[...]) so only the intended
fields argument is provided.

---

Duplicate comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Line 9: The Generator return annotation must be fully parameterized with three
type args for compatibility; update the function return from Generator[dict[str,
Any]] to Generator[dict[str, Any], None, None] and ensure you import the correct
Generator symbol (either from typing or collections.abc as used in the file).
Locate the annotation token "Generator[dict[str, Any]]" in the function
signature and replace it with "Generator[dict[str, Any], None, None]" and add or
adjust the import for Generator if missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e1020d77-de6f-4986-993c-23ec8f1f3bfe

📥 Commits

Reviewing files that changed from the base of the PR and between 98d0a21 and bb53de1.

📒 Files selected for processing (1)
  • src/structuretoolkit/build/materialsproject.py

Comment on lines +63 to +65
results = mpr.summary.search(
chemsys=chemsys, **kwargs, fields=["structure", "material_id"]
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent fields keyword collision in search() call.

At Line 63–65, callers can pass fields via **kwargs, which will conflict with the explicit fields=[...] and raise TypeError (“multiple values for keyword argument 'fields'”).

Proposed fix
-    with MPRester(**rest_kwargs) as mpr:
-        results = mpr.summary.search(
-            chemsys=chemsys, **kwargs, fields=["structure", "material_id"]
-        )
+    query_kwargs = dict(kwargs)
+    query_kwargs.pop("fields", None)
+    with MPRester(**rest_kwargs) as mpr:
+        results = mpr.summary.search(
+            chemsys=chemsys, fields=["structure", "material_id"], **query_kwargs
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 63 - 65, The
search call passes explicit fields but also forwards **kwargs which may contain
a fields key causing a TypeError; before calling mpr.summary.search (the call
where chemsys and fields are passed), remove any 'fields' entry from kwargs
(e.g. kwargs.pop('fields', None)) or validate and raise a clear error if callers
pass their own fields, then call mpr.summary.search(chemsys=chemsys, **kwargs,
fields=[...]) so only the intended fields argument is provided.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/structuretoolkit/build/materialsproject.py (2)

61-65: ⚠️ Potential issue | 🟠 Major

fields can still collide with forwarded **kwargs and raise TypeError.

If callers pass fields in **kwargs, this call provides fields twice.

Proposed fix
-    with MPRester(**rest_kwargs) as mpr:
-        results = mpr.summary.search(
-            chemsys=chemsys,
-            **kwargs,
-            fields=list(fields) + ["structure", "material_id"],
-        )
+    query_kwargs = dict(kwargs)
+    query_kwargs.pop("fields", None)
+    with MPRester(**rest_kwargs) as mpr:
+        results = mpr.summary.search(
+            chemsys=chemsys,
+            fields=[*list(fields), "structure", "material_id"],
+            **query_kwargs,
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 61 - 65, The
call to mpr.summary.search passes fields twice if callers include a fields key
in **kwargs; to fix, remove any fields entry from kwargs before calling
mpr.summary.search (e.g., pop kwargs['fields'] if present) and then pass the
desired fields list only once so the call in materialsproject.py that constructs
fields=list(fields)+["structure","material_id"] does not collide with forwarded
kwargs; ensure you reference the mpr.summary.search invocation and the local
kwargs variable when making this change.

22-30: ⚠️ Potential issue | 🟡 Minor

Docstring examples/return contract still mismatch generator behavior.

search() returns a generator, but examples use len(...) directly and Returns says list of dict.

Also applies to: 46-50

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 22 - 30, The
docstring for function search() is inconsistent with its implementation:
search() yields a generator but the examples call len(...) and the Returns
section claims "list of dict"; update the docstring to match behavior by either
(A) changing the examples to consume the generator (e.g., wrap search(...) with
list(...) before len) and update Returns to "iterator/generator of dict" with
example showing list(...) where needed, or (B) change search() to materialize
and return a list if the intended contract is a list; reference the search
function in materialsproject.py and adjust the docstring examples and Returns
section accordingly so examples and return contract match the actual behavior.
🧹 Nitpick comments (1)
src/structuretoolkit/build/materialsproject.py (1)

64-64: Small readability tweak: prefer iterable unpacking over list concatenation.

This aligns with Ruff’s RUF005 recommendation.

Proposed tweak
-            fields=list(fields) + ["structure", "material_id"],
+            fields=[*list(fields), "structure", "material_id"],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` at line 64, Replace the list
concatenation that builds the fields argument (currently using list(fields) +
["structure", "material_id"]) with iterable unpacking to improve readability and
satisfy RUF005; locate where fields is used to construct the fields parameter
(the expression list(fields) + ["structure", "material_id"]) and change it to
use a single list literal with unpacking of fields followed by "structure" and
"material_id".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 66-69: The loop over results currently only yields when a
"structure" key exists, causing entries without "structure" to be dropped;
change the logic in the for-results loop in materialsproject.py so you still
yield each result record `r` unconditionally, but only perform the conversion
`r["structure"] = pymatgen_to_ase(r["structure"])` inside an `if "structure" in
r` check — i.e., convert when present and always `yield r` afterward.

---

Duplicate comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 61-65: The call to mpr.summary.search passes fields twice if
callers include a fields key in **kwargs; to fix, remove any fields entry from
kwargs before calling mpr.summary.search (e.g., pop kwargs['fields'] if present)
and then pass the desired fields list only once so the call in
materialsproject.py that constructs
fields=list(fields)+["structure","material_id"] does not collide with forwarded
kwargs; ensure you reference the mpr.summary.search invocation and the local
kwargs variable when making this change.
- Around line 22-30: The docstring for function search() is inconsistent with
its implementation: search() yields a generator but the examples call len(...)
and the Returns section claims "list of dict"; update the docstring to match
behavior by either (A) changing the examples to consume the generator (e.g.,
wrap search(...) with list(...) before len) and update Returns to
"iterator/generator of dict" with example showing list(...) where needed, or (B)
change search() to materialize and return a list if the intended contract is a
list; reference the search function in materialsproject.py and adjust the
docstring examples and Returns section accordingly so examples and return
contract match the actual behavior.

---

Nitpick comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Line 64: Replace the list concatenation that builds the fields argument
(currently using list(fields) + ["structure", "material_id"]) with iterable
unpacking to improve readability and satisfy RUF005; locate where fields is used
to construct the fields parameter (the expression list(fields) + ["structure",
"material_id"]) and change it to use a single list literal with unpacking of
fields followed by "structure" and "material_id".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0b05c60f-714e-4ed7-bc93-0a7580e49002

📥 Commits

Reviewing files that changed from the base of the PR and between bb53de1 and d8ec636.

📒 Files selected for processing (2)
  • pyproject.toml
  • src/structuretoolkit/build/materialsproject.py
✅ Files skipped from review due to trivial changes (1)
  • pyproject.toml

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/structuretoolkit/build/materialsproject.py (1)

61-65: ⚠️ Potential issue | 🟠 Major

Prevent fields collision between **kwargs and explicit fields.

At Line 61–65, callers can still pass fields via **kwargs, which will raise TypeError: multiple values for keyword argument 'fields'.

Proposed fix
-    with MPRester(**rest_kwargs) as mpr:
-        results = mpr.summary.search(
-            chemsys=chemsys,
-            **kwargs,
-            fields=list(fields) + ["structure", "material_id"],
-        )
+    query_kwargs = dict(kwargs)
+    query_kwargs.pop("fields", None)
+    with MPRester(**rest_kwargs) as mpr:
+        results = mpr.summary.search(
+            chemsys=chemsys,
+            fields=[*list(fields), "structure", "material_id"],
+            **query_kwargs,
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 61 - 65, The
mpr.summary.search call passes explicit fields while also forwarding **kwargs,
causing a collision if callers include 'fields' in kwargs; modify the code
around the mpr.summary.search invocation to remove any 'fields' key from kwargs
(e.g., provided_fields = kwargs.pop('fields', None)), normalize provided_fields
to an iterable if present, merge it with the local fields set and ensure
"structure" and "material_id" are included and deduplicated, then pass the
combined list as the fields argument to mpr.summary.search so no duplicate
'fields' keyword is sent.
🧹 Nitpick comments (2)
src/structuretoolkit/build/materialsproject.py (2)

64-64: Use unpacking for cleaner field list construction.

At Line 64, prefer unpacking over list concatenation for readability (RUF005).

Suggested cleanup
-            fields=list(fields) + ["structure", "material_id"],
+            fields=[*list(fields), "structure", "material_id"],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` at line 64, Replace the list
concatenation that builds the fields argument with unpacking for clarity: in the
code that sets fields=list(fields) + ["structure", "material_id"] (inside the
MaterialsProject-related builder in
src/structuretoolkit/build/materialsproject.py), change it to use iterable
unpacking so the fields are constructed via unpacking the existing fields
iterable and appending "structure" and "material_id" (e.g. use [*fields,
"structure", "material_id"] or [*list(fields), "structure", "material_id"]
depending on whether fields is already a list).

120-137: Remove duplicated API call in by_id branches.

At Line 120–137, mpr.get_structure_by_material_id(...) is repeated in both branches; compute once and branch only on conversion shape.

Refactor sketch
     with MPRester(**rest_kwargs) as mpr:
-        if final:
-            return pymatgen_to_ase(
-                mpr.get_structure_by_material_id(
-                    material_id=material_id,
-                    final=final,
-                    conventional_unit_cell=conventional_unit_cell,
-                )
-            )
-        else:
-            return [
-                pymatgen_to_ase(mpr_structure)
-                for mpr_structure in (
-                    mpr.get_structure_by_material_id(
-                        material_id=material_id,
-                        final=final,
-                        conventional_unit_cell=conventional_unit_cell,
-                    )
-                )
-            ]
+        mpr_structure = mpr.get_structure_by_material_id(
+            material_id=material_id,
+            final=final,
+            conventional_unit_cell=conventional_unit_cell,
+        )
+        if final:
+            return pymatgen_to_ase(mpr_structure)
+        return [pymatgen_to_ase(s) for s in mpr_structure]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 120 - 137,
Compute the structure(s) once by calling mpr.get_structure_by_material_id(...)
into a local variable (preserving material_id, final, conventional_unit_cell
arguments) and then branch on whether to return a single converted object or a
list: if final return pymatgen_to_ase(structure) else return [pymatgen_to_ase(s)
for s in structure]; update the by_id function to remove the duplicated
mpr.get_structure_by_material_id call and reuse the local variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 22-30: The doc examples call len() directly on the generator
returned by search(), which will fail; update the examples in
materialsproject.py to materialize the generator before measuring length (e.g.,
wrap the return of structuretoolkit.build.materialsproject.search("Fe") and
search(['Al','Li','Al-Li'], is_stable=True) with list(...) or use sum(1 for _ in
gen)) so irons and alli are actual sequences when calling len(); reference the
search() generator return and the irons/alli example variables when making the
change.

---

Duplicate comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 61-65: The mpr.summary.search call passes explicit fields while
also forwarding **kwargs, causing a collision if callers include 'fields' in
kwargs; modify the code around the mpr.summary.search invocation to remove any
'fields' key from kwargs (e.g., provided_fields = kwargs.pop('fields', None)),
normalize provided_fields to an iterable if present, merge it with the local
fields set and ensure "structure" and "material_id" are included and
deduplicated, then pass the combined list as the fields argument to
mpr.summary.search so no duplicate 'fields' keyword is sent.

---

Nitpick comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Line 64: Replace the list concatenation that builds the fields argument with
unpacking for clarity: in the code that sets fields=list(fields) + ["structure",
"material_id"] (inside the MaterialsProject-related builder in
src/structuretoolkit/build/materialsproject.py), change it to use iterable
unpacking so the fields are constructed via unpacking the existing fields
iterable and appending "structure" and "material_id" (e.g. use [*fields,
"structure", "material_id"] or [*list(fields), "structure", "material_id"]
depending on whether fields is already a list).
- Around line 120-137: Compute the structure(s) once by calling
mpr.get_structure_by_material_id(...) into a local variable (preserving
material_id, final, conventional_unit_cell arguments) and then branch on whether
to return a single converted object or a list: if final return
pymatgen_to_ase(structure) else return [pymatgen_to_ase(s) for s in structure];
update the by_id function to remove the duplicated
mpr.get_structure_by_material_id call and reuse the local variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ba1f4e69-547a-4cf8-a470-c9a1def3a060

📥 Commits

Reviewing files that changed from the base of the PR and between d8ec636 and 588be25.

📒 Files selected for processing (2)
  • pyproject.toml
  • src/structuretoolkit/build/materialsproject.py
✅ Files skipped from review due to trivial changes (1)
  • pyproject.toml

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/structuretoolkit/build/materialsproject.py (2)

61-65: ⚠️ Potential issue | 🟠 Major

Prevent duplicate fields kwargs in search() call.

If a caller passes fields via **kwargs, Line 61–65 can raise TypeError: multiple values for keyword argument 'fields'.

Proposed fix
-    with MPRester(**rest_kwargs) as mpr:
-        results = mpr.summary.search(
-            chemsys=chemsys,
-            **kwargs,
-            fields=list(fields) + ["structure", "material_id"],
-        )
+    query_kwargs = dict(kwargs)
+    query_kwargs.pop("fields", None)
+    with MPRester(**rest_kwargs) as mpr:
+        results = mpr.summary.search(
+            chemsys=chemsys,
+            fields=[*list(fields), "structure", "material_id"],
+            **query_kwargs,
+        )
#!/bin/bash
# Verify risk: function accepts **kwargs and also passes explicit fields= in same call.
rg -n -C2 --type=py '^def search\(' src/structuretoolkit/build/materialsproject.py
rg -n -C4 --type=py 'mpr\.summary\.search\(' src/structuretoolkit/build/materialsproject.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 61 - 65, The
call to mpr.summary.search passes fields both via an explicit fields=... and may
also get fields in **kwargs, causing TypeError; before calling
mpr.summary.search, remove any 'fields' key from kwargs (e.g. provided =
kwargs.pop('fields', None)), then merge provided with the local fields variable
(compute final_fields = list(fields) + list(provided) +
["structure","material_id"] or deduplicate as needed) and pass
fields=final_fields to mpr.summary.search; update the call referenced as
mpr.summary.search(...) to use the cleaned kwargs and the merged final_fields.

46-50: ⚠️ Potential issue | 🟡 Minor

Fix search() return type in docstring.

Line 47 says list of dict, but the function yields a generator (Line 9, Line 69). Update the docstring contract to avoid misleading API usage.

Proposed doc fix
-    Returns:
-        list of dict: one dictionary for each search results with at least keys
+    Returns:
+        Generator[dict[str, Any], None, None]: one dictionary per search result with at least keys
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 46 - 50, The
docstring for search() incorrectly states it returns a "list of dict" while the
implementation yields results (a generator); update the Returns section of the
search() docstring to describe an iterator/generator that yields dicts (each
with 'material_id', 'structure', plus any requested `fields`), and adjust any
example or type annotation text to say "iterator of dict" or "generator yielding
dict" to match the implementation in search().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 13-15: Fix typos in the public docstring in
src/structuretoolkit/build/materialsproject.py: correct "distint" to "distinct",
"Similarily" to "Similarly", and "seperated" to "separated" (the explanatory
sentence about chemsys usage and any other occurrences around the
MaterialsProject/docstring area). Update the wording so the examples read
clearly (e.g., 'chemsys=["Fe", "O"] will return iron and oxygen structures but
not iron oxide; chemsys=["Fe-O"] will not return unary structures'), ensuring
the docstring attached to the MaterialsProject-related code is spelled and
phrased correctly.

---

Duplicate comments:
In `@src/structuretoolkit/build/materialsproject.py`:
- Around line 61-65: The call to mpr.summary.search passes fields both via an
explicit fields=... and may also get fields in **kwargs, causing TypeError;
before calling mpr.summary.search, remove any 'fields' key from kwargs (e.g.
provided = kwargs.pop('fields', None)), then merge provided with the local
fields variable (compute final_fields = list(fields) + list(provided) +
["structure","material_id"] or deduplicate as needed) and pass
fields=final_fields to mpr.summary.search; update the call referenced as
mpr.summary.search(...) to use the cleaned kwargs and the merged final_fields.
- Around line 46-50: The docstring for search() incorrectly states it returns a
"list of dict" while the implementation yields results (a generator); update the
Returns section of the search() docstring to describe an iterator/generator that
yields dicts (each with 'material_id', 'structure', plus any requested
`fields`), and adjust any example or type annotation text to say "iterator of
dict" or "generator yielding dict" to match the implementation in search().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f0c0a1b1-9600-43c2-9e19-a8b7e9a98ffd

📥 Commits

Reviewing files that changed from the base of the PR and between 588be25 and d175557.

📒 Files selected for processing (1)
  • src/structuretoolkit/build/materialsproject.py

Comment on lines +13 to +15
Note that `chemsys` takes distint values for unaries, binaries and so! A query with `chemsys=["Fe", "O"]` will
return iron structures and oxygen structures, but no iron oxide structures. Similarily `chemsys=["Fe-O"]` will
not return unary structures.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clean up docstring typos for user-facing API docs.

There are spelling issues in the public docs (distint, Similarily, seperated) on Line 13–15 and Line 40.

Proposed doc fix
-    Note that `chemsys` takes distint values for unaries, binaries and so!  A query with `chemsys=["Fe", "O"]` will
+    Note that `chemsys` takes distinct values for unaries, binaries, and so on. A query with `chemsys=["Fe", "O"]` will
@@
-    return iron structures and oxygen structures, but no iron oxide structures.  Similarily `chemsys=["Fe-O"]` will
+    return iron structures and oxygen structures, but no iron oxide structures. Similarly, `chemsys=["Fe-O"]` will
@@
-            symbols seperated by dashes; if a list of strings is given return structures matching either of them
+            symbols separated by dashes; if a list of strings is given, return structures matching either of them

Also applies to: 40-40

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/structuretoolkit/build/materialsproject.py` around lines 13 - 15, Fix
typos in the public docstring in src/structuretoolkit/build/materialsproject.py:
correct "distint" to "distinct", "Similarily" to "Similarly", and "seperated" to
"separated" (the explanatory sentence about chemsys usage and any other
occurrences around the MaterialsProject/docstring area). Update the wording so
the examples read clearly (e.g., 'chemsys=["Fe", "O"] will return iron and
oxygen structures but not iron oxide; chemsys=["Fe-O"] will not return unary
structures'), ensuring the docstring attached to the MaterialsProject-related
code is spelled and phrased correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants