Skip to content

Implement (efficient) bilinear decoder#2127

Merged
sophie-xhonneux merged 4 commits intodevelopfrom
sophiex/dev/efficient-bilin-clean-pr
Apr 10, 2026
Merged

Implement (efficient) bilinear decoder#2127
sophie-xhonneux merged 4 commits intodevelopfrom
sophiex/dev/efficient-bilin-clean-pr

Conversation

@sophie-xhonneux
Copy link
Copy Markdown
Contributor

@sophie-xhonneux sophie-xhonneux commented Mar 27, 2026

Description

Make the bilinear layer more efficient

Issue Number

Closes #1683

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@sophie-xhonneux
Copy link
Copy Markdown
Contributor Author

@shmh40 Could you please review this? I have tested training with forecasting and jepa

@sophie-xhonneux sophie-xhonneux mentioned this pull request Mar 27, 2026
4 tasks
@github-actions github-actions bot added the model Related to model training or definition (not generic infra) label Mar 27, 2026
@clessig clessig changed the title Create clean PR with only intended change Implement (efficient) bilinear decoder Mar 29, 2026
Copy link
Copy Markdown
Collaborator

@clessig clessig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comm

Comment thread src/weathergen/model/engines.py Outdated


class EfficientBilinear(torch.nn.Module):
def __init__(self, in1, in2, out, bias=False):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we either make the name more self descriptive or add a doctring. I think it is dim_in_lhs and dim_in_rhs?

Comment thread src/weathergen/model/engines.py Outdated
self.bias = nn.Parameter(torch.zeros(out)) if bias else 0.0
self.total_in = in1 * in2

def forward(self, x1, x2):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: args not clear, at least lhs and rhs or similar

@sophie-xhonneux sophie-xhonneux merged commit 05885cd into develop Apr 10, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model:pretrain model Related to model training or definition (not generic infra)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Make the bilinear layer more memory efficient

2 participants