Add optional norm for lm head vectors by klei22 · Pull Request #808 · ReaLLMASIC/ReaLLM-Forge

klei22 · 2026-05-06T02:37:43Z

No description provided.

Copilot

Pull request overview

Adds an optional normalization step for LM head vectors (applied to lm_head.weight before computing logits) to enable experimentation with logit-space stability and geometry, plus introduces a capped hypersphere normalization variant.

Changes:

Add CappedHyperSphereNorm and register it in the normalization variant dictionary.
Introduce CLI/config plumbing for norm_variant_lm_head and associated hypersphere-style parameters (norm_lm_head_*), and route LM-head logits through a helper that can normalize the head weight.
Add an exploration YAML to compare LM-head norm variants (none / hypersphere / RMS / capped hypersphere).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
variations/norm_variations.py	Adds `CappedHyperSphereNorm` and exposes it via `norm_dictionary`.
train_args.py	Adds CLI options for LM-head norm selection/params and includes the new norm variant in choices; also changes default `--device`.
model.py	Adds LM-head weight normalization path and uses it when computing logits.
gpt_conf.py	Extends `GPTConfig` with LM-head norm configuration fields.
explorations/default_inf_lm_head_norm_comparison.yaml	Provides an experiment matrix to compare LM-head norm variants under `default_inf`-like settings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

        if self.config.norm_variant_abs is not None:
            self.transformer['post_abs_norm'] = self.build_norm_from_variant(config, "norm_variant_abs", "norm_abs")
+        if self.config.norm_variant_lm_head is not None:
+            self.transformer['lm_head_norm'] = self.build_norm_from_variant(config, "norm_variant_lm_head", "norm_lm_head")


+        return self.transformer.lm_head_norm(lm_head_weight)
+



    # System args
-    training_group.add_argument('--device', default='cuda', type=str)
+    training_group.add_argument('--device', default='cuda:0', type=str)


klei22 and others added 4 commits April 27, 2026 10:43

Add optional LM-head vector norm variants and comparison sweep

4291740

Switch lm-head norm comparison sweep to softmax-only

28b03a1

Fix lm-head norm sweep variation-group schema

0bb8bec

Update yaml exploration for capped hsnorm and args

f861515

klei22 requested review from Copilot and gkielian May 6, 2026 02:37

Copilot started reviewing on behalf of klei22 May 6, 2026 02:38 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional norm for lm head vectors#808

Add optional norm for lm head vectors#808
klei22 wants to merge 4 commits into
ReaLLMASIC:masterfrom
klei22:add-optional-norm-for-lm-head-vectors

klei22 commented May 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

klei22 commented May 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants