Skip to content

fix random rotation and update rotation doc.#1884

Open
lkk12014402 wants to merge 1 commit into
intel:mainfrom
lkk12014402:fix_random_rotation
Open

fix random rotation and update rotation doc.#1884
lkk12014402 wants to merge 1 commit into
intel:mainfrom
lkk12014402:fix_random_rotation

Conversation

@lkk12014402
Copy link
Copy Markdown
Contributor

Description

Fix the asymmetry of the random matrix, and update the rotation usage documentation.

Signed-off-by: lkk12014402 <kaokao.lv@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses correctness and usability around rotation-based preprocessing for quantization: it fixes an inconsistency in SpinQuant’s random Hadamard R1 block rotation (matrix symmetry assumption), and refreshes the step-by-step documentation to describe rotation options (QuaRot/SpinQuant and per-linear block rotation).

Changes:

  • Fix random Hadamard online R1 block weight rotation to preserve equivalence with the activation-side hook (x @ R).
  • Replace the legacy “Hadamard Transform” doc section with a new “Rotation” section (English + Chinese) describing rotation modes and usage.
  • Add rotation usage examples and parameter tables in the step-by-step docs.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File Description
docs/step_by_step.md Replaces Hadamard section with Rotation section and adds usage guidance/examples.
docs/step_by_step_CN.md Chinese translation of the Rotation documentation updates.
auto_round/algorithms/transforms/spinquant/preprocessor.py Fixes random Hadamard online R1 block rotation by aligning weight fusion with hook convention.

Comment thread docs/step_by_step.md
Comment on lines +851 to +855
# R1 only (fast, good baseline improvement)
ar = AutoRound(model, scheme="MXFP4", rotation_config=SpinQuantConfig(r1=True))

# R1 + R2 (better, no runtime overhead after fuse)
ar = AutoRound(model, scheme="MXFP4", rotation_config=SpinQuantConfig(r1=True, r2=True))
Comment thread docs/step_by_step.md
Comment on lines +857 to +858
# R1 + R2 + R3 + R4 (best accuracy, slight runtime overhead from hooks)
ar = AutoRound(model, scheme="MXFP4", rotation_config=SpinQuantConfig(r1=True, r2=True, r3=True, r4=True))
Comment thread docs/step_by_step.md
Comment on lines +865 to +866
| `"quarot"` | `SpinQuantConfig(r1=True, r2=True, r3=True, r4=True)` — deterministic Hadamard, no training |
| `"spinquant"` | `SpinQuantConfig(r1=True, r2=True, r3=True, r4=True, trainable_rotation=True)` — **experimental**, see note below |
Comment thread docs/step_by_step.md
Comment on lines +890 to +894
| `r1` / `r2` / `r3` / `r4` | `False` | Enable rotation at each position |
| `online_r1_rotation` | `True` | R1 via hook (`True`) or fused into weights (`False`) |
| `random_r1` / `r2` / `r3` / `r4` | `False` | Use random Hadamard (H×diag(±1)) instead of deterministic |
| `rotation_size` | `None` (auto) | Block rotation dimension; auto-detected from model dimensions |
| `trainable_rotation` | `False` | Enable SpinQuant learnable rotation (**experimental**) |
Comment thread docs/step_by_step.md
Comment on lines +910 to +912
- **Deterministic rotations** (R1–R4): Only metadata (type + seed) is stored — matrices are regenerated on load
- **Random rotations**: The random sign vector is stored as a compact int8 buffer (~hidden_size bytes)
- **Online hooks** (R3/R4): Automatically re-registered during model loading
Comment thread docs/step_by_step_CN.md
Comment on lines +818 to +822
# 仅 R1(速度快,良好的基准提升)
ar = AutoRound(model, scheme="MXFP4", rotation_config=SpinQuantConfig(r1=True))

# R1 + R2(更好,融合后无运行时开销)
ar = AutoRound(model, scheme="MXFP4", rotation_config=SpinQuantConfig(r1=True, r2=True))
Comment thread docs/step_by_step_CN.md
Comment on lines +824 to +825
# R1 + R2 + R3 + R4(最佳精度,hook 带来少许运行时开销)
ar = AutoRound(model, scheme="MXFP4", rotation_config=SpinQuantConfig(r1=True, r2=True, r3=True, r4=True))
Comment thread docs/step_by_step_CN.md
Comment on lines +832 to +833
| `"quarot"` | `SpinQuantConfig(r1=True, r2=True, r3=True, r4=True)` — 确定性 Hadamard,无需训练 |
| `"spinquant"` | `SpinQuantConfig(r1=True, r2=True, r3=True, r4=True, trainable_rotation=True)` — **实验性**,见下方说明 |
Comment thread docs/step_by_step_CN.md
Comment on lines +857 to +861
| `r1` / `r2` / `r3` / `r4` | `False` | 启用各位置的旋转 |
| `online_r1_rotation` | `True` | R1 通过 hook 应用(`True`)或融合到权重中(`False`) |
| `random_r1` / `r2` / `r3` / `r4` | `False` | 使用随机 Hadamard(H×diag(±1))而非确定性 |
| `rotation_size` | `None`(自动) | 块旋转维度;从模型维度自动检测 |
| `trainable_rotation` | `False` | 启用 SpinQuant 可学习旋转(**实验性**) |
Comment thread docs/step_by_step_CN.md
Comment on lines +877 to +879
- **确定性旋转**(R1–R4):仅存储元数据(类型 + 种子)——矩阵在加载时重新生成
- **随机旋转**:随机符号向量以紧凑的 int8 buffer 存储(约 hidden_size 字节)
- **在线 hook**(R3/R4):在模型加载时自动重新注册
# Hadamard matrix is orthonormal but NOT symmetric, so
# rotate_in_channels_ (which applies R.T) would break
# equivalence. Pass R.T so it computes W @ (R.T).T = W @ R.
rotate_in_channels_(module, R_in=R.T)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit tests to verify equivalence.

Comment thread docs/step_by_step.md
- [Enable multiple gpus calibration in lm_head quantization](#enable-multiple-gpus-calibration-in-lm_head-quantization)
+ [Adjust Hyperparameters](#adjust-hyperparameters)
+ [Hadamard Transform-Research Feature](#hadamard-transform)
+ [Rotation](#rotation)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please recover the research feature. Any feature that lacks effective kernel support or has low adoption among serving frameworks should fall into this category.

Comment thread docs/step_by_step.md
### Rotation

**Research feature with no effective kernels currently available and typically low community adoption.**
AutoRound supports rotation-based transforms to improve quantization accuracy. Rotation redistributes outliers in weights and activations before quantization, making the distribution more uniform and quantization-friendly.
Copy link
Copy Markdown
Contributor

@wenhuach21 wenhuach21 Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, this is a user guide rather than a technical notebook. Focus on presenting the data and helping users understand when to use this feature and how to use it easily.

Detailed implementation should be hidden behind expandable sections (e.g., "

Details") for advanced users who want to dive deeper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants