refactor and support for multi algs fusion by n1ck-guo · Pull Request #1852 · intel/auto-round

n1ck-guo · 2026-05-26T00:29:53Z

Description

What this PR does
Introduces a composable QuantizationPipeline that separates pre-processing algorithms (e.g. AWQ) from the terminal block-quantizer (e.g. AutoRound/RTN), and lets users compose them declaratively via config lists.

Key changes:

QuantizationPipeline (algorithms/quantization/pipeline.py): new orchestration layer — [preprocessors…] + block_quantizer. Replaces the implicit algorithm coupling in DataDrivenCompressor.
BasePipelineMember / BaseWeightTransformer / BaseQuantizer (base.py): clean class hierarchy with unified lifecycle hooks (prepare_run, quantize_block, finalize_run).
AWQConfig + AWQQuantizer refactored as a BaseWeightTransformer — pure weight-smoothing preprocessor, no quantization loop of its own.
DiffusionMixin injected dynamically at pipeline construction time (is_diffusion=True) — no if is_diffusion branches in algorithm code.
CLI (auto_round/cli/) rewritten to expose --alg_configs for composing pipelines from the command line.

Usage: AWQ + AutoRound fusion

from auto_round import AutoRound
from auto_round.algorithms.quantization.awq.config import AWQConfig
from auto_round.algorithms.quantization.sign_round.config import SignRoundConfig
ar = AutoRound(
    [AWQConfig(), SignRoundConfig(iters=200)],
    model_name,
    scheme="W4A16",
)
model, layer_config = ar.quantize()

Passing a list of configs activates the pipeline: AWQ smoothing runs first on each block, then AutoRound's SignSGD optimization runs on the smoothed weights. Passing a single config (old API) continues to work unchanged.
Compatibility

Single-config API (AutoRound(model, ...)) is fully backward compatible.
All existing CPU tests pass; pre-existing environment failures (missing auto-round-lib, device fixtures) are unrelated to this PR.

Type of Change

New feature

Related Issues

Fixes or relates to #

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.
The CUDA CI has passed. You can trigger it by commenting /azp run Unit-Test-CUDA-AutoRound.

Signed-off-by: n1ck-guo <heng.guo@intel.com>

for more information, see https://pre-commit.ci

Copilot

Pull request overview

This PR introduces a composable quantization “pipeline” abstraction that separates preprocessors (e.g., AWQ smoothing) from the terminal block quantizer (e.g., RTN / SignRound), and refactors compressor + CLI wiring to support multi-algorithm composition via ordered config lists.

Changes:

Added QuantizationPipeline / BlockContext orchestration and a unified quantizer/preprocessor lifecycle (prepare_run, block_forward_hooks, pre_quantize_block, quantize_block, finalize_run).
Refactored quantizers (RTN, SignRound, SignRoundV2, AWQ) to use the new pipeline context and hook model.
Rewrote the CLI into auto_round/cli/ with algorithm handlers + config building, and added docs + tests for pipeline composition.

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
test/test_cuda/quantization/test_torch_compile.py	Update test imports/usages for new `BaseQuantizer` naming.
test/test_cuda/algorithms/test_alg_ext.py	Update alg-ext regression expectations for SignRoundV2 wrapper behavior.
test/test_cpu/utils/test_alg_ext.py	Update import smoke for new SignRoundV2 location.
test/test_cpu/core/test_pipeline_fail_fast.py	New unit tests for registry + pipeline construction failure modes.
test/test_cpu/core/test_awq_autoround_smoke.py	New smoke test for AWQ + AutoRound fusion via config list.
docs/step_by_step.md	Document AWQ as a preprocessing algorithm + CLI/API composition examples.
docs/step_by_step_CN.md	Chinese translation updates for the AWQ section.
auto_round/context/model.py	Minor refactor/formatting + rename doc reference to `BaseQuantizer`.
auto_round/compressors/zero_shot.py	Adapt zero-shot RTN path to use `BlockContext`.
auto_round/compressors/entry.py	Centralized alias/config resolution + routing for pipeline/preprocessors and model-free path.
auto_round/compressors/data_driven.py	Refactor block loop to pipeline lifecycle + hook scheduling via `BlockContext`.
auto_round/compressors/config.py	Removed legacy `ExtraConfig` container module.
auto_round/compressors/base.py	Build and expose `QuantizationPipeline`; `quantizer` becomes a forwarding property.
auto_round/compressors/init.py	Remove legacy config exports; keep lazy imports for compressors.
auto_round/cli/parser.py	New argparse construction module for quantize/list/eval commands.
auto_round/cli/main.py	New CLI router + recipe defaults + algorithm-help printing.
auto_round/cli/algorithms.py	New CLI algorithm registry/handlers that build ordered config lists.
auto_round/cli/init.py	Export CLI entrypoints from the new CLI package.
auto_round/calibration/state.py	Update doc reference from `BaseQuantizers` to `BaseQuantizer`.
auto_round/autoround.py	Add `alg_configs` fast-path into new entry point; adjust skip args.
auto_round/algorithms/transforms/rotation/config.py	Add/expand RotationConfig docstring and minor formatting.
auto_round/algorithms/quantization/sign_roundv2/quantizer.py	Move imatrix hooks into quantizer hook context + minor cleanups.
auto_round/algorithms/quantization/sign_round/quantizer.py	Convert block quantization API to `ctx` (`BlockContext`) and use `BlockIO`.
auto_round/algorithms/quantization/sign_round/config.py	Replace sparse docstring with a clearer/structured one.
auto_round/algorithms/quantization/rtn/quantizer.py	Convert RTN to `ctx` API; move imatrix hook logic into context manager.
auto_round/algorithms/quantization/rtn/config.py	Add docstring and minor logging formatting.
auto_round/algorithms/quantization/registry.py	New alias→config registry used by entry routing.
auto_round/algorithms/quantization/pipeline.py	New pipeline abstraction (`BlockContext`, `RunContext`, policy merge, IO helpers).
auto_round/algorithms/quantization/config.py	Add top-level `QuantizationConfig` docstring.
auto_round/algorithms/quantization/base.py	Introduce `BasePipelineMember`, `BaseWeightTransformer`, `BaseQuantizer`, mixins; new hook lifecycle.
auto_round/algorithms/quantization/awq/mappings.py	Downgrade a log line from warning to info for hybrid-attention mapping build.
auto_round/algorithms/quantization/awq/config.py	Reframe AWQConfig as a preprocessor config; improve validation and repr.
auto_round/algorithms/quantization/awq/init.py	Export AWQConfig/AWQQuantizer from the awq subpackage.
auto_round/algorithms/quantization/init.py	Re-export pipeline types/quantizers; currently contains duplicate AWQ imports.
auto_round/alg_ext.py	Removed legacy alg-ext implementation module.
auto_round/main.py	Replace huge legacy CLI with a shim that forwards to `auto_round.cli.main`.

Comments suppressed due to low confidence (1)

auto_round/algorithms/quantization/init.py:41

Duplicate imports for AWQConfig/AWQQuantizer are present twice in this module, which is redundant and can trigger linting/formatting issues. Remove the second pair of imports and keep a single import location.

from auto_round.algorithms.quantization.awq.config import AWQConfig
from auto_round.algorithms.quantization.awq.quantizer import AWQQuantizer
from auto_round.algorithms.quantization.rtn.config import RTNConfig
from auto_round.algorithms.quantization.rtn.quantizer import RTNQuantizer, OptimizedRTNQuantizer

wenhuach21 · 2026-05-26T01:49:29Z

The common tuning parameters should be transferable across different algorithms, e.g., for awq and signround, the shared parameter is weight clip ratio

Signed-off-by: n1ck-guo <heng.guo@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: n1ck-guo <heng.guo@intel.com>

…-round into hengguo/refactor_algs

Signed-off-by: n1ck-guo <heng.guo@intel.com>

chensuyue · 2026-05-29T05:50:33Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-05-29T05:50:42Z

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: n1ck-guo <heng.guo@intel.com>

chensuyue · 2026-06-04T06:18:08Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-06-04T06:18:17Z

Azure Pipelines successfully started running 1 pipeline(s).

…-round into hengguo/refactor_algs

refactor and support for multi algs fusion

d8fc7cc

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Copilot AI review requested due to automatic review settings May 26, 2026 00:29

Copilot started reviewing on behalf of n1ck-guo May 26, 2026 00:30 View session

n1ck-guo and others added 2 commits May 26, 2026 08:32

Merge branch 'main' into hengguo/refactor_algs

18e8b15

[pre-commit.ci] auto fixes from pre-commit.com hooks

4de34b8

for more information, see https://pre-commit.ci

Copilot AI reviewed May 26, 2026

View reviewed changes

Comment thread auto_round/compressors/entry.py Outdated

Comment thread auto_round/compressors/data_driven.py

Comment thread auto_round/compressors/__init__.py

n1ck-guo and others added 7 commits May 26, 2026 10:06

fix bugs

bcf3633

fix and hanle shared config

92a9723

Signed-off-by: n1ck-guo <heng.guo@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

bc6569d

for more information, see https://pre-commit.ci

merge main

a9fc7df

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge branch 'hengguo/refactor_algs' of https://github.com/intel/auto…

f0e8483

…-round into hengguo/refactor_algs

relocate awq

5a71dbf

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge remote-tracking branch 'origin/main' into hengguo/refactor_algs

dae7221

lvliang-intel reviewed May 28, 2026

View reviewed changes

Comment thread AGENTS.md Outdated

lvliang-intel reviewed May 28, 2026

View reviewed changes

Comment thread auto_round/algorithms/base.py Outdated

lvliang-intel reviewed May 28, 2026

View reviewed changes

Comment thread auto_round/algorithms/quantization/registry.py Outdated

lvliang-intel reviewed May 28, 2026

View reviewed changes

Comment thread auto_round/algorithms/quantization/config.py

n1ck-guo added 5 commits May 28, 2026 15:14

refactor scheme and entry

533f12b

Signed-off-by: n1ck-guo <heng.guo@intel.com>

modify by comments

6a6f97a

Signed-off-by: n1ck-guo <heng.guo@intel.com>

merge main

d3a391c

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge remote-tracking branch 'origin/main' into hengguo/refactor_algs

ecefed9

fix ut

cab89be

Signed-off-by: n1ck-guo <heng.guo@intel.com>

n1ck-guo requested a review from WeiweiZhang1 May 29, 2026 01:25

n1ck-guo added 3 commits May 29, 2026 10:22

fix

8b2af76

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge remote-tracking branch 'origin/main' into hengguo/refactor_algs

6f3abc7

add llmc api

0ea5774

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge remote-tracking branch 'origin/main' into hengguo/refactor_algs

9c7853f

n1ck-guo added 5 commits June 3, 2026 15:54

fix gguf error

4c5c470

Signed-off-by: n1ck-guo <heng.guo@intel.com>

fix ut

8e12461

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge remote-tracking branch 'origin/main' into hengguo/refactor_algs

c470601

fix

8e4fd0e

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Merge branch 'main' into hengguo/refactor_algs

1fdb43a

n1ck-guo added 2 commits June 4, 2026 14:32

Merge remote-tracking branch 'origin/main' into hengguo/refactor_algs

40a67fa

Merge branch 'hengguo/refactor_algs' of https://github.com/intel/auto…

b4ae543

…-round into hengguo/refactor_algs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor and support for multi algs fusion#1852

refactor and support for multi algs fusion#1852
n1ck-guo wants to merge 26 commits into
mainfrom
hengguo/refactor_algs

n1ck-guo commented May 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenhuach21 commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chensuyue commented May 29, 2026

Uh oh!

azure-pipelines Bot commented May 29, 2026

Uh oh!

chensuyue commented Jun 4, 2026

Uh oh!

azure-pipelines Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

n1ck-guo commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenhuach21 commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chensuyue commented May 29, 2026

Uh oh!

azure-pipelines Bot commented May 29, 2026

Uh oh!

chensuyue commented Jun 4, 2026

Uh oh!

azure-pipelines Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

n1ck-guo commented May 26, 2026 •

edited

Loading