refactor and support for multi algs fusion#1852
Open
n1ck-guo wants to merge 26 commits into
Open
Conversation
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a composable quantization “pipeline” abstraction that separates preprocessors (e.g., AWQ smoothing) from the terminal block quantizer (e.g., RTN / SignRound), and refactors compressor + CLI wiring to support multi-algorithm composition via ordered config lists.
Changes:
- Added
QuantizationPipeline/BlockContextorchestration and a unified quantizer/preprocessor lifecycle (prepare_run,block_forward_hooks,pre_quantize_block,quantize_block,finalize_run). - Refactored quantizers (RTN, SignRound, SignRoundV2, AWQ) to use the new pipeline context and hook model.
- Rewrote the CLI into
auto_round/cli/with algorithm handlers + config building, and added docs + tests for pipeline composition.
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/test_cuda/quantization/test_torch_compile.py | Update test imports/usages for new BaseQuantizer naming. |
| test/test_cuda/algorithms/test_alg_ext.py | Update alg-ext regression expectations for SignRoundV2 wrapper behavior. |
| test/test_cpu/utils/test_alg_ext.py | Update import smoke for new SignRoundV2 location. |
| test/test_cpu/core/test_pipeline_fail_fast.py | New unit tests for registry + pipeline construction failure modes. |
| test/test_cpu/core/test_awq_autoround_smoke.py | New smoke test for AWQ + AutoRound fusion via config list. |
| docs/step_by_step.md | Document AWQ as a preprocessing algorithm + CLI/API composition examples. |
| docs/step_by_step_CN.md | Chinese translation updates for the AWQ section. |
| auto_round/context/model.py | Minor refactor/formatting + rename doc reference to BaseQuantizer. |
| auto_round/compressors/zero_shot.py | Adapt zero-shot RTN path to use BlockContext. |
| auto_round/compressors/entry.py | Centralized alias/config resolution + routing for pipeline/preprocessors and model-free path. |
| auto_round/compressors/data_driven.py | Refactor block loop to pipeline lifecycle + hook scheduling via BlockContext. |
| auto_round/compressors/config.py | Removed legacy ExtraConfig container module. |
| auto_round/compressors/base.py | Build and expose QuantizationPipeline; quantizer becomes a forwarding property. |
| auto_round/compressors/init.py | Remove legacy config exports; keep lazy imports for compressors. |
| auto_round/cli/parser.py | New argparse construction module for quantize/list/eval commands. |
| auto_round/cli/main.py | New CLI router + recipe defaults + algorithm-help printing. |
| auto_round/cli/algorithms.py | New CLI algorithm registry/handlers that build ordered config lists. |
| auto_round/cli/init.py | Export CLI entrypoints from the new CLI package. |
| auto_round/calibration/state.py | Update doc reference from BaseQuantizers to BaseQuantizer. |
| auto_round/autoround.py | Add alg_configs fast-path into new entry point; adjust skip args. |
| auto_round/algorithms/transforms/rotation/config.py | Add/expand RotationConfig docstring and minor formatting. |
| auto_round/algorithms/quantization/sign_roundv2/quantizer.py | Move imatrix hooks into quantizer hook context + minor cleanups. |
| auto_round/algorithms/quantization/sign_round/quantizer.py | Convert block quantization API to ctx (BlockContext) and use BlockIO. |
| auto_round/algorithms/quantization/sign_round/config.py | Replace sparse docstring with a clearer/structured one. |
| auto_round/algorithms/quantization/rtn/quantizer.py | Convert RTN to ctx API; move imatrix hook logic into context manager. |
| auto_round/algorithms/quantization/rtn/config.py | Add docstring and minor logging formatting. |
| auto_round/algorithms/quantization/registry.py | New alias→config registry used by entry routing. |
| auto_round/algorithms/quantization/pipeline.py | New pipeline abstraction (BlockContext, RunContext, policy merge, IO helpers). |
| auto_round/algorithms/quantization/config.py | Add top-level QuantizationConfig docstring. |
| auto_round/algorithms/quantization/base.py | Introduce BasePipelineMember, BaseWeightTransformer, BaseQuantizer, mixins; new hook lifecycle. |
| auto_round/algorithms/quantization/awq/mappings.py | Downgrade a log line from warning to info for hybrid-attention mapping build. |
| auto_round/algorithms/quantization/awq/config.py | Reframe AWQConfig as a preprocessor config; improve validation and repr. |
| auto_round/algorithms/quantization/awq/init.py | Export AWQConfig/AWQQuantizer from the awq subpackage. |
| auto_round/algorithms/quantization/init.py | Re-export pipeline types/quantizers; currently contains duplicate AWQ imports. |
| auto_round/alg_ext.py | Removed legacy alg-ext implementation module. |
| auto_round/main.py | Replace huge legacy CLI with a shim that forwards to auto_round.cli.main. |
Comments suppressed due to low confidence (1)
auto_round/algorithms/quantization/init.py:41
- Duplicate imports for AWQConfig/AWQQuantizer are present twice in this module, which is redundant and can trigger linting/formatting issues. Remove the second pair of imports and keep a single import location.
from auto_round.algorithms.quantization.awq.config import AWQConfig
from auto_round.algorithms.quantization.awq.quantizer import AWQQuantizer
from auto_round.algorithms.quantization.rtn.config import RTNConfig
from auto_round.algorithms.quantization.rtn.quantizer import RTNQuantizer, OptimizedRTNQuantizer
Contributor
|
The common tuning parameters should be transferable across different algorithms, e.g., for awq and signround, the shared parameter is weight clip ratio |
Signed-off-by: n1ck-guo <heng.guo@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: n1ck-guo <heng.guo@intel.com>
…-round into hengguo/refactor_algs
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Contributor
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Contributor
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
What this PR does
Introduces a composable QuantizationPipeline that separates pre-processing algorithms (e.g. AWQ) from the terminal block-quantizer (e.g. AutoRound/RTN), and lets users compose them declaratively via config lists.
Key changes:
Usage: AWQ + AutoRound fusion
Passing a list of configs activates the pipeline: AWQ smoothing runs first on each block, then AutoRound's SignSGD optimization runs on the smoothed weights. Passing a single config (old API) continues to work unchanged.
Compatibility
Type of Change
New feature
Related Issues
Fixes or relates to #
Checklist Before Submitting
/azp run Unit-Test-CUDA-AutoRound.