-
Notifications
You must be signed in to change notification settings - Fork 260
[MoE Calibration] Simplify MoE calibration interface #1851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[MoE Calibration] Simplify MoE calibration interface #1851
Conversation
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
@kylesayrs @dsikka Few clarifications:
|
7fefaac
to
ba42881
Compare
@sairampillai , regarding DCO, you can ignore that. We can sign it via github once reviewed/approved |
…illai/llm-compressor into moe_calibration_refactor
…illai/llm-compressor into moe_calibration_refactor
…illai/llm-compressor into moe_calibration_refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but I worry that this implementation uses more abstraction than is necessary. I like the idea of "contextual" vs "permanent" changes, and we should definitely log which one is being used to the user.
Please consider simplifying to a single mapping dictionary, and a single ABC class to handle the from_original
and restore
functions. Don't be afraid to remove/ refactor existing code!
Hey @sairampillai! Are you still interested in contributing to this PR? If not, please let me know and I can assign someone to pick up where you left off! |
@kylesayrs I am working on the updates, I will push an update soon for review! |
Signed-off-by: Sairam Pillai <sairam.pillai61@gmail.com>
Signed-off-by: Sairam Pillai <sairam.pillai61@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far, thanks for following up!
Signed-off-by: Sairam Pillai <sairam.pillai61@gmail.com>
Signed-off-by: Sairam Pillai <sairam.pillai61@gmail.com>
Signed-off-by: Sairam Pillai <sairam.pillai61@gmail.com>
Introduce standardized MoE calibration interface and deprecate legacy replace_modules_for_calibration
Summary
Implements a simplified, decorator-based registration system for MoE model calibration using a single
MoECalibrationModule
base class, making MoE model integration easier and deprecates the legacyreplace_modules_for_calibration
function.Problem
MoE model calibration currently requires module replacement logic scattered across
replace_modules_for_calibration
and manual context management. This makes contributing new MoE model support difficult and error-prone. Additionally, each model required custom replacement functions with duplicated boilerplate code.Relevant Issues
Fixes #1829
Solution
MoECalibrationModule
abstract base class implementationfrom_original()
classmethod and optionalrestore()
is_permanent
flag to specify if module replacement is to be restored usingrestore()
Decorator-Based Registration:
@register_moe_calibration("ModuleName")
decoratorMOE_CALIBRATION_MODULES
registryNew Model Integration: Adding MoE support requires only:
Dataset Arguments: New:
moe_calibrate_all_experts: bool = True
- Controls whether all experts see all tokens during calibrationTrue
(default): All experts receive all tokens for proper quantization statisticsFalse
: Normal routing behavior (only routed experts are used)oneshot()
andDatasetArguments
moe_calibration_context
by pipelinesAutomatic Context Management:
moe_calibration_context
integrated into pipelinesoneshot.py
Backward Compatibility: Deprecation of
replace_modules_for_calibration
with warningsTest Plan
Testing
Migration Guide
Before:
After: