Add Float8ActInt4WeightQATQuantizer #2289

andrewor14 · 2025-06-02T22:39:35Z

Summary: This commit adds a QAT quantizer that performs float8 dynamic activation + int4 symmetric per channel weight fake quantization. Note that there is no corresponding config for float8 QAT yet. This will be added in a future PR.

Test Plan:
python test/quantization/test_qat.py -k test_float8_fake_quantize
python test/quantization/test_qat.py -k test_qat_fp8a4w_quantizer

pytorch-bot · 2025-06-02T22:39:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2289

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 656d17d with merge base 4610850 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/quantization/qat/fake_quantizer.py

drisspg · 2025-06-03T18:15:45Z

test/quantization/test_qat.py

@@ -17,6 +17,9 @@
 from torch.ao.quantization.fx._decomposed import quantized_decomposed_lib  # noqa: F401

 from torchao import quantize_
+from torchao.float8.config import ScalingGranularity


I kinda hate that we have ScalingGranuliarty and Ganularity of the other FP8 inference APIs

I think this is worth fixing before landing. @andrewor14 , how about just using rowwise scaling (since I assume that the one you want) and removing the option to confugure it? That will at least keep this problem away from the BC surface of QAT in a way that we can more easily fix later.

vkuzo

request changes for removing ScalingGranularity from user API

andrewor14 · 2025-06-03T19:39:38Z

Removed ScalingGranularity from the public API Float8ActInt4WeightQATQuantizer, please have another look @vkuzo

torchao/quantization/qat/utils.py

**Summary:** This commit adds a QAT quantizer that performs float8 dynamic activation + int4 symmetric per channel weight fake quantization. Note that there is no corresponding config for float8 QAT yet. This will be added in a future PR. **Test Plan:** python test/quantization/test_qat.py -k test_float8_fake_quantize python test/quantization/test_qat.py -k test_qat_fp8a4w_quantizer

andrewor14 · 2025-06-05T18:48:52Z

Ok, I'm merging this. The latest commit doesn't use any float8 training classes or functions. The implementation is also hidden so we can always change this if needed. Please let me know if there are any follow-up issues or concerns @vkuzo @drisspg

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 2, 2025

andrewor14 marked this pull request as draft June 2, 2025 22:39

andrewor14 commented Jun 2, 2025

View reviewed changes

torchao/quantization/qat/fake_quantizer.py Outdated Show resolved Hide resolved

andrewor14 added the topic: new feature Use this tag if this PR adds a new feature label Jun 2, 2025

vkuzo reviewed Jun 3, 2025

View reviewed changes

torchao/quantization/qat/fake_quantizer.py Outdated Show resolved Hide resolved

vkuzo reviewed Jun 3, 2025

View reviewed changes

torchao/quantization/qat/fake_quantizer.py Show resolved Hide resolved

andrewor14 force-pushed the fp8-int4-qat-quantizer branch 2 times, most recently from 452c147 to 620f676 Compare June 3, 2025 16:37

andrewor14 marked this pull request as ready for review June 3, 2025 16:37

andrewor14 requested review from jerryzh168 and vkuzo June 3, 2025 16:37

andrewor14 force-pushed the fp8-int4-qat-quantizer branch 2 times, most recently from c0b808c to 5e08eca Compare June 3, 2025 18:09

drisspg reviewed Jun 3, 2025

View reviewed changes

drisspg approved these changes Jun 3, 2025

View reviewed changes

vkuzo requested changes Jun 3, 2025

View reviewed changes

andrewor14 force-pushed the fp8-int4-qat-quantizer branch from 5e08eca to 59ca3ca Compare June 3, 2025 19:37

andrewor14 requested a review from vkuzo June 3, 2025 19:39

andrewor14 changed the title ~~Add Float8ActInt4WeightQATQuantizer~~ Jun 3, 2025

andrewor14 force-pushed the fp8-int4-qat-quantizer branch 2 times, most recently from cf45f47 to cfead5c Compare June 3, 2025 22:50

andrewor14 changed the title ~~Add Float8RowwiseActInt4WeightQATQuantizer~~ Jun 3, 2025

vkuzo reviewed Jun 4, 2025

View reviewed changes

torchao/quantization/qat/utils.py Outdated Show resolved Hide resolved

andrewor14 force-pushed the fp8-int4-qat-quantizer branch from cfead5c to 8269247 Compare June 4, 2025 17:58

andrewor14 requested a review from vkuzo June 4, 2025 17:59

andrewor14 force-pushed the fp8-int4-qat-quantizer branch from 8269247 to 2a371fb Compare June 4, 2025 20:53

andrewor14 force-pushed the fp8-int4-qat-quantizer branch from 2a371fb to 656d17d Compare June 4, 2025 20:54

andrewor14 merged commit 0d9631b into main Jun 5, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Float8ActInt4WeightQATQuantizer #2289

Add Float8ActInt4WeightQATQuantizer #2289

Uh oh!

andrewor14 commented Jun 2, 2025 •

edited

Loading

pytorch-bot bot commented Jun 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

drisspg Jun 3, 2025

vkuzo Jun 3, 2025

andrewor14 Jun 3, 2025

vkuzo left a comment

andrewor14 commented Jun 3, 2025 •

edited

Loading

Uh oh!

andrewor14 commented Jun 5, 2025

Uh oh!

Add Float8ActInt4WeightQATQuantizer #2289

Add Float8ActInt4WeightQATQuantizer #2289

Uh oh!

Conversation

andrewor14 commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pytorch-bot bot commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2289

✅ No Failures

Uh oh!

Uh oh!

Uh oh!

drisspg Jun 3, 2025

Choose a reason for hiding this comment

vkuzo Jun 3, 2025

Choose a reason for hiding this comment

andrewor14 Jun 3, 2025

Choose a reason for hiding this comment

vkuzo left a comment

Choose a reason for hiding this comment

andrewor14 commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewor14 commented Jun 5, 2025

Uh oh!

andrewor14 commented Jun 2, 2025 •

edited

Loading

pytorch-bot bot commented Jun 2, 2025 •

edited

Loading

andrewor14 commented Jun 3, 2025 •

edited

Loading