Add support for bmm and `to` for fbgemm Tensor #2337

jerryzh168 · 2025-06-08T01:02:41Z

Summary:
att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first

Test Plan:
python test/dtypes/test_fbgemm_fp8.py -k test_bmm
python test/dtypes/test_fbgemm_int4.py -k test_bmm

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-06-08T01:02:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2337

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 06211ee with merge base 4235837 ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/integration/test_integration.py::TestSubclass::test_int4_weight_only_quant_subclass_grouped_5_cuda

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/dtypes/fbgemm_fp8_tensor.py

drisspg · 2025-06-08T03:27:15Z

torchao/dtypes/fbgemm_fp8_tensor.py

+
+    # not used
+    num_tokens = torch.empty([input_tensor.size(0)], device=input_tensor.device)
+    xq, x_scale = torch.ops.fbgemm.quantize_fp8_per_row(


This ot use num_tokens feels weird, maybe make an issue on fbgemm? or update the op to not need

yeah I checked with @jiawenliu64 and this arg is indeed only used in internal use cases, he was recommending to use the triton op, although I found the triton op is a bit slower, maybe it requires some tuning. I'll double check

torchao/dtypes/fbgemm_int4_tensor.py

Summary: att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first Test Plan: python test/dtypes/test_fbgemm_fp8.py -k test_bmm python test/dtypes/test_fbgemm_int4.py -k test_bmm Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 8, 2025

jerryzh168 requested a review from drisspg June 8, 2025 01:02

jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jun 8, 2025

drisspg reviewed Jun 8, 2025

View reviewed changes

torchao/dtypes/fbgemm_fp8_tensor.py Show resolved Hide resolved

drisspg reviewed Jun 8, 2025

View reviewed changes

torchao/dtypes/fbgemm_int4_tensor.py Show resolved Hide resolved

drisspg reviewed Jun 8, 2025

View reviewed changes

torchao/dtypes/fbgemm_int4_tensor.py Outdated Show resolved Hide resolved

jerryzh168 force-pushed the add-bmm branch 3 times, most recently from 59bc6cf to a02edc9 Compare June 9, 2025 14:24

jerryzh168 changed the title ~~Add support for bmm for fbgemm config~~ Jun 9, 2025

jerryzh168 force-pushed the add-bmm branch from a02edc9 to 06211ee Compare June 9, 2025 14:39

drisspg approved these changes Jun 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for bmm and `to` for fbgemm Tensor #2337

Add support for bmm and `to` for fbgemm Tensor #2337

Uh oh!

jerryzh168 commented Jun 8, 2025

pytorch-bot bot commented Jun 8, 2025 •

edited

Loading

Uh oh!

drisspg Jun 8, 2025

jerryzh168 Jun 9, 2025

Uh oh!

Uh oh!

Add support for bmm and to for fbgemm Tensor #2337

Are you sure you want to change the base?

Add support for bmm and to for fbgemm Tensor #2337

Uh oh!

Conversation

jerryzh168 commented Jun 8, 2025

pytorch-bot bot commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2337

❌ 1 New Failure

Uh oh!

drisspg Jun 8, 2025

Choose a reason for hiding this comment

jerryzh168 Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Add support for bmm and `to` for fbgemm Tensor #2337

Add support for bmm and `to` for fbgemm Tensor #2337

pytorch-bot bot commented Jun 8, 2025 •

edited

Loading