Skip to content

Implemented a new test case for LUT quantization #2342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

szyszyzys
Copy link
Contributor

No description provided.

@szyszyzys szyszyzys added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Jun 9, 2025
Copy link

pytorch-bot bot commented Jun 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2342

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 276d953 with merge base d72a6d1 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 9, 2025


std::pair<std::vector<int8_t>, std::vector<int8_t>>
generate_simple_u_to_s_lut_and_indices(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "simple_u_to_s" mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "simple_u_to_s" refers to converting an unsigned index to a signed index. The example provided is a simplified scenario where we only need to map values to each slot. I will work on making the function name more descriptive and readable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's come up with a better name, e.g., maybe "generate_random_int8_lut_and_indices" or something like that

* input data type (T_in) to its corresponding floating-point representation for
* each quantization group.
*
* @tparam T_in The data type of the quantized values (e.g., int8_t). This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the T_in defined in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the implementation but forgot to update the function description. I will provide a new description.

* @param has_zeros A flag indicating if the zero-points should be used.
* @return A flattened std::vector<float> containing all the group LUTs concatenated.
*/
std::vector<int8_t> generate_requant_lut_from_params(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we returning int8_t type, and not float32_t for LUT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of test_channelwise_8bit_activation_groupwise_lowbit_weight_lut, the code requires an int8_t type to function correctly. I initially designed the implementation to serve as a second quantization step, hence the int8_t output. However, I agree that it should be more flexible to accommodate different data types for various scenarios. I attempted to address this in the code but overlooked this function. I will make the necessary adjustments.

const size_t lut_size_per_group = static_cast<size_t>(q_max) - q_min + 1;
const int lut_index_offset = q_min;

const int num_groups = zeros.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this when has_zeros=false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could lead to errors. I should not rely on the size of zeros when has_zeros is false.

for (int q_val = q_min; q_val <= q_max; ++q_val) {
size_t lut_idx = group_idx * lut_size_per_group + (q_val - lut_index_offset);
// The LUT stores the result of (quantized_value - zero_point)
luts[lut_idx] = static_cast<int8_t>(q_val - zero_point);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the lut output dtype int8_t?

}

// Helper to perform quantization
static std::vector<T_in> quantize_input(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Can't we just generate the random LUT for testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can generate a random LUT for testing purposes. My initial design aimed to extend this function to accommodate more scenarios, such as secondary quantization. This would allow it to work in conjunction with other components. My goal was to ensure that it could be integrated and tested as part of a larger system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow how you're planning on using this. For LUT quantization, we won't be quantizing in this way.

We also shouldn't re-invent the quantization logic because we have this elsewhere in other tests, e.g., here: https://github.com/pytorch/ao/blob/main/torchao/experimental/kernels/cpu/aarch64/tests/test_utils.h#L110.

If you want to refactor that code to use non-aarch64 code, that'd be great, but I'm not sure it's related to the LUT project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks!

}

// Helper to compute the reference output
static std::vector<float> compute_expected_output(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this doing the matrix product of activations * dequantized_weights?

It looks like it's just dequantizing the weights?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have overdesigned this part. My initial intention was to accommodate a more complex setup, such as integrating with other quantization methods or using different LUT designs. That's why I used the GroundTruthStrategy to set the dequantization strategies. It seems unnecessary in this context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing Use this tag if you don't want this PR to show up in release notes
3 participants