The motivation behind this feature request is to extend the functionality of the VeRA method to support models that have been quantized using techniques such as bitsandbytes. Currently, users face issues with dimension mismatches and incorrect tensor shapes when applying VeRA to quantized models. By ...
In order to reduce the storage space of a histogram, the fractional values are quantized to be represented by a space under one byte (8 bits) instead of using all the four bytes (32 bits) which are generally used for representing the fraction. ...