Quantization can be done using various algorithms, including:
- K-means quantization: Grouping the weights and activations into k clusters and assigning each cluster to a lower precision data type.
- Hierarchical quantization: Quantizing the weights and activations in a hierarchical manner, starting with the most important weights and activations.
- Nearest-neighbor quantization: Finding the nearest neighbor in a quantization table and assigning the weight or activation to that neighbor.