Do not convert BC1 LUT to UINT32 by radarhere · Pull Request #8837 · python-pillow/Pillow
| #define LOAD32(p) (p)[0] | ((p)[1] << 8) | ((p)[2] << 16) | ((p)[3] << 24) | |
| static void | |
| bc1_color_load(bc1_color *dst, const UINT8 *src) { | |
| dst->c0 = LOAD16(src); | |
| dst->c1 = LOAD16(src + 2); | |
| dst->lut = LOAD32(src + 4); |
| for (n = 0; n < 16; n++) { | |
| cw = 3 & (col.lut >> (2 * n)); | |
| dst[n] = p[cw]; | |
| } |
With a little maths and changing the loop of size 16 to two range loops of size 4 each, this code can be changed to avoid the UINT32. If you think that changing the size of the loop is misleading to the reality of the image, it's not - looking at https://learn.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-programming-guide-resources-block-compression#bc1, you can see that the LUT is actually representing a 4x4 block.