Do not convert BC1 LUT to UINT32 by radarhere · Pull Request #8837

Do not convert BC1 LUT to UINT32 by radarhere · Pull Request #8837 · python-pillow/Pillow

	#define LOAD32(p) (p)[0] \| ((p)[1] << 8) \| ((p)[2] << 16) \| ((p)[3] << 24)

	static void
	bc1_color_load(bc1_color dst, const UINT8 src) {
	dst->c0 = LOAD16(src);
	dst->c1 = LOAD16(src + 2);
	dst->lut = LOAD32(src + 4);

With a little maths and changing the loop of size 16 to two range loops of size 4 each, this code can be changed to avoid the UINT32. If you think that changing the size of the loop is misleading to the reality of the image, it's not - looking at https://learn.microsoft.com/en-us/windows/win32/direct3d10/d3d10-graphics-programming-guide-resources-block-compression#bc1, you can see that the LUT is actually representing a 4x4 block.