fix xpu 8bit value loading by jiqing-feng · Pull Request #3623

fix xpu 8bit value loading by jiqing-feng · Pull Request #3623 · huggingface/accelerate

We should enable the same logic on XPU as cuda to correct loading the 8bit value. Set device to cpu first so we can trigger the quantization, otherwise we will get the float weight inside linear8bit.

Fix failed test:
pytest -rA tests/test_quantization.py::MixedInt8EmptyModelTest::test_linear_are_8bit
error message:

AssertionError: assert torch.float16 == torch.int8