RuntimeError: invalid numeric default value in 0.46.0.dev

System Info

Environment: Linux, AMD64, Nv V100, Python 3.11, torch-2.4.1, Driver 53x, CUDA 12.4
Regression from 0.45.3.

Reproduction

  1. Clone latest version (0.45.5)

  2. Build and install:

cd bitsandbytes
cmake -DCOMPUTE_BACKEND=cuda -DCOMPLETE_CAPABILITY=70 -S .
make
pip install .
python -m bitsandbytes
  1. Observe:
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 148, in _get_module_details
  File "<frozen runpy>", line 112, in _get_module_details
  File "/dockerdata/bitsandbytes/bitsandbytes/__init__.py", line 9, in <module>
    from . import _ops, research, utils
  File "/dockerdata/bitsandbytes/bitsandbytes/_ops.py", line 20, in <module>
    torch.library.define(
  File "/root/envs/lib/python3.11/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/envs/lib/python3.11/site-packages/torch/library.py", line 370, in define
    lib.define(name + schema, alias_analysis="", tags=tags)
  File "/root/envs/lib/python3.11/site-packages/torch/library.py", line 118, in define
    result = self.m.define(schema, alias_analysis, tuple(tags))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError:
invalid numeric default value:
int8_scaled_mm(Tensor A, Tensor B, Tensor row_stats, Tensor col_stats, Tensor? bias=None, ScalarType dtype=torch.float16) -> Tensor
                                                                                                                ~ <--- HERE

Expected behavior

Successful execution of python -m bitsandbytes without errors, as seen in version 0.45.3. (Version 0.45.3 executes successfully.)