Build Questions and Comments · duckdb/duckdb-python · Discussion #59
Questions and comments after spending far more time on CI than expected.
I can open various PRs, but don't know if there's some good reasons here.
ccache installation
Why is ccache installed in the runners? The cache file isn't persistent, so it seems like an unnecessary step.
See the [tool.cibuildwheel.*] sections of pyproject.toml.
* The Windows runners already have strawberry ccache installed anyway, so it's not needed there.
sccache w/ GH Action Cache
One neat optimization is to use sccache w/ the GH Action Cache. The GHA Cache is limited to 10GB.
I have a workflow using sccache for my primary dev builds, but 10GB isn't enough for the entire matrix. You could use it to speed up the most common smoke test builds.
Ninja on Windows
Why isn't Ninja enabled for Windows builds? Significantly speeds up the Windows builds. There's a couple of settings involved here, happy to open a PR... but not sure if there's some reason.
jemalloc on Windows
Any particular reason jemalloc is enabled on Windows debug builds?
On Windows, jemalloc is disabled for Windows Release, but enabled for Windows Debug builds. This fails locally for me, so I modified my local duckdb_loader.cmake to exclude when if(CMAKE_SYSTEM_NAME STREQUAL "Windows").
Noisy "uv export"
Would suggest adding "--quiet" to the uv export step in packaging_wheels. The uv lock output is very noisy.
Local Builds
A few tips I found for getting local builds running:
- Pass
-vvtouv sync --no-build-isolation -vvstep. This shows the build steps. - edit: Adding
--reinstallhelped for rebuilds. My rebuild is usually:uv sync --active --no-build-isolation -vv --reinstall. - I found re-enabled Unity helped significantly with my build times (DISABLE_UNITY='0'), ymmv.
- edit: Setting the external/duckdb to Unity enabled but the local project to disabled in CMakeLists.txt was my best balance of cache + build times, since I'm not touching anything inside external.
- Enable Ninja for Windows builds (see above)
- When working with multiple Python builds, was easier to use separate build dirs for each Python version. Could probably work this into pyproject.toml.
- If incrementally building with cmake, make sure to link or copy the _duckdb*.so file from the build dir to the location in the venv.
- sccache was not happy with the way DuckDB detects it... you end up with "sccache sccache compiler ...". Solution was:
cmake.define.CMAKE_C_COMPILER_LAUNCHER=""
cmake.define.CMAKE_CXX_COMPILER_LAUNCHER=""
Pytest Plugins
Some suggested plugins:
- pytest-xdist: This runs tests in parallel. I've run a few times with
-n autoon my workflow... cuts test time significantly. - pytest-randomly: This randomizes the order of tests, surfacing any test order dependencies or assumptions.
- pytest-timestamper: Adds timestamps to the verbose output. Helpful when wondering how long a test took.