[WIP] migration to structured configuration classes by arashsm79 · Pull Request #3172 · DeepLabCut/DeepLabCut
Edit (by @deruyter92): this PR was split into several sub-PRs to facilitate a smooth transition. An overview is kept in #3193.
Summary
- Introduce new configuration classes for inference, logging, model, pose, project, runner, and training settings.
- Refactore data loading mechanisms to utilize new configuration structures.
- Move the multithreading and compilation options in inference configuration to the config module.
- Typed configuration for logging.
- Update dataset loaders to accept model configurations directly or via file paths.
Why Typed & Structured Configuration (OmegaConf + Pydantic)
-
Strong guarantees for correctness
- Runtime type safety ensures invalid configs fail fast with clear errors instead of silently producing incorrect training runs.
- Schema-validated configs dramatically reduce debugging time for users and maintainers.
-
Static typing improves developer velocity
- IDE autocomplete and inline documentation make configs discoverable and self-documenting.
- Refactors become safer: config changes are more likely to be caught at development time.
-
Hierarchical, composable configuration
- Natural representation of DeepLabCut’s nested project/model/training settings.
- Easy composition and merging from multiple sources (base config, model presets, experiment overrides).
-
Cleaner overrides and defaults.
-
Structured configs make it easier to define parameter ranges for tuning and automation.
-
Config schemas can be versioned and evolve safely over time while preserving backward compatibility.
-
Full, validated configuration can be saved alongside results, which improves reproducibility and transparency.
-
Builds on well-maintained, widely adopted libraries (OmegaConf, Pydantic).
Resources for knowing more about structured configs:
- hydra
- MIT Responsible AI's hydra-zen
- Pydantic
- Omegaconf
- Soklaski, R., Goodwin, J., Brown, O., Yee, M. and Matterer, J., 2022. Tools and practices for responsible AI engineering. arXiv preprint arXiv:2201.05647.
Future Work
- Currently default model definitions are still stored as yaml files in the package. Moving to LazyConfig as in Detectron 2 would improve things significantly.
More things that could be done ( @deruyter92 ):
- I think we need to make sure that everytime a model is used, all the changes to the project's
config.yamlare reflected in the model's configuration undermetadataas well. - There might be a better way to handle things in
deeplabcut/pose_estimation_pytorch/data/base.py.