[WIP] migration to structured configuration classes by arashsm79 · Pull Request #3172

[WIP] migration to structured configuration classes by arashsm79 · Pull Request #3172 · DeepLabCut/DeepLabCut

Edit (by @deruyter92): this PR was split into several sub-PRs to facilitate a smooth transition. An overview is kept in #3193.

Introduce new configuration classes for inference, logging, model, pose, project, runner, and training settings.
Refactore data loading mechanisms to utilize new configuration structures.
Move the multithreading and compilation options in inference configuration to the config module.
Typed configuration for logging.
Update dataset loaders to accept model configurations directly or via file paths.

Strong guarantees for correctness
- Runtime type safety ensures invalid configs fail fast with clear errors instead of silently producing incorrect training runs.
- Schema-validated configs dramatically reduce debugging time for users and maintainers.
Static typing improves developer velocity
- IDE autocomplete and inline documentation make configs discoverable and self-documenting.
- Refactors become safer: config changes are more likely to be caught at development time.
Hierarchical, composable configuration
- Natural representation of DeepLabCut’s nested project/model/training settings.
- Easy composition and merging from multiple sources (base config, model presets, experiment overrides).
Cleaner overrides and defaults.
Structured configs make it easier to define parameter ranges for tuning and automation.
Config schemas can be versioned and evolve safely over time while preserving backward compatibility.
Full, validated configuration can be saved alongside results, which improves reproducibility and transparency.
Builds on well-maintained, widely adopted libraries (OmegaConf, Pydantic).

hydra
MIT Responsible AI's hydra-zen
Pydantic
Omegaconf
Soklaski, R., Goodwin, J., Brown, O., Yee, M. and Matterer, J., 2022. Tools and practices for responsible AI engineering. arXiv preprint arXiv:2201.05647.

Currently default model definitions are still stored as yaml files in the package. Moving to LazyConfig as in Detectron 2 would improve things significantly.

More things that could be done ( @deruyter92 ):

I think we need to make sure that everytime a model is used, all the changes to the project's config.yaml are reflected in the model's configuration under metadata as well.
There might be a better way to handle things in deeplabcut/pose_estimation_pytorch/data/base.py.