🪐 Project Templates
Weasel, previously spaCy projects, lets you manage and share end-to-end workflows for different use cases and domains, and orchestrate training, packaging and serving your custom pipelines. You can start off by cloning a pre-defined project template, adjust it to fit your needs, load in your data, train a pipeline, export it as a Python package, upload your outputs to a remote storage and share your results with your team.
⚠️ Weasel project templates require Weasel, which is also included by default with spaCy v3.7+. You can install it from pip with
pip install weaselor conda withconda install weasel -c conda-forge. Make sure to use a fresh virtual environment.See the
masterbranch for the previous version of this repo.
🗃 Categories
| Name | Description |
|---|---|
pipelines |
Templates for training NLP pipelines with different components on different corpora. |
tutorials |
Templates that work through a specific NLP use case end-to-end. |
integrations |
Templates showing integrations with third-party libraries and tools for managing your data and experiments, iterating on demos and prototypes and shipping your models into production. |
benchmarks |
Templates to reproduce our benchmarks and produce quantifiable results that are easy to compare against other systems or versions of spaCy. |
experimental |
Experimental workflows and other cutting-edge stuff to use at your own risk. |
🚀 Quickstart
Projects can be used via the
weasel CLI, or
through the spacy project alias. To find
out more about a command, add --help. For detailed instructions, see the
Weasel documentation
or spaCy projects usage guide.
- Clone the project template you want to use.
python -m weasel clone tutorials/ner_fashion_brands
- Install any project requirements.
cd ner_fashion_brands python -m pip install -r requirements.txt - Fetch assets (data, weights) defined in the
project.yml. - Run a command defined in the
project.yml.python -m weasel run preprocess
- Run a workflow of multiple steps in order.
- Adjust the template for your specific use case, load in your own data, adjust the settings and model and share the result with your team.
👷♀️Repository maintanance
To keep the project templates and their documentation up to date, this repo contains several scripts:
| Script | Description |
|---|---|
update_docs.py |
Update all auto-generated docs in the given root. Calls into spacy project document and only replaces the auto-generated sections, not any custom content before or after. |
update_category_docs.py |
Update the auto-generated README.md in the category directories listing the available project templates. |
update_configs.py |
Update and auto-fill all config.cfg files included in the repo, similar to spacy init fill-config. Can be used to keep the configs up to date with changes in spaCy. |
update_projects_jsonl.py |
Update projects.jsonl file in the given root. Should be used at the root level of the repo. |