spacy
spaCy examples
For spaCy v3 we've converted many of the v2 example scripts into end-to-end spacy projects workflows. The workflows include all the steps to go from data to packaged spaCy models.
🪐 Pipeline component demos
The simplest demos for training a single pipeline component are in the
pipelines
category
including:
pipelines/ner_demo
: Train a named entity recognizerpipelines/textcat_demo
: Train a text classifierpipelines/parser_intent_demo
: Train a dependency parser for custom semantics
🪐 Tutorials
The tutorials
category includes examples that work through specific NLP use cases end-to-end:
tutorials/textcat_goemotions
: Train a text classifier to categorize emotions in Reddit poststutorials/nel_emerson
: Use an entity linker to disambiguate mentions of the same name
Check out the projects documentation and browse through the available projects!
🚀 Get started with a demo project
The
pipelines/ner_demo
project converts the spaCy v2
train_ner.py
demo script into a spaCy v3 project.
-
Clone the project:
python -m spacy project clone pipelines/ner_demo -
Install requirements and download any data assets:
cd ner_demopython -m pip install -r requirements.txtpython -m spacy project assets -
Run the default workflow to convert, train and evaluate:
python -m spacy project run allSample output:
ℹ Running workflow 'all'================================== convert ==================================Running command: /home/user/venv/bin/python scripts/convert.py en assets/train.json corpus/train.spacyRunning command: /home/user/venv/bin/python scripts/convert.py en assets/dev.json corpus/dev.spacy=============================== create-config ===============================Running command: /home/user/venv/bin/python -m spacy init config --lang en --pipeline ner configs/config.cfg --forceℹ Generated config template specific for your use case- Language: en- Pipeline: ner- Optimize for: efficiency- Hardware: CPU- Transformer: None✔ Auto-filled config with all values✔ Saved configconfigs/config.cfgYou can now add your data and train your pipeline:python -m spacy train config.cfg --paths.train ./train.spacy --paths.dev ./dev.spacy=================================== train ===================================Running command: /home/user/venv/bin/python -m spacy train configs/config.cfg --output training/ --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy --training.eval_frequency 10 --training.max_steps 100 --gpu-id -1ℹ Using CPU=========================== Initializing pipeline ===========================[2021-03-11 19:34:59,101] [INFO] Set up nlp object from config[2021-03-11 19:34:59,109] [INFO] Pipeline: ['tok2vec', 'ner'][2021-03-11 19:34:59,113] [INFO] Created vocabulary[2021-03-11 19:34:59,113] [INFO] Finished initializing nlp object[2021-03-11 19:34:59,265] [INFO] Initialized pipeline components: ['tok2vec', 'ner']✔ Initialized pipeline============================= Training pipeline =============================ℹ Pipeline: ['tok2vec', 'ner']ℹ Initial learn rate: 0.001E # LOSS TOK2VEC LOSS NER ENTS_F ENTS_P ENTS_R SCORE--- ------ ------------ -------- ------ ------ ------ ------0 0 0.00 7.90 0.00 0.00 0.00 0.0010 10 0.11 71.07 0.00 0.00 0.00 0.0020 20 0.65 22.44 50.00 50.00 50.00 0.5030 30 0.22 6.38 80.00 66.67 100.00 0.8040 40 0.00 0.00 80.00 66.67 100.00 0.8050 50 0.00 0.00 80.00 66.67 100.00 0.8060 60 0.00 0.00 100.00 100.00 100.00 1.0070 70 0.00 0.00 100.00 100.00 100.00 1.0080 80 0.00 0.00 100.00 100.00 100.00 1.0090 90 0.00 0.00 100.00 100.00 100.00 1.00100 100 0.00 0.00 100.00 100.00 100.00 1.00✔ Saved pipeline to output directorytraining/model-last -
Package the model:
python -m spacy project run package -
Visualize the model's output with Streamlit:
python -m spacy project run visualize-model