spaCy loading model fails - nlp

I am trying to load spaCy model de_core_news_sm without any success. Since our company police seems to block the python -m spacy download de_core_news_sm prompt command, I downloaded the model manually and used pip install on the local tar.gz archive, which worked out well.
However, calling nlp = spacy.load("de_core_news_sm") in my code throws the following exception:
Exception has occurred: ValueError
[E149] Error deserializing model. Check that the config used to create the
component matches the model being loaded.
File "pipes.pyx", line 642, in
spacy.pipeline.pipes.Tagger.from_disk.load_model
I have no idea how to deal with this. Does anybody know what to do?

Run python -m spacy validate to check whether the model you downloaded is compatible with the version of spacy you have installed. This kind of error happens when the versions aren't compatible. (Probably one is v2.1 and the other is v2.2.)

Related

Spacy ValueError: [E002] Can't find factory for 'relation_extractor' for language English (en)

I want to train a "relation extractor" component as in this tutorial. I have 3 .spacy files (train.spacy, dev.spacy, test.spacy.
I run:
python3 -m spacy init fill-config config.cfg config.cfg
followed by
python3 -m spacy train --output ./model config.cfg --paths.train train.spacy --paths.dev dev.spacy
Output:
ValueError: [E002] Can't find factory for 'relation_extractor' for language English (en). This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. If you're using a Transformer, make sure to install 'spacy-transformers'. If you're using a custom component, make sure you've added the decorator `#Language.component` (for function components) or `#Language.factory` (for class components).
Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, doc_cleaner, parser, beam_parser, lemmatizer, trainable_lemmatizer, entity_linker, ner, beam_ner, entity_ruler, tagger, morphologizer, senter, sentencizer, textcat, spancat, future_entity_ruler, span_ruler, textcat_multilabel, en.lemmatizer
I have tried the two config files here but the output is the same.
To enable Transformers I have installed spacy-transformers downloaded en_core_web_trf via
python3 -m spacy download en_core_web_trf
A similar issue was mentioned on GitHub but that solution is for an other context. Similarly, on GitHub somebody raised the same issue with no solution. Here too was not solved.

AttributeError displayed as module 'spacy' has no attribute 'cli' while downloading spacy.cli.download('en_core_web_lg') on databricks

I have written a code for NLP programme and downloading pipeline 'en_core_web_lg' using spacy, up to now the code was running but now it is showing me Attribute error
module 'spacy' has no attribute 'cli'.
i have installed the pypi-cli version 0.4.1 and spacy 3.2.3 but still not able to figure out the root cause and solution of the problem.
i am new to the coding.

Error while importing 'en_core_web_sm' for spacy in Azure Databricks

I am getting an error while loading 'en_core_web_sm' of spacy in Databricks notebook. I have seen a lot of other questions regarding the same, but they are of no help.
The code is as follows
import spacy
!python -m spacy download en_core_web_sm
from spacy import displacy
nlp = spacy.load("en_core_web_sm")
# Process
text = ("This is a test document")
doc = nlp(text)
I get the error "OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory"
The details of installation are
Python - 3.8.10
spaCy version 3.3
It simply does not work. I tried the following
ℹ spaCy installation:
/databricks/python3/lib/python3.8/site-packages/spacy
NAME SPACY VERSION
en_core_web_sm >=2.2.2 3.3.0 ✔
But the error still remains
Not sure if this message is relevant
/databricks/python3/lib/python3.8/site-packages/spacy/util.py:845: UserWarning: [W094] Model 'en_core_web_sm' (2.2.5) specifies an under-constrained spaCy version requirement: >=2.2.2. This can lead to compatibility problems with older versions, or as new spaCy versions are released, because the model may say it's compatible when it's not. Consider changing the "spacy_version" in your meta.json to a version range, with a lower and upper pin. For example: >=3.3.0,<3.4.0
warnings.warn(warn_msg)
Also the message when installing 'en_core_web_sm"
"Defaulting to user installation because normal site-packages is not writeable"
Any help will be appreciated
Ganesh
I suspect that you have cluster with autoscaling, and when autoscaling happened, new nodes didn't have the that module installed. Another reason could be that cluster node was terminated by cloud provider & cluster manager pulled a new node.
To prevent such situations I would recommend to use cluster init script as it's described in the following answer - it will guarantee that the module is installed even on the new nodes. Content of the script is really simple:
#!/bin/bash
pip install spacy
python -m spacy download en_core_web_sm

How do I fix a keras error for a plaidbench keras test?

I am trying to install plaidml-keras so I can use non-Nvidia GPUs with Keras in python/jupyter. After clearing several other hurdles I get as far as:
plaidbench keras mobilenet
but it errors twice
ImportError: cannot import name 'object_list_uid' from 'keras.utils.generic_utils' (/Users/me/sprinthive/src/notebooks/nbenv/lib/python3.7/site-packages/keras/utils/generic_utils.py)
File "/Users/me/sprinthive/src/notebooks/nbenv/lib/python3.7/site-packages/plaidbench/frontend_keras.py", line 321, in __init__
raise core.ExtrasNeeded(['plaidml-keras'])
plaidbench.core.ExtrasNeeded: Missing needed packages for benchmark; to fix, pip install plaidml-keras
This is in spite of already having plaidml-keras installed:
pip freeze | grep plaid
plaidbench==0.6.4
plaidml==0.6.4
plaidml-keras==0.6.4
[I am using 0.6.4 to make it work on macOS 10.13 High Sierra]
How can I resolve the above errors?
Thanks!
I worked this out by creating a virtual environment with Anaconda. Beware that i am working on Windows, so this might not be a solution for your problem. If i had to guess, something was installed by me before that causes a python package problem. I think this is related to the tensorflow library, but i haven't dug into that. I would recommend trying out a fresh virtual environment on your Mac, where you install the plaidml package. The error Message before was exactly the same.

Spacy es_core_news_sm model not loading

I'm trying to use Spacy for pos tagging in Spanish, for this I have checked the official documentation and also have read various post in Stackoverflow nonetheless neither has worked to me.
I have Python 3.7 and Spacy 2.2.4 installed and I'm running my code from a jupyter notebook
So as documentation suggests I tried:
From my terminal:
python -m spacy download en_core_web_sm
This gave the result:
Download and installation successful
Then in my jupyter notebook:
import spacy
nlp = spacy.load("es_core_news_sm")
And I got the following error:
ValueError: [E173] As of v2.2, the Lemmatizer is initialized with an instance of Lookups containing the lemmatization tables. See the docs for details: https://spacy.io/api/lemmatizer#init
Additionally, I tried:
import spacy
nlp = spacy.load("es_core_news_sm")
And this gave me a different error:
OSError: Can't find model 'es_core_news_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory
Could you please help me to solve this error?
You downloaded English model. In order to use Spanish model, you have to download it python -m spacy download es_core_news_sm
After downloading the right model you can try import it as follow
import spacy
import es_core_news_sm
nlp = es_core_news_sm.load()

Resources