Error - 'English' object has no attribute 'add_label' - python-3.x

I have trained on spacy's blank model with 5 entities and now I have made a new data with 2 new entities. On training on top of old model with 5 entities,
nlp.add_label(LABEL)
gives error:
AttributeError: 'English' object has no attribute 'add_label'
Spacy version : 2.1.4
Python : 3.6

You have to use add_label on the pipeline component (e.g. ner or textcat) - not on the nlp object. It's probably useful to read through the documentation in a bit more detail here.

Related

AttributeError: 'Field' object has no attribute 'vocab' preventing me to run the code

I have found this code and I wanna see what is the object that im printing in the last line. im new in field of nlp so please help me fix this code, because it gives AttributeError: 'Field' object has no attribute 'vocab'error. by the way I have found out that torchtext has been changed and the error is probably related to these changes, and the code probably was working before.
import spacy
from torchtext.legacy.data import Field
spacy_eng = spacy.load("en")
def tokenize_eng(text):
return [tok.text for tok in spacy_eng.tokenizer(text)]
english = Field(
tokenize=tokenize_eng, lower=True, init_token="<sos>", eos_token="<eos>"
)
print([english.vocab.stoi["<sos>"]])
You have to build the vocabulary for the english Field before you try to access it. You will need a dataset to build the vocabulary, which will be the dataset you are looking to build a model for. You can use english.build_vocab(...). Here are the docs for build_vocab.
Also, if you would like to learn how to migrate what you are doing to the new version of torchtext, here is a good resource.

In spacy custom trianed model : Config Validation error ner -> incorrect_spans_key extra fields not permitted

I am running into the problem whenever I try to load custom trained NER model of spacy inside docker container.
Note:
I am using latest spacy version 3.0 and trained that NER model using CLI commands of spacy, first by converting Train data format into .spacy format
The error throws as following(You can check error in image as hyperlinked):
config validation error
My trained model file structure looks like this:
custom ner model structure
But while run that model without docker it works perfectly. What wrong I have done in this process. Plz help me to resolve the error.
Thank you in advance.

Spacy returns "AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'spans'" in simple .spans assignment. Why?

I'm just trying to mark subparts of a document as spans as per Spacy's documentation
import spacy
nlp = spacy.load('en_core_web_sm')
sentence = "The car with the white wheels was being confiscated by the police when the owner returns from robbing a bank"
doc = nlp(sentence)
doc.spans['remove_parts'] = [doc[2:6], doc[9:12]]
doc.spans['remove_parts']
This looks pretty straight forward, but Spacy returns the following error (and attributes it to the second line i.e. the assignment):
AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'spans'
I can't see what's going on at all. Is this a Spacy bug? Has spans property been removed even though it is still in the documentation? If not what am I missing?
PD: I'm using Colab for this. And spacy.info shows:
spaCy version 2.2.4
Location /usr/local/lib/python3.7/dist-packages/spacy
Platform Linux-4.19.112+-x86_64-with-Ubuntu-18.04-bionic
Python version 3.7.10
Models en
This code:
nlp = English()
text = "The car with the white wheels was being confiscated by the police when the owner returns from robbing a bank"
doc = nlp(text)
doc.spans['remove_parts'] = [doc[2:6], doc[9:12]]
doc.spans['remove_parts']
should work correctly from spaCy v3.0 onwards. If it doesn't - can you verify that you are in fact running the code from the correct virtual environment within colab (and not a different environment using spaCy v2)? We have previously seen issues where Colab would still be accessing older installations of spaCy on the system, instead of sourcing the code from the correct venv. To double check, you can try running the code in a Python console directly instead of through Colab.

Extracting Person Names from a text data in German Language using spacy or nltk?

I am using a spacy model for german language for extracting named entities such as location names, person names and company names but not getting the proper result as an output. Is there any missing concept which I am unable to figure out precisely.
def city_finder(text_data):
nlp = spacy.load('en_core_web_sm')
doc = nlp(text_data)
for ents in doc.ents:
if(ents.label_ == 'GPE'):
return (ents.text)
This is the code which I had used in order to find the city names from the text data but its accuracy is not very high. When I run this code the result is coming out to be something else instead of the city name. Is there something which I am missing out as part of the Natural Language Processing or any other area?
You have to load a German language model, currently you loaded an English language model (see the prefix "en"). There are two German models available:
de_core_news_sm
de_core_news_md
Source: https://spacy.io/models/de
You can install them via the following command:
python -m spacy download de_core_news_sm
However, there are currently only four different entities supported (and my experience is that without retraining, the entity extraction doesn't work very well in German)
Supported NER: LOC, MISC, ORG, PER
You would have to adapt your code the following way:
def city_finder(text_data):
nlp = spacy.load('de_core_news_sm') # load German language model
doc = nlp(text_data)
for ents in doc.ents:
if(ents.label_ == 'LOC'): # GPE is not supported
return (ents.text)
There are standard libraries available for extracting language specific POS. You can check other libraries for extracting nouns for example Pattern library from CLiPS (refer https://github.com/clips/pattern) implements POS for languages like German and Spanish.

AttributeError: 'module' object has no attribute 'LogicParser' with nltk.LogicParser()

I'm playing with examples from Natural Language Processing with Python and this line:
lp = nltk.LogicParser()
produces
AttributeError: 'module' object has no attribute 'LogicParser'
error message. I imported several nltk modules and I can't figure out what is missing. Any clues?
It sounds like you've spotted the problem, but just in case: You are reading the first edition of the NLTK book, but evidently you have installed NLTK 3, which has many changes. Look at the current version of chapter 10 for the correct usage.

Resources