Unable to load 'en' from spacy in jupyter notebook

Unable to load 'en' from spacy in jupyter notebook - python-3.x

I run the following lines of code in a jupyter notebook:
import spacy
nlp = spacy.load('en')
And get following error:
Warning: no model found for 'en_default'
Only loading the 'en' tokenizer.
I am using python 3.5.3, spacy 1.9.0, and jupyter notebook 5.0.0.
I downloaded spacy using conda install spacy and python3 spacy install en.
I am able to import spacy and load 'en' from my terminal but not from a jupyter notebook.

Based on the answer in your comments, it seems fairly clear that the two Python interpreters for Jupyter and your system Python are not the same, and therefore likely do not have shared libraries between them.
I would recommend re-running the installation or just specifically installation the en tool in the correct Spacy. Replace the path with the full path to the file, if the above is not the full path.
//anaconda/envs/capstone/bin/python -m spacy download
That should be enough. Let me know if there are any issues.

You can also download en language model in the jupyter notebook:
import sys
!{sys.executable} -m spacy download en

Related

ModuleNotFoundError: No module named 'seg'

Tried to follow this line of code from this link https://spacy.io/universe/project/spacy-sentence-segmenter to create a sentence segmenter. Encountered the following error：ModuleNotFoundError: No module named 'seg'.
Spacy already installed. Didn't find any information about which module should be used for this 'seg'. Anyone could help? thanks.
from seg.newline.segmenter import NewLineSegmenter
import spacy
nlseg = NewLineSegmenter()
nlp = spacy.load('en')
nlp.add_pipe(nlseg.set_sent_starts, name='sentence_segmenter', before='parser')
doc = nlp(my_doc_text)

Sentence Segmenter is a third-party module that is different from your spaCy installation. You need to install it separately:
pip install spacyss
You can find more information in the project's Github page.

Try to install the module using pip or pip3: pip3 install segmentation

Error while importing 'en_core_web_sm' for spacy in Azure Databricks

I am getting an error while loading 'en_core_web_sm' of spacy in Databricks notebook. I have seen a lot of other questions regarding the same, but they are of no help.
The code is as follows
import spacy
!python -m spacy download en_core_web_sm
from spacy import displacy
nlp = spacy.load("en_core_web_sm")
# Process
text = ("This is a test document")
doc = nlp(text)
I get the error "OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory"
The details of installation are
Python - 3.8.10
spaCy version 3.3
It simply does not work. I tried the following
ℹ spaCy installation:
/databricks/python3/lib/python3.8/site-packages/spacy
NAME SPACY VERSION
en_core_web_sm >=2.2.2 3.3.0 ✔
But the error still remains
Not sure if this message is relevant
/databricks/python3/lib/python3.8/site-packages/spacy/util.py:845: UserWarning: [W094] Model 'en_core_web_sm' (2.2.5) specifies an under-constrained spaCy version requirement: >=2.2.2. This can lead to compatibility problems with older versions, or as new spaCy versions are released, because the model may say it's compatible when it's not. Consider changing the "spacy_version" in your meta.json to a version range, with a lower and upper pin. For example: >=3.3.0,<3.4.0
warnings.warn(warn_msg)
Also the message when installing 'en_core_web_sm"
"Defaulting to user installation because normal site-packages is not writeable"
Any help will be appreciated
Ganesh

I suspect that you have cluster with autoscaling, and when autoscaling happened, new nodes didn't have the that module installed. Another reason could be that cluster node was terminated by cloud provider & cluster manager pulled a new node.
To prevent such situations I would recommend to use cluster init script as it's described in the following answer - it will guarantee that the module is installed even on the new nodes. Content of the script is really simple:
#!/bin/bash
pip install spacy
python -m spacy download en_core_web_sm

Spacy es_core_news_sm model not loading

I'm trying to use Spacy for pos tagging in Spanish, for this I have checked the official documentation and also have read various post in Stackoverflow nonetheless neither has worked to me.
I have Python 3.7 and Spacy 2.2.4 installed and I'm running my code from a jupyter notebook
So as documentation suggests I tried:
From my terminal:
python -m spacy download en_core_web_sm
This gave the result:
Download and installation successful
Then in my jupyter notebook:
import spacy
nlp = spacy.load("es_core_news_sm")
And I got the following error:
ValueError: [E173] As of v2.2, the Lemmatizer is initialized with an instance of Lookups containing the lemmatization tables. See the docs for details: https://spacy.io/api/lemmatizer#init
Additionally, I tried:
import spacy
nlp = spacy.load("es_core_news_sm")
And this gave me a different error:
OSError: Can't find model 'es_core_news_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory
Could you please help me to solve this error?

You downloaded English model. In order to use Spanish model, you have to download it python -m spacy download es_core_news_sm

After downloading the right model you can try import it as follow
import spacy
import es_core_news_sm
nlp = es_core_news_sm.load()

Unable to import sklearn and statsmodels from Anaconda from windows 10 pro

I'm relatively new to python, so please excuse my ignorance on what could be a very easy fix. I am running python 3.6 through the Rodeo IDE, and it has been great, as it is similar to R-Studio (which I am very familiar with). As an aspiring data scientist, I am trying to learn how to fit regression and time series models to data, and all of the tutorials that I have found all say that I need various packages, all of which should be included in the Anaconda library. After downloading and re-downloading Python, Rodeo, and Anaconda, and trying various online fixes, I have been unable to successfully load the scikit-learn and the statsmodels modules.
#here is everything I have tried.
#using pip
! pip install 'statsmodels'
! pip install 'scikit-learn'
! pip install 'sklearn'
I don't get any errors here, and to be honest I'm kind of confused as to what this actually does, but I have seen many people online always suggest that this is a big problem when trying to import modules.
#using import
import sklearn
import statsmodels
from sklearn import datasets
import statsmodels.api as sm
all of the above give me the same error:
import statsmodels.api as sm
ImportError: No module named 'statsmodels'
ImportError: Traceback (most recent call last)
ipython-input-184-6030a6549dc0 in module()
----> 1 import statsmodels.api as sm
ImportError: No module named 'statsmodels'
I have tried to set my working directory to the Anaconda 3 file that has all of the packages and rerunning the above code with no success.
I'm thinking that the most likely problem has to do with my inexperience, and it is probably a simple fix. Is it possible that the IDE is bad or anaconda just doesn't like me?
So keeping all of the above in mind, the question is, how can I import these modules successfully so that I can access their functionality?

Option 1:
After installing packages with pip, try closing and reopening your IDE/Jupyter Notebook and try again.
This is a known bug that Jake VanderPlas outlined here
Option 2:
Don't put quotations around your pip messages.
!pip install -U statsmodels
!pip install scikit-learn
Option 3:
Also are you using Anaconda? If you are, you should already have scikit-learn. If you are trying inside Rodeo, I think you need to set your path inside Rodeo. Open Rodeo and set the Python Path to your fresh anaconda. See here

Using TensorFlow through Jupyter (Python 3)

Apologies in advance, I think the issue is quite perplexing!
I would like to use TensorFlow through Jupyter, with a Python3 kernel.
However the command import tensorflow as tf returns the error: ImportError: No module named tensorflow when either Python2 or Python3 is specified as the Jupyter kernel.
I have Python 2 and Python 3 installed on my Mac and can access both
versions through Terminal.
I installed TensorFlow for Python 3, however I can only access it via Python 2 on the Terminal.
As such, this question is really two-fold:
I want to get TensorFlow working with Python3
...which should lead to TensorFlow working with Jupyter on the Python3 terminal.

I had the same problem and solved it using the tutorial Using a virtualenv in an IPython notebook. I'll walk you through the steps I took.
I am using Anaconda, and I installed a new environment tensorflow using these instructions at tensorflow.org. After that, here is how I got tensorflow to work in a Jupyter notebook:
Open Terminal
Run source activate tensorflow. You should now see (tensorflow) at the beginning of the prompt.
Now that we are in the tensorflow environment, we want to install ipython and jupyter in this environment: Run
conda install ipython
and
conda install jupyter
Now follow the instructions in the tutorial linked above. I'll repeat them here with a bit more information added. First run
ipython kernelspec install-self --user
The result for me was Installed kernelspec python3 in /Users/charliebrummitt/Library/Jupyter/kernels/python3
Run the following:
mkdir -p ~/.ipython/kernels
Then run the following with <kernel_name> replaced by a name of your choice (I chose tfkernel) and replace the first path (i.e., ~/.local/share/jupyter/kernels/pythonX) by the path generated in step 4:
mv ~/.local/share/jupyter/kernels/pythonX ~/.ipython/kernels/<kernel_name>
Now you'll see a new kernel if you open a Jupyter notebook and select Kernel -> Change kernel from the menu. But the new kernel will have the same name as your previous kernel (for me it was called Python 3). To give your new kernel a unique name, run in Terminal
cd ~/.ipython/kernels/tfkernel/
and then run vim kernel.json to edit the file kernel.json so that you replace the value of "display_name" from the default (Python 3) to a new name (I chose to call it "tfkernel"). Save and exit vim by typing :wq while in command mode.
Open a new Jupyter notebook and type import tensorflow as tf. If you didn't get ImportError then you are ready to go!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string