Python and SpaCy - cannot download specific version - python-3.x

I am using Python 3.7.7
I ran the following (after pip3 install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz) and got results
[XXXXX#localhost some-folder]$ python3 -m spacy download en_core_web_sm-2.2.0
2021-02-16 17:58:24.921639: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-02-16 17:58:24.921671: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Ignoring the CUDA*.
No compatible package found for 'en_core_web_sm-2.2.0
the only way I have made it to work is to remove the version 2.2.0 from the code. But SpaCy documentation suggests that the version number should be able to download the correct file.
So, what am I doing wrong ?

Your spacy version also matters, not only the Python version.
spaCy models-languages compatibility
If you are already using spaCy v3, you won't be able to download language versions < 3

Related

Error while importing 'en_core_web_sm' for spacy in Azure Databricks

I am getting an error while loading 'en_core_web_sm' of spacy in Databricks notebook. I have seen a lot of other questions regarding the same, but they are of no help.
The code is as follows
import spacy
!python -m spacy download en_core_web_sm
from spacy import displacy
nlp = spacy.load("en_core_web_sm")
# Process
text = ("This is a test document")
doc = nlp(text)
I get the error "OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory"
The details of installation are
Python - 3.8.10
spaCy version 3.3
It simply does not work. I tried the following
ℹ spaCy installation:
/databricks/python3/lib/python3.8/site-packages/spacy
NAME SPACY VERSION
en_core_web_sm >=2.2.2 3.3.0 ✔
But the error still remains
Not sure if this message is relevant
/databricks/python3/lib/python3.8/site-packages/spacy/util.py:845: UserWarning: [W094] Model 'en_core_web_sm' (2.2.5) specifies an under-constrained spaCy version requirement: >=2.2.2. This can lead to compatibility problems with older versions, or as new spaCy versions are released, because the model may say it's compatible when it's not. Consider changing the "spacy_version" in your meta.json to a version range, with a lower and upper pin. For example: >=3.3.0,<3.4.0
warnings.warn(warn_msg)
Also the message when installing 'en_core_web_sm"
"Defaulting to user installation because normal site-packages is not writeable"
Any help will be appreciated
Ganesh
I suspect that you have cluster with autoscaling, and when autoscaling happened, new nodes didn't have the that module installed. Another reason could be that cluster node was terminated by cloud provider & cluster manager pulled a new node.
To prevent such situations I would recommend to use cluster init script as it's described in the following answer - it will guarantee that the module is installed even on the new nodes. Content of the script is really simple:
#!/bin/bash
pip install spacy
python -m spacy download en_core_web_sm

How to solve the import cv problem in vs studio?

File "c:\Users\csany\Documents\PYTHON_BEGINEERS\read.py", line 1, in
import cv2 as cv
File "C:\Users\csany\AppData\Local\Programs\Python\Python39\lib\site-packages\cv2_init_.py", line 5, in
from .cv2 import *
ImportError: DLL load failed while importing cv2: The specified module could not be found.
Every time I try to run the code i get this error. I have installed OpenCV module through its source I have changed its path yet it still won't work in vs studio for python. Please help me out
You can download the latest OpenCV 3.2.0 for Python 3.6 on Windows 32-bit or 64-bit machine, look for file starts withopencv_python‑3.2.0‑cp36‑cp36m, from this unofficial site. Then type below command to install it:
pip install opencv_python‑3.2.0‑cp36‑cp36m‑win32.whl (32-bit version)
pip install opencv_python‑3.2.0‑cp36‑cp36m‑win_amd64.whl (64-bit version)
I think it would be easier.
Update on 2017-09-15:
OpenCV 3.3.0 wheel files are now available in the unofficial site and replaced OpenCV 3.2.0.
Update on 2018-02-15:
OpenCV 3.4.0 wheel files are now available in the unofficial site and replaced OpenCV 3.3.0.
Update on 2018-06-19:
OpenCV 3.4.1 wheel files are now available in the unofficial site with CPython 3.5/3.6/3.7 support, and replaced OpenCV 3.4.0.
Update on 2018-10-03:
OpenCV 3.4.3 wheel files are now available in the unofficial site with CPython 3.5/3.6/3.7 support, and replaced OpenCV 3.4.1.
Update on 2019-01-30:
OpenCV 4.0.1 wheel files are now available in the unofficial site with CPython 3.5/3.6/3.7 support.
Update on 2019-06-10:
OpenCV 3.4.6 and OpenCV 4.1.0 wheel files are now available in the unofficial site with CPython 3.5/3.6/3.7 support.
src : https://stackoverflow.com/a/43190144/13319197

Install pywin32 package in google colab or kaggle notebook environment

pywin32 package was required to install as part of requirements to set up the environment for pix2pix implementation codebase, pywin32 is used to enable the features of the Win32 API in python. I tried to set up an environment in google colab, and produced the following error message during pywin32 setup.
ERROR: Could not find a version that satisfies the requirement pywin32
(from versions: none) ERROR: No matching distribution found for
pywin32
Similar issue with the following message encountered while trying to implement in kaggle:
ERROR: Could not find a version that satisfies the requirement pywin32
ERROR: No matching distribution found for pywin32
The same issue encountered when I tried in my local python environment (Python 3.6.10) in my mac.
Also, I attempt to install pywin32 package from its source itself, using the latest tag build-300 as suggested for python 3.5+. But no luck, installation terminated with the dependency issue with winreg package not found, following message was shown.
ModuleNotFoundError: No module named 'winreg'
Likewise, tried with fake-winreg, but no luck at all. I checked the platform in google colab by print(sys.platform), it shows linux. Please advise if there is any workaround to install pywin32 package in colab and/or resolution solving any issue reported in the above steps. Thank you in advance.
Note:
Issue can be replicated by simply try pip install pywin32 in native python environment, and !pip install pywin32 in colab or kaggle environment.
Unfortunately you can't install it in linux python, pywin32 is a package of extension modules for accessing Windows C and COM APIs in Windows python:
Python extensions for Microsoft Windows Provides access to much of the Win32 API, the ability to create and use COM objects, and the Pythonwin environment.
Google Colab
Kaggle

Pandas install or import in IBM SPSS Statistics Version 26

I have installed the latest version of IBM SPSS Statistics(Version 26) which has pre installed python 3.4 and 2.7. i am trying to use version 3.4 python. i am able to import modules like pip , sys , os etc. i tried pandas the same way. i am unable to do so. getting error no module found. Hence going through our forum and IBM support did the following changes.
received the following error
1) tried pointing the site-packages via
import sys
# Assuming windows and standard python folder here.
sys.path.append(r"D:\Python34\Lib\site-packages")
2) changed the path in the settings of SPSS
3) Tried installing pip in the below folder as suggested in the forum but got message i have already installed the updated version.
C:\Program Files\IBM\SPSS\Statistics\Subscription\Python3
4) following versions of python were installed
have tried what i could. Need your expertise help to fix the same which will help me to install/use modules needed for SPSS. Thanks.
This is going to be painful to explain, I'll do my best.
As far as I can tell, you're on windows. Usually when we need a new package, we just open cmd and type pip install xxx (assume you added python to path when installing it). The reason that this works, is because when you type pip install xxx in cmd, windows recognize pip to be a command because python path is in system variables. Windows know that I can execute pip install with this python path.
However for SPSS python (3.4), that python had a different path in the system. Thus when you only have the 3.7 or 3.8 python in path, windows cannot install package to you 3.4 python, and I'm not sure if you can have more than one python path in system.
In order to fix this, you need to first figure out what's the path to your 3.4 python, then in this page you can follow the instruction to remove your 3.7 or 3.8 python in path, and add your 3.4 path, then you can do pip install xxxx for whatever package you want
I did the same thing with a arcgis python distribution, hope this works for you. If the attached page does not work, just google add python path to windows and look for a instruction that works on you PC
Oh and the reason that you can import pip, sys and some other package but not pandas, is because python is 'battery included', it comes with tons of packages pre-installed for additional functionality, but pandas is not one of them.
Fixed it since my ananconda had version 3.7 .i created virtual environment and installed 2.7 python with anaconda package. Pointed SPSS to the 2.7 folder and was able to import pandas.

How to find version of libraries installed for requirements.txt that don't show in pip or conda list?

I am trying to get the requirements for my code, but the requirement.txt file exported only contains 4 of the 9 libraries I'm using. I'm wondering if python comes with preloaded libraries when installed? The libraries I'm using are requests, json, datetime, calendar, pandas, os, openpyxl, re, and click. The only versions I was able to get were the versions for click,openpyxl,pandas,requests. How to do I get the versions for json, datetime,calender,os,and re?
I have tried doing a pip list and conda list but both didn't show the versions for json, datetime,calender,os,and re.
I am using python 3.7.1
json, datetime,calender,os,and re are all part of the Python Standard Library and are installed by default.

Resources