How to set path to Python executable in JupyterLab somewhere outside Anaconda - python-3.x

I need to be able to run JupyterLab on the Python executable in SPSS in order to import some SPSS libraries (spss, spssaux, SpssClient, etc.) in a notebook in JuptyerLab. Based on answers to other Ancadonda/executable-related questions, I've attempted two possible solutions:
Paste the relevant libraries from the Python directory in SPSS to the Python directory in Anaconda. I tried this both to the base environment and to the virtual environment I work out of, and tried launching JupyterLab both from the base and from the virtual environment. I copied the files from C:\Program Files\IBM\SPSS\Statistics\27\Python3\Lib\site-packages to C:\ProgramData\Anaconda3\Lib.
Redirect Anaconda to the Python executable in SPSS. I tried to to do that with sys.path.insert:
import sys
sys.path.insert(0, 'C:\Program Files\IBM\SPSS\Statistics\27\Python3')
Attempt #1 results in the following error when I try to import spss:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11344/2864419669.py in <module>
----> 1 import spss
C:\Program Files\IBM\SPSS\Statistics\27\Python3\Lib\site-packages\spss\__init__.py in <module>
254 __SetErrorMessage()
255
--> 256 from .spss import *
257 from .cursors import *
258 from .pivotTable import *
C:\Program Files\IBM\SPSS\Statistics\27\Python3\Lib\site-packages\spss\spss.py in <module>
21
22 import atexit,os,locale,sys,codecs
---> 23 from . import PyInvokeSpss
24 from .errMsg import *
25
ImportError: DLL load failed while importing PyInvokeSpss: The specified module could not be found.
This in spite of having copied PyInvokeSpss to C:\ProgramData\Anaconda3\Lib.
#2 doesn't seem to change the path to the executable:
C:\ProgramData\Anaconda3\python.exe
I also have an idea for a third solution, which would be to package the Python that's in SPSS as an iPython kernel, and then activate that in JupyterLab after it's launched, but I can't figure out a way of doing that.
What's the best direction here for a solution? What am I doing wrong?

Related

GStreamer build windows cant import _giscanner

I'm trying to build GStreamer on windows using gst-build.
My environment:
Visual studio 2019 Professional
Python 3.9 64 bit
Meson 0.58.999
Ninja 1.10.2
the error message is:
>ninja -C build
ninja: Entering directory `build'
[1/30] Generating gir-glib with a custom command (wrapped by meson to set PATH)
FAILED: subprojects/gobject-introspection/gir/GLib-2.0.gir
[long list of python subprocess args with include patsh, c files etc]
Traceback (most recent call last):
File "C:\Work\GStreamer-build\source\gst-build\build\subprojects\gobject-introspection\tools\g-ir-scanner", line 98, in <module>
from giscanner.scannermain import scanner_main
File "C:\Work\GStreamer-build\source\gst-build\build\subprojects\gobject-introspection\giscanner\scannermain.py", line 35, in <module>
from giscanner.ast import Include, Namespace
File "C:\Work\GStreamer-build\source\gst-build\build\subprojects\gobject-introspection\giscanner\ast.py", line 29, in <module>
from .sourcescanner import CTYPE_TYPEDEF, CSYMBOL_TYPE_TYPEDEF
File "C:\Work\GStreamer-build\source\gst-build\build\subprojects\gobject-introspection\giscanner\sourcescanner.py", line 35, in <module>
from giscanner._giscanner import SourceScanner as CSourceScanner
ImportError: DLL load failed while importing _giscanner: The specified module could not be found.
ninja: build stopped: subcommand failed.
if I check the build directory:
So the file does build, but for some reason can't be imported.
I've seen from here that there were some changes to python 3.8 in how DLLs are searched for, but even if I manually open a python prompt in the parent directory and add both the parent and the giscanner directories to sys.path and using os.add_dll_directory I still can't import the module:
>>> import os,sys
>>> a = os.add_dll_directory(os.getcwd())
>>> a
<AddedDllDirectory('C:\\Work\\GStreamer-build\\source\\gst-build\\build\\subprojects\\gobject-introspection')>
>>> b = os.add_dll_directory(os.path.join(os.getcwd(),'giscanner'))
>>> b
<AddedDllDirectory('C:\\Work\\GStreamer-build\\source\\gst-build\\build\\subprojects\\gobject-introspection\\giscanner')>
>>> sys.path.insert(0, os.getcwd())
>>> sys.path.insert(0, os.path.join(os.getcwd(),'giscanner'))
further digging shows that pkg-config doesn't seem to know where to find gio, however that was also built sucessfully as part of the build so not sure why it doesn't know.
Unsure the actual problem, but the workaround is to revert Python to 3.7.9 (64 bit)
https://www.python.org/downloads/release/python-379
Its also been suggested you can build without that component using
-Dintrospection=disabled

while attempting to import module of sklearn in Jupiter notebook as well as in PYCHARM, i continuously get following error

i have installed scikit-learn using pip command, however while trying to import module of sklearn in Jupiter notebook as well as in PYCHARM, i continuously get following error. I am working in python3.9. I am new to the interface, so it is requested to suggest solution for this issue.
ImportError Traceback (most recent call last)
in
----> 1 import sklearn.linear_model as lm
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/init.py in
79 # it and importing it first would fail if the OpenMP dll cannot be found.
80 from . import _distributor_init # noqa: F401
---> 81 from . import __check_build # noqa: F401
82 from .base import clone
83 from .utils._show_versions import show_versions
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/init.py in
44 from ._check_build import check_build # noqa
45 except ImportError as e:
---> 46 raise_build_error(e)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/init.py in raise_build_error(e)
29 else:
30 dir_content.append(filename + '\n')
---> 31 raise ImportError("""%s
32 ___________________________________________________________________________
33 Contents of %s:
ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/_check_build.cpython-39-darwin.so, 2): Symbol not found: ____chkstk_darwin
Referenced from: /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/../.dylibs/libomp.dylib
Expected in: /usr/lib/libSystem.B.dylib
in /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/../.dylibs/libomp.dylib
Contents of /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build:
init.py pycache _check_build.cpython-39-darwin.so
setup.py
It seems that scikit-learn has not been built correctly.
If you have installed scikit-learn from source, please do not forget
to build the package before using it: run python setup.py install or
make in the source directory.
If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform
thanks
As you can see in this bug report, the new sklearn version 0.24 crashes on MacOS<10.15 systems.
Until this bug is fixed, the developers suggest installing the previous version using pip install -U scikit-learn==0.23.

Python 3.8: Error running different Python environments from base directory's Jupyter Notebook

I've recently had to do a reformat of my work computer, and am reinstalling Anaconda. I generally keep Anaconda's root (base) folder untouched, and create separate environments when I need to work with specialist Python modules instead of cluttering the (base) environment.
In the past, I was able to successfully run these different environments from the Jupyter Notebook installed in the (base) environment. I would go about doing so by installing ipykernel in the new environment (e.g. my-env), and then running the following commands:
(base) activate my-env
(my-env) conda install ipykernel
(my-env) python -m ipykernel install --name "my-env" --display-name "My Python"
This would be done successfully, and give me the following message:
Installed kernelspec my-env in C:\ProgramData\jupyter\kernels\my-env
However, when I tried testing out the link in Jupyter Notebook using a standard import matplotlib.pyplot as plt command, I get the following error message:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-a0d2faabd9e9> in <module>
----> 1 import matplotlib.pyplot as plt
C:\Anaconda3\envs\my-env\lib\site-packages\matplotlib\__init__.py in <module>
105 # cbook must import matplotlib only within function
106 # definitions, so it is safe to import from it here.
--> 107 from . import cbook, rcsetup
108 from matplotlib.cbook import MatplotlibDeprecationWarning, sanitize_sequence
109 from matplotlib.cbook import mplDeprecation # deprecated
C:\Anaconda3\envs\my-env\lib\site-packages\matplotlib\cbook\__init__.py in <module>
26 import weakref
27
---> 28 import numpy as np
29
30 import matplotlib
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\__init__.py in <module>
141 from .core import *
142 from . import compat
--> 143 from . import lib
144 # NOTE: to be revisited following future namespace cleanup.
145 # See gh-14454 and gh-15672 for discussion.
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\lib\__init__.py in <module>
23 # Private submodules
24 from .type_check import *
---> 25 from .index_tricks import *
26 from .function_base import *
27 from .nanfunctions import *
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\lib\index_tricks.py in <module>
9 from numpy.core.numerictypes import find_common_type, issubdtype
10
---> 11 import numpy.matrixlib as matrixlib
12 from .function_base import diff
13 from numpy.core.multiarray import ravel_multi_index, unravel_index
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\matrixlib\__init__.py in <module>
2
3 """
----> 4 from .defmatrix import *
5
6 __all__ = defmatrix.__all__
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\matrixlib\defmatrix.py in <module>
9 # While not in __all__, matrix_power used to be defined here, so we import
10 # it for backward compatibility.
---> 11 from numpy.linalg import matrix_power
12
13
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\linalg\__init__.py in <module>
71 """
72 # To get sub-modules
---> 73 from .linalg import *
74
75 from numpy._pytesttester import PytestTester
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\linalg\linalg.py in <module>
31 from numpy.core import overrides
32 from numpy.lib.twodim_base import triu, eye
---> 33 from numpy.linalg import lapack_lite, _umath_linalg
34
35
ImportError: DLL load failed: The specified module could not be found.
Could someone advise me on what might be the issue? If it helps, my (base) environment has a python version of 3.8.3 and a notebook version of 6.0.3, whereas my new my-env environment has modules downloaded from conda-forge. It has a python version of 3.7.8 and an ipykernel version of 5.3.4.
Thanks in advance!
UPDATE 26 Oct 2020
As requested, I have included the list of modules I have in both the (base) environment and the (my-env) environment. In general, the packages in (base) have been kept updated with respect to the anaconda module, whereas the packages in (my-env) are kept up-to-date with respect to hyperspy, which is stored in the conda-forge repository.
I have created PasteBin entries for them, as they exceed the character limit for this post.
Link to list of modules in (base)
Link to list of modules in (my-env)
I also tried importing modules other than matplotlib and numpy, and was able to import abc and time without issue, for example. This seems to be an issue with the (base) version of Jupyter Notebook not being compatible with the numpy found in the (my-env) environment.
The error message is indicating the error is arising from the numpy library. The fact that your Python and Ipkernel versions are different between your (base) and your (my-env) is further indication that there is incompatibility between your environments.
Can you provide the output from
conda list
from each environment?
When I tried to create a Python=3.8.3 environment the numpy version installed is numpy-1.19.2-py38hf89b668_1
I used the command
conda create -n foo -c conda-forge python=3.8.3 numpy
When I tried to create a Python=3.7.8 environment the numpy version installed is numpy-1.19.2-py37h7008fea_1
I used the command
conda create -n foo -c conda-forge python=3.7.8 numpy
In addition, why don't you consider installing ipkernel / jupyter notebook libraries that are consistent with respective versions of Python in each environment? This would always be the best solution to ensure dependencies are correctly aligned.
I also attempted to install ipykernel in both python=3.8.3 and python=3.7.8 along with ipykernel without specifying version number.
Here are the versions of ipykernel that condaautomatically chooses:
for python=3.8.3: ipykernel-5.3.4 | py38h1cdfbd6_1
for python=3.7.8: ipykernel-5.3.4 | py37hc6149b9_1
From what you have written your ipykernel versions are different. I do think this discrepancy is most likely coming from these differences of ipykernel versions
When you check your environments, verify that the channel source for ipykernel is the same.
One Solution: Consider downgrading the ipykernel in (base) to 5.3.4 version.

Is pandas_ml broken?

The version info and issue are as given below. I want to know if pandas_ml is broken or am I doing something wrong. Why am I not able to import pandas_ml?
Basic info:
Versions of sklearn and pandas_ml and python are given below:
Python 3.8.2
scikit-learn 0.23.0
pandas-ml 0.6.1
Issue:
import pandas_ml as pdml
returns the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-47-79d5f9d2381c> in <module>
----> 1 import pandas_ml as pdml
2 #from pandas_ml import ModelFrame
3 #mf = pdml.ModelFrame(df.to_dict())
4 #mf.head()
d:\program files\python38\lib\site-packages\pandas_ml\__init__.py in <module>
1 #!/usr/bin/env python
2
----> 3 from pandas_ml.core import ModelFrame, ModelSeries # noqa
4 from pandas_ml.tools import info # noqa
5 from pandas_ml.version import version as __version__ # noqa
d:\program files\python38\lib\site-packages\pandas_ml\core\__init__.py in <module>
1 #!/usr/bin/env python
2
----> 3 from pandas_ml.core.frame import ModelFrame # noqa
4 from pandas_ml.core.series import ModelSeries # noqa
d:\program files\python38\lib\site-packages\pandas_ml\core\frame.py in <module>
8
9 import pandas_ml.imbaccessors as imbaccessors
---> 10 import pandas_ml.skaccessors as skaccessors
11 import pandas_ml.smaccessors as smaccessors
12 import pandas_ml.snsaccessors as snsaccessors
d:\program files\python38\lib\site-packages\pandas_ml\skaccessors\__init__.py in <module>
13 from pandas_ml.skaccessors.linear_model import LinearModelMethods # noqa
14 from pandas_ml.skaccessors.manifold import ManifoldMethods # noqa
---> 15 from pandas_ml.skaccessors.metrics import MetricsMethods # noqa
16 from pandas_ml.skaccessors.model_selection import ModelSelectionMethods # noqa
17 from pandas_ml.skaccessors.neighbors import NeighborsMethods # noqa
d:\program files\python38\lib\site-packages\pandas_ml\skaccessors\metrics.py in <module>
254 _true_pred_methods = (_classification_methods + _regression_methods
255 + _cluster_methods)
--> 256 _attach_methods(MetricsMethods, _wrap_target_pred_func, _true_pred_methods)
257
258
d:\program files\python38\lib\site-packages\pandas_ml\core\accessor.py in _attach_methods(cls, wrap_func, methods)
91
92 for method in methods:
---> 93 _f = getattr(module, method)
94 if hasattr(cls, method):
95 raise ValueError("{0} already has '{1}' method".format(cls, method))
AttributeError: module 'sklearn.metrics' has no attribute 'jaccard_similarity_score'
It seems it is indeed. Here is the situation:
Although the function jaccard_similarity_score is not shown in the available ones of sklearn.metrics in the documentation, it was still there under the hood (hence available) until v0.22.2 (source code) in addition to the jaccard_score one. But in the source code of the latest v0.23, it has been removed, and only jaccard_score remains.
This would imply that it could still be possible to use pandas-ml by simply downgrading scikit-learn to v.0.22.2. But unfortunately this will not work either, throwing a different error:
!pip install pandas-ml
# Successfully installed enum34-1.1.10 pandas-ml-0.6.1
import sklearn
sklearn.__version__
# '0.22.2.post1'
import pandas_ml as pdml
[...]
AttributeError: module 'sklearn.preprocessing' has no attribute 'Imputer'
I guess it would be possible to find a scikit-learn version that works with it by going back enough (the last commit in their Github repo was in March 2019), but not sure if it is worth the fuss. In any case, they do not even mention scikit-learn (let alone any specific version of it) in their requirements file, which does not seem as sound practice, and the whole project seems rather abandoned.
So after some time and effort on this, I got it working and realized that the concept of broken in Python is rather murky. It would depend upon the combination of libraries you are trying to use and their dependencies. The older releases are all available and can be used but sometimes, it can be a hit-and-trial process to find that correct combination of package versions which gets everything working.
The other thing that I learnt from this exercise is the importance of having a significant expertise in creating and managing the virtual environments when programming with python.
In my case, I got help from some friends with the hit-and-trial part and found that pandas_ml works on python 3.7. Given below is the pip freeze output which can be used to setup a reliable virtual environment for machine learning and deep learning work using libraries like pandas_ml and imbalanced-learn libraries and may include some other libraries which have not had a new release in the last few years.
To create a working environment with the right version of packages which would ensure that pandas_ml and imbalanced-learn libraries work, create an environment with the following configuration on Python 3.7.
backcall==0.1.0
colorama==0.4.3
cycler==0.10.0
decorator==4.4.2
enum34==1.1.10
imbalanced-learn==0.4.3
ipykernel==5.2.1
ipython==7.14.0
ipython-genutils==0.2.0
jedi==0.17.0
joblib==0.15.0
jupyter-client==6.1.3
jupyter-core==4.6.3
kiwisolver==1.2.0
matplotlib==3.2.1
numpy==1.15.4
pandas==0.24.2
pandas-ml==0.6.1
parso==0.7.0
pickleshare==0.7.5
prompt-toolkit==3.0.5
Pygments==2.6.1
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
pywin32==227
pyzmq==19.0.1
scikit-learn==0.20.0
scipy==1.3.3
six==1.14.0
threadpoolctl==2.0.0
tornado==6.0.4
traitlets==4.3.3
wcwidth==0.1.9
Hope this helps someone who is looking for the right combination of library versions to setup their machine and deep learning environment in python using pandas_ml and imbalanced-learn packages.

Why does command prompt import differ from sublime text import?

I have installed with pip several packages (numpy/pandas/blpapi/pyarrow). I work with a Windows 64-bit machine, python3.6 in a sublime environment.
While all packages are shown as correctly imported in the command prompt, some packages are not found by my sublime scripts.
To try and remedy this problem, I used sys.path.insert and changed the names of my scripts, to no avail. The traceback below describes what I'm seeing:
Code in Command Prompt:
>>> import pyarrow
>>> import pandas
>>>
Code in Sublime (better_name.py):
print('Hi')
import numpy
import pandas
Output of better_name.py:
Hi
Traceback (most recent call last):
File "C:\Users\Documents\better_name.py", line 4, in <module>
import pandas
ModuleNotFoundError: No module named 'pandas'
Obtaining the paths in Command Prompt:
>>> import os
>>> import numpy
>>> path = os.path.dirname(numpy.__file__)
>>> print(path)
C:\Users\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy
>>> import pandas
>>> path = os.path.dirname(pandas.__file__)
>>> print(path)
C:\Users\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas
Trying to use sys.path.insert :
print('Hi')
import sys
import numpy
import os
sys.path.insert(1, r"C:\Users\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas")
Output:
C:\Users\Documents>better_name.py
Hi
Traceback (most recent call last):
File "C:\Users\Documents\better_name.py", line 7, in <module>
import pandas
ModuleNotFoundError: No module named 'pandas'
I get the same results whether I change the argument in sys.path.insert to 0.
The issue seems to be that your default version of python points to the 32-bit version - i.e. when you say python your windows system executes the 32 bit version.
One workaround is to specify the full path of your 64 bit version - i.e. launch your script as
C:\PATH\TO\64-BIT-VERSION\PYTHON.EXE your_script.py
from the command line.
The other option is to set your windows environment variables to point to the 64 bit version by default. This link should help

Resources