Is pandas_ml broken? - python-3.x

The version info and issue are as given below. I want to know if pandas_ml is broken or am I doing something wrong. Why am I not able to import pandas_ml?
Basic info:
Versions of sklearn and pandas_ml and python are given below:
Python 3.8.2
scikit-learn 0.23.0
pandas-ml 0.6.1
Issue:
import pandas_ml as pdml
returns the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-47-79d5f9d2381c> in <module>
----> 1 import pandas_ml as pdml
2 #from pandas_ml import ModelFrame
3 #mf = pdml.ModelFrame(df.to_dict())
4 #mf.head()
d:\program files\python38\lib\site-packages\pandas_ml\__init__.py in <module>
1 #!/usr/bin/env python
2
----> 3 from pandas_ml.core import ModelFrame, ModelSeries # noqa
4 from pandas_ml.tools import info # noqa
5 from pandas_ml.version import version as __version__ # noqa
d:\program files\python38\lib\site-packages\pandas_ml\core\__init__.py in <module>
1 #!/usr/bin/env python
2
----> 3 from pandas_ml.core.frame import ModelFrame # noqa
4 from pandas_ml.core.series import ModelSeries # noqa
d:\program files\python38\lib\site-packages\pandas_ml\core\frame.py in <module>
8
9 import pandas_ml.imbaccessors as imbaccessors
---> 10 import pandas_ml.skaccessors as skaccessors
11 import pandas_ml.smaccessors as smaccessors
12 import pandas_ml.snsaccessors as snsaccessors
d:\program files\python38\lib\site-packages\pandas_ml\skaccessors\__init__.py in <module>
13 from pandas_ml.skaccessors.linear_model import LinearModelMethods # noqa
14 from pandas_ml.skaccessors.manifold import ManifoldMethods # noqa
---> 15 from pandas_ml.skaccessors.metrics import MetricsMethods # noqa
16 from pandas_ml.skaccessors.model_selection import ModelSelectionMethods # noqa
17 from pandas_ml.skaccessors.neighbors import NeighborsMethods # noqa
d:\program files\python38\lib\site-packages\pandas_ml\skaccessors\metrics.py in <module>
254 _true_pred_methods = (_classification_methods + _regression_methods
255 + _cluster_methods)
--> 256 _attach_methods(MetricsMethods, _wrap_target_pred_func, _true_pred_methods)
257
258
d:\program files\python38\lib\site-packages\pandas_ml\core\accessor.py in _attach_methods(cls, wrap_func, methods)
91
92 for method in methods:
---> 93 _f = getattr(module, method)
94 if hasattr(cls, method):
95 raise ValueError("{0} already has '{1}' method".format(cls, method))
AttributeError: module 'sklearn.metrics' has no attribute 'jaccard_similarity_score'

It seems it is indeed. Here is the situation:
Although the function jaccard_similarity_score is not shown in the available ones of sklearn.metrics in the documentation, it was still there under the hood (hence available) until v0.22.2 (source code) in addition to the jaccard_score one. But in the source code of the latest v0.23, it has been removed, and only jaccard_score remains.
This would imply that it could still be possible to use pandas-ml by simply downgrading scikit-learn to v.0.22.2. But unfortunately this will not work either, throwing a different error:
!pip install pandas-ml
# Successfully installed enum34-1.1.10 pandas-ml-0.6.1
import sklearn
sklearn.__version__
# '0.22.2.post1'
import pandas_ml as pdml
[...]
AttributeError: module 'sklearn.preprocessing' has no attribute 'Imputer'
I guess it would be possible to find a scikit-learn version that works with it by going back enough (the last commit in their Github repo was in March 2019), but not sure if it is worth the fuss. In any case, they do not even mention scikit-learn (let alone any specific version of it) in their requirements file, which does not seem as sound practice, and the whole project seems rather abandoned.

So after some time and effort on this, I got it working and realized that the concept of broken in Python is rather murky. It would depend upon the combination of libraries you are trying to use and their dependencies. The older releases are all available and can be used but sometimes, it can be a hit-and-trial process to find that correct combination of package versions which gets everything working.
The other thing that I learnt from this exercise is the importance of having a significant expertise in creating and managing the virtual environments when programming with python.
In my case, I got help from some friends with the hit-and-trial part and found that pandas_ml works on python 3.7. Given below is the pip freeze output which can be used to setup a reliable virtual environment for machine learning and deep learning work using libraries like pandas_ml and imbalanced-learn libraries and may include some other libraries which have not had a new release in the last few years.
To create a working environment with the right version of packages which would ensure that pandas_ml and imbalanced-learn libraries work, create an environment with the following configuration on Python 3.7.
backcall==0.1.0
colorama==0.4.3
cycler==0.10.0
decorator==4.4.2
enum34==1.1.10
imbalanced-learn==0.4.3
ipykernel==5.2.1
ipython==7.14.0
ipython-genutils==0.2.0
jedi==0.17.0
joblib==0.15.0
jupyter-client==6.1.3
jupyter-core==4.6.3
kiwisolver==1.2.0
matplotlib==3.2.1
numpy==1.15.4
pandas==0.24.2
pandas-ml==0.6.1
parso==0.7.0
pickleshare==0.7.5
prompt-toolkit==3.0.5
Pygments==2.6.1
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
pywin32==227
pyzmq==19.0.1
scikit-learn==0.20.0
scipy==1.3.3
six==1.14.0
threadpoolctl==2.0.0
tornado==6.0.4
traitlets==4.3.3
wcwidth==0.1.9
Hope this helps someone who is looking for the right combination of library versions to setup their machine and deep learning environment in python using pandas_ml and imbalanced-learn packages.

Related

How to set path to Python executable in JupyterLab somewhere outside Anaconda

I need to be able to run JupyterLab on the Python executable in SPSS in order to import some SPSS libraries (spss, spssaux, SpssClient, etc.) in a notebook in JuptyerLab. Based on answers to other Ancadonda/executable-related questions, I've attempted two possible solutions:
Paste the relevant libraries from the Python directory in SPSS to the Python directory in Anaconda. I tried this both to the base environment and to the virtual environment I work out of, and tried launching JupyterLab both from the base and from the virtual environment. I copied the files from C:\Program Files\IBM\SPSS\Statistics\27\Python3\Lib\site-packages to C:\ProgramData\Anaconda3\Lib.
Redirect Anaconda to the Python executable in SPSS. I tried to to do that with sys.path.insert:
import sys
sys.path.insert(0, 'C:\Program Files\IBM\SPSS\Statistics\27\Python3')
Attempt #1 results in the following error when I try to import spss:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11344/2864419669.py in <module>
----> 1 import spss
C:\Program Files\IBM\SPSS\Statistics\27\Python3\Lib\site-packages\spss\__init__.py in <module>
254 __SetErrorMessage()
255
--> 256 from .spss import *
257 from .cursors import *
258 from .pivotTable import *
C:\Program Files\IBM\SPSS\Statistics\27\Python3\Lib\site-packages\spss\spss.py in <module>
21
22 import atexit,os,locale,sys,codecs
---> 23 from . import PyInvokeSpss
24 from .errMsg import *
25
ImportError: DLL load failed while importing PyInvokeSpss: The specified module could not be found.
This in spite of having copied PyInvokeSpss to C:\ProgramData\Anaconda3\Lib.
#2 doesn't seem to change the path to the executable:
C:\ProgramData\Anaconda3\python.exe
I also have an idea for a third solution, which would be to package the Python that's in SPSS as an iPython kernel, and then activate that in JupyterLab after it's launched, but I can't figure out a way of doing that.
What's the best direction here for a solution? What am I doing wrong?

ModuleNotFoundError: No module named 'pycocotools._mask'

I'm trying to train Mask-R CNN model from cocoapi(https://github.com/cocodataset/cocoapi), and this error code keep come out.
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-8-83356bb9cf95> in <module>
19 sys.path.append(os.path.join(ROOT_DIR, "samples/coco/")) # To find local version
20
---> 21 from pycocotools.coco import coco
22
23 get_ipython().run_line_magic('matplotlib', 'inline ')
~/Desktop/coco/PythonAPI/pycocotools/coco.py in <module>
53 import copy
54 import itertools
---> 55 from . import mask as maskUtils
56 import os
57 from collections import defaultdict
~/Desktop/coco/PythonAPI/pycocotools/mask.py in <module>
1 __author__ = 'tsungyi'
2
----> 3 import pycocotools._mask as _mask
4
5 # Interface for manipulating masks stored in RLE format.
ModuleNotFoundError: No module named 'pycocotools._mask'
I tried all the methods on the github 'issues' tab, but it is not working to me at all. Is there are another solution for this? I'm using Python 3.6, Linux.
The answer is summarise from these three GitHub issues
1.whether you have installed cython in the correct version. Namely, you should install cython for python2/3 if you use python2/3
pip install cython
2.whether you have downloaded the whole .zip file from this github project. Namely, you should download all the things here even though you only need PythonAPI
git clone https://github.com/cocodataset/cocoapi.git
or
unzip the zip file
3.whether you open Terminal and run "make" under the correct folder. The correct folder is the one that "Makefile" is located in
cd path/to/coco/PythonAPI/Makefile
make
Almost, the question can be solved.
If not, 4 and 5 may help.
4.whether you have already installed gcc in the correct version
5.whether you have already installed python-dev in the correct version. Namely you should install python3-dev (you may try "sudo apt-get install python3-dev"), if you use python3.
Try cloning official repo and run below commands
python setup.py install
make

while attempting to import module of sklearn in Jupiter notebook as well as in PYCHARM, i continuously get following error

i have installed scikit-learn using pip command, however while trying to import module of sklearn in Jupiter notebook as well as in PYCHARM, i continuously get following error. I am working in python3.9. I am new to the interface, so it is requested to suggest solution for this issue.
ImportError Traceback (most recent call last)
in
----> 1 import sklearn.linear_model as lm
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/init.py in
79 # it and importing it first would fail if the OpenMP dll cannot be found.
80 from . import _distributor_init # noqa: F401
---> 81 from . import __check_build # noqa: F401
82 from .base import clone
83 from .utils._show_versions import show_versions
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/init.py in
44 from ._check_build import check_build # noqa
45 except ImportError as e:
---> 46 raise_build_error(e)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/init.py in raise_build_error(e)
29 else:
30 dir_content.append(filename + '\n')
---> 31 raise ImportError("""%s
32 ___________________________________________________________________________
33 Contents of %s:
ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/_check_build.cpython-39-darwin.so, 2): Symbol not found: ____chkstk_darwin
Referenced from: /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/../.dylibs/libomp.dylib
Expected in: /usr/lib/libSystem.B.dylib
in /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build/../.dylibs/libomp.dylib
Contents of /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/__check_build:
init.py pycache _check_build.cpython-39-darwin.so
setup.py
It seems that scikit-learn has not been built correctly.
If you have installed scikit-learn from source, please do not forget
to build the package before using it: run python setup.py install or
make in the source directory.
If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform
thanks
As you can see in this bug report, the new sklearn version 0.24 crashes on MacOS<10.15 systems.
Until this bug is fixed, the developers suggest installing the previous version using pip install -U scikit-learn==0.23.

Python 3.8: Error running different Python environments from base directory's Jupyter Notebook

I've recently had to do a reformat of my work computer, and am reinstalling Anaconda. I generally keep Anaconda's root (base) folder untouched, and create separate environments when I need to work with specialist Python modules instead of cluttering the (base) environment.
In the past, I was able to successfully run these different environments from the Jupyter Notebook installed in the (base) environment. I would go about doing so by installing ipykernel in the new environment (e.g. my-env), and then running the following commands:
(base) activate my-env
(my-env) conda install ipykernel
(my-env) python -m ipykernel install --name "my-env" --display-name "My Python"
This would be done successfully, and give me the following message:
Installed kernelspec my-env in C:\ProgramData\jupyter\kernels\my-env
However, when I tried testing out the link in Jupyter Notebook using a standard import matplotlib.pyplot as plt command, I get the following error message:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-a0d2faabd9e9> in <module>
----> 1 import matplotlib.pyplot as plt
C:\Anaconda3\envs\my-env\lib\site-packages\matplotlib\__init__.py in <module>
105 # cbook must import matplotlib only within function
106 # definitions, so it is safe to import from it here.
--> 107 from . import cbook, rcsetup
108 from matplotlib.cbook import MatplotlibDeprecationWarning, sanitize_sequence
109 from matplotlib.cbook import mplDeprecation # deprecated
C:\Anaconda3\envs\my-env\lib\site-packages\matplotlib\cbook\__init__.py in <module>
26 import weakref
27
---> 28 import numpy as np
29
30 import matplotlib
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\__init__.py in <module>
141 from .core import *
142 from . import compat
--> 143 from . import lib
144 # NOTE: to be revisited following future namespace cleanup.
145 # See gh-14454 and gh-15672 for discussion.
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\lib\__init__.py in <module>
23 # Private submodules
24 from .type_check import *
---> 25 from .index_tricks import *
26 from .function_base import *
27 from .nanfunctions import *
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\lib\index_tricks.py in <module>
9 from numpy.core.numerictypes import find_common_type, issubdtype
10
---> 11 import numpy.matrixlib as matrixlib
12 from .function_base import diff
13 from numpy.core.multiarray import ravel_multi_index, unravel_index
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\matrixlib\__init__.py in <module>
2
3 """
----> 4 from .defmatrix import *
5
6 __all__ = defmatrix.__all__
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\matrixlib\defmatrix.py in <module>
9 # While not in __all__, matrix_power used to be defined here, so we import
10 # it for backward compatibility.
---> 11 from numpy.linalg import matrix_power
12
13
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\linalg\__init__.py in <module>
71 """
72 # To get sub-modules
---> 73 from .linalg import *
74
75 from numpy._pytesttester import PytestTester
C:\Anaconda3\envs\my-env\lib\site-packages\numpy\linalg\linalg.py in <module>
31 from numpy.core import overrides
32 from numpy.lib.twodim_base import triu, eye
---> 33 from numpy.linalg import lapack_lite, _umath_linalg
34
35
ImportError: DLL load failed: The specified module could not be found.
Could someone advise me on what might be the issue? If it helps, my (base) environment has a python version of 3.8.3 and a notebook version of 6.0.3, whereas my new my-env environment has modules downloaded from conda-forge. It has a python version of 3.7.8 and an ipykernel version of 5.3.4.
Thanks in advance!
UPDATE 26 Oct 2020
As requested, I have included the list of modules I have in both the (base) environment and the (my-env) environment. In general, the packages in (base) have been kept updated with respect to the anaconda module, whereas the packages in (my-env) are kept up-to-date with respect to hyperspy, which is stored in the conda-forge repository.
I have created PasteBin entries for them, as they exceed the character limit for this post.
Link to list of modules in (base)
Link to list of modules in (my-env)
I also tried importing modules other than matplotlib and numpy, and was able to import abc and time without issue, for example. This seems to be an issue with the (base) version of Jupyter Notebook not being compatible with the numpy found in the (my-env) environment.
The error message is indicating the error is arising from the numpy library. The fact that your Python and Ipkernel versions are different between your (base) and your (my-env) is further indication that there is incompatibility between your environments.
Can you provide the output from
conda list
from each environment?
When I tried to create a Python=3.8.3 environment the numpy version installed is numpy-1.19.2-py38hf89b668_1
I used the command
conda create -n foo -c conda-forge python=3.8.3 numpy
When I tried to create a Python=3.7.8 environment the numpy version installed is numpy-1.19.2-py37h7008fea_1
I used the command
conda create -n foo -c conda-forge python=3.7.8 numpy
In addition, why don't you consider installing ipkernel / jupyter notebook libraries that are consistent with respective versions of Python in each environment? This would always be the best solution to ensure dependencies are correctly aligned.
I also attempted to install ipykernel in both python=3.8.3 and python=3.7.8 along with ipykernel without specifying version number.
Here are the versions of ipykernel that condaautomatically chooses:
for python=3.8.3: ipykernel-5.3.4 | py38h1cdfbd6_1
for python=3.7.8: ipykernel-5.3.4 | py37hc6149b9_1
From what you have written your ipykernel versions are different. I do think this discrepancy is most likely coming from these differences of ipykernel versions
When you check your environments, verify that the channel source for ipykernel is the same.
One Solution: Consider downgrading the ipykernel in (base) to 5.3.4 version.

Seaborn ImportError: DLL load failed: The specified procedure could not be found

I am getting the ImportError: DLL load failed: The specified procedure could not be found. when importing the module seaborn.
My code is:
import seaborn as sns
sns.set()
Output:
ImportError Traceback (most recent call last)
<ipython-input-16-6f477838ac7f> in <module>
----> 1 import seaborn
D:\anaconda3\lib\site-packages\seaborn\__init__.py in <module>
1 # Capture the original matplotlib rcParams
----> 2 import matplotlib as mpl
3 _orig_rc_params = mpl.rcParams.copy()
4
5 # Import seaborn objects
D:\anaconda3\lib\site-packages\matplotlib\__init__.py in <module>
205
206
--> 207 _check_versions()
208
209
D:\anaconda3\lib\site-packages\matplotlib\__init__.py in _check_versions()
190 # Quickfix to ensure Microsoft Visual C++ redistributable
191 # DLLs are loaded before importing kiwisolver
--> 192 from . import ft2font
193
194 for modname, minver in [
ImportError: DLL load failed: The specified procedure could not be found.
I have uninstalled and reinstalled seaborn. still problem is not solved. What should I do?
I just got your issue with both seaborn and matplotlib. Did you try executing matplotlib ? Searching for an answer, I found out that it might be related to the matplotlib actual version (3.3.1). Here's the StackOverflow post that helped me.
from matplotlib import ft2font: "ImportError: DLL load failed: The specified procedure could not be found."
I just deleted/uninstalled matplotlib from my current packages (I work in a virtual env), and reinstalled a previous version (3.0.3)
I was having the same issue with importing seaborn in jupyter lab, lately.
I uninstalled the seaborn using pip:
pip uninstall seaborn
Then installed it again using pip:
pip install seaborn
This worked in my case.
Just on the sidenote, I also updated the whole conda enviroment using
conda update --all. But this might not be influential in this case.

Resources