In this project, I am trying to utilize the pycaret package to analyze some time series with the help of scikit-learn package. Specifically, I have imported some modules as follows:
from pycaret.regression import (setup, compare_models, predict_model, plot_model, finalize_model, load_model)
# setting up the stage to initialize the training environment
s = setup(
data=train,
target=target_var,
ignore_features = ['Series'],
numeric_features=involved_numerics,
categorical_features = categorics,
silent=True,
log_experiment=True,
)
# Now, to train machine learning models, we need to compare models and find the best one
best_model = compare_models(sort='MAE')
# Making some plots
for id, name in zip(ids, names):
plot_model(best_model, plot=id, scale=3, save=True)
.
.
.
I was able to succeed in running the code for some of the models but not all from the list of available models mentioned in the documentation. However, for some specific models (such as Recursive Feat. Selection), there is an error message:
Traceback (most recent call last):
File "c:/Users/username/Desktop/project/project.py", line 55,
in <module>
main()
File "c:/Users/username/Desktop/project/project.py", line 48,
in main
ml_modelling(data, train, test)
File "c:\Users\username\Desktop\project\utilities.py", line 1070, in ml_modelling
plot_model(best_model, plot=id, scale=3, save=True)
File "C:\Users\username\anaconda3\envs\py38\lib\site-packages\pycaret\regression.py", line 1601, in plot_model
return pycaret.internal.tabular.plot_model(
File "C:\Users\username\anaconda3\envs\py38\lib\site-packages\pycaret\internal\tabular.py", line 7712, in plot_model
ret = locals()[plot]()
File "C:\Users\username\anaconda3\envs\py38\lib\site-packages\pycaret\internal\tabular.py", line 6293, in residuals_interactive
resplots.write_html(plot_filename)
File "C:\Users\username\anaconda3\envs\py38\lib\site-packages\pycaret\internal\plots\residual_plots.py", line 673, in write_html
f.write(html)
File "C:\Users\username\anaconda3\envs\py38\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u25c4' in position 276445: character maps to <undefined>
Here is the train:
Train
Series x y z ID var1 var2 var3 var4 var5 var6
0 1 2 1 3 True -3 -4 6 7 4 6
1 2 2 1 7 False 22 0 3 5 2 8
2 3 2 1 0 True 3 -6 3 5 4 4
3 4 2 1 4 False 27 -4 8 3 -3 2
.
.
.
I am using VSCode to run my python tool on a Windows 10 machine and here is the list of all packages installed on the conda environment:
name: py38
channels:
- conda-forge
- defaults
dependencies:
- bzip2=1.0.8=h8ffe710_4
- ca-certificates=2022.12.7=h5b45459_0
- et_xmlfile=1.1.0=pyhd8ed1ab_0
- libffi=3.4.2=h8ffe710_5
- libsqlite=3.40.0=hcfcfb64_0
- libzlib=1.2.13=hcfcfb64_4
- openpyxl=3.0.10=py38h91455d4_2
- openssl=3.0.7=hcfcfb64_2
- pip=22.3.1=pyhd8ed1ab_0
- python=3.8.15=h4de0772_1_cpython
- python_abi=3.8=3_cp38
- setuptools=66.1.1=pyhd8ed1ab_0
- tk=8.6.12=h8ffe710_0
- ucrt=10.0.22621.0=h57928b3_0
- vc=14.3=hb6edc58_10
- vs2015_runtime=14.34.31931=h4c5c07a_10
- wheel=0.38.4=pyhd8ed1ab_0
- xz=5.2.6=h8d14728_0
- pip:
- alembic==1.9.2
- asttokens==2.2.1
- attrs==22.2.0
- backcall==0.2.0
- blis==0.7.9
- boruta==0.3
- catalogue==1.0.2
- certifi==2022.12.7
- charset-normalizer==3.0.1
- click==8.1.3
- cloudpickle==2.2.1
- colorama==0.4.6
- colorlover==0.3.0
- comm==0.1.2
- contourpy==1.0.7
- cufflinks==0.17.3
- cycler==0.11.0
- cymem==2.0.7
- cython==0.29.14
- databricks-cli==0.17.4
- debugpy==1.6.6
- decorator==5.1.1
- docker==6.0.1
- entrypoints==0.4
- executing==1.2.0
- flask==2.2.2
- fonttools==4.38.0
- funcy==1.18
- future==0.18.3
- gensim==3.8.3
- gitdb==4.0.10
- gitpython==3.1.30
- greenlet==2.0.2
- htmlmin==0.1.12
- idna==3.4
- imagehash==4.3.1
- imbalanced-learn==0.7.0
- importlib-metadata==5.2.0
- importlib-resources==5.10.2
- ipykernel==6.20.2
- ipython==8.9.0
- ipywidgets==8.0.4
- itsdangerous==2.1.2
- jedi==0.18.2
- jinja2==3.1.2
- joblib==1.2.0
- jupyter-client==8.0.1
- jupyter-core==5.1.5
- jupyterlab-widgets==3.0.5
- kiwisolver==1.4.4
- kmodes==0.12.2
- lightgbm==3.3.5
- llvmlite==0.37.0
- mako==1.2.4
- markdown==3.4.1
- markupsafe==2.1.2
- matplotlib==3.6.3
- matplotlib-inline==0.1.6
- mlflow==2.1.1
- mlxtend==0.19.0
- multimethod==1.9.1
- murmurhash==1.0.9
- nest-asyncio==1.5.6
- networkx==3.0
- nltk==3.8.1
- numba==0.54.1
- numexpr==2.8.4
- numpy==1.20.3
- oauthlib==3.2.2
- packaging==22.0
- pandas==1.5.3
- pandas-profiling==3.6.3
- parso==0.8.3
- patsy==0.5.3
- phik==0.12.3
- pickleshare==0.7.5
- pillow==9.4.0
- plac==1.1.3
- platformdirs==2.6.2
- plotly==5.13.0
- preshed==3.0.8
- prompt-toolkit==3.0.36
- protobuf==4.21.12
- psutil==5.9.4
- pure-eval==0.2.2
- pyarrow==10.0.1
- pycaret==2.3.10
- pydantic==1.10.4
- pygments==2.14.0
- pyjwt==2.6.0
- pyldavis==3.3.1
- pynndescent==0.5.8
- pyod==1.0.7
- pyparsing==3.0.9
- python-dateutil==2.8.2
- pytz==2022.7.1
- pywavelets==1.4.1
- pywin32==305
- pyyaml==5.4.1
- pyzmq==25.0.0
- querystring-parser==1.2.4
- regex==2022.10.31
- requests==2.28.2
- scikit-learn==0.23.2
- scikit-plot==0.3.7
- scipy==1.5.4
- seaborn==0.12.2
- shap==0.41.0
- six==1.16.0
- sklearn==0.0.post1
- slicer==0.0.7
- smart-open==6.3.0
- smmap==5.0.0
- spacy==2.3.9
- sqlalchemy==1.4.46
- sqlparse==0.4.3
- srsly==1.0.6
- stack-data==0.6.2
- statsmodels==0.13.5
- tabulate==0.9.0
- tangled-up-in-unicode==0.2.0
- tenacity==8.1.0
- textblob==0.17.1
- thinc==7.4.6
- threadpoolctl==3.1.0
- tornado==6.2
- tqdm==4.64.1
- traitlets==5.8.1
- typeguard==2.13.3
- typing-extensions==4.4.0
- umap-learn==0.5.3
- urllib3==1.26.14
- visions==0.7.5
- waitress==2.1.2
- wasabi==0.10.1
- wcwidth==0.2.6
- websocket-client==1.5.0
- werkzeug==2.2.2
- widgetsnbextension==4.0.5
- wordcloud==1.8.2.2
- yellowbrick==1.2.1
- zipp==3.12.0
prefix: C:\Users\username\anaconda3\envs\py38
It could be probably an issue in the library and the data being loaded having dash in unicode ...
Here is referenced pycaret's source code:
def write_html(self, plot_filename):
"""
Write the current plots to a file in HTML format.
Parameters
----------
plot_filename: str
name of the file
"""
html = self.get_html()
with open(plot_filename, "w") as f:
f.write(html)
And as mentioned in this stackoverflow question
It could be solved by mentioning encoding while opening the file
with open(plot_filename, "w", encoding='utf-8') as f:
f.write(html)
But since you cannot change library's code try running following in console before running your script as mentioned in this answer
chcp 65001
set PYTHONIOENCODING=utf-8
I am trying to create a conda environment using the following yml file but I get the following error:
install command: conda env create -f conda_env_torch_zoo.yml
Error:
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: failed
ERROR conda.core.link:_execute(481): An error occurred while installing package 'conda-forge::antlr-python-runtime-4.9.3-pyhd8ed1ab_1'.
FileNotFoundError(2, 'No such file or directory')
Attempting to roll back.
Rolling back transaction: done
FileNotFoundError(2, 'No such file or directory')
I think the problem is caused by the conda-forge under channels:. How can I resolve this error?
yml file:
name: torch_zoo
channels:
- pytorch
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- antlr-python-runtime=4.9.3=pyhd8ed1ab_1
- blas=1.0=mkl
- intel-openmp=2020.2=254
- mkl=2020.2=256
- olefile=0.46=py_0
- pip=20.2.2=py38_0
- pycparser=2.21=pyhd8ed1ab_0
- pysoundfile=0.10.3.post1=pyhd3deb0d_0
- python_abi=3.8=1_cp38
- resampy=0.2.2=py_0
- setuptools=49.6.0=py38_0
- six=1.15.0=py_0
- torchaudio=0.7.2=py38
- torchvision=0.8.2=py38_cu110
- tqdm=4.49.0=py_0
- typing_extensions=4.0.1=pyha770c72_0
- wheel=0.35.1=py_0
- pip:
- opencv-python==4.4.0.44
- yaml==0.2.5=h516909a_0
- libgcc-ng==9.1.0=hdf63c60_0
- scipy==1.5.2=py38h0b6359f_0
- openssl==1.1.1l=h7f8727e_0
- ld_impl_linux-64==2.33.1=h53a641e_7
- libogg==1.3.2=h516909a_1002
- libuv==1.40.0=h7b6447c_0
- python==3.8.5=h7579374_1
- lz4-c==1.9.2=he6710b0_1
- zlib==1.2.11=h7b6447c_3
- numpy-base==1.19.1=py38hfa32c7d_0
- sqlite==3.33.0=h62c20be_0
- libllvm10==10.0.1=he513fc3_3
- libiconv==1.16=h516909a_0
- mkl-service==2.3.0=py38he904b0f_0
- gnutls==3.6.13=h79a8f9a_0
- ffmpeg==4.3.1=h167e202_0
- libedit==3.1.20191231=h14c3975_1
- openh264==2.1.1=h8b12597_0
- llvmlite==0.36.0=py38h612dafd_4
- x264==1!152.20180806=h14c3975_0
- ninja==1.10.1=py38hfd86e86_0
- numpy==1.19.1=py38hbc911f0_0
- libtiff==4.1.0=h2733197_1
- cudatoolkit==11.0.221=h6bb024c_0
- pyyaml==5.3.1=py38h8df0ef7_1
- libvorbis==1.3.7=he1b5a44_0
- mkl_random==1.1.1=py38h0573a6f_0
- numba==0.53.1=py38ha9443f7_0
- omegaconf==2.1.1=py38h578d9bd_1
- ncurses==6.2=he6710b0_1
- gmp==6.2.0=he1b5a44_2
- libpng==1.6.37=hbc83047_0
- pytorch==1.7.1=py3.8_cuda11.0.221_cudnn8.0.5_0
- av==8.0.2=py38he20a9df_1
- tk==8.6.10=hbc83047_0
- libffi==3.3=he6710b0_2
- libgfortran-ng==7.3.0=hdf63c60_0
- jpeg==9b=h024ee3a_2
- pillow==7.2.0=py38hb39fc2d_0
- certifi==2021.10.8=py38h578d9bd_1
- bzip2==1.0.8=h516909a_3
- lame==3.100=h14c3975_1001
- xz==5.2.5=h7b6447c_0
- tbb==2020.2=hc9558a2_0
- libsndfile==1.0.29=he1b5a44_0
- nettle==3.4.1=h1bed415_1002
- ca-certificates==2021.10.26=h06a4308_2
- cffi==1.14.6=py38h400218f_0
- freetype==2.10.2=h5ab3b9f_0
- mkl_fft==1.2.0=py38h23d657b_0
- libflac==1.3.3=he1b5a44_0
- readline==8.0=h7b6447c_0
- zstd==1.4.5=h9ceee32_0
- lcms2==2.11=h396b838_0
- gettext==0.19.8.1=h5e8e0c9_1
- libstdcxx-ng==9.1.0=hdf63c60_0
This code was created by black:
def test_schema_org_script_from_list():
assert (
schema_org_script_from_list([1, 2])
== '<script type="application/ld+json">1</script>\n<script type="application/ld+json">2</script>'
)
But now flake8 complains:
tests/test_utils.py:59:9: W503 line break before binary operator
tests/test_utils.py:59:101: E501 line too long (105 > 100 characters)
How can I format above lines and make flake8 happy?
I use this .pre-commit-config.yaml
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: 'https://github.com/pre-commit/pre-commit-hooks'
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: 'https://gitlab.com/pycqa/flake8'
rev: 3.8.4
hooks:
- id: flake8
- repo: 'https://github.com/pre-commit/mirrors-isort'
rev: v5.7.0
hooks:
- id: isort
tox.ini:
[flake8]
max-line-length = 100
exclude = .git,*/migrations/*,node_modules,migrate
# W504 line break after binary operator
ignore = W504
(I think it is a bit strange that flake8 reads config from a file which belongs to a different tool).
from your configuration, you've set ignore = W504
ignore isn't the option you want as it resets the default ignore (bringing in a bunch of things, including W503).
If you remove ignore=, both W504 and W503 are in the default ignore so they won't be caught
as for your E501 (line too long), you can either extend-ignore = E501 or you can set max-line-length appropriately
for black, this is the suggested configuration:
[flake8]
max-line-length = 88
extend-ignore = E203
note that there are cases where black cannot make a line short enough (as you're seeing) -- both from long strings and from long variable names
disclaimer: I'm the current flake8 maintainer
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- statsmodels==0.8.0=np113py36_0
- jpeg==9b=vc14_0
- numpy==1.13.1=py36_0
- pandas-datareader==0.4.0=py36_0
- pyqt==5.6.0=py36_2
- libpng==1.6.27=vc14_0
- mkl==2017.0.3=0
- icu==57.1=vc14_0
- requests-file==1.4.1=py36_0
- pytz==2017.2=py36_0
- matplotlib==2.0.2=np113py36_0
- pip==9.0.1=py36_1
- requests-ftp==0.3.1=py36_0
- patsy==0.4.1=py36_0
- qt==5.6.2=vc14_6
- python==3.6.2=0
- cycler==0.10.0=py36_0
- openssl==1.0.2l=vc14_0
- sip==4.18=py36_0
- setuptools==27.2.0=py36_1
- wheel==0.29.0=py36_0
- requests==2.14.2=py36_0
- scipy==0.19.1=np113py36_0
- zlib==1.2.8=vc14_3
- tk==8.5.18=vc14_0
- vs2015_runtime==14.0.25420=0
- python-dateutil==2.6.1=py36_0
- six==1.10.0=py36_0
- pyparsing==2.1.4=py36_0
- pandas==0.20.3=py36_0
I'm a newbie trying to explore OpenCV, I had to create a virtual environment in anaconda with the dependencies from the .yml file, However the numpy dependency seems to cause some trouble while setting up the environment.
The environment is being created with the help of a .yml file and is being created in the desktop(Writable).
how do I overcome this error?
my .yml file:
name: python-cvcourse
channels:
- michael_wild
- defaults
dependencies:
- absl-py=0.4.1=py36_0
- appdirs=1.4.3=py36h28b3542_0
- asn1crypto=0.24.0=py36_0
- astor=0.7.1=py36_0
- attrs=18.2.0=py36h28b3542_0
- automat=0.7.0=py36_0
- backcall=0.1.0=py36_0
- blas=1.0=mkl
- bleach=2.1.4=py36_0
- ca-certificates=2018.03.07=0
- certifi=2018.10.15=py36_0
- cffi=1.11.5=py36h74b6da3_1
- colorama=0.3.9=py36h029ae33_0
- constantly=15.1.0=py36h28b3542_0
- cryptography=2.3.1=py36h74b6da3_0
- cudatoolkit=9.0=1
- cudnn=7.1.4=cuda9.0_0
- cycler=0.10.0=py36h009560c_0
- decorator=4.3.0=py36_0
- entrypoints=0.2.3=py36_2
- freetype=2.9.1=ha9979f8_1
- gast=0.2.0=py36_0
- grpcio=1.12.1=py36h1a1b453_0
- h5py=2.8.0=py36hf7173ca_2
- hdf5=1.8.20=hac2f561_1
- html5lib=1.0.1=py36_0
- hyperlink=18.0.0=py36_0
- icc_rt=2017.0.4=h97af966_0
- icu=58.2=ha66f8fd_1
- idna=2.7=py36_0
- incremental=17.5.0=py36_0
- intel-openmp=2019.0=118
- ipykernel=4.9.0=py36_0
- ipython=6.5.0=py36_0
- ipython_genutils=0.2.0=py36h3c5d0ee_0
- ipywidgets=7.4.1=py36_0
- jedi=0.12.1=py36_0
- jinja2=2.10=py36_0
- jpeg=9b=hb83a4c4_2
- jsonschema=2.6.0=py36h7636477_0
- jupyter=1.0.0=py36_6
- jupyter_client=5.2.3=py36_0
- jupyter_console=5.2.0=py36_1
- jupyter_core=4.4.0=py36_0
- jupyterlab=0.34.9=py36_0
- jupyterlab_launcher=0.13.1=py36_0
- keras=2.2.2=0
- keras-applications=1.0.4=py36_1
- keras-base=2.2.2=py36_0
- keras-preprocessing=1.0.2=py36_1
- kiwisolver=1.0.1=py36h6538335_0
- libopencv=3.4.2=h20b85fd_0
- libpng=1.6.34=h79bbb47_0
- libprotobuf=3.6.0=h1a1b453_0
- libsodium=1.0.16=h9d3ae62_0
- libtiff=4.0.9=h36446d0_2
- m2w64-gcc-libgfortran=5.3.0=6
- m2w64-gcc-libs=5.3.0=7
- m2w64-gcc-libs-core=5.3.0=7
- m2w64-gmp=6.1.0=2
- m2w64-libwinpthread-git=5.0.0.4634.697f757=2
- markdown=2.6.11=py36_0
- markupsafe=1.0=py36hfa6e2cd_1
- matplotlib=2.2.3=py36hd159220_0
- mistune=0.8.3=py36hfa6e2cd_1
- mkl=2019.0=118
- mkl_fft=1.0.4=py36h1e22a9b_1
- mkl_random=1.0.1=py36h77b88f5_1
- msys2-conda-epoch=20160418=1
- nbconvert=5.3.1=py36_0
- nbformat=4.4.0=py36h3a5bc1b_0
- notebook=5.6.0=py36_0
- numpy=1.15.1=py36ha559c80_0
- numpy-base=1.15.1=py36h8128ebf_0
- olefile=0.46=py36_0
- opencv=3.4.2=py36h40b0b35_0
- openssl=1.0.2p=hfa6e2cd_0
- pandoc=2.2.3.2=0
- pandocfilters=1.4.2=py36_1
- parso=0.3.1=py36_0
- pickleshare=0.7.4=py36h9de030f_0
- pillow=5.2.0=py36h08bbbbd_0
- pip=10.0.1=py36_0
- prometheus_client=0.3.1=py36h28b3542_0
- prompt_toolkit=1.0.15=py36h60b8f86_0
- protobuf=3.6.0=py36he025d50_0
- py-opencv=3.4.2=py36hc319ecb_0
- pyasn1=0.4.4=py36h28b3542_0
- pyasn1-modules=0.2.2=py36_0
- pycparser=2.18=py36_1
- pygments=2.2.0=py36hb010967_0
- pyopenssl=18.0.0=py36_0
- pyparsing=2.2.0=py36_1
- pyqt=5.9.2=py36ha878b3d_0
- python=3.6.6=hea74fb7_0
- python-dateutil=2.7.3=py36_0
- pytz=2018.5=py36_0
- pywin32=223=py36hfa6e2cd_1
- pywinpty=0.5.4=py36_0
- pyyaml=3.13=py36hfa6e2cd_0
- pyzmq=17.1.2=py36hfa6e2cd_0
- qt=5.9.6=vc14h62aca36_0
- qtconsole=4.4.1=py36_0
- scikit-learn=0.19.1=py36hae9bb9f_0
- scipy=1.1.0=py36h4f6bf74_1
- send2trash=1.5.0=py36_0
- service_identity=17.0.0=py36h28b3542_0
- setuptools=40.2.0=py36_0
- simplegeneric=0.8.1=py36_2
- sip=4.19.8
- six=1.11.0=py36_1
- sqlite=3.24.0=h7602738_0
- tensorflow=1.10.0
- termcolor=1.1.0=py36_1
- terminado=0.8.1=py36_1
- testpath=0.3.1=py36h2698cfe_0
- tk=8.6.8=hfa6e2cd_0
- tornado=5.1=py36hfa6e2cd_0
- traitlets=4.3.2=py36h096827d_0
- twisted=18.7.0=py36hfa6e2cd_1
- vc=14=h0510ff6_3
- vs2015_runtime=14.0.25123=3
- wcwidth=0.1.7=py36h3d5aa90_0
- webencodings=0.5.1=py36_1
- werkzeug=0.14.1=py36_0
- wheel=0.31.1=py36_0
- widgetsnbextension=3.4.1=py36_0
- wincertstore=0.2=py36h7fe50ca_0
- winpty=0.4.3=4
- yaml=0.1.7=hc54c509_2
- zeromq=4.2.5=he025d50_1
- zlib=1.2.11=h8395fce_2
- zope=1.0=py36_1
- zope.interface=4.5.0=py36hfa6e2cd_0
- opencv-contrib=3.3.1=py36_1
prefix: C:\Users\Marcial\Anaconda3\envs\cvcourse_windows
The error is :
(base) C:\Users\Jaysurya\Desktop>conda env create -f cvcourse_windows.yml
Collecting package metadata: done
Solving environment: failed
UnsatisfiableError: The following specifications were found to be in conflict:
- numpy==1.15.1=py36ha559c80_0
Use "conda search <package> --info" to see the dependencies for each package.
anaconda promt screent shot attached
With the following commands you can create an environment with opencv:
conda create -n py3_opencv python=3.6
source activate py3_opencv
pip install opencv-python==3.4
conda install ... (if you need other dependencies)
(You can use conda install if you like, but I personally prefer pip's opencv version)
Use
pip install opencv-contrib-python==3.4
if you need a contrib version.
Hope this helps.