Using Geopandas and the county-choropleth module - python-3.x

I am unable to get figure_factory to recognize the county_choroplet module which contains create_choropleth (on line 512 I believe).
I am just using a basic example from the plotly website
https://plot.ly/python/county-choropleth/
Edit: Ive tried to implement suggestions from a previous question by importing as:
from plotly.figure_factory._county_choropleth import create_choropleth
and then:
fig = create_choropleth(fips=fips, values=values)
py.ploy(fig, filename='basic-choropleth')
py.iplot(fig, filename='choropleth of some cali counties - full usa scope')
But i receive the following error (in picture):
File "C:\ProgramData\Miniconda3\lib\site-packages\fiona__init__.py", line 162, in open
raise IOError("no such file or directory: %r" % path)
OSError: no such file or directory: 'C:\ProgramData\Miniconda3\lib\site-packages\plotly\package_data\gz_2010_us_050_00_500k.shp'

So what is did was transfer the files in C:\ProgramData\Miniconda3\pkgs\plotly-3.1.1-py36h28b3542_0\Lib\site-packages\plotly
to:
C:\ProgramData\Miniconda3\Lib\site-packages\plotly
Then I ran the code:
import plotly.plotly as py
from plotly.figure_factory._county_choropleth import create_choropleth
py.sign_in('chessybo', 'XXXXXXXXXXX')
fips = ['06021', '06023', '06027',
'06029', '06033', '06059',
'06047', '06049', '06051',
'06055', '06061']
values = range(len(fips))
#fig = ff.create_choropleth(fips=fips, values=values)
fig = create_choropleth(fips=fips, values=values)
#py.plotly(fig, filename='basic-choropleth')
py.plot(fig, filename='choropleth of some cali counties - full usa scope')
and it worked.

What you will get after executing this code:
# import necessary libraries
import geopandas
import shapely
import shapefile
import plotly
from plotly.figure_factory._county_choropleth import create_choropleth
# Check your plotly version
print(plotly.__version__, geopandas.__version__,shapely.__version__,shapefile.__version__)
# Data
fips = ['06021','06023','06027',
'06029','06033','06059',
'06047','06049','06051',
'06055','06061']
values = range(len(fips))
# Create fig
fig = create_choropleth(fips=fips, values=values)
# Plot in offline mode and save plot in your Python script folder
plotly.offline.plot(fig, filename='choropleth_usa.html')
Just in my case, script return the following:

Related

OSError: Unable to open file (file signature not found)

I am currently doing an assignment on deep learning by downloading the assignment files from github.
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset
%matplotlib inline
You are given a dataset ("data.h5") containing: - a training set of m_train images labeled as cat (y=1) or non-cat (y=0) - a test set of m_test images labeled as cat or non-cat - each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).
# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
I ran the setup.sh file too but the error doesn't seem to go away.
lr_utils.py file:
import numpy as np
import h5py
def load_dataset():
train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
Kindly help!
I solved the issue by downloading uncorrupted .h5 files and putting them in the folder datasets/ in the same directory.
The files you downloaded are corrupted. You can visit https://github.com/abdur75648/Deep-Learning-Specialization-Coursera to download the uncorrupted files.
you can download uncorrupted files from here :
https://www.kaggle.com/datasets/muhammeddalkran/catvnoncat
and replace it in the directory of the corrupted files

How to save "IPython.core.display.SVG" as PNG file?

I am trying to save a variable with data-type of "IPython.core.display.SVG" as a PNG file in Jupyter Notebook environment.
First I tried:
with open('./file.png','wb+') as outfile:
outfile.write(my_svg.data)
And I got the error:
TypeError: a bytes-like object is required, not 'str'
Next, I tried:
with open('./file.png','wb+') as outfile:
outfile.write(my_svg.data.encode('utf-8'))
But, I cannot open "file.png". The operating system gives error:
The file “file.png” could not be opened. It may be damaged or use a file format that Preview doesn’t recognize.
I can save "my_svg" with "svg" extension as below:
with open('./file.svg','wb+') as outfile:
outfile.write(my_svg.data.encode('utf-8'))
But, when I want to convert "file.svg" into "file.png" by:
import cairosvg
cairosvg.svg2png(url="./file.svg", write_to="./file.png")
I get the error:
ValueError: unknown locale: UTF-8
This is how I get "IPython.core.display.SVG" data-type in Jupyter Notebook:
from rdkit import Chem
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG
smile_1 = 'C(C(N)=O)c(c)c'
smile_2 = 'o(cn)c(c)c'
m1 = Chem.MolFromSmiles(smile_1,sanitize=False)
Chem.SanitizeMol(m1, sanitizeOps=(Chem.SanitizeFlags.SANITIZE_ALL^Chem.SanitizeFlags.SANITIZE_KEKULIZE^Chem.SanitizeFlags.SANITIZE_SETAROMATICITY))
m2 = Chem.MolFromSmiles(smile_2,sanitize=False)
Chem.SanitizeMol(m2, sanitizeOps=(Chem.SanitizeFlags.SANITIZE_ALL^Chem.SanitizeFlags.SANITIZE_KEKULIZE^Chem.SanitizeFlags.SANITIZE_SETAROMATICITY))
mols = [m1, m2]
legends = ["smile_1", "smile_2"]
molsPerRow=2
subImgSize=(200, 200)
nRows = len(mols) // molsPerRow
if len(mols) % molsPerRow:
nRows += 1
fullSize = (molsPerRow * subImgSize[0], nRows * subImgSize[1])
d2d = rdMolDraw2D.MolDraw2DSVG(fullSize[0], fullSize[1], subImgSize[0], subImgSize[1])
d2d.drawOptions().prepareMolsBeforeDrawing=False
d2d.DrawMolecules(list(mols), legends=legends)
d2d.FinishDrawing()
SVG(d2d.GetDrawingText())
Environment:
macOS 11.2.3
python 3.6
RDKit version 2020.09.1
Any help is greatly appreciated.
Instead of creating an SVG with rdkit and trying to convert it to a PNG, why not just create a PNG directly?
from rdkit.Chem import Draw
from rdkit import Chem
# create rdkit mol
smile = 'CCCC'
mol = Chem.MolFromSmiles(smile)
# create png
d2d = Draw.MolDraw2DCairo(200, 200)
d2d.DrawMolecule(mol)
d2d.FinishDrawing()
png_data = d2d.GetDrawingText()
# save png to file
with open('mol_image.png', 'wb') as png_file:
png_file.write(png_data)
I am not sure why MolDraw2DCairo is not working for you but using the package you mention (cairosvg) you could extend your code sample quite easily:
# extra imports
import cairosvg
import tempfile
# replace molecule drawing part
d2d = rdMolDraw2D.MolDraw2DSVG(fullSize[0], fullSize[1], subImgSize[0], subImgSize[1])
d2d.drawOptions().prepareMolsBeforeDrawing=False
d2d.DrawMolecules(list(mols), legends=legends)
d2d.FinishDrawing()
svg_text = d2d.GetDrawingText()
# save to png file
with tempfile.NamedTemporaryFile(delete=True) as tmp:
tmp.write(svg_text.encode())
tmp.flush()
cairosvg.svg2png(url=tmp.name, write_to="./mol_img.png")

Trying extract a geography coordinates from .pdf file with python3

I am trying to extract a geographic coordinates in UTM format from a .pdf file with python3 in Ubuntu operative system, with the follow code:
from pathlib import Path
import textract
import numpy as np
import re
import os
import pdfminer
def main(_file):
try:
text = textract.process(_file, method="pdfminer")
except textract.exceptions.ShellError as ex:
print(ex)
return
with open("%s.csv" % Path(_file).name[: -len(Path(_file).suffix)],
"w+") as _file:
# find orders and DNIs
coords = re.compile(r"\d?\.?\d+\.+\d+\,\d{2}")
results = re.findall(coords, text.decode())
if results:
_file.write("|".join(results))
if __name__ == "__main__":
_file = "/home/cristian33/python_proj/folder1/buscarco.pdf"
main(_file)
when I run it give me the follow error:
The command pdf2txt.py /home/cristian33/python_proj/folder1/buscarco.pdf failed because the executable
pdf2txt.py is not installed on your system. Please make
sure the appropriate dependencies are installed before using
textract:
http://textract.readthedocs.org/en/latest/installation.html
somebody knows why is that error?
thanks

module 'seaborn' has no attribute 'distplot'

I've some code like:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.read_csv('StudentsPerformance.csv')
#print(data.isnull().sum()) // checking if there are some missing values or not
#print(data.dtypes)checking datatypes of the dataset
# ANALYSİS VALUES OF THE COLUMN'S
"""print(data['gender'].value_counts())
print(data['parental level of education'].value_counts())
print(data['race/ethnicity'].value_counts())
print(data['lunch'].value_counts())
print(data['test preparation course'].value_counts())"""
# Adding column total and average to the dataset
data['total'] = data['math score'] + data['reading score'] + data['writing score']
data['average'] = data ['total'] / 3
sns.distplot(data['average'])
I would like to see distplot of average for visualization but I run the program that gives me an error like
Traceback (most recent call last): File
"C:/Users/usersample/PycharmProjects/untitled1/sample.py", line 22, in
sns.distplot(data['average']) AttributeError: module 'seaborn' has no attribute 'distplot'
I've tried to reinstall and install seaborn and upgrade the seaborn to 0.9.0 but it doesn't work.
head of my data female,"group B","bachelor's
degree","standard","none","72","72","74" female,"group C","some
college","standard","completed","69","90","88" female,"group
B","master's degree","standard","none","90","95","93" male,"group
A","associate's degree","free/reduced","none","47","57","44"
this might be due to removal of paths in environment variables section. Try considering to add your IDE scripts and python folder. I am using pycharm IDE, and did the same and its working fine.

Why is my image_path undefined when using export_graphviz? - Python 3

I'm trying to run this machine learning tree algorithm code in IPython:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
iris = load_iris()
X = iris.data[:, 2:] # petal length and width
y = iris.target
tree_clf = DecisionTreeClassifier(max_depth=2)
tree_clf.fit(X, y)
from sklearn.tree import export_graphviz
export_graphviz(tree_clf, out_file=image_path("iris_tree.dot"),
feature_names=iris.feature_names[2:],
class_names=iris.target_names,
rounded=True,
filled=True
)
But I get this error when run in IPython:
I'm unfamiliar with export_graphviz, does anyone have any idea how to correct this?
I guess you are following "Hands on Machine Learning with Scikit-Learn and TensorFlow" book by Aurelien Geron. I encountered with the same problem while trying out "Decision Trees" chapter. You can always refer to his GitHub notebooks . For your code, you may refer "decision tree" notebook.
Below I paste the code from notebook. Please do go ahead and have a look at the notebook also.
# To support both python 2 and python 3
from __future__ import division, print_function, unicode_literals
# Common imports
import numpy as np
import os
# to make this notebook's output stable across runs
np.random.seed(42)
# To plot pretty figures
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
# Where to save the figures
PROJECT_ROOT_DIR = "."
CHAPTER_ID = "decision_trees"
def image_path(fig_id):
return os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id)
def save_fig(fig_id, tight_layout=True):
print("Saving figure", fig_id)
if tight_layout:
plt.tight_layout()
plt.savefig(image_path(fig_id) + ".png", format='png', dpi=300)
To get rid of all the mess simply remove image_path,
now out_file="iris_tree.dot", after running that command a file will be saved in your folder named iris_tree. Open that file in Microsoft Word and copy all of its content. Now open your browser and type "webgraphviz" and then click on the first link. Then delete whatever is written in white space and paste your code which is copied from iris_tree. Then click "generate graph". Scroll down and your graph is ready.
I know you might have got what you were looking for. But in case you don't, all you need to do is just replace:
out_file=image_path("iris_tree.dot")
with:
out_file="iris_tree.dot"
This will create the .dot file in the same directory in which your current script is.
You can also give the absolute path to where you want to save the .dot file as:
out_file="/home/cipher/iris_tree.dot"
you must correct
out_file=image_path("iris_tree.dot"),
in below code line:
out_file="C:/Users/VIDA/Desktop/python/iris_tree.dot",
You can directly type instead of using the webgraphviz, if you are using sklearn version 0.20.
import graphviz
with open ("iris_tree.dot") as f:
dot_graph = f.read()
display (graphviz.Source(dot_graph))
With sklearn 0.22 you have to change again. See sklearn users guide.
I have a sklearn with the version of 0.20.1, and I got the example to work through the line below.
export_graphviz(
tree_clf,
out_file = "iris_tree.dot",
feature_names = iris.feature_names[2:])

Resources