Creating PDF using pydot - python-3.5

I got the below code from Visualizing a Decision Tree - Machine Learning
import numpy as np
from sklearn.datasets import load_iris
from sklearn import tree
iris = load_iris()
test_idx = [0, 50 , 100]
train_target = np.delete(iris.target, test_idx)
train_data = np.delete(iris.data, test_idx , axis=0)
test_target = iris.target[test_idx]
test_data = iris.data[test_idx]
clf = tree.DecisionTreeClassifier()
clf.fit(train_data, train_target)
print(test_target)
print(clf.predict(test_data))
#viz_code
from sklearn.externals.six import StringIO
import pydot
dot_data = StringIO()
tree.export_graphviz(clf,
out_file=dot_data,
feature_names = iris.feature_names,
class_names = iris.target_names,
filled = True, rounded = True,
impurity = False)
graph = pydot.graph_from_dot_data(dot_data.getvalue())
graph.write_pdf("iris.pdf")
I tried to run it in my python 3.5 but i get an error saying that graph is a list.
Traceback (most recent call last):
File "Iris.py", line 31, in <module>
graph.write_pdf("iris.pdf")
AttributeError: 'list' object has no attribute 'write_pdf'
Press any key to continue . . .
How come graph here is a list?

I think this is a duplicate, here is answered the same question link
because pydot.graph_from_dot_data return a list the solution is:
graph = pydot.graph_from_dot_data(dot_data.getvalue())
graph[0].write_pdf("iris.pdf")
This solved the problem for me with Python 3.6.5 :: Anaconda, Inc.

Pydot will not work in Python3.
You can use Pydotplus (graph.write_pdf("iris.pdf") AttributeError: 'list' object has no attribute 'write_pdf'") for python3 instead of pydot.
Although, the code shown on youtube is for Python2. So, it will be better if you use Python2.

Related

Keras model.fit in Azure ML fails [duplicate]

How to troubleshoot this? I've tried setting dtype=None in the image.img_to_array method.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
from keras.preprocessing import image
image_size = (180, 180)
batch_size = 32
model = keras.models.load_model('best_model.h5')
img = keras.preprocessing.image.load_img(
"GarnetCreek_7-15-2019.jpeg", target_size=image_size
)
img_array = image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create batch axis
predictions = model.predict(img_array)
score = predictions[0]
This raises the following error:
Traceback (most recent call last):
img_array = image.img_to_array(img, dtype=None)
return image.img_to_array(img, data_format=data_format, **kwargs)
x = np.asarray(img, dtype=dtype)
return array(a, dtype, copy=False, order=order)
TypeError: __array__() takes 1 positional argument but 2 were given
Has anyone seen this before? Many thanks!
This error sometimes is due to a bug in Pillow 8.3.0 as it is here. (You may not use import PIL directly in your code, however some libraries such as tf.keras.preprocessing.image.load_img use PIL internally)
So, downgrading from PIL 8.3.0 to 8.2.0 may work.
Check PIL version:
import PIL
print(PIL.__version__)
If it is 8.3.0, then you may downgrade to 8.2.0:
!pip install pillow==8.2.0

Google Colab - AttributeError: 'numpy.ndarray' object has no attribute 'seek' and 'read'

I'm trying to concatenate 3 images horizontaly. I've found a way that seems good enough on the the internet, the problem is that I'm having two errors that I can't figure out how to fix.
Note: I'm using Google Colab with Python 3.
Here is the errors:
AttributeError: 'numpy.ndarray' object has no attribute 'seek'
AttributeError: 'numpy.ndarray' object has no attribute 'read'
My code:
import matplotlib.pyplot as plt
import numpy as np
import sys
import PIL
import cv2
rgb = cv2.imread("/content/teste/LC08_L1TP_222063_20180702_20180716_01_T1_1_3.tif")
tir = cv2.imread("/content/teste/LC08_L1TP_222063_20180702_20180716_01_T1_TIR_1_3.tif")
qb = cv2.imread("/content/teste/LC08_L1TP_222063_20180702_20180716_01_T1_QB_1_3.tif")
list_im = [rgb, tir, qb]
imgs = [ PIL.Image.open(i) for i in list_im ]
# pick the image which is the smallest, and resize the others to match it (can be arbitrary image shape here)
min_shape = sorted( [(np.sum(i.size), i.size ) for i in imgs])[0][1]
imgs_comb = np.hstack( (np.asarray( i.resize(min_shape) ) for i in imgs ) )
# save that beautiful picture
imgs_comb = PIL.Image.fromarray( imgs_comb)
imgs_comb.save( 'Trifecta.jpg' )
Apparently the error happens on this line: ---> 12| imgs = [ PIL.Image.open(i) for i in list_im ]
And because of it I tried to update the Pillow library to 5.3.0, but the error still happens, what should I do?
How about not using pil, seems like opencv can do want you want
import matplotlib.pyplot as plt
import numpy as np
import sys
import cv2
rgb = cv2.imread("/content/teste/LC08_L1TP_222063_20180702_20180716_01_T1_1_3.tif")
tir = cv2.imread("/content/teste/LC08_L1TP_222063_20180702_20180716_01_T1_TIR_1_3.tif")
qb = cv2.imread("/content/teste/LC08_L1TP_222063_20180702_20180716_01_T1_QB_1_3.tif")
list_im = [rgb, tir, qb]
# pick the image which is the smallest, and resize the others to match it (can be arbitrary image shape here)
min_width = min([img.shape[1] for img in list_im])
min_height = min([img.shape[0] for img in list_im])
list_img = [cv2.resize(img, (min_width, min_height)) for img in list_im]
imgs_comb = np.hstack( list_img )
# save that beautiful picture
cv2.imwrite('result.jpg', imgs_comb)

python cv2.imread return none on 6th image

I am trying to import and read all images in a folder. However, when I have more than 5 images, cv2.imread returns none for the 6th image. I have tried using different file names, different files, etc, but I can't get it to work.
import cv2
import numpy as np
import matplotlib.pyplot as plt
from tkinter import filedialog
import os
from mpl_toolkits.mplot3d import Axes3D
global scan_dir
scan_dir = filedialog.askdirectory()
print(scan_dir)
x=os.listdir(scan_dir)
img={}
print(x)
for i in range(0,len(x)):
print(i)
img[i] = cv2.imread(x[i], cv2.IMREAD_GRAYSCALE)
indices[i] = np.where(img[i]<100)
I get the following error...(None is the return of print(img[i] on 6th iteration of the loop)
None
Traceback (most recent call last):
File "C:\CodeRepository\US-3D\GettingCloser.py", line 55, in <module>
indices[i] = np.where(img[i]<100)
TypeError: '<' not supported between instances of 'NoneType' and 'int'
I have the same problem if I try this
global scan_dir
scan_dir = filedialog.askdirectory()
print(scan_dir)
x=os.listdir(scan_dir)
img = cv2.imread(x[5], cv2.IMREAD_GRAYSCALE)
It will return that img is None. This is true for anything beyond the 5th image.
Must be something wrong with the file. dicts are an unordered Data structure. Should not give error always on 5th iteration. However, I have made the changes which will not throw the error. But you need to debug that image
for i in range(0,len(x)):
print(i)
img[i] = cv2.imread(x[i], cv2.IMREAD_GRAYSCALE)
if img[i]:
indices[i] = np.where(img[i]<100)

Unable to call the fit function on randomforest regressor python sklearn

I'm unable to call the fit function on the RandomForestRegressor and even the intellisense is only showing the predict and some other parameters. Below is my code, traceback call and an image showing the content of the intellisense.
import pandas
import numpy as np
from sklearn.model_selection import KFold
from sklearn.ensemble import RandomForestRegressor
def predict():
Fvector = 'C:/Users/Oussema/Desktop/Cred_Data/VEctors/FinalFeatureVector.csv'
data = np.genfromtxt(Fvector, dtype=float, delimiter=',', names=True)
AnnotArr = np.array(data['CredAnnot']) #this is a 1D array containig the ground truth (50000 rows)
TempTestArr = np.array([data['GrammarV'],data['TweetSentSc'],data['URLState']]) #this is the features vector the shape is (3,50000) the values range is [0-1]
FeatureVector = TempTestArr.transpose() #i used the transpose method to get the shape (50000,3)
RF_model = RandomForestRegressor(n_estimators=20, max_features = 'auto', n_jobs = -1)
RF_model.fit(FeatureVector,AnnotArr)
print(RF_model.oob_score_)
predict()
Intelisense content:
[1]: https://i.stack.imgur.com/XweOo.png
Traceback call
Traceback (most recent call last):
File "C:\Users\Oussema\source\repos\Regression_Models\Regression_Models\Random_forest_TCA.py", line 15, in <module>
predict()
File "C:\Users\Oussema\source\repos\Regression_Models\Regression_Models\Random_forest_TCA.py", line 14, in predict
print(RF_model.oob_score_)
AttributeError: 'RandomForestRegressor' object has no attribute 'oob_score_'
You need to set the oob_score param to True when initializing the RandomForestRegressor.
As per the documentation:
oob_score : bool, optional (default=False)
whether to use out-of-bag samples to estimate the R^2 on unseen data.
So the attribute oob_score_ is only available if you do this:
def predict():
....
....
RF_model = RandomForestRegressor(n_estimators=20,
max_features = 'auto',
n_jobs = -1,
oob_score=True) #<= This is what you want
....
....
print(RF_model.oob_score_)

Plotting decision tree, graphvizm pydotplus

I'm following the tutorial for decision tree on scikit documentation.
I have pydotplus 2.0.2 but it is telling me that it does not have write method - error below. I've been struggling for a while with it now, any ideas, please? Many thanks!
from sklearn import tree
from sklearn.datasets import load_iris
iris = load_iris()
clf = tree.DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)
from IPython.display import Image
dot_data = tree.export_graphviz(clf, out_file=None)
import pydotplus
graph = pydotplus.graphviz.graph_from_dot_data(dot_data)
Image(graph.create_png())
and my error is
/Users/air/anaconda/bin/python /Users/air/PycharmProjects/kiwi/hemr.py
Traceback (most recent call last):
File "/Users/air/PycharmProjects/kiwi/hemr.py", line 10, in <module>
dot_data = tree.export_graphviz(clf, out_file=None)
File "/Users/air/anaconda/lib/python2.7/site-packages/sklearn/tree/export.py", line 375, in export_graphviz
out_file.write('digraph Tree {\n')
AttributeError: 'NoneType' object has no attribute 'write'
Process finished with exit code 1
----- UPDATE -----
Using the fix with out_file, it throws another error:
Traceback (most recent call last):
File "/Users/air/PycharmProjects/kiwi/hemr.py", line 13, in <module>
graph = pydotplus.graphviz.graph_from_dot_data(dot_data)
File "/Users/air/anaconda/lib/python2.7/site-packages/pydotplus/graphviz.py", line 302, in graph_from_dot_data
return parser.parse_dot_data(data)
File "/Users/air/anaconda/lib/python2.7/site-packages/pydotplus/parser.py", line 548, in parse_dot_data
if data.startswith(codecs.BOM_UTF8):
AttributeError: 'NoneType' object has no attribute 'startswith'
---- UPDATE 2 -----
Also, se my own answer below which solves another problem
The problem is that you are setting the parameter out_file to None.
If you look at the documentation, if you set it at None it returns the string file directly and does not create a file. And of course a string does not have a write method.
Therefore, do as follows :
dot_data = tree.export_graphviz(clf)
graph = pydotplus.graphviz.graph_from_dot_data(dot_data)
Method graph_from_dot_data() didn't work for me even after specifying proper path for out_file.
Instead try using graph_from_dot_file method:
graph = pydotplus.graphviz.graph_from_dot_file("iris.dot")
I met the same error this morning. I use python 3.x and here is how I solve the problem.
from sklearn import tree
from sklearn.datasets import load_iris
from IPython.display import Image
import io
iris = load_iris()
clf = tree.DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)
# Let's give dot_data some space so it will not feel nervous any more
dot_data = io.StringIO()
tree.export_graphviz(clf, out_file=dot_data)
import pydotplus
graph = pydotplus.graphviz.graph_from_dot_data(dot_data.getvalue())
# make sure you have graphviz installed and set in path
Image(graph.create_png())
if you use python 2.x, I believe you need to change "import io" as:
import StringIO
and,
dot_data = StringIO.StringIO()
Hope it helps.
Also another problem was the backend settings to my Graphviz!! It is solved nicely here. you just need to lookup that settings file and change backend, or in the code mpl.use("TkAgg") as suggested there in the comments. After I only got error that pydotplot couldn't find my Graphviz executable, hence I reinstalled Graphviz via homebrew: brew install graphviz which solved the issue and I can make plots now!!
What really helped me solve the problem was:-
I executed the code from the same user through which graphviz was installed. So executing from any other user would give your error
i would suggest avoid graphviz & use the following alternate approach
from sklearn.tree import plot_tree
plt.figure(figsize=(60,30))
plot_tree(clf, filled=True);

Resources