MatplotlibDeprecationWarning and aligning titles - python-3.x

I'm trying to put the finishing touches to a small program but I'm stuck on the last 2 items and have gone nowhere for hours. The 2 problems are:
I get the following warning when I compile: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
I want the title of the plot to be center justified. And even though I have loc=center in the parameters for the title, what it is doing is center justifying over the last image (right most). Here is a link to the image: image
The warning mentioned in #1 above is on Line 15 of my code:
import keras
keras.__version__
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import ImageGrid
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
def draw_with_mnist(theString):
for x in range(0, len(theString)):
c = int(theString[x],10)
for i in range(0, 100):
if (train_labels[i] == c):
digit = train_images[i]
plt.subplot(1, len(theString), x+1)
plt.imshow(digit)
plt.axis('off')
plt.grid(b=None)
plt.tight_layout(pad=0.00)
plt.title(theString+" -- My Name", loc='center')
plt.show()
draw_with_mnist("34085194")
Appreciate any help anyone can offer.

Related

Python: how to create a smoothed version of a 2D binned "color map"?

I would like to create a version of this 2D binned "color map" with smoothed colors.
I am not even sure this would be the correct nomenclature for the plot, but, essentially, I want my figure to be color coded by the median values of a third variable for points that reside in each defined bin of my (X, Y) space.
Even though I am able to accomplish that to a certain degree (see example), I would like to find a way to create a version of the same plot with a smoothed color gradient. That would allow me to visualize the overall behavior of my distribution.
I tried ideas described here: Smoothing 2D map in python
and here: Python: binned_statistic_2d mean calculation ignoring NaNs in data
as well as links therein, but could not find a clear solution to the problem.
This is what I have so far:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.stats import binned_statistic_2d
import random
random.seed(999)
x = np.random.normal (0,10,5000)
y = np.random.normal (0,10,5000)
z = np.random.uniform(0,10,5000)
fig = plt.figure(figsize=(20, 20))
plt.rcParams.update({'font.size': 10})
ax = fig.add_subplot(3,3,1)
ax.set_axisbelow(True)
plt.grid(b=True, lw=0.5, zorder=-1)
x_bins = np.arange(-50., 50.5, 1.)
y_bins = np.arange(-50., 50.5, 1.)
cmap = plt.cm.get_cmap('jet_r',1000) #just a colormap
ret = binned_statistic_2d(x, y, z, statistic=np.median, bins=[x_bins, y_bins]) # Bin (X, Y) and create a map of the medians of "Colors"
plt.imshow(ret.statistic.T, origin='bottom', extent=(-50, 50, -50, 50), cmap=cmap)
plt.xlim(-40,40)
plt.ylim(-40,40)
plt.xlabel("X", fontsize=15)
plt.ylabel("Y", fontsize=15)
ax.set_yticks([-40,-30,-20,-10,0,10,20,30,40])
bounds = np.arange(2.0, 20.0, 1.0)
plt.colorbar(ticks=bounds, label="Color", fraction=0.046, pad=0.04)
# save plots
plt.savefig("Whatever_name.png", bbox_inches='tight')
Which produces the following image (from random data):
Therefore, the simple question would be: how to smooth these colors?
Thanks in advance!
PS: sorry for excessive coding, but I believe a clear visualization is crucial for this particular problem.
Thanks to everyone who viewed this issue and tried to help!
I ended up being able to solve my own problem. In the end, it was all about image smoothing with Gaussian Kernel.
This link: Gaussian filtering a image with Nan in Python gave me the insight for the solution.
I, basically, implemented the exactly same code, but, in the end, mapped the previously known NaN pixels from the original 2D array to the resulting smoothed version. Unlike the solution from the link, my version does NOT fill NaN pixels with some value derived from the pixels around. Or, it does, but then I erase those again.
Here is the final figure produced for the example I provided:
Final code, for reference, for those who might need in the future:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.stats import binned_statistic_2d
import scipy.stats as st
import scipy.ndimage
import scipy as sp
import random
random.seed(999)
x = np.random.normal (0,10,5000)
y = np.random.normal (0,10,5000)
z = np.random.uniform(0,10,5000)
fig = plt.figure(figsize=(20, 20))
plt.rcParams.update({'font.size': 10})
ax = fig.add_subplot(3,3,1)
ax.set_axisbelow(True)
plt.grid(b=True, lw=0.5, zorder=-1)
x_bins = np.arange(-50., 50.5, 1.)
y_bins = np.arange(-50., 50.5, 1.)
cmap = plt.cm.get_cmap('jet_r',1000) #just a colormap
ret = binned_statistic_2d(x, y, z, statistic=np.median, bins=[x_bins, y_bins]) # Bin (X, Y) and create a map of the medians of "Colors"
sigma=1 # standard deviation for Gaussian kernel
truncate=5.0 # truncate filter at this many sigmas
U = ret.statistic.T.copy()
V=U.copy()
V[np.isnan(U)]=0
VV=sp.ndimage.gaussian_filter(V,sigma=sigma)
W=0*U.copy()+1
W[np.isnan(U)]=0
WW=sp.ndimage.gaussian_filter(W,sigma=sigma)
np.seterr(divide='ignore', invalid='ignore')
Z=VV/WW
for i in range(len(Z)):
for j in range(len(Z[0])):
if np.isnan(U[i][j]):
Z[i][j] = np.nan
plt.imshow(Z, origin='bottom', extent=(-50, 50, -50, 50), cmap=cmap)
plt.xlim(-40,40)
plt.ylim(-40,40)
plt.xlabel("X", fontsize=15)
plt.ylabel("Y", fontsize=15)
ax.set_yticks([-40,-30,-20,-10,0,10,20,30,40])
bounds = np.arange(2.0, 20.0, 1.0)
plt.colorbar(ticks=bounds, label="Color", fraction=0.046, pad=0.04)
# save plots
plt.savefig("Whatever_name.png", bbox_inches='tight')

Python: Pickle.load function returns the correct 3d-scatter plot, but is not interactive anymore

this is my first question here so let me know if I should make any improvements regarding e.g. formulation of the question, code and so on.
So I am creating several 3-D Scatter Plots in Python and want to safe them for later re usage and comparability. I am using Qt5 as Graphics Backend in Spyder, which perfectly displays my interactive (so I can rotate over the axes and flip the plot) 3-D Scatter plot using the origin Code.
Now I am able to successfully save the created plot and also load it into a new script, which opens the Plot in Qt5 as well. But somehow the interactivity is gone, meaning I am not able to rotate over the axes and flip the plot anymore.
I was unable to find any guidance to that issue or find any person with a similar problem. Do you guys have any idea? I'll put the relevant part of my sample Code below:
""" First script """
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import pandas as pd
import pickle
testdf = pd.DataFrame({"X" : x, "Y" : y, "Z" : z}) #x and y are the criteria, z the values, stored as lists
# Create 3d scatter plot
fig = plt.figure(figsize=(12, 12))
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x, y, z, c=z, marker="o")
ax.set_xlabel("Initial Notional Cluster")
ax.set_ylabel("Laufzeit in Month Cluster")
ax.set_zlabel("Vol. Weighted Margin")
plt.show()
# Save the figure object as binary file
file = open(r"Init_Lfz_VolWeightedMargin.pkl", "wb")
pickle.dump(fig, file)
file.close()
""" Second script """
import matplotlib.pyplot as plt
figx = pickle.load(open(r"Init_Lfz_VolWeightedMargin.pkl", "rb"))
plt.show()
Any idea, why the interactivity is gone? According to the pickle library and other usercases, this shall not happen.
Many thanks.

How to use fill_between utilizing the where parameter

So following a tutorial, I tried to create a graph using the following code:
time_values = [i for i in range(1,100)]
execution_time = [random.randint(0,100) for i in range(1,100)]
fig = plt.figure()
ax1 = plt.subplot()
threshold=[.8 for i in range(len(execution_time))]
ax1.plot(time_values, execution_time)
ax1.margins(x=-.49, y=0)
ax1.fill_between(time_values,execution_time, 1,where=(execution_time>1), color='r', alpha=.3)
This did not work as I got an error saying I could not compare a list and an int.
However, I then tried:
ax1.fill_between(time_values,execution_time, 1)
And that gave me a graph with all area in between the execution time and the y=1 line, filled in. Since I want the area above the y=1 line filled in, with the area below left un-shaded, I created a list called threshold, and populated it with 1 so that I could recreate the comparison. However,
ax1.fill_between(time_values,execution_time, 1,where=(execution_time>threshold)
and
ax1.fill_between(time_values,execution_time, 1)
create the exact same graph, even though the execution times values do go beyond 1.
I am confused for two reasons:
firstly, in the tutorial I was watching, the teacher was able to successfully compare a list and an integer within the fill_between function, why was I not able to do this?
Secondly, why is the where parameter not identifying the regions I want to fill? Ie, why is the graph shading in the areas between the y=1 and the value of the execution time?
The problem is mainly due the use of python lists instead of numpy arrays. Clearly you could use lists, but then you need to use them throughout the code.
import numpy as np
import matplotlib.pyplot as plt
time_values = list(range(1,100))
execution_time = [np.random.randint(0,100) for _ in range(len(time_values))]
threshold = 50
fig, ax = plt.subplots()
ax.plot(time_values, execution_time)
ax.fill_between(time_values, execution_time, threshold,
where= [e > threshold for e in execution_time],
color='r', alpha=.3)
ax.set_ylim(0,None)
plt.show()
Better is the use of numpy arrays throughout. It's not only faster, but also easier to code and understand.
import numpy as np
import matplotlib.pyplot as plt
time_values = np.arange(1,100)
execution_time = np.random.randint(0,100, size=len(time_values))
threshold = 50
fig, ax = plt.subplots()
ax.plot(time_values, execution_time)
ax.fill_between(time_values,execution_time, threshold,
where=(execution_time > threshold), color='r', alpha=.3)
ax.set_ylim(0,None)
plt.show()

Generating different marker shapes in plotly/cufflinks

This post is similar to this one (Change Marker Shapes in Plotly .js), but I can't seem to get anything to work in python. First off, I am trying to make a multi-line graph (which I have done in both plt and plotly...code below), but being colorblind (which I am) I can't often tell what I am looking in plotly because the markers are always a circle (even though the label is included, it sometimes gets cut off (i.e., when the labels are too long) and I can't figure out what I'm looking at). The plotly/cufflinks graphs are much better in terms of being interactive and since I do a lot of data presentations, this will be my preferred method going forward if I can figure out how to change the markers for each line.
I am using Jupyter Notebook (version: 5.4.0) and Python (version 3.6.4)
Screenshot of the dummy_data file.
dummy_data_screenshot
In matplotlib, I did the following to get the output attached (note the different shape markers):
import matplotlib.pyplot as plt
import matplotlib as mpl ##(version: 2.1.2)
import pandas as pd ##(version: 0.22.0)
import numpy as np ##(version: 1.14.0)
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot
import cufflinks as cf ##(version: 0.12.1)
init_notebook_mode(connected=True)
cf.go_offline()
%matplotlib notebook
df = pd.read_csv("desktop\dummy_data.csv")
fx = df.groupby(['studyarm', 'visit'])\
['totdiffic_chg'].mean().unstack('studyarm').drop(['02_UNSCH','ZEOS'])
valid_markers = ([item[0] for item in
mpl.markers.MarkerStyle.markers.items() if
item[1] is not 'nothing' and not item[1].startswith('tick')
and not item[1].startswith('caret')])
markers = np.random.choice(valid_markers, df.shape[1], replace=False)
ax = fx.plot(kind = 'line', linestyle='-')
for i, line in enumerate(ax.get_lines()):
line.set_marker(markers[i])
ax.legend(loc='best')
ax.set_xticklabels(df.index, rotation=45)
plt.title('Some Made Up Data')
plt.ylabel('Score', fontsize=14)
plt.autoscale(enable=True, axis='x', tight=True)
plt.tight_layout()
plt_image_dummy_data
I used the code below and it created the graph via plotly/cufflinks:
fx.iplot(kind='line', yTitle='Score', title='Some Made Up Data',
mode=markers, filename='cufflinks/simple-line')
plotly_image_dummy_data
I have searched the web for the last few days and I can see many options to change the marker color, opacity, etc., etc., but I can't seem to figure out a way to automatically and randomly change the shape of the markers OR to manually change each individual line to a separate marker shape.
I am sure this is a simple fix, but I can't figure it out. Any help (or nudge in the right direction) would be very much appreciated.!
You can specify the shape for scatter plots using the symbol property, like bellow:
Scatter(x = ..., y = ..., mode = 'lines+markers',
marker = dict(size = 10, symbol = 1, ...))
For example:
0 gives circles
1 gives squares
3 gives '+' signs
5 gives triangles, etc.
Have a look at the 'symbol' entry in Plotly's doc here: https://plot.ly/python/reference/#box-marker-symbol

All Matplotlib points appearing at bottom of graph, regardless of y-value

I'm following this linear regression tutorial. Here's my code:
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt
dataframe = pd.read_fwf('brain_body.txt')
x_values = dataframe[['Brain']]
y_values = dataframe[['Body']]
body_reg = linear_model.LinearRegression()
body_reg.fit(x_values, y_values)
plt.scatter(x_values, y_values)
plt.plot(x_values, body_reg.predict(x_values))
plt.show()
When I run the script, I get no errors, but the graph doesn't seem to account for the y-values. I reduced the data points to three so it's easier to see:
I tried to manually change the y-axis with plt.ylim([-1000,7000]) but no luck.
Thanks for any suggestions!
There's nothing wrong with the code, it's just that you have a few very extreme values in relation to the rest of your data. Matplotlib expands the graph to show the extreme values, but that ends up in bunching all the others. Broadening your ylim will only increase the effect - try a much smaller ylim and xlim instead:
plt.ylim([0, 20])
plt.xlim([0, 2])

Resources