Append matplotlib histogram plots to a list, without overwriting - python-3.x

I am trying to write a function, such that it creates multiple histograms(using matplotlib.pyplot) using a for loop on some data. I append these plots to a list, and the list is returned.
But when I try to use show() on each of the plots, they are all the same plot, and all the plots are being overwritten.
I have tried using plt.clf() at the end of the for loop, but it does not work for me.
for data in data_list:
n, bins, patches = plt.hist(x=data, bins='auto', color=color,
alpha=0.7, rwidth=0.85)
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.grid(axis='y', alpha=0.55)
plots.append(plt)

First, maybe we should question the need: why do you want to keep all plots at hand? Why not save them after plotting plt.savefig('xxxxx')? Are you planning to animate all of the figures?
Second, if we want to keep all figures, subplots might be a handy tool. Note that you are still plotting in one plot, but calling subplots creates multiple axes and allow them to not interfere with each other. Try this:
import matplotlib.pyplot as plt
np.random.seed(0)
data_list = [np.random.randn(10) for _ in range(4)]
plots=[]
fig, axes = plt.subplots(4,1)
for i,data in enumerate(data_list):
n, bins, patches = axes[i].hist(x=data, bins='auto',
alpha=0.7, rwidth=0.85)
axes[i].set_xlabel('Values')
axes[i].set_ylabel('Frequency')
axes[i].grid(axis='y', alpha=0.55)
For more info on subplots
For more info on multiple figures

Related

Plotting multiple series generated using a for loop in python on the same figure

import numpy as np
import matplotlib.pyplot as plt
Data=np.array(([200,99],[200,62],[200,40],[300,94],[300,44],[300,24],[400,86],[400,35],[400,13]))
X_axis=[]
Y_axis=[]
for j in range(3):
Counter=j
for i in range(3):
X=Data[i+Counter,0]
Y=Data[i+Counter,1]
X_axis.append(X)
Y_axis.append(Y)
Counter=Counter+2
#plt.figure(1)
plt.plot(X_axis,Y_axis, linestyle='solid', marker='o')
plt.show()
X_axis=[]
Y_axis=[]
With this code I am getting three separate figures for three series of data set I am generating using two for loops. I am trying to plot all the three series on a single figure. The three series are:
Series-1: x-axis [200,300,400], y-axis [99,94,86]; Series-2: x-axis [200,300,400], y-axis [62,44,35]; Series-3: x-axis [200,300,400], y-axis [40,24,13] By reseting the X_axis and Y_axis inside the nested for loop, I am trying to generate three separate series, otherwise they get appended. I would appreciate any help.

Legend overwritten by plot - matplotlib

I have a plot that looks as follows:
I want to put labels for both the lineplot and the markers in red. However the legend is not appearning because its the plot is taking out its space.
Update
it turns out I cannot put several strings in plt.legend()
I made the figure bigger by using the following:
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
However now I have only one label in the legend, with the marker appearing on the lineplot while I rather want two: one for the marker alone and another for the line alone:
Updated code:
plt.plot(range(len(y)), y, '-bD', c='blue', markerfacecolor='red', markeredgecolor='k', markevery=rare_cases, label='%s' % target_var_name)
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
# changed this over here
plt.legend()
plt.savefig(output_folder + fig_name)
plt.close()
What you want to do (have two labels for a single object) is not completely impossible but it's MUCH easier to plot separately the line and the rare values, e.g.
# boilerplate
import numpy as np
import matplotlib.pyplot as plt
# synthesize some data
N = 501
t = np.linspace(0, 10, N)
s = np.sin(np.pi*t)
rare = np.zeros(N, dtype=bool); rare[:20]=True; np.random.shuffle(rare)
plt.plot(t, s, label='Curve')
plt.scatter(t[rare], s[rare], label='rare')
plt.legend()
plt.show()
Update
[...] it turns out I cannot put several strings in plt.legend()
Well, you can, as long as ① the several strings are in an iterable (a tuple or a list) and ② the number of strings (i.e., labels) equals the number of artists (i.e., thingies) in the plot.
plt.legend(('a', 'b', 'c'))

Using "hue" for a Seaborn visual: how to get legend in one graph?

I created a scatter plot in seaborn using seaborn.relplot, but am having trouble putting the legend all in one graph.
When I do this simple way, everything works fine:
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
df2 = df[df.ln_amt_000s < 700]
sns.relplot(x='ln_amt_000s', y='hud_med_fm_inc', hue='outcome', size='outcome', legend='brief', ax=ax, data=df2)
The result is a scatter plot as desired, with the legend on the right hand side.
However, when I try to generate a matplotlib figure and axes objects ahead of time to specify the figure dimensions I run into problems:
a4_dims = (10, 10) # generating a matplotlib figure and axes objects ahead of time to specify figure dimensions
df2 = df[df.ln_amt_000s < 700]
fig, ax = plt.subplots(figsize = a4_dims)
sns.relplot(x='ln_amt_000s', y='hud_med_fm_inc', hue='outcome', size='outcome', legend='brief', ax=ax, data=df2)
The result is two graphs -- one that has the scatter plots as expected but missing the legend, and another one below it that is all blank except for the legend on the right hand side.
How do I fix this such? My desired result is one graph where I can specify the figure dimensions and have the legend at the bottom in two rows, below the x-axis (if that is too difficult, or not supported, then the default legend position to the right on the same graph would work too)? I know the problem lies with "ax=ax", and in the way I am specifying the dimensions as matplotlib figure, but I'd like to know specifically why this causes a problem so I can learn from this.
Thank you for your time.
The issue is that sns.relplot is a "Figure-level interface for drawing relational plots onto a FacetGrid" (see the API page). With a simple sns.scatterplot (the default type of plot used by sns.relplot), your code works (changed to use reproducible data):
df = pd.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv", index_col=0)
fig, ax = plt.subplots(figsize = (5,5))
sns.scatterplot(x = 'Sepal.Length', y = 'Sepal.Width',
hue = 'Species', legend = 'brief',
ax=ax, data = df)
plt.show()
Further edits to legend
Seaborn's legends are a bit finicky. Some tweaks you may want to employ:
Remove the default seaborn title, which is actually a legend entry, by getting and slicing the handles and labels
Set a new title that is actually a title
Move the location and make use of bbox_to_anchor to move outside the plot area (note that the bbox parameters need some tweaking depending on your plot size)
Specify the number of columns
fig, ax = plt.subplots(figsize = (5,5))
sns.scatterplot(x = 'Sepal.Length', y = 'Sepal.Width',
hue = 'Species', legend = 'brief',
ax=ax, data = df)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], loc=8,
ncol=2, bbox_to_anchor=[0.5,-.3,0,0])
plt.show()

Plot several boxplots in one figure

I am using python-3.x and I would like to plot several boxplots in one figure, all the data from one numpy array where the shape of this array is (100, 301)
If I use the code below it will plot them all (I will have 301 boxplots in one figure which is too much)
fig, ax = plt.subplots()
ax.boxplot(my_data)
plt.show()
I don't want to plot all the data, I just want to plot 10, 15 or 20 (variable number) of the data by using for loop or any method that work best.
for example, I want to plot boxplots every 50 number of data that mean I will have around 6 boxplots from 301 in my figure, I tried to use for loop but no luck
Any advice would be much appreciated
You can just use indexing to plot every 50th data points using a variable step. To have separate box plots and avoid overlapping, you can specify the positions of individual box plot using the positions parameter. my_data[:, ::step] gives you the desired data to plot. Below is an example using some random data.
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
my_data = np.random.randint(0, 20, (100, 301))
step = 50
posit = range(my_data[:, ::step].shape[1])
ax.boxplot(my_data[:, ::step], positions=posit)
plt.show()

matplotlib.pyplot: create a subplot of stored plots

python 3.6 om mac
matplotlib 2.1.0
using matplotlib.pyplot (as plt)
Let's say i have a few plt.figures() that i appended into a list called figures as objects. When in command line i do: figures[0]it produces the plot for the index 0 of the list figures.
However, how can i arrange to have all the plots in figures to be in a subplot.
# Pseudo code:
plt.figure()
for i, fig in enumerate(figures): # figures contains the plots
plt.subplot(2, 2, i+1)
fig # location i+1 of the subplot is filled with the fig plot element
So as a result, i would a 2 by 2 grid that contains each plot found in figures.
hoping this makes sense.
A figure is a figure. You cannot have a figure inside a figure. The usual approach is to create a figure, create one or several subplots, plot something in the subplots.
In case it may happen that you want to plot something in different axes or figures, it might make sense to wrap the plotting in a function which takes the axes as argument.
You could then use this function to plot to an axes of a new figure or to plot to an axes of a figure with many subplots.
import numpy as np
import matplotlib.pyplot as plt
def myplot(ax, data_x, data_y, color="C0"):
ax.plot(data_x, data_y, color=color)
ax.legend()
x = np.linspace(0,10)
y = np.cumsum(np.random.randn(len(x),4), axis=0)
#create 4 figures
for i in range(4):
fig, ax = plt.subplots()
myplot(ax, x, y[:,i], color="C{}".format(i))
# create another figure with each plot as subplot
fig, ax = plt.subplots(2,2)
for i in range(4):
myplot(ax.flatten()[i], x, y[:,i], color="C{}".format(i))
plt.show()

Resources