matplotlib, add legend for each line? [duplicate] - python-3.x

TL;DR -> How can one create a legend for a line graph in Matplotlib's PyPlot without creating any extra variables?
Please consider the graphing script below:
if __name__ == '__main__':
PyPlot.plot(total_lengths, sort_times_bubble, 'b-',
total_lengths, sort_times_ins, 'r-',
total_lengths, sort_times_merge_r, 'g+',
total_lengths, sort_times_merge_i, 'p-', )
PyPlot.title("Combined Statistics")
PyPlot.xlabel("Length of list (number)")
PyPlot.ylabel("Time taken (seconds)")
PyPlot.show()
As you can see, this is a very basic use of matplotlib's PyPlot. This ideally generates a graph like the one below:
Nothing special, I know. However, it is unclear what data is being plotted where (I'm trying to plot the data of some sorting algorithms, length against time taken, and I'd like to make sure people know which line is which). Thus, I need a legend, however, taking a look at the following example below(from the official site):
ax = subplot(1,1,1)
p1, = ax.plot([1,2,3], label="line 1")
p2, = ax.plot([3,2,1], label="line 2")
p3, = ax.plot([2,3,1], label="line 3")
handles, labels = ax.get_legend_handles_labels()
# reverse the order
ax.legend(handles[::-1], labels[::-1])
# or sort them by labels
import operator
hl = sorted(zip(handles, labels),
key=operator.itemgetter(1))
handles2, labels2 = zip(*hl)
ax.legend(handles2, labels2)
You will see that I need to create an extra variable ax. How can I add a legend to my graph without having to create this extra variable and retaining the simplicity of my current script?

Add a label= to each of your plot() calls, and then call legend(loc='upper left').
Consider this sample (tested with Python 3.8.0):
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 20, 1000)
y1 = np.sin(x)
y2 = np.cos(x)
plt.plot(x, y1, "-b", label="sine")
plt.plot(x, y2, "-r", label="cosine")
plt.legend(loc="upper left")
plt.ylim(-1.5, 2.0)
plt.show()
Slightly modified from this tutorial: http://jakevdp.github.io/mpl_tutorial/tutorial_pages/tut1.html

You can access the Axes instance (ax) with plt.gca(). In this case, you can use
plt.gca().legend()
You can do this either by using the label= keyword in each of your plt.plot() calls or by assigning your labels as a tuple or list within legend, as in this working example:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-0.75,1,100)
y0 = np.exp(2 + 3*x - 7*x**3)
y1 = 7-4*np.sin(4*x)
plt.plot(x,y0,x,y1)
plt.gca().legend(('y0','y1'))
plt.show()
However, if you need to access the Axes instance more that once, I do recommend saving it to the variable ax with
ax = plt.gca()
and then calling ax instead of plt.gca().

Here's an example to help you out ...
fig = plt.figure(figsize=(10,5))
ax = fig.add_subplot(111)
ax.set_title('ADR vs Rating (CS:GO)')
ax.scatter(x=data[:,0],y=data[:,1],label='Data')
plt.plot(data[:,0], m*data[:,0] + b,color='red',label='Our Fitting
Line')
ax.set_xlabel('ADR')
ax.set_ylabel('Rating')
ax.legend(loc='best')
plt.show()

You can add a custom legend documentation
first = [1, 2, 4, 5, 4]
second = [3, 4, 2, 2, 3]
plt.plot(first, 'g--', second, 'r--')
plt.legend(['First List', 'Second List'], loc='upper left')
plt.show()

A simple plot for sine and cosine curves with a legend.
Used matplotlib.pyplot
import math
import matplotlib.pyplot as plt
x=[]
for i in range(-314,314):
x.append(i/100)
ysin=[math.sin(i) for i in x]
ycos=[math.cos(i) for i in x]
plt.plot(x,ysin,label='sin(x)') #specify label for the corresponding curve
plt.plot(x,ycos,label='cos(x)')
plt.xticks([-3.14,-1.57,0,1.57,3.14],['-$\pi$','-$\pi$/2',0,'$\pi$/2','$\pi$'])
plt.legend()
plt.show()

Add labels to each argument in your plot call corresponding to the series it is graphing, i.e. label = "series 1"
Then simply add Pyplot.legend() to the bottom of your script and the legend will display these labels.

Related

Matplotlib - maintain plot size of uneven subplots

I've been creating uneven subplots in matplotlib based on this question. The gridspec solution (third answer) worked a little better for me as it gives a bit more flexibility for the exact sizes of the subplots.
When I add a plot of a 2D array with imshow() the affected subplot is resized to the shape of the array. Is there any way to avoid that and keep the subplot-sizes (or rather aspect-ratio) fixed?
Here's the example code and the resulting image with the subplot-sizes I'm happy with:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
# generate data
x = np.arange(0, 10, 0.2)
y = np.sin(x)
# plot
fig = plt.figure(figsize=(12, 9))
gs = gridspec.GridSpec(20, 20)
ax1 = fig.add_subplot(gs[0:5,0:11])
ax1.plot(x, y)
ax2 = fig.add_subplot(gs[6:11,0:11])
ax2.plot(y, x)
ax3 = fig.add_subplot(gs[12:20,0:11])
ax3.plot(y, x)
ax4 = fig.add_subplot(gs[0:9,13:20])
ax4.plot(x, y)
ax5 = fig.add_subplot(gs[11:20,13:20])
ax5.plot(y, x)
plt.show()
This is what happens if I additionally plot data from a 2D array with the following lines (insert before plt.show):
2Ddata = np.arange(0, 10, 0.1).reshape(10, 10)
im = ax3.imshow(2Ddata, cmap='rainbow')
How can I restore the original size of the subplot from ax3 (lower left corner)?
Including the line ax3.set_aspect('auto') seems to have solved the issue.

Is there a way to plot 2x Standard Deviation in Seaborn?

For Seaborn lineplot, it seems pretty easy to plot the Standard Deviation by specifying ci='sd'. Is there a way to plot 2 times the standard deviation?
For example, I have a graph like this:
sns.lineplot(data=df, ax=x, x='day_of_week', y='y_variable', color='lightgrey', ci='sd')
Is there a way to make it so the "CI" plotted is 2 times the standard deviation?
I didn't find a solution within the seaborn, but a walk-around way is by using matplotlib.pyplot.fill_between, as, e.g., was done in this answer, but also in the thread suggested in the comments.
Here is my implementation:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme()
flights = sns.load_dataset("flights")
fig, axs = plt.subplots(1, 2, figsize=(12, 6), sharey=True)
sns.lineplot(data=flights, x="year", y = "passengers", ci="sd", ax=axs[0])
axs[0].set_title("seaborn")
nstd = 1.
means = flights.groupby("year")["passengers"].mean()
stds = flights.groupby("year")["passengers"].std()
axs[1].plot(means.index, means.values)
for nstd in range(1, 4):
axs[1].fill_between(means.index, (means - nstd*stds).values, (means + nstd*stds).values, alpha=0.3, label="nstd={}".format(nstd))
axs[1].legend(loc="upper left")
axs[0].set_title("homemade")
plt.savefig("./tmp/flights.png")
plt.close(fig)
The resulting figure is

making multiple plot at the same time in python3

I have a list and a python array like these 2 examples:
example:
Neg = [37.972200755611425, 32.14963079785344]
Pos = array([[15.24373185, 13.66099865, 11.86959384, 9.72792045, 7.12928302, 6.04439412],[14.5235007 , 13. , 11.1792871 , 9.14974712, 6.4429435 , 5.04439412]
both Neg and Pos have 2 elements (in this example) therefore I would like to make 2 separate plots (pdf file) for every element.
in every plot there would be 2 lines:
1- comes from Pos and is a line plot basically which is made of all the elements in the sub-list.
2- comes from Neg and is a horizontal line on the y-axis.
I am trying to do that in a for loop for all elements at the same time. to do so, I made the following code in python but it does not return what I would like to get. do you know how to fix it ?
for i in range(len(Neg)):
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(concentration, Pos[i], label='gg')
plt.axhline(y=Neg[i], color='b', linestyle='-')
ax.legend()
ax.set_xlabel("log2 concentration")
ax.set_ylabel("log2 raw counts")
ax.set_ylim(0, 40)
plt.savefig(f'{i}.pdf')
Not quite sure exactly what you want but this code creates two subplots of the data in the way I think you're describing it:
import numpy as np
from matplotlib import pyplot as plt
Neg = [37.972200755611425, 32.14963079785344]
Pos = np.array([[15.24373185, 13.66099865, 11.86959384, 9.72792045, 7.12928302, 6.04439412],[14.5235007 , 13. , 11.1792871 , 9.14974712, 6.4429435 , 5.04439412]])
fig = plt.figure()
for i in range(len(Neg)):
ax = fig.add_subplot(2,1,i+1)
ax.plot(Pos[i], label='gg')
plt.axhline(y=Neg[i], color='b', linestyle='-')
ax.legend()
ax.set_xlabel("log2 concentration")
ax.set_ylabel("log2 raw counts")
ax.set_ylim(0, 40)
plt.subplots_adjust(hspace=1.0)
extent = ax.get_window_extent().transformed(fig.dpi_scale_trans.inverted())
fig.savefig(f'{i}.pdf', bbox_inches=extent.expanded(1.2, 1.9))
Edited the code to save each subplot individually to file by grabbing a specific part of the plot for saving, as used in this question: Save a subplot in matplotlib.
Also included some additional spacing between each subplot by calling subplots_adjust(), so that each subplot can be saved to individual files without any detail from the other subplots being included. This might not be the best way of doing what you want, but I think it will do what you want now.
Alternatively, if you're not set on using subplots, you could always just use a plot per element:
fig = plt.figure()
for i in range(len(Neg)):
plt.plot(Pos[i], label='gg')
plt.axhline(y=Neg[i], color='b', linestyle='-')
plt.legend()
plt.xlabel("log2 concentration")
plt.ylabel("log2 raw counts")
plt.ylim(0, 40)
fig = plt.gcf()
fig.savefig(f'{i}.pdf')
plt.show()

Matplotlib how to plot 1 colorbar for four 2d histogram

Before I start I want to say that I've tried follow this and this post on the same problem however they are doing it with imshow heatmaps unlike 2d histogram like I'm doing.
Here is my code(the actual data has been replaced by randomly generated data but the gist is the same):
import matplotlib.pyplot as plt
import numpy as np
def subplots_hist_2d(x_data, y_data, x_labels, y_labels, titles):
fig, a = plt.subplots(2, 2)
a = a.ravel()
for idx, ax in enumerate(a):
image = ax.hist2d(x_data[idx], y_data[idx], bins=50, range=[[-2, 2],[-2, 2]])
ax.set_title(titles[idx], fontsize=12)
ax.set_xlabel(x_labels[idx])
ax.set_ylabel(y_labels[idx])
ax.set_aspect("equal")
cb = fig.colorbar(image[idx])
cb.set_label("Intensity", rotation=270)
# pad = how big overall pic is
# w_pad = how separate they're left to right
# h_pad = how separate they're top to bottom
plt.tight_layout(pad=-1, w_pad=-10, h_pad=0.5)
x1, y1 = np.random.uniform(-2, 2, 10000), np.random.uniform(-2, 2, 10000)
x2, y2 = np.random.uniform(-2, 2, 10000), np.random.uniform(-2, 2, 10000)
x3, y3 = np.random.uniform(-2, 2, 10000), np.random.uniform(-2, 2, 10000)
x4, y4 = np.random.uniform(-2, 2, 10000), np.random.uniform(-2, 2, 10000)
x_data = [x1, x2, x3, x4]
y_data = [y1, y2, y3, y4]
x_labels = ["x1", "x2", "x3", "x4"]
y_labels = ["y1", "y2", "y3", "y4"]
titles = ["1", "2", "3", "4"]
subplots_hist_2d(x_data, y_data, x_labels, y_labels, titles)
And this is what it's generating:
So now my problem is that I could not for the life of me make the colorbar apply for all 4 of the histograms. Also for some reason the bottom right histogram seems to behave weirdly compared with the others. In the links that I've posted their methods don't seem to use a = a.ravel() and I'm only using it here because it's the only way that allows me to plot my 4 histograms as subplots. Help?
EDIT:
Thomas Kuhn your new method actually solved all of my problem until I put my labels down and tried to use plt.tight_layout() to sort out the overlaps. It seems that if I put down the specific parameters in plt.tight_layout(pad=i, w_pad=0, h_pad=0) then the colorbar starts to misbehave. I'll now explain my problem.
I have made some changes to your new method so that it suits what I want, like this
def test_hist_2d(x_data, y_data, x_labels, y_labels, titles):
nrows, ncols = 2, 2
fig, axes = plt.subplots(nrows, ncols, sharex=True, sharey=True)
##produce the actual data and compute the histograms
mappables=[]
for (i, j), ax in np.ndenumerate(axes):
H, xedges, yedges = np.histogram2d(x_data[i][j], y_data[i][j], bins=50, range=[[-2, 2],[-2, 2]])
ax.set_title(titles[i][j], fontsize=12)
ax.set_xlabel(x_labels[i][j])
ax.set_ylabel(y_labels[i][j])
ax.set_aspect("equal")
mappables.append(H)
##the min and max values of all histograms
vmin = np.min(mappables)
vmax = np.max(mappables)
##second loop for visualisation
for ax, H in zip(axes.ravel(), mappables):
im = ax.imshow(H,vmin=vmin, vmax=vmax, extent=[-2,2,-2,2])
##colorbar using solution from linked question
fig.colorbar(im,ax=axes.ravel())
plt.show()
# plt.tight_layout
# plt.tight_layout(pad=i, w_pad=0, h_pad=0)
Now if I try to generate my data, in this case:
phi, cos_theta = get_angles(runs)
detector_x1, detector_y1, smeared_x1, smeared_y1 = detection_vectorised(1.5, cos_theta, phi)
detector_x2, detector_y2, smeared_x2, smeared_y2 = detection_vectorised(1, cos_theta, phi)
detector_x3, detector_y3, smeared_x3, smeared_y3 = detection_vectorised(0.5, cos_theta, phi)
detector_x4, detector_y4, smeared_x4, smeared_y4 = detection_vectorised(0, cos_theta, phi)
Here detector_x, detector_y, smeared_x, smeared_y are all lists of data point
So now I put them into 2x2 lists so that they can be unpacked suitably by my plotting function, as such:
data_x = [[detector_x1, detector_x2], [detector_x3, detector_x4]]
data_y = [[detector_y1, detector_y2], [detector_y3, detector_y4]]
x_labels = [["x positions(m)", "x positions(m)"], ["x positions(m)", "x positions(m)"]]
y_labels = [["y positions(m)", "y positions(m)"], ["y positions(m)", "y positions(m)"]]
titles = [["0.5m from detector", "1.0m from detector"], ["1.5m from detector", "2.0m from detector"]]
I now run my code with
test_hist_2d(data_x, data_y, x_labels, y_labels, titles)
with just plt.show() turned on, it gives this:
which is great because data and visual wise, it is exactly what I want i.e. the colormap corresponds to all 4 histograms. However, since the labels are overlapping with the titles, I thought I would just run the same thing but this time with plt.tight_layout(pad=a, w_pad=b, h_pad=c) hoping that I would be able to adjust the overlapping labels problem. However this time it doesn't matter how I change the numbers a, b and c, I always get my colorbar lying on the second column of graphs, like this:
Now changing a only makes the overall subplots bigger or smaller, and the best I could do was to adjust it with plt.tight_layout(pad=-10, w_pad=-15, h_pad=0), which looks like this
So it seems that whatever your new method is doing, it made the whole plot lost its adjustability. Your solution, as wonderful as it is at solving one problem, in return, created another. So what would be the best thing to do here?
Edit 2:
Using fig, axes = plt.subplots(nrows, ncols, sharex=True, sharey=True, constrained_layout=True) along with plt.show() gives
As you can see there's still a vertical gap between the columns of subplots for which not even using plt.subplots_adjust() can get rid of.
Edit:
As has been noted in the comments, the biggest problem here is actually to make the colorbar for many histograms meaningful, as ax.hist2d will always scale the histogram data it receives from numpy. It may therefore be best to first calculated the 2d histogram data using numpy and then use again imshow to visualise it. This way, also the solutions of the linked question can be applied. To make the problem with the normalisation more visible, I put some effort into producing some qualitatively different 2d histograms using scipy.stats.multivariate_normal, which shows how the height of the histogram can change quite dramatically even though the number of samples is the same in each figure.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec as gs
from scipy.stats import multivariate_normal
##opening figure and axes
nrows=3
ncols=3
fig, axes = plt.subplots(nrows,ncols)
##generate some random data for the distributions
means = np.random.rand(nrows,ncols,2)
sigmas = np.random.rand(nrows,ncols,2)
thetas = np.random.rand(nrows,ncols)*np.pi*2
##produce the actual data and compute the histograms
mappables=[]
for mean,sigma,theta in zip( means.reshape(-1,2), sigmas.reshape(-1,2), thetas.reshape(-1)):
##the data (only cosmetics):
c, s = np.cos(theta), np.sin(theta)
rot = np.array(((c,-s), (s, c)))
cov = rot#np.diag(sigma)#rot.T
rv = multivariate_normal(mean,cov)
data = rv.rvs(size = 10000)
##the 2d histogram from numpy
H,xedges,yedges = np.histogram2d(data[:,0], data[:,1], bins=50, range=[[-2, 2],[-2, 2]])
mappables.append(H)
##the min and max values of all histograms
vmin = np.min(mappables)
vmax = np.max(mappables)
##second loop for visualisation
for ax,H in zip(axes.ravel(),mappables):
im = ax.imshow(H,vmin=vmin, vmax=vmax, extent=[-2,2,-2,2])
##colorbar using solution from linked question
fig.colorbar(im,ax=axes.ravel())
plt.show()
This code produces a figure like this:
Old Answer:
One way to solve your problem is to generate the space for your colorbar explicitly. You can use a GridSpec instance to define how wide your colorbar should be. Below your subplots_hist_2d() function with a few modifications. Note that your use of tight_layout() shifted the colorbar into a funny place, hence the replacement. If you want the plots closer to each other, I'd rather recommend to play with the aspect ratio of the figure.
def subplots_hist_2d(x_data, y_data, x_labels, y_labels, titles):
## fig, a = plt.subplots(2, 2)
fig = plt.figure()
g = gs.GridSpec(nrows=2, ncols=3, width_ratios=[1,1,0.05])
a = [fig.add_subplot(g[n,m]) for n in range(2) for m in range(2)]
cax = fig.add_subplot(g[:,2])
## a = a.ravel()
for idx, ax in enumerate(a):
image = ax.hist2d(x_data[idx], y_data[idx], bins=50, range=[[-2, 2],[-2, 2]])
ax.set_title(titles[idx], fontsize=12)
ax.set_xlabel(x_labels[idx])
ax.set_ylabel(y_labels[idx])
ax.set_aspect("equal")
## cb = fig.colorbar(image[-1],ax=a)
cb = fig.colorbar(image[-1], cax=cax)
cb.set_label("Intensity", rotation=270)
# pad = how big overall pic is
# w_pad = how separate they're left to right
# h_pad = how separate they're top to bottom
## plt.tight_layout(pad=-1, w_pad=-10, h_pad=0.5)
fig.tight_layout()
Using this modified function, I get the following output:

How to change the location of the symbols/text within a legend box?

I have a subplot with a single legend entry. I am placing the legend at the bottom of the figure and using mode='expand'; however, the single legend entry is placed to the very left of the legend box. To my understanding, changing kwargs such as bbox_to_anchor changes the legend box parameters but not the parameters of the symbols/text within. Below is an example to reproduce my issue.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-10, 10, 21)
y = np.exp(x)
z = x **2
fig, axes = plt.subplots(nrows=1, ncols=2)
axes[0].plot(x, y, color='r', label='exponential')
axes[1].plot(x, z, color='b')
# handles, labels = axes[0].get_legend_handles_labels()
plt.subplots_adjust(bottom=0.125)
fig.legend(mode='expand', loc='lower center')
plt.show()
plt.close(fig)
This code produces . How can I change the position of the symbol and text such that they are centered in the legend box?
PS: I am aware that exponential is a bad label for this subplot since it only describes the first subfigure. But, this is just for examples-sake so that I can apply it to my actual use-case.
The legend entries are placed using a HPacker object. This does not allow to be centered. The behaviour is rather that those HPackers are "justified" (similar to the "justify" option in common word processing software).
A workaround would be to create three (or any odd number of) legend entries, such that the desired entry is in the middle. This would be accomplished via the ncol argument and the use of "dummy" entries (which might be transparent and have no associated label).
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-10, 10, 21)
y = np.exp(x)
z = x **2
fig, axes = plt.subplots(nrows=1, ncols=2)
fig.subplots_adjust(bottom=0.125)
l1, = axes[0].plot(x, y, color='r', label='exponential')
axes[1].plot(x, z, color='b')
dummy = plt.Line2D([],[], alpha=0)
fig.legend(handles=[dummy, l1, dummy],
mode='expand', loc='lower center', ncol=3)
plt.show()

Resources