I am trying to find a way to move the little multiplier below the x-axis to the top. I have a plot with two x-axis and the multiplier of the top axis is placed below the bottom x-axis, which I find confusing.
Here is a small example:
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
fig = plt.figure(num=None,figsize=(15, 2.5), dpi=300)
gs = mpl.gridspec.GridSpec(1,1)
ax2 = plt.subplot(gs[0,0])
ax1 = ax2.twiny()
ax1.grid(False)
ax1.set_xlim(0,10000000)
ax2.set_xlim(0,1000000)
ax1.set_ylim([0,100])
ax2.set_ylim([0,100])
plt.show()
Now, if you change ax2.set_xlim(0,1000000) to ax2.set_xlim(0,100000000), then both multipliers are placed below the bottom x-axis. Maybe it is also possible to prevent the multiplier from overlapping with the x-axis tick labels?
My problem with researching this is that I have no idea how this 'multiplier' is actually called.
Related
I created a scatter plot in seaborn using seaborn.relplot, but am having trouble putting the legend all in one graph.
When I do this simple way, everything works fine:
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
df2 = df[df.ln_amt_000s < 700]
sns.relplot(x='ln_amt_000s', y='hud_med_fm_inc', hue='outcome', size='outcome', legend='brief', ax=ax, data=df2)
The result is a scatter plot as desired, with the legend on the right hand side.
However, when I try to generate a matplotlib figure and axes objects ahead of time to specify the figure dimensions I run into problems:
a4_dims = (10, 10) # generating a matplotlib figure and axes objects ahead of time to specify figure dimensions
df2 = df[df.ln_amt_000s < 700]
fig, ax = plt.subplots(figsize = a4_dims)
sns.relplot(x='ln_amt_000s', y='hud_med_fm_inc', hue='outcome', size='outcome', legend='brief', ax=ax, data=df2)
The result is two graphs -- one that has the scatter plots as expected but missing the legend, and another one below it that is all blank except for the legend on the right hand side.
How do I fix this such? My desired result is one graph where I can specify the figure dimensions and have the legend at the bottom in two rows, below the x-axis (if that is too difficult, or not supported, then the default legend position to the right on the same graph would work too)? I know the problem lies with "ax=ax", and in the way I am specifying the dimensions as matplotlib figure, but I'd like to know specifically why this causes a problem so I can learn from this.
Thank you for your time.
The issue is that sns.relplot is a "Figure-level interface for drawing relational plots onto a FacetGrid" (see the API page). With a simple sns.scatterplot (the default type of plot used by sns.relplot), your code works (changed to use reproducible data):
df = pd.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv", index_col=0)
fig, ax = plt.subplots(figsize = (5,5))
sns.scatterplot(x = 'Sepal.Length', y = 'Sepal.Width',
hue = 'Species', legend = 'brief',
ax=ax, data = df)
plt.show()
Further edits to legend
Seaborn's legends are a bit finicky. Some tweaks you may want to employ:
Remove the default seaborn title, which is actually a legend entry, by getting and slicing the handles and labels
Set a new title that is actually a title
Move the location and make use of bbox_to_anchor to move outside the plot area (note that the bbox parameters need some tweaking depending on your plot size)
Specify the number of columns
fig, ax = plt.subplots(figsize = (5,5))
sns.scatterplot(x = 'Sepal.Length', y = 'Sepal.Width',
hue = 'Species', legend = 'brief',
ax=ax, data = df)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], loc=8,
ncol=2, bbox_to_anchor=[0.5,-.3,0,0])
plt.show()
So I have two lists one containing a bunch of years and the other one containing some integers, each list has 17 values.
when I make a simple bar chart
plt.bar(keys,values)
plt.show()
in the X axis of the graph it only contains some of the years in the keys list eg: the graph only has 2000,2002,2005,2007,2010,2012,2015. It has missed 2001,2003,2006,2008,2009 etc.
Is it because there is a maximum amount of keys allowed in the bar chart so it randomly took some keys?
If not how do i fix this?
There is a maximum amount of ticklabels on a matplotlib axes. This limit however lies well above 1000 and you would first run into severe lags when creating the figure.
The usual automatic ticking by matplotlib is that the axes are equipped with just as many labels as needed. I.e. if you plot 50 points on a plot, you would not want to have 50 labels as well. Further if you plot a point at 0.853164 you would not want to have such odd number being displayed as ticklabel on the axes.
Now, I cannot think of any reason matplotlib would produce the labels you report about, 2000,2002,2005,2007,2010,2012,2015, because the automatic locator for the ticks chooses equidistant points on the axes. For any help with this specific problem, a MCVE would be needed.
But in general there are two concepts from which you may choose.
Numerical axes
When plotting numbers, matplotlib will by default choose a linear axes and tick it automatically as described above.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2000, 2017)
y = np.random.randint(5,21, size=len(x))
plt.bar(x,y)
plt.show()
In this case an equidistant ticking of 2.5 units is chosen to have 7 nicely spaced ticks on the axes. If instead you would want to have every bar ticked, you could use a custom ticker.
E.g. a MultipleLocator with the interval set to 1,
import matplotlib.ticker as mticker
plt.gca().xaxis.set_major_locator(mticker.MultipleLocator(1))
plt.gca().tick_params(axis="x", rotation=90)
Or, a FixedLocator with the locations set to the x values of the bars,
import matplotlib.ticker as mticker
plt.gca().xaxis.set_major_locator(mticker.FixedLocator(x))
plt.gca().tick_params(axis="x", rotation=90)
Categorical axes
You may also decide that your xaxis shall be categorical. This means that every unique value gets its own tick and those ticks are equally spaced, independent of their value. This is easiest accomplished by converting the numbers to strings.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2000, 2017)
y = np.random.randint(5,21, size=len(x))
cats = list(map(str, x))
plt.bar(cats,y)
plt.gca().tick_params(axis="x", rotation=90)
plt.show()
The result is visually the same as above, but this time, the number 2000 is not at location 2000, but at its index 0, 2001 is at 1 and so on.
You can show all the ticks in this way:
plt.xticks(np.arange(min(keys), max(keys)+1, 1.0), rotation=45)
Example:
keys = [2000, 2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016]
values = range(1,18)
import matplotlib.pyplot as plt
plt.bar(keys,values)
plt.xticks(np.arange(min(keys), max(keys)+1, 1.0), rotation=45)
plt.show()
This question already has answers here:
multiple axis in matplotlib with different scales [duplicate]
(3 answers)
Closed 5 years ago.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
d = ['d1','d2','d3','d4','d5','d6']
value = [111111, 222222, 333333, 444444, 555555, 666666]
y_cumsum = np.cumsum(value)
sns.barplot(d, value)
sns.pointplot(d, y_cumsum)
plt.show()
I'm trying to make pareto diagram with barplot and pointplot. But I can't print percentages to the right side ytick. By the way, if I manuplate yticks it overlaps itself.
plt.yticks([1,2,3,4,5])
overlaps like in the image.
Edit: I mean that I want to quarter percentages (0, 25%, 50%, 75%, 100%) on the right hand side of the graphic, as well.
From what I understood, you want to show the percentages on the right hand side of your figure. To do that, we can create a second y axis using twinx(). All we need to do then is to set the limits of this second axis appropriately, and set some custom labels:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
d = ['d1','d2','d3','d4','d5','d6']
value = [111111, 222222, 333333, 444444, 555555, 666666]
fig, ax = plt.subplots()
ax2 = ax.twinx() # create a second y axis
y_cumsum = np.cumsum(value)
sns.barplot(d, value, ax=ax)
sns.pointplot(d, y_cumsum, ax=ax)
y_max = y_cumsum.max() # maximum of the array
# find the percentages of the max y values.
# This will be where the "0%, 25%" labels will be placed
ticks = [0, 0.25*y_max, 0.5*y_max, 0.75*y_max, y_max]
ax2.set_ylim(ax.get_ylim()) # set second y axis to have the same limits as the first y axis
ax2.set_yticks(ticks)
ax2.set_yticklabels(["0%", "25%","50%","75%","100%"]) # set the labels
ax2.grid("off")
plt.show()
This produces the following figure:
I would like to add a figure description to the bottom of my group of subplots. Is there a built in way to do this or do I have to keep messing with text() to get it placed correctly
This will put the label centered and 15 pixels above the bottom of the figure.
import matplotlib.pyplot as plt
import matplotlib.transforms as mtrans
fig, ax = plt.subplots(2, 2)
trans = mtrans.blended_transform_factory(fig.transFigure,
mtrans.IdentityTransform())
txt = fig.text(.5, 15, "total label", ha='center')
txt.set_transform(trans)
See http://matplotlib.org/users/transforms_tutorial.html for more on how to work with transforms.
I have the results of a (H,ranges) = numpy.histogram2d() computation and I'm trying to plot it.
Given H I can easily put it into plt.imshow(H) to get the corresponding image. (see http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.imshow )
My problem is that the axis of the produced image are the "cell counting" of H and are completely unrelated to the values of ranges.
I know I can use the keyword extent (as pointed in: Change values on matplotlib imshow() graph axis ). But this solution does not work for me: my values on range are not growing linearly (actually they are going exponentially)
My question is: How can I put the value of range in plt.imshow()? Or at least, or can I manually set the label values of the plt.imshow resulting object?
Editing the extent is not a good solution.
You can just change the tick labels to something more appropriate for your data.
For example, here we'll set every 5th pixel to an exponential function:
import numpy as np
import matplotlib.pyplot as plt
im = np.random.rand(21,21)
fig,(ax1,ax2) = plt.subplots(1,2)
ax1.imshow(im)
ax2.imshow(im)
# Where we want the ticks, in pixel locations
ticks = np.linspace(0,20,5)
# What those pixel locations correspond to in data coordinates.
# Also set the float format here
ticklabels = ["{:6.2f}".format(i) for i in np.exp(ticks/5)]
ax2.set_xticks(ticks)
ax2.set_xticklabels(ticklabels)
ax2.set_yticks(ticks)
ax2.set_yticklabels(ticklabels)
plt.show()
Expanding a bit on #thomas answer
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mi
im = np.random.rand(20, 20)
ticks = np.exp(np.linspace(0, 10, 20))
fig, ax = plt.subplots()
ax.pcolor(ticks, ticks, im, cmap='viridis')
ax.set_yscale('log')
ax.set_xscale('log')
ax.set_xlim([1, np.exp(10)])
ax.set_ylim([1, np.exp(10)])
By letting mpl take care of the non-linear mapping you can now accurately over-plot other artists. There is a performance hit for this (as pcolor is more expensive to draw than AxesImage), but getting accurate ticks is worth it.
imshow is for displaying images, so it does not support x and y bins.
You could either use pcolor instead,
H,xedges,yedges = np.histogram2d()
plt.pcolor(xedges,yedges,H)
or use plt.hist2d which directly plots your histogram.