Take control of Seaborn marginal histograms? - python-3.x

Question 1:
How do I remove excess space in the plot, when plotting marginals? Answered below in first post.
Question 2:
How do I get more fine contorl over the margin histogram plots, e.g. to plot both histogram and decide kde parameters for the marginals? Answered below in second post, with JointGrid.
#!/usr/bin/env python3
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
sns.set_palette("viridis")
sns.set(style="white", color_codes=True)
x = np.random.normal(0, 1, 1000)
y = np.random.normal(5, 1, 1000)
df = pd.DataFrame({"x":x, "y":y})
g = sns.jointplot(df["x"],df["y"], bw=0.15, shade=True, xlim=(-3,3), ylim=(2,8),cmap="coolwarm", kind="kde", stat_func=None)
# plt.tight_layout() # This will override seaborn parameters. Remember to exclude.
plt.show()

jointplot has a space parameter that determines the space between the mainplot and the marginplots.
Running this code:
g = sns.jointplot(df["x"],df["y"], bw=0.15, shade=True, xlim=(-3,3),
ylim=(2,8),cmap="coolwarm", kind="kde",
stat_func=None, space = 0)
plt.show()
results in this plot for me:
Please note that running with plt.tight_layout() will overrule the space argument for jointplot.
Edit:
To further specify the parameters of the marginal plot you can use marginal_kws. You must pass a dictionary that specifies the parameters of the kind of marginal plot you use.
In your example you use the kde plot as marginal plots. So I will continue to use that as an example:
Here I show how to change the kernel used to make the marginal plots.
g = sns.jointplot(df["x"],df["y"], bw=0.15, shade=True, xlim=(-3,3),
ylim=(2,8),cmap="coolwarm", kind="kde",
stat_func=None, space = 0, marginal_kws={'kernel': 'epa'})
plt.show()
resulting in this graph:
You can pass any parameter accepted by the kde plot as a key in the dictionary and the desired value for that parameter as the value of for that key.

Okay, so I'm going to go ahead and post an extra answer myself. It's not entirely apparent to me which parameters the extra marginal_kws can control. Instead, it might be more intuitive to build the plot layer-by-layer (especially coming from ggplot) using JointGrid:
g = sns.JointGrid(x="x", y="y", data=df) # Initiate multi-plot
g.plot_joint(sns.kdeplot) # Plot the center x/y plot as sns.kdeplot
g.plot_marginals(sns.distplot, kde=True) # Plot the edges as sns.distplot (histogram), where kde can be set to True

Related

matplotlib histogram bins not reflecting data [duplicate]

I can't figure out how to rotate the text on the X Axis. Its a time stamp, so as the number of samples increase, they get closer and closer until they overlap. I'd like to rotate the text 90 degrees so as the samples get closer together, they aren't overlapping.
Below is what I have, it works fine with the exception that I can't figure out how to rotate the X axis text.
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import datetime
font = {'family' : 'normal',
'weight' : 'bold',
'size' : 8}
matplotlib.rc('font', **font)
values = open('stats.csv', 'r').readlines()
time = [datetime.datetime.fromtimestamp(float(i.split(',')[0].strip())) for i in values[1:]]
delay = [float(i.split(',')[1].strip()) for i in values[1:]]
plt.plot(time, delay)
plt.grid(b='on')
plt.savefig('test.png')
This works for me:
plt.xticks(rotation=90)
Many "correct" answers here but I'll add one more since I think some details are left out of several. The OP asked for 90 degree rotation but I'll change to 45 degrees because when you use an angle that isn't zero or 90, you should change the horizontal alignment as well; otherwise your labels will be off-center and a bit misleading (and I'm guessing many people who come here want to rotate axes to something other than 90).
Easiest / Least Code
Option 1
plt.xticks(rotation=45, ha='right')
As mentioned previously, that may not be desirable if you'd rather take the Object Oriented approach.
Option 2
Another fast way (it's intended for date objects but seems to work on any label; doubt this is recommended though):
fig.autofmt_xdate(rotation=45)
fig you would usually get from:
fig = plt.gcf()
fig = plt.figure()
fig, ax = plt.subplots()
fig = ax.figure
Object-Oriented / Dealing directly with ax
Option 3a
If you have the list of labels:
labels = ['One', 'Two', 'Three']
ax.set_xticks([1, 2, 3])
ax.set_xticklabels(labels, rotation=45, ha='right')
In later versions of Matplotlib (3.5+), you can just use set_xticks alone:
ax.set_xticks([1, 2, 3], labels, rotation=45, ha='right')
Option 3b
If you want to get the list of labels from the current plot:
# Unfortunately you need to draw your figure first to assign the labels,
# otherwise get_xticklabels() will return empty strings.
plt.draw()
ax.set_xticks(ax.get_xticks())
ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
As above, in later versions of Matplotlib (3.5+), you can just use set_xticks alone:
ax.set_xticks(ax.get_xticks(), ax.get_xticklabels(), rotation=45, ha='right')
Option 4
Similar to above, but loop through manually instead.
for label in ax.get_xticklabels():
label.set_rotation(45)
label.set_ha('right')
Option 5
We still use pyplot (as plt) here but it's object-oriented because we're changing the property of a specific ax object.
plt.setp(ax.get_xticklabels(), rotation=45, ha='right')
Option 6
This option is simple, but AFAIK you can't set label horizontal align this way so another option might be better if your angle is not 90.
ax.tick_params(axis='x', labelrotation=45)
Edit:
There's discussion of this exact "bug" but a fix hasn't been released (as of 3.4.0):
https://github.com/matplotlib/matplotlib/issues/13774
Easy way
As described here, there is an existing method in the matplotlib.pyplot figure class that automatically rotates dates appropriately for you figure.
You can call it after you plot your data (i.e.ax.plot(dates,ydata) :
fig.autofmt_xdate()
If you need to format the labels further, checkout the above link.
Non-datetime objects
As per languitar's comment, the method I suggested for non-datetime xticks would not update correctly when zooming, etc. If it's not a datetime object used as your x-axis data, you should follow Tommy's answer:
for tick in ax.get_xticklabels():
tick.set_rotation(45)
Try pyplot.setp. I think you could do something like this:
x = range(len(time))
plt.xticks(x, time)
locs, labels = plt.xticks()
plt.setp(labels, rotation=90)
plt.plot(x, delay)
Appart from
plt.xticks(rotation=90)
this is also possible:
plt.xticks(rotation='vertical')
I came up with a similar example. Again, the rotation keyword is.. well, it's key.
from pylab import *
fig = figure()
ax = fig.add_subplot(111)
ax.bar( [0,1,2], [1,3,5] )
ax.set_xticks( [ 0.5, 1.5, 2.5 ] )
ax.set_xticklabels( ['tom','dick','harry'], rotation=45 ) ;
If you want to apply rotation on the axes object, the easiest way is using tick_params. For example.
ax.tick_params(axis='x', labelrotation=90)
Matplotlib documentation reference here.
This is useful when you have an array of axes as returned by plt.subplots, and it is more convenient than using set_xticks because in that case you need to also set the tick labels, and also more convenient that those that iterate over the ticks (for obvious reasons)
If using plt:
plt.xticks(rotation=90)
In case of using pandas or seaborn to plot, assuming ax as axes for the plot:
ax.set_xticklabels(ax.get_xticklabels(), rotation=90)
Another way of doing the above:
for tick in ax.get_xticklabels():
tick.set_rotation(45)
My answer is inspired by cjohnson318's answer, but I didn't want to supply a hardcoded list of labels; I wanted to rotate the existing labels:
for tick in ax.get_xticklabels():
tick.set_rotation(45)
The simplest solution is to use:
plt.xticks(rotation=XX)
but also
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=X.XX)
e.g for dates I used rotation=45 and bottom=0.20 but you can do some test for your data
import pylab as pl
pl.xticks(rotation = 90)
To rotate the x-axis label to 90 degrees
for tick in ax.get_xticklabels():
tick.set_rotation(45)
It will depend on what are you plotting.
import matplotlib.pyplot as plt
x=['long_text_for_a_label_a',
'long_text_for_a_label_b',
'long_text_for_a_label_c']
y=[1,2,3]
myplot = plt.plot(x,y)
for item in myplot.axes.get_xticklabels():
item.set_rotation(90)
For pandas and seaborn that give you an Axes object:
df = pd.DataFrame(x,y)
#pandas
myplot = df.plot.bar()
#seaborn
myplotsns =sns.barplot(y='0', x=df.index, data=df)
# you can get xticklabels without .axes cause the object are already a
# isntance of it
for item in myplot.get_xticklabels():
item.set_rotation(90)
If you need to rotate labels you may need change the font size too, you can use font_scale=1.0 to do that.

Is there a maximum amount of ticklabels in a matplotlib axes?

So I have two lists one containing a bunch of years and the other one containing some integers, each list has 17 values.
when I make a simple bar chart
plt.bar(keys,values)
plt.show()
in the X axis of the graph it only contains some of the years in the keys list eg: the graph only has 2000,2002,2005,2007,2010,2012,2015. It has missed 2001,2003,2006,2008,2009 etc.
Is it because there is a maximum amount of keys allowed in the bar chart so it randomly took some keys?
If not how do i fix this?
There is a maximum amount of ticklabels on a matplotlib axes. This limit however lies well above 1000 and you would first run into severe lags when creating the figure.
The usual automatic ticking by matplotlib is that the axes are equipped with just as many labels as needed. I.e. if you plot 50 points on a plot, you would not want to have 50 labels as well. Further if you plot a point at 0.853164 you would not want to have such odd number being displayed as ticklabel on the axes.
Now, I cannot think of any reason matplotlib would produce the labels you report about, 2000,2002,2005,2007,2010,2012,2015, because the automatic locator for the ticks chooses equidistant points on the axes. For any help with this specific problem, a MCVE would be needed.
But in general there are two concepts from which you may choose.
Numerical axes
When plotting numbers, matplotlib will by default choose a linear axes and tick it automatically as described above.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2000, 2017)
y = np.random.randint(5,21, size=len(x))
plt.bar(x,y)
plt.show()
In this case an equidistant ticking of 2.5 units is chosen to have 7 nicely spaced ticks on the axes. If instead you would want to have every bar ticked, you could use a custom ticker.
E.g. a MultipleLocator with the interval set to 1,
import matplotlib.ticker as mticker
plt.gca().xaxis.set_major_locator(mticker.MultipleLocator(1))
plt.gca().tick_params(axis="x", rotation=90)
Or, a FixedLocator with the locations set to the x values of the bars,
import matplotlib.ticker as mticker
plt.gca().xaxis.set_major_locator(mticker.FixedLocator(x))
plt.gca().tick_params(axis="x", rotation=90)
Categorical axes
You may also decide that your xaxis shall be categorical. This means that every unique value gets its own tick and those ticks are equally spaced, independent of their value. This is easiest accomplished by converting the numbers to strings.
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2000, 2017)
y = np.random.randint(5,21, size=len(x))
cats = list(map(str, x))
plt.bar(cats,y)
plt.gca().tick_params(axis="x", rotation=90)
plt.show()
The result is visually the same as above, but this time, the number 2000 is not at location 2000, but at its index 0, 2001 is at 1 and so on.
You can show all the ticks in this way:
plt.xticks(np.arange(min(keys), max(keys)+1, 1.0), rotation=45)
Example:
keys = [2000, 2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016]
values = range(1,18)
import matplotlib.pyplot as plt
plt.bar(keys,values)
plt.xticks(np.arange(min(keys), max(keys)+1, 1.0), rotation=45)
plt.show()

Matplotlib - Axes collision warning when setting aspect ratio

I am using matplotlib to plot a hexbin. As a simple example-
import matplotlib.pyplot as plt
import numpy as np
x = np.random.rand(100)
y = np.random.rand(100)
plt.hexbin(x, y, gridsize = 15, cmap='inferno')
plt.gca().invert_yaxis() # To make top left corner as origin
plt.axes().set_aspect('equal', 'datalim')
plt.show()
I get the following warning-
"MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance."
I think it is due to the line-
plt.axes().set_aspect('equal', 'datalim')
How can I use different arguments in this case. The version of matplotlibis 2.1.1
It doesn't seem like you want to create a new axes anyways. So don't use plt.axes() here. Instead get the current axes in the usual way (plt.gca()) and use any of its methods.
plt.gca().set_aspect('equal', 'datalim')

Python matplotlib graphing [duplicate]

I need help with setting the limits of y-axis on matplotlib. Here is the code that I tried, unsuccessfully.
import matplotlib.pyplot as plt
plt.figure(1, figsize = (8.5,11))
plt.suptitle('plot title')
ax = []
aPlot = plt.subplot(321, axisbg = 'w', title = "Year 1")
ax.append(aPlot)
plt.plot(paramValues,plotDataPrice[0], color = '#340B8C',
marker = 'o', ms = 5, mfc = '#EB1717')
plt.xticks(paramValues)
plt.ylabel('Average Price')
plt.xlabel('Mark-up')
plt.grid(True)
plt.ylim((25,250))
With the data I have for this plot, I get y-axis limits of 20 and 200. However, I want the limits 20 and 250.
Get current axis via plt.gca(), and then set its limits:
ax = plt.gca()
ax.set_xlim([xmin, xmax])
ax.set_ylim([ymin, ymax])
One thing you can do is to set your axis range by yourself by using matplotlib.pyplot.axis.
matplotlib.pyplot.axis
from matplotlib import pyplot as plt
plt.axis([0, 10, 0, 20])
0,10 is for x axis range.
0,20 is for y axis range.
or you can also use matplotlib.pyplot.xlim or matplotlib.pyplot.ylim
matplotlib.pyplot.ylim
plt.ylim(-2, 2)
plt.xlim(0,10)
Another workaround is to get the plot's axes and reassign changing only the y-values:
x1,x2,y1,y2 = plt.axis()
plt.axis((x1,x2,25,250))
You can instantiate an object from matplotlib.pyplot.axes and call the set_ylim() on it. It would be something like this:
import matplotlib.pyplot as plt
axes = plt.axes()
axes.set_ylim([0, 1])
Just for fine tuning. If you want to set only one of the boundaries of the axis and let the other boundary unchanged, you can choose one or more of the following statements
plt.xlim(right=xmax) #xmax is your value
plt.xlim(left=xmin) #xmin is your value
plt.ylim(top=ymax) #ymax is your value
plt.ylim(bottom=ymin) #ymin is your value
Take a look at the documentation for xlim and for ylim
This worked at least in matplotlib version 2.2.2:
plt.axis([None, None, 0, 100])
Probably this is a nice way to set up for example xmin and ymax only, etc.
To add to #Hima's answer, if you want to modify a current x or y limit you could use the following.
import numpy as np # you probably alredy do this so no extra overhead
fig, axes = plt.subplot()
axes.plot(data[:,0], data[:,1])
xlim = axes.get_xlim()
# example of how to zoomout by a factor of 0.1
factor = 0.1
new_xlim = (xlim[0] + xlim[1])/2 + np.array((-0.5, 0.5)) * (xlim[1] - xlim[0]) * (1 + factor)
axes.set_xlim(new_xlim)
I find this particularly useful when I want to zoom out or zoom in just a little from the default plot settings.
This should work. Your code works for me, like for Tamás and Manoj Govindan. It looks like you could try to update Matplotlib. If you can't update Matplotlib (for instance if you have insufficient administrative rights), maybe using a different backend with matplotlib.use() could help.

Matplotlib: personalize imshow axis

I have the results of a (H,ranges) = numpy.histogram2d() computation and I'm trying to plot it.
Given H I can easily put it into plt.imshow(H) to get the corresponding image. (see http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.imshow )
My problem is that the axis of the produced image are the "cell counting" of H and are completely unrelated to the values of ranges.
I know I can use the keyword extent (as pointed in: Change values on matplotlib imshow() graph axis ). But this solution does not work for me: my values on range are not growing linearly (actually they are going exponentially)
My question is: How can I put the value of range in plt.imshow()? Or at least, or can I manually set the label values of the plt.imshow resulting object?
Editing the extent is not a good solution.
You can just change the tick labels to something more appropriate for your data.
For example, here we'll set every 5th pixel to an exponential function:
import numpy as np
import matplotlib.pyplot as plt
im = np.random.rand(21,21)
fig,(ax1,ax2) = plt.subplots(1,2)
ax1.imshow(im)
ax2.imshow(im)
# Where we want the ticks, in pixel locations
ticks = np.linspace(0,20,5)
# What those pixel locations correspond to in data coordinates.
# Also set the float format here
ticklabels = ["{:6.2f}".format(i) for i in np.exp(ticks/5)]
ax2.set_xticks(ticks)
ax2.set_xticklabels(ticklabels)
ax2.set_yticks(ticks)
ax2.set_yticklabels(ticklabels)
plt.show()
Expanding a bit on #thomas answer
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mi
im = np.random.rand(20, 20)
ticks = np.exp(np.linspace(0, 10, 20))
fig, ax = plt.subplots()
ax.pcolor(ticks, ticks, im, cmap='viridis')
ax.set_yscale('log')
ax.set_xscale('log')
ax.set_xlim([1, np.exp(10)])
ax.set_ylim([1, np.exp(10)])
By letting mpl take care of the non-linear mapping you can now accurately over-plot other artists. There is a performance hit for this (as pcolor is more expensive to draw than AxesImage), but getting accurate ticks is worth it.
imshow is for displaying images, so it does not support x and y bins.
You could either use pcolor instead,
H,xedges,yedges = np.histogram2d()
plt.pcolor(xedges,yedges,H)
or use plt.hist2d which directly plots your histogram.

Resources