Matplotlib figure annotations outside of window - python-3.x

I am making a program that implements a matplotlib pie/donut chart into a tkinter window to illustrate some data, however, I have added "annotations" or labels from each wedge of the pie chart. Because of this the window that opens when I execute the code fits the chart itself, but the labels are cut off at the edges of the window. Specifically, it looks like this...
Note the top two arrows don't actually have text attached to the corresponding labels so the situation is actually worse than my screenshot depicts.
Even if I get rid of the code related to generating a tkinter GUI, and just try to execute code to generate a regular figure window the labels are initially cut-off. But, if I use the built in zoom-out functionality I can zoom out the make the labels fit.
I have tried to adjust the figsize here...
fig, ax = plt.subplots(figsize=(6, 4), subplot_kw=dict(aspect="equal"))
yet it makes no difference. Hopefully there is a solution, thanks...
Here is my full code if anyone needs...
import numpy as np
import matplotlib.pyplot as plt
player1_cards = {'Mustard', 'Plum', 'Revolver', 'Rope', 'Ballroom', 'Library'}
player2_cards = {'Scarlet', 'White', 'Candlestick'}
player3_cards = {'Green', 'Library', 'Kitchen', 'Conservatory'}
middle_cards = {'Peacock'}
unknown_cards = {'Lead Pipe', 'Wrench', 'Knife', 'Hall', 'Lounge', 'Dining Room', 'Study'}
player1_string = ', '.join(player1_cards)
player1_string = player1_string.replace(', ', '\n')
player2_string = ', '.join(player2_cards)
player2_string = player2_string.replace(', ', '\n')
player3_string = ', '.join(player3_cards)
player3_string = player3_string.replace(', ', '\n')
fig, ax = plt.subplots(figsize=(6, 4), subplot_kw=dict(aspect="equal"))
recipe = [player1_string, player2_string, player3_string, '', '']
data = [len(player1_cards), len(player2_cards), len(player3_cards), 1, 7]
cols = ['#339E5A', '#26823E', '#0C5D2E', '#98D6AE', '#5EC488']
wedges, texts = ax.pie(data, wedgeprops=dict(width=0.5), startangle=90, colors = cols)
for w in wedges:
w.set_linewidth(4)
w.set_edgecolor('white')
bbox_props = dict(boxstyle="square,pad=0.3", fc="w", ec="white", lw=0.72)
kw = dict(xycoords='data', textcoords='data', arrowprops=dict(arrowstyle="-"), bbox=bbox_props, zorder=0, va="center")
for i, p in enumerate(wedges):
ang = (p.theta2 - p.theta1)/2. + p.theta1
y = np.sin(np.deg2rad(ang))
x = np.cos(np.deg2rad(ang))
horizontalalignment = {-1: "right", 1: "left"}[int(np.sign(x))]
connectionstyle = "angle,angleA=0,angleB={}".format(ang)
kw["arrowprops"].update({"connectionstyle": connectionstyle})
ax.annotate(recipe[i], xy=(x, y), xytext=(x + np.sign(x)*.5, y*1.5),
horizontalalignment=horizontalalignment, **kw, family = "Quicksand")
ax.set_title("Matplotlib bakery: A donut")
plt.show()

You would want to play around with the subplot parameters to make space for the text outside the axes.
fig.subplots_adjust(bottom=..., top=..., left=..., right=...)
E.g. in this case
fig.subplots_adjust(bottom=0.2, top=0.9)
seems to give a nice representation

Related

Use constant colors in matplotlib axes3d

I am making a 3d scatterplot with Matplotlib, with the following code:
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
for i in range(len(model.ut[0,:]-1)):
x_dp = model.ut[0,i]
y_dp = model.ut[1,i]
z_dp = model.ut[2,i]
ax.scatter(x_dp, y_dp, z_dp, marker='^')
for i in range(len(model.cluster_centers[:,0]-1)):
x_c = model.cluster_centers[i,0]
y_c = model.cluster_centers[i,1]
z_c = model.cluster_centers[i,2]
ax.scatter(x_c, y_c, z_c, marker='o')
ax.set_xlabel('Dimension 0')
ax.set_ylabel('Dimension 1')
ax.set_zlabel('Dimension 2')
ax.set_title('3d')
pyplot.show()
Where model.ut and cluster_center are matrices with the data that I want to visualize.
Currently, the color of each datapoint is different:
Instead, I would like each point with the same marker to be the same color (like it has been done here). How can I do this?

How to add percentage label on top of bar chart from a data frame with different sum total data groups

I am new in coding with python, I am trying to develop a bar chart with percentage on top. I have a sample data frame Quiz2. I developed code and gives only 1600% at first single bar. Kindly any one with help how can i do it correct?
#Approach 2
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
sns.set()
%matplotlib inline
Quiz2 = pd.DataFrame({'Kaha': ['16', '5'], 'Shiny': ['16', '10']})
data=Quiz2 .rename(index={0: "Male", 1: "Female"})
data=data.astype(float)
Q1p = data[['Kaha','Shiny']].plot(kind='bar', figsize=(5, 5), legend=True, fontsize=12)
Q1p.set_xlabel("Gender", fontsize=12)
Q1p.set_ylabel("Number of people", fontsize=12)
#Q1p.set_xticklabels(x_labels)
for p in Q1p.patches:
width = p.get_width()
height = p.get_height()
x, y = p.get_xy()
Q1p.annotate(f'{height:.0%}', (x + width/2, y + height*1.02), ha='center')
plt.show()
I want the percentage of Kaha (with 21 sum total) to appear as (76.2% for Male and 23.8% for Female) and that of shy (with 26 sum total) as (61.5% for Male and 38.5%for Female). Kindly requesting help
In approach 2, the reason you have only 1 value displaying is the plt.show()
should be outdented so it comes after the processing of the for loop. You are getting a value of 1600% because you are plotting the value as the height of the bar in the line beginning with Q1p.annotate(f'{height:.0%}' Instead of height this should be height/10*total or something to give you the percentage.
Here is a solution, but not sure if I am computing the percentages correctly:
Quiz2 = pd.DataFrame({'Kaha': ['16', '5'], 'Shiny': ['16', '10']})
data=Quiz2 .rename(index={0: "Male", 1: "Female"})
data=data.astype(float)
total = len(data)*10
Q1p = data[['Kaha','Shiny']].plot(kind='bar', figsize=(5, 5), legend=True, fontsize=12)
Q1p.set_xlabel("Gender", fontsize=12)
Q1p.set_ylabel("Number of people", fontsize=12)
#Q1p.set_xticklabels(x_labels)
for p in Q1p.patches:
width = p.get_width()
height = p.get_height()
x, y = p.get_xy()
Q1p.annotate(f'{height/total:.0%}', (x + width/2, y + height*1.02), ha='center')
plt.show()

Why is the saved video from FuncAnimation a superpositions of plots?

Regards, I would like to ask about Python's FuncAnimation.
In the full code, I was trying to animate bar plots (for integral illustration). The animated output from
ani = FuncAnimation(fig, update, frames=Iter, init_func = init, blit=True);
plt.show(ani);
looks fine.
But the output video from
ani.save("example_new.mp4", fps = 5)
gives a slightly different version from the animation showed in Python. The output gives a video of 'superposition version' compared to the animation. Unlike the animation : in the video, at each frame, the previous plots kept showing together with the current one.
Here is the full code :
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
fig, ax = plt.subplots()
Num = 20
p = plt.bar([0], [0], 1, color = 'b')
Iter = tuple(range(2, Num+1))
xx = list(np.linspace(0, 2, 200)); yy = list(map(lambda x : x**2,xx));
def init():
ax.set_xlim(0, 2)
ax.set_ylim(0, 4)
return (p)
def update(frame):
w = 2/frame;
X = list(np.linspace(0, 2-w, frame+1));
Y = list(map(lambda x: x**2, X));
X = list(map(lambda x: x + w/2,X));
C = (0, 0, frame/Num);
L = plt.plot(xx , yy, 'y', animated=True)[0]
p = plt.bar(X, Y, w, color = C, animated=True)
P = list(p[:]); P.append(L)
return P
ani = FuncAnimation(fig, update, frames=Iter, init_func = init, interval = 0.25, blit=True)
ani.save("examplenew.mp4", fps = 5)
plt.show(ani)
Any constructive inputs on this would be appreciated. Thanks. Regards, Arief.
When saving the animation, no blitting is used. You can turn off blitting, i.e. blit=False and see the animation the same way as it is saved.
What is happening is that in each iteration a new plot is added without the last one being removed. You basically have two options:
Clear the axes in between, ax.clear() (then remember to set the axes limits again)
update the data for the bars and the plot. Examples to do this:
For plot: Matplotlib Live Update Graph
For bar: Dynamically updating a bar plot in matplotlib

How do I create a legend for a heatmap in Bokeh 12.4.1

The recent version of Bokeh allows the programmer to put the legend outside of the chart area. This can be accomplished like described here:
p = figure(toolbar_location="above")
r0 = p.circle(x, y)
legend = Legend(items=[
("sin(x)" , [r0]),),
], location=(0, -30))
p.add_layout(legend, 'right')
show(p)
Note: A legend object is attached to a plot via add_layout. The legend object itself consists of tuples and strings together with glyph lists.
The question is what to do when you are just drawing one "data" series as is the case with the code below, adapted from here:
from bokeh.io import show
from bokeh.models import ColumnDataSource, HoverTool, LinearColorMapper
from bokeh.plotting import figure
col = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
row = ['A', 'B', 'C' , 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P']
# this is the colormap from the original NYTimes plot
colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
"#ddb7b1", "#cc7878", "#933b41", "#550b1d"]
mapper = LinearColorMapper(palette=colors)
source = ColumnDataSource(data = dict (
row = test['plate_row'],
col = test['plate_col'],
values = test['Melt Temp']
))
TOOLS = "hover,save,pan,box_zoom,wheel_zoom"
p = figure(title="Plate Heatmap", x_range = (0.0,25.0), y_range =
list(reversed(row)), x_axis_location="above", tools=TOOLS)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None)
legend = Legend(items=[
("test" , [r1]),
], location=(0, -30))
p.add_layout(legend, 'left')
show(p) # show the plot
The issue here is that there is only one glyph. What I actually need is an explanation of what value range is included for different colors. Clearly, this is possible, because the plots defined here show that it's possible.
Update:
Now that I am writing about the problem, I am starting to think, that perhaps I can can just plot multiple series, one for each color...and only plot those coordinates that fall within a certain range...that seems rather clunky though. So any ideas are appreciated!
I figured out a way through using CategoricalColorMapper and then not creating an explicit legend object.
There may be a way to create the legend object explicitly with the same layout, I will have a look later.
import numpy as np
from bokeh.io import show
from bokeh.models import Legend
from bokeh.models import ColumnDataSource, HoverTool,CategoricalColorMapper
from bokeh.plotting import figure
from bokeh.palettes import Blues8
# values to assign colours on
values = np.arange(100,107)
# values that will appear in the legend!!!
legend_values = ['100-101','101-102','102-103','103-04','104-05','105-06',
'106-07']
source = ColumnDataSource(data = dict (
row = np.arange(100,107),
col = np.arange(100,107),
values = np.arange(100,107),
legend_values = legend_values
))
mapper = CategoricalColorMapper(factors=list(values),palette=Blues8)
TOOLS = "hover,save,pan,box_zoom,wheel_zoom"
p = figure(title="Plate Heatmap", x_range = (100,107), y_range =
[90,107], x_axis_location="above", tools=TOOLS)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None,legend='legend_values')
p.legend.location = "bottom_right"
show(p) # show the plot
See the image here 1
After researching this a bit more, I found 2 ways of creating a legends that show what each color means on the heatmap:
1.) Painting several glyph series:
First, I divide the number range into bins like so:
min_value = test['Melt Temp'].min()
max_value = test['Melt Temp'].max()
increment = round((max_value - min_value)/9)
num_bins = [(lower, lower+increment) for lower in
range(int(floor(min_value)), int(round(max_value)),
int(round(increment)))]
Then, I create sub tables from the main tables like so:
source_dict = {}
for range_tuple in num_bins:
range_data = test[(test['Melt Temp'] > int(range_tuple[0])) &
(test['Melt Temp'] <= int(range_tuple[1]))]
source = ColumnDataSource(data = dict (
row = range_data['x'],
col = range_data['y'],
values = range_data['Value']))
source_dict[range_tuple] = source
Then I zip up the colors with a column data source sub-table:
colors = RdYlBu9
glyph_list = []
for color, range_tuple in zip(colors, num_bins):
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source_dict[range_tuple],
fill_color=color,
line_color=None)
glyph_list.append(r1)
Lastly, I create an explicit legend object which requires string-glyph-tuples. The legend object then gets attached to the plot:
legend_list = [("{0}<={1}".format(bin[0], bin[1]), [glyph]) for bin,
glyph in zip(num_bins, glyph_list)]
legend = Legend(items=legend_list, location=(0, -50))
p.add_layout(legend, 'left')
show(p)
Downsides to this approach:
It somehow seems a bit clunky.
Another potential downside I discovered while trying to select objects: If you click on one datapoint of a certain color, all datapoints of that color get selected. Depending on what you want to do this may be a plus or a minus.
2.) Colorbar:
Second approach makes use of #Okonomiyaki's comment above, and is a lot simpler. The basic gist is that you use a color mapper for determining colors of your glyphs. You also create a ColorBar as Okonomiyaki pointed out:
mapper = LogColorMapper(palette="Viridis256", low=min_value,
high=max_value)
source = ColumnDataSource(data = dict (
row = test['x'], col = test['y'], values = test['value']))
p = figure(title="Plate Heatmap", x_range = (0.0,25.0), y_range =
list(reversed(row)),
x_axis_location="above", plot_width=650, plot_height=400)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None)
color_bar = ColorBar(color_mapper=mapper, ticker=LogTicker(),
label_standoff=12, border_line_color=None,
location(0,0))
p.add_layout(color_bar, 'left')
layout = p
show(layout)
I like the elegance of this approach. The only downside to this approach is that you don't get a clean range of numbers that define a given color.
If other people come up with even more elegant approaches, please
share!

matplotlib: get the subplot layout?

I have a function that creates a grid of similar 2D histograms. So that I can select whether to put this new plot on a pre-existing figure, I do the following:
def make_hist2d(x, y, current_fig=False, layout=(1,1,1),*args):
if current_fig:
fig = _plt.gcf()
ax = fig.add_subplot(*layout) # layout=(nrows, ncols, nplot)
else:
fig, ax = _plt.subplots()
H, x, y = np.histogram2d(...)
# manipulate the histogram, e.g. column normalize.
XX, YY = _np.meshgrid(xedges, yedges)
Image = ax.pcolormesh(XX, YY, Hplot.T, norm=norm, **pcmesh_kwargs)
ax.autoscale(tight=True)
grid_kargs = {'orientation': 'vertical'}
cax, kw = _mpl.colorbar.make_axes_gridspec(ax, **grid_kargs)
cbar = fig.colorbar(Image, cax=cax)
cbar.set_label(cbar_title)
return fig, ax, cbar
def hist2d_grid(data_dict, key_pairs, layout, *args): # ``*args`` are things like xlog, ylog, xlabel, etc.
# that are common to all subplots in the figure.
fig, ax = _plt.subplots()
nplots = range(len(key_pairs) + 1) # key_pairs = ((k1a, k1b), (k2a, k2b), ..., (kna, knb))
ax_list = []
for pair, i in zip(key_pairs, nplots):
fig, ax, cbar = make_hist2d(data[k1a], data[k1b]
ax_list.append(ax)
return fig, ax_list
Then I call something like:
hgrid = hist2d_grid(...)
However, if I want to add a new figure to the grid, I don't know of a good way to get the subplot layout. For example, is there something like:
layout = fig.get_layout()
That would give me something like (nrows, ncols, n_subplots)?
I could do this with something like:
n_plot = len(ax_list) / 2 # Each subplot generates a plot and a color bar.
n_rows = np.floor(np.sqrt(n_ax))
n_cols = np.ceil(np.sqrt(n_ax))
But I have to deal with special cases like a (2,4) subplot array for which I would get n_rows = 2 and n_cols = 3, which means that I would be passing (2,3,8) to ax.add_subplot(), which clearly doesn't work because 8 > 3*2.
As ax returned by fig, ax = plt.subplots(4,2) is a numpy array of axes, then ax.shape will give you the layout information you want, e.g.
nrows, ncols = ax.shape
n_subplots = nrows*ncols
You can also get the locations of the various axes by looping over the children of the figure object,
[[f.colNum, f.rowNum] for f in fig.get_children()[1:]]
and probably get the size from the final element fig.get_children()[-1]
You could also use gridspec to be more explicit about the location of subplots if needed. With gridspec you setup the gridspec object and pass to subplot,
import matplotlib.gridspec as gridspec
gs = gridspec.GridSpec(2, 2)
ax = plt.subplot(gs[0, 0])
To get the layout you can then use,
gs.get_geometry()

Resources