Equal spacing between pie charts of different sizes in matplotlib - python-3.x

I am having difficulties with setting an equal space between pie charts of different sizes. The 5 are correctly arranged in one row, but the distance between the contours of neighboring pies aren't equal. I tried many abbreviations of the following code, all of them not making a big difference in the output (see image):
#code:
import matplotlib.pyplot as plt
import pandas as pd
labels = 'Verkehr', 'Maschinen und Motoren', 'Feuerungen', 'Industrie / Gewerbe', 'Land- und Forstwirtschaft'
sizesax1 = [108295, 10107, 7220, 11551, 7220]
sizesax2 = [77882, 6676, 6676, 13351, 6676]
sizesax3 = [55652, 4417, 6184, 15900, 6184]
sizesax4 = [36327, 2642, 4632, 16512, 5944]
sizesax5 = [18781, 1409, 3287, 1878, 4695]
fig, (ax1, ax2, ax3, ax4, ax5) = plt.subplots(1, 5, figsize =(20,4))
ax1.pie(sizesax1, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=1*4)
ax2.pie(sizesax2, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.77*4)
ax3.pie(sizesax3, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.61*4)
ax4.pie(sizesax4, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.46*4)
ax5.pie(sizesax5, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.33*4)
some additions i tried:
fig.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=1, hspace=None)
or
fig.tight_layout()
#giving me this error message:
/srv/conda/envs/notebook/lib/python3.7/site-packages/ipykernel_launcher.py:17: UserWarning:
Tight layout not applied. The bottom and top margins cannot be made large enough to
accommodate all axes decorations.
and some others.
Big thank you already for reading this! I am a complete beginner in python and just managed to come as far as you see in this image:
enter image description here

It is not clear what it is required. I'll assume it is the following image:
Fundamentally, the problem is that the pie needs a square aspect ratio, which is not provided by a row of subplots.
The simplest solution, is to create only one plot and plot there multiple pies with different centres. Something like:
import matplotlib.pyplot as plt
sizes = [ [108295, 10107, 7220, 11551, 7220],
[77882, 6676, 6676, 13351, 6676],
[55652, 4417, 6184, 15900, 6184],
[36327, 2642, 4632, 16512, 5944],
[18781, 1409, 3287, 1878, 4695]]
colors = ('red', 'darkblue', 'orange', 'yellow', 'green')
R = 4
radius = [R*i for i in [1.0, 0.77, 0.61, 0.46, 0.33] ]
wid = sum(radius)*2
hei = R*2
fig, ax = plt.subplots(figsize =(wid,hei))
fig.subplots_adjust(left = 0, right = 1, bottom = 0, top = 1)
y = R
x = 0
for i in range(5):
x += radius[i]
ax.pie(sizes[i], startangle = 0, colors = colors,
radius = radius[i], center = (x,y) )
x += radius[i]
ax.set(xlim =(0,x), ylim=(0,R*2))
plt.savefig("aaa.png")
Notice that my figure aspect ratio is not the (20,4) of the question, which does not hold for the way I interpreted the intended result.
But it might be the case that there is the need of having these in different axes. If so, the idea is:
Use gridspec to create a single row with 5 columns and provide the ratios so that they correspond to the required radius.
Plot the larger pie in the left slot.
In all remaining slots, use a subgrid, dividing into a column of three (sub-)slots.
Set the height ratios so that the middle one ends up with an aspect ratio of a square.
Plot the pies in the middle slots.
Here we go:
import matplotlib.pyplot as plt
sizes = [ [108295, 10107, 7220, 11551, 7220],
[77882, 6676, 6676, 13351, 6676],
[55652, 4417, 6184, 15900, 6184],
[36327, 2642, 4632, 16512, 5944],
[18781, 1409, 3287, 1878, 4695]]
colors = ('red', 'darkblue', 'orange', 'yellow', 'green')
R = 4
radius = [R*i for i in [1.0, 0.77, 0.61, 0.46, 0.33] ]
wid = sum(radius)*2
hei = R*2
ratios = [i/radius[0] for i in radius] # for gridspec
fig = plt.figure(figsize =(wid,hei))
gs = fig.add_gridspec(1, 5,
width_ratios = ratios,
wspace=0, left = 0, right = 1, bottom = 0, top = 1)
ax = fig.add_subplot(gs[0,0])
ax.pie(sizes[0], startangle = 0, colors = colors, radius = 1 )
ax.set(xlim=(-1,1) ,ylim=(-1,1))
for i in range(1,5):
mid = ratios[i]/sum(ratios)*wid
inrat = [(hei-mid)/2, mid, (hei-mid)/2]
ings = gs[0,i].subgridspec(3, 1, hspace=0,
height_ratios = inrat)
ax = fig.add_subplot(ings[1,0])
ax.pie(sizes[i], startangle = 0, colors = colors, radius = 1 )
ax.set(xlim=(-1,1), ylim=(-1,1))
plt.savefig("aaa.png")

Related

Plotly Custom Legend

I have a plotly plot which looks like this:
The Code I am using is below:
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(go.Scatter( x = pf['Timestamp'], y = pf['Price_A'], name ='<b>A</b>',
mode = 'lines+markers',
marker_color = 'rgba(255, 0, 0, 0.8)',
line = dict(width = 3 ), yaxis = "y1"),
secondary_y=False,)
fig.add_trace(go.Scatter( x = df['Timestamp'], y = df['Price_B'], name='<b>B</b>',
mode = 'lines+markers',
marker_color = 'rgba(0, 196, 128, 0.8)',
line = dict(width = 3 ), yaxis = "y1") ,
secondary_y=False,)
for i in pf2['Timestamp']:
fig.add_vline(x=i, line_width=3, line_dash="dash", line_color="purple",
name='Event')
fig.update_layout( title="<b>Change over Time</b>", font=dict( family="Courier New,
monospace", size=16, color="RebeccaPurple"),
legend=dict(
yanchor="top",
y=0.99,
xanchor="left",
x=0.01
))
How can I add the entry in the legend for the event that is denoted by the vertical lines?
When you use add_vline, you are adding an annotation which will not have a corresponding legend entry.
You'll need to instead use go.Scatter to plot the vertical lines, passing the minimum and maximum values in your data (plus or minus some padding) to the y parameter. Then you can set this same y-range for your plot. This will give you the appearance of vertical lines while still showing the full range of your data.
Update: you can use a legend group so that the vertical lines appear as a single entry in the legend
For example:
from pkg_resources import yield_lines
import plotly.express as px
import plotly.graph_objects as go
fig = go.Figure()
df = px.data.stocks()
for col in ['GOOG','AMZN']:
fig.add_trace(go.Scatter(
x=df['date'],
y=df[col]
))
vlines = ["2018-07-01","2019-04-01","2019-07-01"]
min_y,max_y = df[['GOOG','AMZN']].min().min(), df[['GOOG','AMZN']].max().max()
padding = 0.05*(max_y-min_y)
for i,x in enumerate(vlines):
fig.add_trace(go.Scatter(
x=[x]*2,
y=[min_y-padding, max_y+padding],
mode='lines',
line=dict(color='purple', dash="dash"),
name="vertical lines",
legendgroup="vertical lines",
showlegend=True if i == 0 else False
))
fig.update_yaxes(range=[min_y-padding, max_y+padding])
fig.show()

Creating GlyphRenderers for modifying the legend

I want to create a bokeh application that can filter points based on some attribute. Here is a very simple code example for my use case that filters points on the plot using checkboxes.
from bokeh.plotting import ColumnDataSource, figure, curdoc
import bokeh.models as bmo
from bokeh.layouts import row
import numpy as np
def update_filter(selected_colors):
keep_indices = []
for i, color in enumerate(cds.data['color']):
if color2idx[color] in selected_colors:
keep_indices.append(i)
view.filters[0] = bmo.IndexFilter(keep_indices)
cds = ColumnDataSource(data=dict(
x=np.random.rand(10),
y=np.random.rand(10),
color=['red', 'green', 'blue', 'red', 'green',
'blue', 'red', 'green', 'blue', 'red'])
)
view = bmo.CDSView(source=cds, filters=[bmo.IndexFilter(np.arange(10))])
checkboxes = bmo.CheckboxGroup(labels=['red', 'green', 'blue'], active=[0, 1, 2])
color2idx = {'red': 0, 'green': 1, 'blue': 2}
checkboxes.on_change('active', lambda attr, old_val, new_val: update_filter(new_val))
fig = figure(plot_width=400, plot_height=400, title='Visualize')
fig.circle(x='x', y='y', fill_color='color', size=10, source=cds, view=view, legend_field='color')
curdoc().add_root(row(checkboxes, fig))
curdoc().title = 'Plot'
It works well, however, when I filter points out by de-selecting one of the checkboxes, the legend becomes erroneous.
Below is a screenshot when all the colors are selected:
And this is a screenshot when one of the colors is de-selected:
As it can be seen, the legend for "green" became red in color when the checkbox for "green" was de-selected.
I found that legends do not work properly with CDSView and it is still an unsolved issue: https://github.com/bokeh/bokeh/issues/8010
So, I wrote the function below that would modify the legend so that it is not erroneous.
def update_legend():
# Find the indices in the CDS that are visible
filters = view.filters
visible_indices = set(list(range(len(cds.data['x']))))
for filter in filters:
visible_indices = visible_indices & set(filter.indices)
# Get a list of visible colors
visible_colors = set([cds.data['color'][i] for i in visible_indices])
# Create a dummy figure to obtain renderers
dummy_figure = figure(plot_width=0, plot_height=0, title='')
legend_items = []
# Does not work
for color in visible_colors:
renderer = dummy_figure.circle(x=[0], y=[0], fill_color=color, size=10)
legend_items.append(bmo.LegendItem(label=color, renderers=[renderer]))
fig.legend[0].items = legend_items
And added another event callback for the checkbox group:
checkboxes.on_change('active', lambda attr, old_val, new_val: update_legend())
When I did the above, the labels in the legend were corrected but now the glyphs are not rendered in the legend. Below is a screenshot of the same:
What am I doing wrong? How should I create a GlyphRenderer for the legend such that the issue gets resolved?
This works for Bokeh v2.1.1. In addition to your original code you can also click on a legend item to show/hide the circles.
from bokeh.plotting import ColumnDataSource, figure, curdoc
from bokeh.models import CheckboxGroup, Row, CDSView, IndexFilter
import numpy as np
colors = ['red', 'green', 'blue']
cds = ColumnDataSource(dict(x=np.random.rand(10),
y=np.random.rand(10),
color=['red', 'green', 'blue', 'red', 'green', 'blue', 'red', 'green', 'blue', 'red']))
def update_filter(selected_colors):
for i in range(len(colors)):
renderers[i].visible = True if i in selected_colors else False
checkboxes = CheckboxGroup(labels=colors, active=[0, 1, 2], width = 50)
checkboxes.on_change('active', lambda attr, old_val, new_val: update_filter(new_val))
fig = figure(plot_width=400, plot_height=400, title='Visualize')
views = [CDSView(source=cds, filters=[IndexFilter([i for i, x in enumerate(cds.data['color']) if x == color])]) for color in colors]
renderers = [fig.circle(x='x', y='y', fill_color='color', size=10, source=cds, view=views[i], legend=color) for i,color in enumerate(colors)]
fig.legend.click_policy = 'hide'
curdoc().add_root(Row(checkboxes, fig))
curdoc().title = 'Plot'
Result:

How do I apply a line between two points in geopanda e.g. between 2 cities

I am trying to plot a green line between the 2 cities on my map in geopandas
The result should show the 2 cities with the red point and the name of the cities as well plus a green line between the two cities
I hope you can help me!
Thanks in ahead!
I tried it a few times but I dont get the key for plotting a line
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
###################### The Map of Germany is plotted here
plt.style.use('seaborn')
plz_shape_df = gpd.read_file(r'C:\Users\XXXXX\geopanda\plz-gebiete.shp', dtype={'plz': str})
plz_shape_df.head()
plt.rcParams['figure.figsize'] = [16, 11]
fig, ax = plt.subplots()
plz_shape_df.plot(ax=ax, color='orange', alpha=0.8)
ax.set(
title='Germany',
aspect=1.3,
facecolor='lightblue');
################ The dict: new_dict3 with the 2 cities gets plotted
new_dict3 = {
'Stuttgart': (9.181332, 48.777128),
'Munich': (11.576124, 48.137154),
}
for c in new_dict3.keys():
ax.text(
x=new_dict3[c][0],
y=float(new_dict3[c][1]) + 0.1,
s=c,
fontsize = 12,
ha = "center",
)
ax.plot(
new_dict3[c][0],
new_dict3[c][1],
marker = "o",
c = "red",
alpha = 2.0
)
############### Now I want to plot a green line between the 2 cities of the new_dict3
ax.plot(
x= new_dict3[c][0],
y= float(new_dict3[c][1]) + 0.1,
linestyle = "--",
c = "green",
marker="",
)
#this doesn't work
[
I got the right answer myself here is my result:
Stuttgart = [9.181332, 48.777128]
Munich = [11.576124, 48.137154]
x_values = [ Stuttgart[0], Munich[0]]
y_values = [ Stuttgart[1], Munich[1]]
plt.plot(x_values, y_values, linewidth = 5, linestyle = "--", color = "green")

how to link vbar with circle plots using bokeh?

I have three plots based on the same dataset. How can I link all three plots so that when I select a certain species in vbar plot, two scatter plot also change to plot points in that species only.
any help is appreciated~
from bokeh.sampledata.iris import flowers
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, CategoricalColorMapper
from bokeh.layouts import column, row
#color mapper to color data by species
mapper = CategoricalColorMapper(factors = ['setosa','versicolor', 'virginica'],\
palette = ['green', 'blue', 'red'])
output_file("plots.html")
#group by species and plot barplot for count
species = flowers.groupby('species')
source = ColumnDataSource(species)
p = figure(plot_width = 800, plot_height = 400, title = 'Count by Species', \
x_range = source.data['species'], y_range = (0,60),tools = 'box_select')
p.vbar(x = 'species', top = 'petal_length_count', width = 0.8, source = source,\
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2,\
color = {'field': 'species', 'transform': mapper})
labels = LabelSet(x='species', y='petal_length_count', text='petal_length_count',
x_offset=5, y_offset=5, source=source)
p.add_layout(labels)
#scatter plot for sepal length and width
source1 = ColumnDataSource(flowers)
p1 = figure(plot_width = 800, plot_height = 400, tools = 'box_select', title = 'scatter plot for sepal')
p1.circle(x = 'sepal_length', y ='sepal_width', source = source1, \
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2, \
color = {'field': 'species', 'transform': mapper})
#scatter plot for petal length and width
p2 = figure(plot_width = 800, plot_height = 400, tools = 'box_select', title = 'scatter plot for petal')
p2.circle(x = 'petal_length', y ='petal_width', source = source1, \
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2, \
color = {'field': 'species', 'transform': mapper})
#show all three plots
show(column(p, row(p1, p2)))
I don't think there's some functionality existing for this at the moment. But you can explicitly link two ColumnDataSources with a CustomJS callback:
from bokeh.models import CusomJS
source = ColumnDataSource(species)
source1 = ColumnDataSource(flowers)
source.js_on_change('selected', CustomJS(args=dict(s1=source1), code="""
const indices = cb_obj.selected['1d'].indices;
const species = new Set(indices.map(i => cb_obj.data.species[i]));
s1.selected['1d'].indices = s1.data.species.reduce((acc, s, i) => {if (species.has(s)) acc.push(i); return acc}, []);
s1.select.emit();
"""))
Note that this callback only synchronizes selection from the bar plot to the scatter plots. To make selections on the scatter plots influence the bar plot, you'll have to write some additional code.

Why can't I plot multiple scatter subplots in one figure for a data set from a DataFrame of Pandas (Python) in the way I plot subplots of histograms? [duplicate]

This question already has an answer here:
Matplotlib: create multiple subplot in one figure
(1 answer)
Closed 5 years ago.
I am learning how to play with matplotlib recently. However, some problems come up. I read in a non-standard data file named students.data with the following command.
student_dataset = pd.read_csv("students.data", index_col=0)
Here is how students.data looks like.
Then I plot a figure with four subplots of histograms in it with the following commands.
fig = plt.figure(0) #Use it to create subplots.
fig.subplots_adjust(hspace=0.5, wspace=0.5) #Adjust height-spacing to
#de-overlap titles and ticks
ax1 = fig.add_subplot(2, 2, 1)
my_series1 = student_dataset["G1"]
my_series1.plot.hist(alpha=0.5, color = "blue", histtype = "bar", bins = 30)
ax2 = fig.add_subplot(2, 2, 2)
my_series2 = student_dataset["G2"]
my_series2.plot.hist(alpha=1, color = "green", histtype = "step", bins = 20)
ax3 = fig.add_subplot(2, 2, 3)
my_series3 = student_dataset["G3"]
my_series3.plot.hist(alpha=0.5, color = "red", histtype = "stepfilled")
ax4 = fig.add_subplot(2, 2, 4)
my_series1.plot.hist(alpha=0.5, color = "blue")
my_series2.plot.hist(alpha=0.5, color = "green")
my_series3.plot.hist(alpha=0.5, color = "red")
And the result is exactly the stuff I want. However, as I try to do so for scatter subplots, they are separated in different figures. And I cannot figure out why. Here are the commands.
fig = plt.figure(2)
ax1 = fig.add_subplot(2, 2, 1)
student_dataset.plot.scatter(x = "freetime", y = "G1")
ax2 = fig.add_subplot(2, 2, 2)
student_dataset.plot.scatter(x = "freetime", y = "G2")
ax3 = fig.add_subplot(2, 2, 3)
student_dataset.plot.scatter(x = "freetime", y = "G3")
After searching for a day, I find the solution that almost fits my target. But, still, why? Why my original method is not working?
Here are the new commands and the result.
fig, axes = plt.subplots(2, 2, figsize=(6, 6), sharex=False, sharey=False)
x = student_dataset["freetime"].values
for i in range(3):
axes[i//2, i%2].scatter(x, student_dataset.iloc[:, i + 25].values)
fig.tight_layout()
Sorry that I cannot put more images in this post to describe my question. Hope you can understand my point.
Thanks in advance.
You may choose to use option 2 of the linked question,
fig = plt.figure(2)
ax1 = fig.add_subplot(2, 2, 1)
student_dataset.plot.scatter(x = "freetime", y = "G1", ax=ax1)
ax2 = fig.add_subplot(2, 2, 2)
student_dataset.plot.scatter(x = "freetime", y = "G2", ax=ax2)
ax3 = fig.add_subplot(2, 2, 3)
student_dataset.plot.scatter(x = "freetime", y = "G3", ax=ax3)
If you don't specify ax, pandas will produce a new figure.
At the moment I don't have any good explanation for why plot.hist does not require the ax keyword; it probably has to do with it directly calling the plt.hist function instead of preprocessing the data first.

Resources