I want to create a bokeh application that can filter points based on some attribute. Here is a very simple code example for my use case that filters points on the plot using checkboxes.
from bokeh.plotting import ColumnDataSource, figure, curdoc
import bokeh.models as bmo
from bokeh.layouts import row
import numpy as np
def update_filter(selected_colors):
keep_indices = []
for i, color in enumerate(cds.data['color']):
if color2idx[color] in selected_colors:
keep_indices.append(i)
view.filters[0] = bmo.IndexFilter(keep_indices)
cds = ColumnDataSource(data=dict(
x=np.random.rand(10),
y=np.random.rand(10),
color=['red', 'green', 'blue', 'red', 'green',
'blue', 'red', 'green', 'blue', 'red'])
)
view = bmo.CDSView(source=cds, filters=[bmo.IndexFilter(np.arange(10))])
checkboxes = bmo.CheckboxGroup(labels=['red', 'green', 'blue'], active=[0, 1, 2])
color2idx = {'red': 0, 'green': 1, 'blue': 2}
checkboxes.on_change('active', lambda attr, old_val, new_val: update_filter(new_val))
fig = figure(plot_width=400, plot_height=400, title='Visualize')
fig.circle(x='x', y='y', fill_color='color', size=10, source=cds, view=view, legend_field='color')
curdoc().add_root(row(checkboxes, fig))
curdoc().title = 'Plot'
It works well, however, when I filter points out by de-selecting one of the checkboxes, the legend becomes erroneous.
Below is a screenshot when all the colors are selected:
And this is a screenshot when one of the colors is de-selected:
As it can be seen, the legend for "green" became red in color when the checkbox for "green" was de-selected.
I found that legends do not work properly with CDSView and it is still an unsolved issue: https://github.com/bokeh/bokeh/issues/8010
So, I wrote the function below that would modify the legend so that it is not erroneous.
def update_legend():
# Find the indices in the CDS that are visible
filters = view.filters
visible_indices = set(list(range(len(cds.data['x']))))
for filter in filters:
visible_indices = visible_indices & set(filter.indices)
# Get a list of visible colors
visible_colors = set([cds.data['color'][i] for i in visible_indices])
# Create a dummy figure to obtain renderers
dummy_figure = figure(plot_width=0, plot_height=0, title='')
legend_items = []
# Does not work
for color in visible_colors:
renderer = dummy_figure.circle(x=[0], y=[0], fill_color=color, size=10)
legend_items.append(bmo.LegendItem(label=color, renderers=[renderer]))
fig.legend[0].items = legend_items
And added another event callback for the checkbox group:
checkboxes.on_change('active', lambda attr, old_val, new_val: update_legend())
When I did the above, the labels in the legend were corrected but now the glyphs are not rendered in the legend. Below is a screenshot of the same:
What am I doing wrong? How should I create a GlyphRenderer for the legend such that the issue gets resolved?
This works for Bokeh v2.1.1. In addition to your original code you can also click on a legend item to show/hide the circles.
from bokeh.plotting import ColumnDataSource, figure, curdoc
from bokeh.models import CheckboxGroup, Row, CDSView, IndexFilter
import numpy as np
colors = ['red', 'green', 'blue']
cds = ColumnDataSource(dict(x=np.random.rand(10),
y=np.random.rand(10),
color=['red', 'green', 'blue', 'red', 'green', 'blue', 'red', 'green', 'blue', 'red']))
def update_filter(selected_colors):
for i in range(len(colors)):
renderers[i].visible = True if i in selected_colors else False
checkboxes = CheckboxGroup(labels=colors, active=[0, 1, 2], width = 50)
checkboxes.on_change('active', lambda attr, old_val, new_val: update_filter(new_val))
fig = figure(plot_width=400, plot_height=400, title='Visualize')
views = [CDSView(source=cds, filters=[IndexFilter([i for i, x in enumerate(cds.data['color']) if x == color])]) for color in colors]
renderers = [fig.circle(x='x', y='y', fill_color='color', size=10, source=cds, view=views[i], legend=color) for i,color in enumerate(colors)]
fig.legend.click_policy = 'hide'
curdoc().add_root(Row(checkboxes, fig))
curdoc().title = 'Plot'
Result:
Related
I am having difficulties with setting an equal space between pie charts of different sizes. The 5 are correctly arranged in one row, but the distance between the contours of neighboring pies aren't equal. I tried many abbreviations of the following code, all of them not making a big difference in the output (see image):
#code:
import matplotlib.pyplot as plt
import pandas as pd
labels = 'Verkehr', 'Maschinen und Motoren', 'Feuerungen', 'Industrie / Gewerbe', 'Land- und Forstwirtschaft'
sizesax1 = [108295, 10107, 7220, 11551, 7220]
sizesax2 = [77882, 6676, 6676, 13351, 6676]
sizesax3 = [55652, 4417, 6184, 15900, 6184]
sizesax4 = [36327, 2642, 4632, 16512, 5944]
sizesax5 = [18781, 1409, 3287, 1878, 4695]
fig, (ax1, ax2, ax3, ax4, ax5) = plt.subplots(1, 5, figsize =(20,4))
ax1.pie(sizesax1, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=1*4)
ax2.pie(sizesax2, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.77*4)
ax3.pie(sizesax3, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.61*4)
ax4.pie(sizesax4, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.46*4)
ax5.pie(sizesax5, startangle=0, colors = ('red', 'darkblue', 'orange', 'yellow', 'green'), radius=.33*4)
some additions i tried:
fig.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=1, hspace=None)
or
fig.tight_layout()
#giving me this error message:
/srv/conda/envs/notebook/lib/python3.7/site-packages/ipykernel_launcher.py:17: UserWarning:
Tight layout not applied. The bottom and top margins cannot be made large enough to
accommodate all axes decorations.
and some others.
Big thank you already for reading this! I am a complete beginner in python and just managed to come as far as you see in this image:
enter image description here
It is not clear what it is required. I'll assume it is the following image:
Fundamentally, the problem is that the pie needs a square aspect ratio, which is not provided by a row of subplots.
The simplest solution, is to create only one plot and plot there multiple pies with different centres. Something like:
import matplotlib.pyplot as plt
sizes = [ [108295, 10107, 7220, 11551, 7220],
[77882, 6676, 6676, 13351, 6676],
[55652, 4417, 6184, 15900, 6184],
[36327, 2642, 4632, 16512, 5944],
[18781, 1409, 3287, 1878, 4695]]
colors = ('red', 'darkblue', 'orange', 'yellow', 'green')
R = 4
radius = [R*i for i in [1.0, 0.77, 0.61, 0.46, 0.33] ]
wid = sum(radius)*2
hei = R*2
fig, ax = plt.subplots(figsize =(wid,hei))
fig.subplots_adjust(left = 0, right = 1, bottom = 0, top = 1)
y = R
x = 0
for i in range(5):
x += radius[i]
ax.pie(sizes[i], startangle = 0, colors = colors,
radius = radius[i], center = (x,y) )
x += radius[i]
ax.set(xlim =(0,x), ylim=(0,R*2))
plt.savefig("aaa.png")
Notice that my figure aspect ratio is not the (20,4) of the question, which does not hold for the way I interpreted the intended result.
But it might be the case that there is the need of having these in different axes. If so, the idea is:
Use gridspec to create a single row with 5 columns and provide the ratios so that they correspond to the required radius.
Plot the larger pie in the left slot.
In all remaining slots, use a subgrid, dividing into a column of three (sub-)slots.
Set the height ratios so that the middle one ends up with an aspect ratio of a square.
Plot the pies in the middle slots.
Here we go:
import matplotlib.pyplot as plt
sizes = [ [108295, 10107, 7220, 11551, 7220],
[77882, 6676, 6676, 13351, 6676],
[55652, 4417, 6184, 15900, 6184],
[36327, 2642, 4632, 16512, 5944],
[18781, 1409, 3287, 1878, 4695]]
colors = ('red', 'darkblue', 'orange', 'yellow', 'green')
R = 4
radius = [R*i for i in [1.0, 0.77, 0.61, 0.46, 0.33] ]
wid = sum(radius)*2
hei = R*2
ratios = [i/radius[0] for i in radius] # for gridspec
fig = plt.figure(figsize =(wid,hei))
gs = fig.add_gridspec(1, 5,
width_ratios = ratios,
wspace=0, left = 0, right = 1, bottom = 0, top = 1)
ax = fig.add_subplot(gs[0,0])
ax.pie(sizes[0], startangle = 0, colors = colors, radius = 1 )
ax.set(xlim=(-1,1) ,ylim=(-1,1))
for i in range(1,5):
mid = ratios[i]/sum(ratios)*wid
inrat = [(hei-mid)/2, mid, (hei-mid)/2]
ings = gs[0,i].subgridspec(3, 1, hspace=0,
height_ratios = inrat)
ax = fig.add_subplot(ings[1,0])
ax.pie(sizes[i], startangle = 0, colors = colors, radius = 1 )
ax.set(xlim=(-1,1), ylim=(-1,1))
plt.savefig("aaa.png")
I am using the following code to generate a legend in a separate file. Unfortunately there is a black line in the middle of the legend that has been generated. In the middle of 'permissions modification'. How would i remove this black line ?
import matplotlib.pyplot as plt
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.size'] = 10
plt.rcParams['font.serif'] = ['Times New Roman'] + plt.rcParams['font.serif']
"#0C99BA"
"#2f4b7c"
"darkviolet"
"a05195"
"crimson"
"f95d6a"
"ff7c43"
"ffa600"
colors = ["#88D18A", "#0C99BA", "#2f4b7c", "darkviolet", "#a05195", "crimson", "#f95d6a", "#ff7c43", "#ffa600"]
f = lambda m,c: plt.plot([],[],marker=m, color=c, ls="none")[0]
handles = [f("s", colors[i]) for i in range(9)]
#handles = ["red", "red", "wine"]
#labels = ["red", "red", "wine"]
labels = ['Delayed\nExecution', 'File\nOpening', 'Firewall\nModification', 'Permission\nModification', 'Persistence', 'Proxied\nExecution', 'Reconnaissance', 'Registry\nModification', 'Task\nStopping']
legend = plt.legend(handles, labels, loc=3, framealpha=1, frameon=False,ncol=9,handletextpad=0.1,columnspacing=0)
def export_legend(legend, filename="legend.pdf"):
fig = legend.figure
fig.canvas.draw()
bbox = legend.get_window_extent().transformed(fig.dpi_scale_trans.inverted())
fig.savefig(filename, dpi="figure", bbox_inches=bbox)
export_legend(legend)
plt.show()
This is the infuriating and inexplicable black line:
I have three plots based on the same dataset. How can I link all three plots so that when I select a certain species in vbar plot, two scatter plot also change to plot points in that species only.
any help is appreciated~
from bokeh.sampledata.iris import flowers
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, CategoricalColorMapper
from bokeh.layouts import column, row
#color mapper to color data by species
mapper = CategoricalColorMapper(factors = ['setosa','versicolor', 'virginica'],\
palette = ['green', 'blue', 'red'])
output_file("plots.html")
#group by species and plot barplot for count
species = flowers.groupby('species')
source = ColumnDataSource(species)
p = figure(plot_width = 800, plot_height = 400, title = 'Count by Species', \
x_range = source.data['species'], y_range = (0,60),tools = 'box_select')
p.vbar(x = 'species', top = 'petal_length_count', width = 0.8, source = source,\
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2,\
color = {'field': 'species', 'transform': mapper})
labels = LabelSet(x='species', y='petal_length_count', text='petal_length_count',
x_offset=5, y_offset=5, source=source)
p.add_layout(labels)
#scatter plot for sepal length and width
source1 = ColumnDataSource(flowers)
p1 = figure(plot_width = 800, plot_height = 400, tools = 'box_select', title = 'scatter plot for sepal')
p1.circle(x = 'sepal_length', y ='sepal_width', source = source1, \
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2, \
color = {'field': 'species', 'transform': mapper})
#scatter plot for petal length and width
p2 = figure(plot_width = 800, plot_height = 400, tools = 'box_select', title = 'scatter plot for petal')
p2.circle(x = 'petal_length', y ='petal_width', source = source1, \
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2, \
color = {'field': 'species', 'transform': mapper})
#show all three plots
show(column(p, row(p1, p2)))
I don't think there's some functionality existing for this at the moment. But you can explicitly link two ColumnDataSources with a CustomJS callback:
from bokeh.models import CusomJS
source = ColumnDataSource(species)
source1 = ColumnDataSource(flowers)
source.js_on_change('selected', CustomJS(args=dict(s1=source1), code="""
const indices = cb_obj.selected['1d'].indices;
const species = new Set(indices.map(i => cb_obj.data.species[i]));
s1.selected['1d'].indices = s1.data.species.reduce((acc, s, i) => {if (species.has(s)) acc.push(i); return acc}, []);
s1.select.emit();
"""))
Note that this callback only synchronizes selection from the bar plot to the scatter plots. To make selections on the scatter plots influence the bar plot, you'll have to write some additional code.
Update the question:
How to select a certain species in barplot, nonselected bars will change color?
How to show text on top of each bar?
from bokeh.sampledata.iris import flowers
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, CategoricalColorMapper
from bokeh.layouts import column, row
#color mapper to color data by species
mapper = CategoricalColorMapper(factors = ['setosa','versicolor', 'virginica'],\
palette = ['green', 'blue', 'red'])
output_file("plots.html")
#group by species and plot barplot for count
species = flowers.groupby('species')
source = ColumnDataSource(species)
p = figure(plot_width = 800, plot_height = 400, title = 'Count by Species', \
x_range = source.data['species'], tools = 'box_select')
p.vbar(x = 'species', top = 'petal_length_count', width = 0.8, source = source,\
nonselection_fill_color = 'gray', nonselection_fill_alpha = 0.2,\
color = {'field': 'species', 'transform': mapper})
show(p)
First: please try to ask unrelated questions in separate SO posts.
Hit testing and selection was not implemented for vbar and hbar until recently. Using the recent 0.12.11 release, your code behaves as you are wanting:
Regarding labels for each bar, you want to use the LabelSet annotation, as demonstrated in the User's Guide Something like:
labels = LabelSet(x='species', y='petal_count_length', text='some_column',
x_offset=5, y_offset=5, source=source)
p.add_layout(labels)
The linking question is too vague. I would suggest opening a new SO question with more information and description of what exactly you are trying to accomplish.
The recent version of Bokeh allows the programmer to put the legend outside of the chart area. This can be accomplished like described here:
p = figure(toolbar_location="above")
r0 = p.circle(x, y)
legend = Legend(items=[
("sin(x)" , [r0]),),
], location=(0, -30))
p.add_layout(legend, 'right')
show(p)
Note: A legend object is attached to a plot via add_layout. The legend object itself consists of tuples and strings together with glyph lists.
The question is what to do when you are just drawing one "data" series as is the case with the code below, adapted from here:
from bokeh.io import show
from bokeh.models import ColumnDataSource, HoverTool, LinearColorMapper
from bokeh.plotting import figure
col = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
row = ['A', 'B', 'C' , 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P']
# this is the colormap from the original NYTimes plot
colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
"#ddb7b1", "#cc7878", "#933b41", "#550b1d"]
mapper = LinearColorMapper(palette=colors)
source = ColumnDataSource(data = dict (
row = test['plate_row'],
col = test['plate_col'],
values = test['Melt Temp']
))
TOOLS = "hover,save,pan,box_zoom,wheel_zoom"
p = figure(title="Plate Heatmap", x_range = (0.0,25.0), y_range =
list(reversed(row)), x_axis_location="above", tools=TOOLS)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None)
legend = Legend(items=[
("test" , [r1]),
], location=(0, -30))
p.add_layout(legend, 'left')
show(p) # show the plot
The issue here is that there is only one glyph. What I actually need is an explanation of what value range is included for different colors. Clearly, this is possible, because the plots defined here show that it's possible.
Update:
Now that I am writing about the problem, I am starting to think, that perhaps I can can just plot multiple series, one for each color...and only plot those coordinates that fall within a certain range...that seems rather clunky though. So any ideas are appreciated!
I figured out a way through using CategoricalColorMapper and then not creating an explicit legend object.
There may be a way to create the legend object explicitly with the same layout, I will have a look later.
import numpy as np
from bokeh.io import show
from bokeh.models import Legend
from bokeh.models import ColumnDataSource, HoverTool,CategoricalColorMapper
from bokeh.plotting import figure
from bokeh.palettes import Blues8
# values to assign colours on
values = np.arange(100,107)
# values that will appear in the legend!!!
legend_values = ['100-101','101-102','102-103','103-04','104-05','105-06',
'106-07']
source = ColumnDataSource(data = dict (
row = np.arange(100,107),
col = np.arange(100,107),
values = np.arange(100,107),
legend_values = legend_values
))
mapper = CategoricalColorMapper(factors=list(values),palette=Blues8)
TOOLS = "hover,save,pan,box_zoom,wheel_zoom"
p = figure(title="Plate Heatmap", x_range = (100,107), y_range =
[90,107], x_axis_location="above", tools=TOOLS)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None,legend='legend_values')
p.legend.location = "bottom_right"
show(p) # show the plot
See the image here 1
After researching this a bit more, I found 2 ways of creating a legends that show what each color means on the heatmap:
1.) Painting several glyph series:
First, I divide the number range into bins like so:
min_value = test['Melt Temp'].min()
max_value = test['Melt Temp'].max()
increment = round((max_value - min_value)/9)
num_bins = [(lower, lower+increment) for lower in
range(int(floor(min_value)), int(round(max_value)),
int(round(increment)))]
Then, I create sub tables from the main tables like so:
source_dict = {}
for range_tuple in num_bins:
range_data = test[(test['Melt Temp'] > int(range_tuple[0])) &
(test['Melt Temp'] <= int(range_tuple[1]))]
source = ColumnDataSource(data = dict (
row = range_data['x'],
col = range_data['y'],
values = range_data['Value']))
source_dict[range_tuple] = source
Then I zip up the colors with a column data source sub-table:
colors = RdYlBu9
glyph_list = []
for color, range_tuple in zip(colors, num_bins):
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source_dict[range_tuple],
fill_color=color,
line_color=None)
glyph_list.append(r1)
Lastly, I create an explicit legend object which requires string-glyph-tuples. The legend object then gets attached to the plot:
legend_list = [("{0}<={1}".format(bin[0], bin[1]), [glyph]) for bin,
glyph in zip(num_bins, glyph_list)]
legend = Legend(items=legend_list, location=(0, -50))
p.add_layout(legend, 'left')
show(p)
Downsides to this approach:
It somehow seems a bit clunky.
Another potential downside I discovered while trying to select objects: If you click on one datapoint of a certain color, all datapoints of that color get selected. Depending on what you want to do this may be a plus or a minus.
2.) Colorbar:
Second approach makes use of #Okonomiyaki's comment above, and is a lot simpler. The basic gist is that you use a color mapper for determining colors of your glyphs. You also create a ColorBar as Okonomiyaki pointed out:
mapper = LogColorMapper(palette="Viridis256", low=min_value,
high=max_value)
source = ColumnDataSource(data = dict (
row = test['x'], col = test['y'], values = test['value']))
p = figure(title="Plate Heatmap", x_range = (0.0,25.0), y_range =
list(reversed(row)),
x_axis_location="above", plot_width=650, plot_height=400)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None)
color_bar = ColorBar(color_mapper=mapper, ticker=LogTicker(),
label_standoff=12, border_line_color=None,
location(0,0))
p.add_layout(color_bar, 'left')
layout = p
show(layout)
I like the elegance of this approach. The only downside to this approach is that you don't get a clean range of numbers that define a given color.
If other people come up with even more elegant approaches, please
share!