Plot points playing games when I try to size by a value count in bokeh - python-3.x

I'm trying to get the plot points in a scatter graph to size according to the frequency of values in a column of data. The data is coming from a questionnaire.
My questions are: What am I doing wrong, and what can I do to fix it?
I can push out a simple plot with x and y values coming from 2 columns of data. The X axis represents a level (1-100), and the Y axis represents a choice users can make for each level (1-4). For this plot I want to track how many people choose 1-4 on each level - so I need to capture that 1-4 has been selected, then indicate how many times.
Simple plot works fine, though those points have multiple occurrences.
Here's the code for that:
# Set up the graph
WT_Number = data.wt # This is the X axis
CFG_Number = data.cfg # This is the Y axis
wt_cfg_plot = figure(plot_width=1000, plot_height=400,
title="Control Form Groups chosen by WT unit")
# Set up the plot points, including the Hover Tool
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=7,
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Problem: I then added a value count and set it as the size, to get the plot points to adjust according to the value frequency. But now it pumps out this chart and throws an error:
Plot points are reacting to the code, but now they're doing their own thing.
I added a variable for the value counts (cfg_freq), and used that as the size:
cfg_freq = data['cfg'].value_counts()*4
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=cfg_freq, fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Here's the the last part of the error being thrown:
File "/Applications/anaconda/lib/python3.5/site-packages/bokeh/core/", line 722, in setattr
(name,, text, nice_join(matches)))
AttributeError: unexpected attribute 'size' to Chart, possible attributes are above, background_fill_alpha, background_fill_color, below, border_fill_alpha, border_fill_color, disabled, extra_x_ranges, extra_y_ranges, h_symmetry, height, hidpi, left, legend, lod_factor, lod_interval, lod_threshold, lod_timeout, logo, min_border, min_border_bottom, min_border_left, min_border_right, min_border_top, name, outline_line_alpha, outline_line_cap, outline_line_color, outline_line_dash, outline_line_dash_offset, outline_line_join, outline_line_width, plot_height, plot_width, renderers, responsive, right, tags, title, title_standoff, title_text_align, title_text_alpha, title_text_baseline, title_text_color, title_text_font, title_text_font_size, title_text_font_style, tool_events, toolbar_location, tools, v_symmetry, webgl, width, x_mapper_type, x_range, xgrid, xlabel, xscale, y_mapper_type, y_range, ygrid, ylabel or yscale


Putting a value from an input data file into 'Set Label'

I'm plotting an animated surface in gnuplot and want to read in an average or sum of the mapped z values and include this in a label to be printed in the plot, so that I get a running total updated as the GIF progresses. It's probably straightforward, but I'm a "gnu"bie, so to speak, and find this system pretty confusing!
I've tried putting the running sum and average numbers in additional columns ...
splot 'output3.dat' index i:i using 1:2:(column(3), TD1 = strcol(4), TD2 = strcol(5)) with pm3d
but this doesn't plot, and the string variables TD1, TD2 don't seem to exist outside the splot command.
The command you show would indeed set variables TD1 and TD2 globally if you change the order of clauses in the serial evaluation expression (the comma-separated sub-expressions):
splot 'output3.dat' index i:i using 1:2:(TD1 = strcol(4), TD2 = strcol(5), column(3)) with pm3d
However, if the idea is to create a label using set label that will appear as part of the resulting graph, this won't work. The set label command would have to be executed before the splot command, so TD1 and TD2 will not have the correct values yet.
There is an alternative that might serve you better. Instead of trying to put this dynamically evaluate information in a label, put it in the plot title. Unlike a label, the plot title is evaluated after the corresponding plot is generated, so any variables set or updated by that plot will be current. [caveat: this is true for current gnuplot (version 5.4) but was not always true. If you have an older gnuplot version the title is evaluated before the plot rather than after].
Since current gnuplot also allows you to place the individual plot titles somewhere other than in the key proper, you have the same freedom that you would with a label to position the text anywhere on the output page. For example, if you want to sum the values in column 3 of the data file and print the total as part of a title above the resulting plot:
SUM = 0
splot 'foo.dat' using 1:2:(SUM = SUM+column(3), column(3)) with linespoints title 'foo.dat', \
keyentry title = sprintf("Points sum to %g", SUM) at screen 0.5 0.9
I used a separate keyentry clause because this allows to omit the sample line segment that would otherwise be generated, but it would also be possible to make this the title of the plot itself if you want that sample line.

Ticks on color bar are overlapping because the values are very close to each other

I'm trying to display the exact values on one axis of the color bar and a basic scale on the other. However, some of the exact values are so close together their names overlap on the color bar. Is there a way for me to make the overlapping names appear as a list or just to the side the other values name? I've already tried rotation of the labels, setting vmin/vmax in the color bar method, and setting the ylim's of the second axis. I'm at a lose at what to try next. It feels like this is something matplotlib would allow but I can't find what method or kwargs that allow this manipulation. Many of the commented out tlines are the attempts I've made with help from many posts on StackOverflow. Thank you!!
Previous code deleted for clarity
UPDATE: Paul H here is a workable example with the same issue I'm trying to fix
# Make random data with same issue
x, y = np.linspace(-3, 1.5, 20), np.linspace(0, 0.5, 20)
# two different ranges used to simulate the same issue in my data
fake_phase = np.append(np.random.random_sample(15), np.arange(0.0, .005, 0.001))
fake_labels = np.array(['V439Oph', 'ALVir', 'YZVir', 'XXVir', 'V716Oph', 'BFSer', 'BLHer',
'RXLib', 'CEHer', 'V465Oph', 'V1180Sgr', 'CSCas', 'DQAnd', 'IXCas',
'UYEri', 'TWCap', 'AUPeg', 'MZCyg', 'SWTau', 'TXDel'], dtype=object)
# Plot data
fig, ax = plt.subplots(1,1,figsize=(15,10))
plt.plot(x, y, marker='.', ms=17, mew=2, linestyle='none')
# Make the same colorbar
norm = cm.colors.Normalize(vmin=0.0, vmax=1.0, clip=False)
cbar = fig.colorbar(cm.ScalarMappable(norm=norm, cmap='rainbow'), ax=ax, extend='both',
orientation='vertical', pad=0.005, use_gridspec=True)
cbar.set_ticklabels(fake_labels)'major', labelsize='large', width=1.5, length=6)
cbar.set_label(label='Phase', size='xx-large', labelpad=40)'auto')
ax2 =
pos =
pos.x0 += 0.1
The output of this code: Output of workable example
My issue is that the secondary axis on the colorbar (left axis) has values that are so close together their labels overlap. I'm hoping to find a way to space the labels so they are readable. I thought I found a way to accomplish this using axis.set_ticklabels() (set_ticklabels() documentation. In the **kargs section of the doc it references using text properties. In the text properties documentation text properties doc the property 'y' says you can set the y-position of the text. However, when I add this keyword to set_ticklabels() I get an error that the keyword is not recognized.. I've tried adding the property 'y' as a keyword and attribute but I get a keyword error or does not have that attribute error...
I'm calling the property wrong but I've never gotten this detailed in editing these parameters. I honestly don't know if this is the best way to solve this, but it's the closest I've gotten so far. I was hoping to use it to offset the labels so they were stacked vertically on top of each other in the same order but far enough apart that the label is readable.
Thanks for any input!

How to remove specific part of legend (seaborn, scatterplot)

I am using a seaborn scatterplot and just started using different point sizes.
sns.scatterplot(x='X [um]', y='Y [um]', hue='label', size='size', data=data)
All works perfectly but I'd like to remove the 'size' from the legend seen in picture:
The upper part with CH1, etc. shall remain the same but I'd want the lower part where the sizes are listed to vanish.
I use the get_legend_handles_labels() functionality to index the labels. Using indexing, I ensure that the final printed image only contains the first 13 labels in your legend.
g = sns.scatterplot(x='X [um]', y='Y [um]', hue='label', size='size', data=data)
h,l = g.get_legend_handles_labels()
plt.legend(h[0:13],l[0:13],bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

How to use df.plot to set different colors in one plot for one line?

I need to plot line plot that has different colors. I create special df column 'color' that contains for each point appropriate color.
I already found the solution here:
python/matplotlib - multicolor line
And take the approach from the above question. First, it was working when I use index but now I need to plot it vs other column and I can not appropriately handle the colors. It is all the time colores only with one color.
I use this code for setting colors, but it color line with one color that is the last in the column 'color'. And also create a legend that I don't understand how to delete from the plot.
for color2, start, end in gen_repeating(df2['color']):
print(start, end)
if start > 0: # make sure lines connect
start -= 1
idx = df2.index[start:end+1]
x2 = idx
y2 = df2.loc[idx, 'age_gps_data'].tolist()
df2.plot(x='river_km', y='age_gps_data', color=color2, ax=ax[1])
I would appreciate any help.
How can I set these colors to achieve different color in one line? And don't have legend on the plot.

Setting multiple axvspan labels as one element in legend

I am trying to set up a series of vertical axis spans to symbolize different switching positions at different times. For example, in the figure below, switching position 1 (green) happens quite a few times, alternating between other positions.
I plot these spans running a for loop in a list of tuples, each containing the initial and final indexes of each interval to plot the axvspan.
def plotShades(timestamp, intervals, colour):
for i in range(len(intervals)):
md.plt.axvspan(timestamp[intervals[i][0]], timestamp[intervals[i][1]], alpha=0.5, color=colour, label="interval")
This function is then called upon another one, that plots the shades for each different switching position:
def plotAllOutcomes(timestamp, switches):
#switches is a list of 7 arrays indicating when the switcher is at each one of the 7 positions. If the array has a 1 value, the switcher is there. 0 otherwise.
colors = ['#d73027', '#fc8d59', '#fee08b', '#ffffbf', '#d9ef8b', '#91cf60', '#1a9850']
intervals = []
for i in range(len(switches)):
intervals.append(getIntervals(switches[i], timestamp))
plotShades(timestamp, intervals[i], colors[i])
Doing so with the code snippets I've put here (not the best code, I know - I'm fairly new in Python!) the legend ends up having one item for each interval, and that's pretty awful. This is how it looks:
I'd like to get a legend with only 7 items, each for a single color in my plot of axvspans. How can I proceed to do so? I've searched quite extensively but haven't managed to find this situation being asked before. Thank you in advance for any help!!
A small trick you can apply using the fact that labels starting with "_" are ignored:
plt.axvspan( ... , label = "_"*i + "interval")
Thereby a label is only created for the case where i==0.
