Displaying multiple values in Altair/Streamlit tooltips on a bar chart - altair

My DataFrame looks similar to this:
name
reached points
Jose Laderman
13
William Kane
13
I am currently displaying the aggregated count of students reached points of an assignment on an Altair bar chart within Streamlit like this:
brush = alt.selection(type='interval', encodings=['x'])
interactive_test = alt.Chart(df_display_all).mark_bar(opacity=1, width=5).encode(
x= alt.X('reached points', scale=alt.Scale(domain=[0, maxPoints])),
y=alt.Y('count()', type='quantitative', axis=alt.Axis(tickMinStep=1), title='student count'),
).properties(width=1200)
upper = interactive_test.encode(
alt.X('reached points', sort=alt.EncodingSortField(op='count', order='ascending'), scale=alt.Scale(domain=brush, domainMin=-0.5))
)
lower = interactive_test.properties(
height=60
).add_selection(brush)
concat_distribution_interactive = alt.vconcat(upper, lower)
Which produces this output and everything looks fine
The information I want my tooltip to show is a list of students that reached the specific amounts of reached points I'm hovering over. When adding something like:
tooltip='name'
the way my bar chart seems to display values has now been altered to this
When adding something like
tooltip='reached points'
The data seems to be displayed normally but without a tooltip that gives me the necessary information. Is it possible to display tooltip data that isn't used in my x or y axis but still part of the DataFrame I'm putting into the chart?

Related

Plotly does not color-code my chart according to the variable indicated

I have the problem that when I concatenate two csv and create a column "Version" with string values, plotly does NOT generate the classification of each of the values, in this case 1 and 2. However, when this variable is numeric, it does generate a continuous classification (I need a discrete classification).
Image shows concatenation and data type
Two legends are observed, but only one color is observed
This is an example of how it doesn't work. However, if I change the variable to continuous, it does work.
I am generating this block to generate graphs for each type of group. Which should be my final result.
grouped = df_final.groupby('Name')
plots = []
for name, group in grouped:
# create a new figure for the group
fig = px.scatter(group.reset_index(), x="Time", y="Observed", opacity=1, width=800, height=600,
color = "Version)
fig.show()
This block generates the chart with its legend, but does not show the colors in the chart.
I am starting with python and plotly, any help would be appreciated.
I'm trying to understand why plotly doesn't sort my string variables

How to add text on interactive Scatter on Altair?

I try to adapt the Selection Detail Example from altair doc (https://altair-viz.github.io/gallery/select_detail.html#selection-detail-example).
I won't detailed my Dataframe structure which is identical with the one from the example (included variable names).
The native code is working well :
# Data is prepared, now make a chart
selector = alt.selection_single(empty='all', fields=['id'])
base = alt.Chart(data).properties(
width=250,
height=250
).add_selection(selector)
points = base.mark_point(filled=True, size=200,opacity=0.9).encode(
x=alt.X('mean(y)',title='Durée de perception',scale=alt.Scale(domain=(11, 23))),
y=alt.Y('mean(x)',title='Taux de marge (%PM)'),
color=alt.condition(selector, 'id:O', alt.value('lightgray')),
tooltip = ['mean(y)','mean(x)']
)
timeseries = base.mark_bar(opacity=1).encode(
x=alt.X('time', title='Items'),
y=alt.Y('value', scale=alt.Scale(domain=(-1, 1)),stack=None),
color=alt.Color('id:O',scale=alt.Scale(domain=domain, range=range_))
#, legend=None)
).transform_filter(
selector
)
points | timeseries
No problem at this stage even if it could be useful to hide all the bars on right chart when no selection is made on the right chart (don't know if it's possible ?)
After that I try to add text to the scatter plot adding this at the end of the code :
text = points.mark_text(dy=-5).encode(
x=alt.X('mean(y)',title='Durée de perception',scale=alt.Scale(domain=(11, 23))),
y=alt.Y('mean(x)',title='NBV (%CA)'),
text='id:O'
)
(points + text) | timeseries
which leads to the following error message :
Javascript Error: Duplicate signal name: "selector094_tuple"
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.
If you have any idea on how to do, i would be grateful
Thanks
The issue is that you cannot add the same selection to two different layers, which you do implicitly by deriving text from points. Try this instead:
text = alt.Chart(data).mark_text(dy=-5).encode(
x=alt.X('mean(y)',title='Durée de perception',scale=alt.Scale(domain=(11, 23))),
y=alt.Y('mean(x)',title='NBV (%CA)'),
text='id:O'
)
(points + text) | timeseries

Bumbling around plotting two sets of seasonal data on the same chart

I have series of monthly inventory data since 2017.
I have a series of inventory_forecasts since Dec2018
I am trying to plot the inventory data on a monthly-seasonal basis, and then overlay the inventory_forecasts of Jan2019 through Dec2019.
The dataframe looks like:
The first way I tried to make the chart does show all the data I want, but I'm unable to control the color of the inventory_zj line. Its color seems to be dominated by the color=year(date):N of the alt.Chart I configured. It is ignoring the color='green' I pass to the mark_line()
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory").mark_line().encode(
x='month',
y="inventory",
color="year(date):N"
)
#this ignores my 'green' color instruction, and marks it the same light blue 2019 color
joe = base.mark_line(color='green').encode(
alt.Y('inventory_zj', scale=alt.Scale(zero=False), )
)
base+joe
I tried to use a layering system, but it's not working at all -- I cannot get it to display the "joe" layer
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory").encode(
x='month(date)'
)
doe = base.mark_line().encode(
alt.Y('inventory', scale=alt.Scale(zero=False), ),
color="year(date):N"
)
joe = base.mark_line(color="green").encode(
alt.Y('inventory_zj', scale=alt.Scale(zero=False), ),
)
#looks identical to the first example
alt.layer(
doe, joe
).resolve_scale(
y='shared'
).configure_axisLeft(labelColor='black').configure_axisRight(labelColor='green',titleColor='green')
#independent shows a second y-axis (which is different from the left y-axis) but no line
alt.layer(
doe, joe
).resolve_scale(
y='independent'
).configure_axisLeft(labelColor='black').configure_axisRight(labelColor='green',titleColor='green')
I feel like i must be trying to assemble this chart in a fundamentally wrong way. I should be able to share teh same left y-axis, have the historic data colored by its year, and have a unique color for the 2019-forecasted data. But I seem to be making a mess of it.
As mentioned in the Customizing Visualizations docs, there are multiple ways to specify things like line color, with a well-defined hierarchy: encodings override mark properties, which override top-level configurations.
In your chart, you write base.mark_point(color='green'), where base contains a color encoding which overrides the mark property. If you don't derive the layer from base (so that it does not have a color encoding), then the line will be green as you hoped. Something like this:
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory")
inventory = base.mark_line().encode(
x='month',
y="inventory",
color="year(date):N"
)
joe = base.mark_line(color='green').encode(
x='month',
y=alt.Y('inventory_zj', scale=alt.Scale(zero=False))
)
inventory + joe

Setting multiple axvspan labels as one element in legend

I am trying to set up a series of vertical axis spans to symbolize different switching positions at different times. For example, in the figure below, switching position 1 (green) happens quite a few times, alternating between other positions.
I plot these spans running a for loop in a list of tuples, each containing the initial and final indexes of each interval to plot the axvspan.
def plotShades(timestamp, intervals, colour):
for i in range(len(intervals)):
md.plt.axvspan(timestamp[intervals[i][0]], timestamp[intervals[i][1]], alpha=0.5, color=colour, label="interval")
This function is then called upon another one, that plots the shades for each different switching position:
def plotAllOutcomes(timestamp, switches):
#switches is a list of 7 arrays indicating when the switcher is at each one of the 7 positions. If the array has a 1 value, the switcher is there. 0 otherwise.
colors = ['#d73027', '#fc8d59', '#fee08b', '#ffffbf', '#d9ef8b', '#91cf60', '#1a9850']
intervals = []
for i in range(len(switches)):
intervals.append(getIntervals(switches[i], timestamp))
plotShades(timestamp, intervals[i], colors[i])
md.plt.legend()
Doing so with the code snippets I've put here (not the best code, I know - I'm fairly new in Python!) the legend ends up having one item for each interval, and that's pretty awful. This is how it looks:
I'd like to get a legend with only 7 items, each for a single color in my plot of axvspans. How can I proceed to do so? I've searched quite extensively but haven't managed to find this situation being asked before. Thank you in advance for any help!!
A small trick you can apply using the fact that labels starting with "_" are ignored:
plt.axvspan( ... , label = "_"*i + "interval")
Thereby a label is only created for the case where i==0.

Plot points playing games when I try to size by a value count in bokeh

I'm trying to get the plot points in a scatter graph to size according to the frequency of values in a column of data. The data is coming from a questionnaire.
My questions are: What am I doing wrong, and what can I do to fix it?
I can push out a simple plot with x and y values coming from 2 columns of data. The X axis represents a level (1-100), and the Y axis represents a choice users can make for each level (1-4). For this plot I want to track how many people choose 1-4 on each level - so I need to capture that 1-4 has been selected, then indicate how many times.
Simple plot works fine, though those points have multiple occurrences.
Here's the code for that:
# Set up the graph
WT_Number = data.wt # This is the X axis
CFG_Number = data.cfg # This is the Y axis
wt_cfg_plot = figure(plot_width=1000, plot_height=400,
title="Control Form Groups chosen by WT unit")
# Set up the plot points, including the Hover Tool
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=7,
fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Problem: I then added a value count and set it as the size, to get the plot points to adjust according to the value frequency. But now it pumps out this chart and throws an error:
Plot points are reacting to the code, but now they're doing their own thing.
I added a variable for the value counts (cfg_freq), and used that as the size:
cfg_freq = data['cfg'].value_counts()*4
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=cfg_freq, fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Here's the the last part of the error being thrown:
File "/Applications/anaconda/lib/python3.5/site-packages/bokeh/core/properties.py", line 722, in setattr
(name, self.class.name, text, nice_join(matches)))
AttributeError: unexpected attribute 'size' to Chart, possible attributes are above, background_fill_alpha, background_fill_color, below, border_fill_alpha, border_fill_color, disabled, extra_x_ranges, extra_y_ranges, h_symmetry, height, hidpi, left, legend, lod_factor, lod_interval, lod_threshold, lod_timeout, logo, min_border, min_border_bottom, min_border_left, min_border_right, min_border_top, name, outline_line_alpha, outline_line_cap, outline_line_color, outline_line_dash, outline_line_dash_offset, outline_line_join, outline_line_width, plot_height, plot_width, renderers, responsive, right, tags, title, title_standoff, title_text_align, title_text_alpha, title_text_baseline, title_text_color, title_text_font, title_text_font_size, title_text_font_style, tool_events, toolbar_location, tools, v_symmetry, webgl, width, x_mapper_type, x_range, xgrid, xlabel, xscale, y_mapper_type, y_range, ygrid, ylabel or yscale

Resources