Altair: geo_shape doesn't work with selection - altair

I am working on a Choropleth map that will display the similarities between different states. So when you select a state from a dropdown, the map will show the similarity it has to other states.
For this, I am using 2 datasets:
DatasetA: a long-form dataframe, with 3 columns: State 1, State 2, and the similarity between them.
DatasetB: a GeoDataFrame that contains the geometry of each state.
When I try to plot this without the selection, then it works:
alt.Chart(gdf).mark_geoshape(
).encode(
color='Similarity:O',
tooltip = ['Similarity:Q']
).properties(
projection={'type': 'albersUsa'},
width=700,
height=400
).transform_lookup(
lookup='State',
from_=alt.LookupData(source, 'State', source.columns.values)
)
But once I add the selection, then it only works when I select Wyoming (the last state in Dataset A). When I select other states, the plot disappears.
input_dropdown = alt.binding_select(options=source.State.unique())
selection = alt.selection_single(fields=['Similarity_to'], bind=input_dropdown ,init={'Similarity_to': 'New York'})
alt.Chart(gdf).mark_geoshape(
).encode(
color='Similarity:Q',
tooltip = ['Similarity:Q']
).properties(
projection={'type': 'albersUsa'},
width=700,
height=400
).transform_lookup(
lookup='State',
from_=alt.LookupData(source, 'State', source.columns.values)
).transform_filter(
selection
).add_selection(
selection
)
Here's a clip that demonstrates it: https://www.loom.com/share/292e8b1a80344cf5a998a54f453ece2c

I got this to work by using a transform_fold. The problem is that the transform_lookup only matches once, so if there are multiple matches in the dataset, it ignores them. So you have to use a wide-form dataset and then use the transform_fold to convert it back into long-form.
input_dropdown = alt.binding_select(options=source.State.unique())
selection = alt.selection_single(fields=['Similarity_to'], bind=input_dropdown ,init={'Similarity_to': 'New York'})
alt.Chart(gdf).mark_geoshape(
stroke='black'
).encode(
color='Similarity:Q',
tooltip = ['Similarity:Q']
).properties(
projection={'type': 'albersUsa'},
width=700,
height=400
).transform_lookup(
lookup='State',
from_=alt.LookupData(source, 'State', source.columns.values)
).transform_fold(
source.drop('State',axis=1).columns.values, # Preserve the State column, fold the rest
['Similarity_to','Similarity']
).transform_filter(
selection
).add_selection(
selection
)
I actually tried this before asking the question. But it turns out I did the transformations in the wrong order. The order of transformations matters!
You can find the full code here: https://deepnote.com/project/altairstackoverflow--zl7Wx2tQQ22U3D1cLcodA/%2Fnotebooks%2Fstates.ipynb#00011-ed9a9249-2c34-412c-a091-d87d0ddb457d

Related

Is it possible to display one faceted chart in Altair at a time and toggle between different charts?

I'm working on a project for a class where I've been creating faceted scatterplots in Altair where each chart is a different type of food. I was wondering if it would be possible to still use these faceted charts but only display one at a time and give the user the ability to toggle between graphs instead of having each graph displayed one after the other?
Here is a rough diagram of what I'm trying to do and here's the code I have now (and what the output looks like at the moment):
import altair as alt
from vega_datasets import data
hover = alt.selection_single(on='mouseover', nearest=True, empty='none')
base = alt.Chart("Food Nutrition Info Compiled.csv").encode(
x='Energ_Kcal:N',
y='Water_(g):Q',
color=alt.condition(hover, 'Type:N', alt.value('lightgray'))
).properties(
width=180,
height=180,
)
points = base.mark_point().add_selection(
hover
)
text = base.mark_text(dy=-5).encode(
text = 'Shrt_Desc:N',
opacity = alt.condition(hover, alt.value(1), alt.value(0))
)
alt.layer(points, text).facet(
'Type:N',
)
Thanks and I hope this makes sense!
Displaying one faceted chart at a time sounds the same as showing a single subset of the data at any one point. This is currently possible by creating a selection that is bound to e.g. a dropdown and the using that with transform_filter:
import altair as alt
from vega_datasets import data
source = data.cars()
dropdown_options = source['Origin'].unique().tolist()
dropdown = alt.binding_select(
options=dropdown_options,
name='Origin '
)
selection = alt.selection_multi(
fields=['Origin'],
value=[{'Origin': dropdown_options[0]}],
bind=dropdown
)
alt.Chart(source).mark_circle().encode(
x=alt.X('Weight_in_lbs:Q', title=''),
y='Horsepower',
color='Origin',
).add_selection(
selection
).transform_filter(
selection
)

Is there a way to display the value of a mark next to the mark in Altair

I was playing around with the following example from the Altair Gallery:
https://altair-viz.github.io/gallery/airports_count.html
As of right now, the only way to display the actual count appears to be via the tooltip, as the example shows. However, I am trying to code a static visualization for which it would be very helpful if the exact value was displayed right next to the mark itself, without the user having to hover or interact in any way. Is there a way to achieve this?
You can do this by manually calculating offsets for text labels, though this is admittedly difficult when the points become crowded:
import altair as alt
from vega_datasets import data
airports = data.airports.url
states = alt.topo_feature(data.us_10m.url, feature='states')
# US states background
background = alt.Chart(states).mark_geoshape(
fill='lightgray',
stroke='white'
).properties(
width=500,
height=300
).project('albersUsa')
# airport positions on background
base = alt.Chart(airports).transform_aggregate(
latitude='mean(latitude)',
longitude='mean(longitude)',
count='count()',
groupby=['state']
).encode(
longitude='longitude:Q',
latitude='latitude:Q',
)
points = base.mark_circle().encode(
size=alt.Size('count:Q', title='Number of Airports'),
color=alt.value('steelblue'),
tooltip=['state:N','count:Q']
).properties(
title='Number of airports in US'
)
text = base.mark_text(
dx=15, dy=10
).encode(
text='count:Q'
)
background + points + text
Long-term, a better solution will be to use vega-label, which will be able to do this automatically once it's part of the Vega-Lite package. For Altair, this feature is tracked in this bug: https://github.com/altair-viz/altair/issues/1731

Can a condition be based on both a selection and a predicate?

Is it possible to combine a selection and a predicate in a condition? I would like to color points on a scatterplot only if the group is selected and above a certain value.
import altair as alt
from vega_datasets import data
source = data.cars()
selection = alt.selection_multi(fields=['Origin'])
color = alt.condition(
selection & (alt.datum.Miles_per_Gallon > 18),
alt.Color('Origin:N'),
alt.value('lightgray')
)
alt.Chart(source).mark_circle().encode(
x='Horsepower',
y='Miles_per_Gallon',
color=color,
tooltip=['Name', 'Origin', 'Horsepower', 'Miles_per_Gallon']
).add_selection(
selection
)
Trying to compound the selection and the predicate raises:
Javascript Error: Cannot find a selection named "(datum.Miles_per_Gallon > 18)".
The code works with either just the selection or the condition, but not both. The only solution I can think of is layering a scatterplot on top with all the data points below the threshold colored gray. Appreciate any help, thanks!
It looks like the & operator does not work properly between a selection and an expression (tracked by this issue in the Altair repository). You can work around this by using the underlying schema object instead:
color = alt.condition(
alt.LogicalAndPredicate(**{'and': [selection, '(datum.Miles_per_Gallon > 18)']}),
alt.Color('Origin:N'),
alt.value('lightgray')
)
The resulting chart looks like this when the selection is empty:

Plotly - clicking on legend items - how to make an initial setting?

When someone clicks on a legend item, it becomes grey, and the data disappears, for instance, here. It is possible to set that an item from a legend will be grey after opening the .HTML output and will appear after clicking of that? Thank you
You can do that using the visible property on the trace, just set visible='legendonly'.
visible
Type: enumerated , one of ( True | False | "legendonly" )
Default: True
Determines whether or not this trace is visible. If "legendonly", the
trace is not drawn, but can appear as a legend item (provided that the
legend itself is visible).
A common use case is when one has a lot of traces and wants to show only a few of them initially, eg. with plotly.express :
import plotly.express as px
df = px.data.gapminder().query("continent == 'Europe'")
# Nb. This creates one trace per country (color='country'), with each trace `name`
# inheriting the value of its respective country.
fig = px.line(df, x='year', y='gdpPercap', color='country', symbol="country")
# Arbitrary selection
sel = ['Norway', 'Ireland', 'France', 'Switzerland']
# Disable the traces that are not in the selection
fig.update_traces(selector=lambda t: t.name not in sel, visible='legendonly')
fig.show()

Shared axis labels with independent scale

When facet/concat-ing charts, I would like the axis labels to be shared (so only 1 label per column/row, here: Horsepower), but the scale to be independent. Is this possible?
I thought a combination of resolve_axis and resolve_scale would be the way to go, as the title is a part of Axis, but I didn't get it to work.
I'm also wondering what resolve_axis actually does different than resolve_scale, anyone has an example?
base = alt.Chart(source).mark_circle().encode(
x=alt.X('Horsepower:Q',),
y=alt.Y('Miles_per_Gallon:Q'),
color='Origin:N',
row=alt.Row('Origin:N'),
).properties(
width=200, height=100
)
base.resolve_axis(
x='shared' # doesn't do anything obvious
).resolve_scale(
x='independent'
)
Open the Chart in the Vega Editor
I found a hacky way to do this, by misusing the facet header:
base = alt.Chart(source).mark_circle(size=60).encode(
x=alt.X('Horsepower:Q',),
y=alt.Y('Miles_per_Gallon:Q',
axis=alt.Axis(title=''),),
color='Origin:N',
column=alt.Column('Origin:N', header=alt.Header(title='Miles_per_Gallon')),
).properties(
width=200, height=200
).configure_header(
labelExpr="['Origin',datum.value]",
titleOrient='left'
)
display(base.resolve_scale(y='shared'))
display(base.resolve_scale(y='independent'))
I don't know of any way to do what you're hoping for (independent scales with only a single outer axis title) via scale and guide resolution.
As to your question of the difference between resolve_scale and resolve_axis, an example may help.
Here's a chart with independent y scale:
import altair as alt
from vega_datasets import data
source = data.cars()
base = alt.Chart(source).mark_circle().encode(
x=alt.X('Horsepower:Q',),
y=alt.Y('Miles_per_Gallon:Q'),
color='Origin:N',
column=alt.Column('Origin:N'),
).properties(
width=150, height=150
)
base.resolve_scale(
y='independent'
)
And here's one with independent y axis:
base.resolve_axis(
y='independent'
)
In both cases, each chart gets its own axis (because independent scales imply independent axes), but only with an independent scale do the axes scales differ from each other.

Resources