Altair dropdown for linear or log scale - altair

I'd like for be able to toggle between log and linear scale in my altair plot. I'd also like to avoid multiple columns of transformed data if possible. I've tried this but get an error AttributeError: 'Scale' object has no attribute 'selection'
import altair as alt
from vega_datasets import data
cars_data = data.cars()
input_dropdown = alt.binding_select(options=['linear','log'], name='Scale')
selection = alt.selection_single(fields=['Miles_per_Gallon'], bind=input_dropdown)
scale = alt.condition(selection, alt.Scale(type = 'linear'), alt.Scale(type = 'log'))
alt.Chart(cars_data).mark_point().encode(
x='Horsepower:Q',
y = alt.Y('Miles_per_Gallon:Q',
scale=scale),
tooltip='Name:N'
).add_selection(
scale
)
I've tried a variety of different things but can't seem to make it work. Any suggestions are greatly appreciated.

Related

Altair repeated chart, add different subplot/chart title

I can't believe I haven't been able to google the answer for this .... in the documented example of repeated charts, how would I add a different sub-chart titles?
import altair as alt
from vega_datasets import data
iris = data.iris.url
alt.Chart(iris).mark_point().encode(
alt.X(alt.repeat("column"), type='quantitative'),
alt.Y(alt.repeat("row"), type='quantitative'),
color='species:N'
).properties(
width=200,
height=200,
title="Chart title",
).repeat(
row=['petalLength', 'petalWidth'],
column=['sepalLength', 'sepalWidth']
).interactive()
adds the same title to each sub-chart. Can I pass in a list of titles here? The figure in this question shows that the same chart title would show up in all columns. The same seems to be the case for my data/code:
I don't think it is possible to change title for repeated charts, but depending on your application you might be able to workaround this by using a transform_fold + faceting instead:
import altair as alt
from vega_datasets import data
iris = data.iris.url
alt.Chart(iris).mark_point().encode(
alt.X('species:N'),
alt.Y('value:Q'),
color='species:N'
).transform_fold(
['petalLength', 'petalWidth', 'sepalLength', 'sepalWidth']
).facet(
'key:N'
)

Hide the grid in an a specificaltair plot within a set of vstacked plots

I am trying to create a plot composed of 2 charts stacked vertically: a time series chart showing a data and below it a time series chart showing texts representing events on the time axis. I want the data-chart having a grid, but the mark_text chart below not to show an outer line and no grid. I use the chart.configure_axis(grid=False) command to hide the axis but get the following error: Objects with "config" attribute cannot be used within LayerChart. Consider defining the config attribute in the LayerChart object instead.
I can't figure out, where to apply the configure_axis(grid=False) option, so it will only apply to the bottom plot. any help on this would be greatly appreciated. or any suggestion how to implement the label-plot in a different way.
here is my code:
import altair as alt
import pandas as pd
import locale
from altair_saver import save
from datetime import datetime
file = '.\lagebericht.csv'
df = pd.read_csv(file, sep=';')
source = df
locale.setlocale(locale.LC_ALL, "de_CH")
min_date = '2020-02-29'
domain_pd = pd.to_datetime([min_date, '2020-12-1']).astype(int) / 10 ** 6
base = alt.Chart(source, title='Neumeldungen BS').encode(
alt.X('test_datum:T', axis=alt.Axis(title="",format="%b %y"), scale = alt.Scale(domain=list(domain_pd) ))
)
bar = base.mark_bar(width = 1).encode(
alt.Y('faelle_bs:Q', axis=alt.Axis(title="Anzahl Fälle"), scale = alt.Scale(domain=(0, 120)))
)
line = base.mark_line(color='blue').encode(
y='faelle_Total:Q')
chart1 = (bar + line).properties(width=600)
events= pd.DataFrame({
'datum': [datetime(2020,7,1), datetime(2020,5,15)],
'const': [1,1],
'label': ['allgememeiner Lockdown', 'Gruppen > 50 verboten'],
})
base = alt.Chart(events).encode(
alt.X('datum:T', axis=alt.Axis(title="", format="%b %y"), scale = alt.Scale(domain=list(domain_pd) ))
)
points = base.mark_rule(color='blue').encode(
y=alt.Y('const:Q', axis=alt.Axis(title="",ticks=False, domain=False, labels=False), scale = alt.Scale(domain=(0, 10)))
)
text = base.mark_text(
align='right',
baseline='bottom',
angle = 20,
dx=0, # Nudges text to right so it doesn't appear on top of the bar
dy=20,
).encode(text='label:O').configure_axis(grid=False)
chart2 = (points + text).properties(width=600, height = 50)
save(chart1 & chart2, r"images\figs.html")
this is what it looks without the grid=False option:
enter image description here
The configure() method should be thought of as a way to specify a global chart theme; you cannot have different configurations within a single Chart (See https://altair-viz.github.io/user_guide/customization.html#global-config-vs-local-config-vs-encoding for a discussion of this).
The way to do what you want is not via global configuration, but via axis settings. For example, you can pass grid=False to alt.Axis:
points = alt.Chart(events).mark_rule(color='blue').encode(
x=alt.X('datum:T', axis=alt.Axis(title="", format="%b %y"), scale = alt.Scale(domain=list(domain_pd) )),
y=alt.Y('const:Q', axis=alt.Axis(title="",ticks=False, domain=False, labels=False), scale = alt.Scale(domain=(0, 10)))
)
text = alt.Chart(events).mark_text().encode(
x=alt.X('datum:T', axis=alt.Axis(title="", grid=False, format="%b %y"), scale = alt.Scale(domain=list(domain_pd) )),
text='label:O'
)

Altair plot, show vertical bars

import pandas as pd
import altair as alt
dicta = {
'date':['2019-06-29', '2019-06-30', '2019-07-01', '2019-07-02', '2019-07-03'],
'amount':[-9.35, -6.42, -13.55, -12.88, -12.24] }
dataset = pd.DataFrame(dicta)
alt.Chart(dataset).mark_bar().encode(
x = "date:T",
y = "amount:N"
)
I'm not sure why this generates horizontal bars, instead of vertical bars by default.
How can I change it? I would like to see a bar per day, up to the amount for the day.
Found the answer. I encoded the numerical column with N. But this is for nominal data, and got Altair confused. Need to use Q/ Quantitative

How to change the limits for geo_shape in altair (python vega-lite)

I am trying to plot locations in three states in the US in python with Altair. I saw the tutorial about the us map but I am wondering if there is anyway to zoom the image to the only three states of interest, i.e. NY,NJ and CT.
Currently, I have the following code:
from vega_datasets import data
states = alt.topo_feature(data.us_10m.url, 'states')
# US states background
background = alt.Chart(states).mark_geoshape(
fill='lightgray',
stroke='white',
limit=1000
).properties(
title='US State Capitols',
width=700,
height=400
).project("albers")
points=alt.Chart(accts).mark_point().encode(
longitude = "longitude",
latitude = "latitude",
color = "Group")
background+points
I inspected the us_10m.url data set and seems like there is no field which specifies the individual states. So I am hoping if I could just somehow change the xlim and ylim for the background to [-80,-70] and [35,45] for example. I want to zoom in to the regions where there are data points(blue dots).
Could someone kindly show me how to do that? Thanks!!
Update
There is a field called ID in the JSON file and I manually found out that NJ is 34, NY is 36 and CT is 9. Is there a way to filter on these IDs? That will get the job done!
Alright seems like the selection/zoom/xlim/ylim feature for geotype is not supported yet:
Document and add warning that geo-position doesn't support selection yet #3305
So I end up with a hackish way to solve this problem by first filtering based on the IDs using pure python. Basically, load the JSON file into a dictionary and then change the value field before converting the dictionary to topojson format. Below is an example for 5 states,PA,NJ,NY,CT,RI and MA.
import altair as alt
from vega_datasets import data
# Load the data, which is loaded as a dict object
us_10m = data.us_10m()
# Select the geometries under states under objects, filter on id (9,25,34,36,42,44)
us_10m['objects']['states']['geometries']=[item for item in us_10m['objects'] \
['states']['geometries'] if item['id'] in [9,25,34,36,42,44]]
# Make the topojson data
states = alt.Data(
values=us_10m,
format=alt.TopoDataFormat(feature='states',type='topojson'))
# Plot background (now only has 5 states)
background = alt.Chart(states).mark_geoshape(
fill='lightgray',
stroke='white',
limit=1000
).properties(
title='US State Capitols',
width=700,
height=400
).project("mercator")
# Plot the points
points=alt.Chart(accts).mark_circle(size=60).encode(
longitude = "longitude",
latitude = "latitude",
color = "Group").project("mercator")
# Overlay the two plots
background+points
The resulting plot looks ok:

Optimal way to display data with different ranges

I have an application which I pull data from an FPGA & display it for the engineers. Good application ... until you start displaying data which are extremely different in ranges...
say: a signal perturbating around +4000 and another around zero (both with small peak-peak).
At the moment the only real workaround is to "export to csv" and then view in Excel but I would like to improve the application so that this isn't needed
Option 1 is a more dynamic pointer that will give you readings of ALL visible plots for the present x
Option 2. Multiple Y axis. This is where it gets a bit ... tight with respect to UI area.
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
import numpy as np
t = np.arange(0,1,0.00001)
data = [5000*np.sin(t*2*np.pi*10),
10*np.sin(t*2*np.pi*20),
20*np.sin(t*2*np.pi*30),
np.sin(t*2*np.pi*40)+5000,
np.sin(t*2*np.pi*50)-5000,
np.sin(t*2*np.pi*60),
np.sin(t*2*np.pi*70),
]
fig = plt.figure()
host = host_subplot(111, axes_class=AA.Axes)
axis_list = [None]*7
for i in range(len(axis_list)):
axis_list[i] = host.twinx()
new_axis = axis_list[i].get_grid_helper().new_fixed_axis
axis_list[i].axis['right'] = new_axis(loc='right',
axes=axis_list[i],
offset=(60*i,0))
axis_list[i].axis['right'].toggle(all=True)
axis_list[i].plot(t,data[i])
plt.show()
for i in data:
plt.plot(t,i)
plt.show()
This code snippet doesn't contain any figure resize to ensure all 7 y-axis are visible BUT ignoring that, you can see it is quite large...
Any advice with respect to multi-Y or a better solution to displaying no more than 7 datasets?

Resources