Altair plot, show vertical bars - python-3.x

import pandas as pd
import altair as alt
dicta = {
'date':['2019-06-29', '2019-06-30', '2019-07-01', '2019-07-02', '2019-07-03'],
'amount':[-9.35, -6.42, -13.55, -12.88, -12.24] }
dataset = pd.DataFrame(dicta)
alt.Chart(dataset).mark_bar().encode(
x = "date:T",
y = "amount:N"
)
I'm not sure why this generates horizontal bars, instead of vertical bars by default.
How can I change it? I would like to see a bar per day, up to the amount for the day.

Found the answer. I encoded the numerical column with N. But this is for nominal data, and got Altair confused. Need to use Q/ Quantitative

Related

Altair dropdown for linear or log scale

I'd like for be able to toggle between log and linear scale in my altair plot. I'd also like to avoid multiple columns of transformed data if possible. I've tried this but get an error AttributeError: 'Scale' object has no attribute 'selection'
import altair as alt
from vega_datasets import data
cars_data = data.cars()
input_dropdown = alt.binding_select(options=['linear','log'], name='Scale')
selection = alt.selection_single(fields=['Miles_per_Gallon'], bind=input_dropdown)
scale = alt.condition(selection, alt.Scale(type = 'linear'), alt.Scale(type = 'log'))
alt.Chart(cars_data).mark_point().encode(
x='Horsepower:Q',
y = alt.Y('Miles_per_Gallon:Q',
scale=scale),
tooltip='Name:N'
).add_selection(
scale
)
I've tried a variety of different things but can't seem to make it work. Any suggestions are greatly appreciated.

Altair repeated chart, add different subplot/chart title

I can't believe I haven't been able to google the answer for this .... in the documented example of repeated charts, how would I add a different sub-chart titles?
import altair as alt
from vega_datasets import data
iris = data.iris.url
alt.Chart(iris).mark_point().encode(
alt.X(alt.repeat("column"), type='quantitative'),
alt.Y(alt.repeat("row"), type='quantitative'),
color='species:N'
).properties(
width=200,
height=200,
title="Chart title",
).repeat(
row=['petalLength', 'petalWidth'],
column=['sepalLength', 'sepalWidth']
).interactive()
adds the same title to each sub-chart. Can I pass in a list of titles here? The figure in this question shows that the same chart title would show up in all columns. The same seems to be the case for my data/code:
I don't think it is possible to change title for repeated charts, but depending on your application you might be able to workaround this by using a transform_fold + faceting instead:
import altair as alt
from vega_datasets import data
iris = data.iris.url
alt.Chart(iris).mark_point().encode(
alt.X('species:N'),
alt.Y('value:Q'),
color='species:N'
).transform_fold(
['petalLength', 'petalWidth', 'sepalLength', 'sepalWidth']
).facet(
'key:N'
)

How do i set the domain of an axis to a value that isn't a multiple of five in Altair?

I'm trying to set the x-axis domain to between 0-36, as some data I'm processing was collected in 6-week increments. Following the documentation i used the scale=alt.Scale(domain=[0,36]). However, this continues to show the chart up to 40.
df = pd.DataFrame({'x':[0,6,12,18,24,30,36],'y':[0,3,1,4,2,5,3]})
alt.Chart(df).mark_line(point=True).encode(
x=alt.X('x:Q',
axis=alt.Axis(values=[0,6,12,18,24,30,36]),
scale=alt.Scale(domain=[0,36])),
y=alt.Y('y:Q'),
)
Output of code above
Changing the above code to cut off between 30 and 35 i.e., scale=alt.Scale(domain=[0,31]) generates this behavior, where the chart axis gets truncated at 30 (but shows the data after 30, appropriately since the data hasn't been clipped).
But why can't I cut off the graph at values that aren't multiples of 5?
I'm using Altair v4.0.1
The Vega-Lite renderer defaults to choosing "nice" values for the scale. If you want to disable this behavior, you can pass nice=False:
import pandas as pd
import altair as alt
df = pd.DataFrame({'x':[0,6,12,18,24,30,36],'y':[0,3,1,4,2,5,3]})
alt.Chart(df).mark_line(point=True).encode(
x=alt.X('x:Q',
axis=alt.Axis(values=[0,6,12,18,24,30,36]),
scale=alt.Scale(domain=[0,36], nice=False)),
y=alt.Y('y:Q'),
)

Show only some bar labels for matplotlib bar chart

I have a bar chart with a lot of columns (around 100). I want to show only some of the bar labels (they are ordered in such a way that this is a perfectly reasonable way to present the data). Is there a simple way to do this, say show every 3rd or 5th label? I know I can manually pull together the list, but I figure there's likely an elegant option.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(100)
groupings = np.arange(0,100)
x_pos = [i for i, _ in enumerate(groupings)]
plt.bar(x_pos,data)
plt.xticks(x_pos,groupings)

Creating a structured grid of subplots with Seaborn FacetGrid

My attempt to use FacetGrid in Seaborn does not produces the expected results.
Moreover, I would like to control the white space in the grid.
My data and code is the following:
toy.to_json()
'{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}'
g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year')
g.map(sns.barplot, 'reg_month', 'Total_Revenue')
g.add_legend();
If I use bar in pyplot I get this:
g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year')
g.map(plt.bar, 'reg_month', 'Total_Revenue')
g.add_legend();
Again, I would like to be able to define the white space of the grid.
In addition I would not like to have the bars stacked one over the other but rather one next to the other.
Some values of the year 2018 are really large compared to the any of the values where has_cus_id_but_not_acc_id is 1. Hence the right plot is almost empty. It might make sense to use a logarithmic scale.
Now you have 6 years, so each month would need to show 6 bars next to each other. That will make bars pretty small and does not let the chart be easily readable. Still it's possible.
The following does not use seaborn, but pandas and matplotlib:
import matplotlib.pyplot as plt
import pandas as pd
toy = '{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}'
df = pd.read_json(toy)
df['reg_year'].astype(int)
u = df["has_cus_id_but_not_acc_id"].unique()
y = df['reg_year'].unique()
fig, axes = plt.subplots(1,len(u), sharey=True)
axes[0].set_yscale("log")
for ax, (n, grp) in zip(axes.flat, df.groupby("has_cus_id_but_not_acc_id")):
piv = grp.pivot('reg_month', 'reg_year', 'Total_Revenue')
empty = pd.DataFrame(index=range(1,12), columns=y)
empty.combine_first(piv).plot.bar(ax=ax, width=0.8, legend=False)
axes[1].legend()
plt.show()

Resources