How to customize image mark width and height to act as barchart bars in altair? - altair

I am exploring the image mark in altair. I tried to make bar charts with image as bars,
source = pd.DataFrame({
'a': ['A', 'B', 'C'],
'b': [28, 12, 77],
'url': ['https://vega.github.io/vega-datasets/data/7zip.png',
'https://vega.github.io/vega-datasets/data/gimp.png',
'https://vega.github.io/vega-datasets/data/ffox.png']
})
init = alt.Chart(source).mark_image(
# width= 50,
).encode(
x='a',
y='b',
url = 'url',
size=alt.Size('b:N', scale=None),
# color = 'a'
).properties(
width=512,
height=512
).configure_axis(
grid=False
)
my current result is like this:
but I want to make the height of images corresponding to y value while keep same width, like this:
Am I able to achieve this by altair? Thanks!

The documentation mentions an aspect parameter which will do what you are looking for, but I couldn't find an example.
Vega-lite, on which Altair is based, allows changing the image aspect ratio as seen here. You can try looking at the source of the example there and figure out how to make it work in Altair.

Related

adjusting properties (e.g. width and height) on a layered faceted chart produce a 'data is a required property error'

I'm trying to adjust the width and height of a faceted layered chart. I have these two charts:
bar_chart = alt.Chart().mark_bar().encode(x='x', y='mean(y)')
text_overlay = bar_chart.mark_text().encode(text='mean(y)')
if I try to adjust the width after I layered the chart with:
alt.layer(bar_chart, text_overlay, data=df).facet('z').properties(width=100)
I get a 'data' is a required property error.
I can change the width and height by adjusting one of the original charts with:
bar_chart = alt.Chart().mark_bar().encode(x='x', y='mean(y)').properties(width=100, height=200)
but I'm trying to return this chart as the output of a function, so I'd like to allow the user to adjust the properties outside of the function.
Is there any way around this error that doesn't require to apply the properties to the original charts?
Thank you.
I think you can only use .properties when using facet as an encoding, but that is not compatible with layering. You could use the object oriented syntax to set the property after creation of the faceted layered chart:
import altair as alt
import pandas as pd
chart = alt.Chart(pd.DataFrame({'x': [1, 2], 'y': ['b', 'a']})).mark_point().encode(x='x', y='y')
chart_layered = (chart + chart).facet(facet='y')
chart_layered.spec.width = 100
chart_layered
To figure out which these attribues are, you could create a faceted layered chart using .properties the right way and study its dictionary or json output:
chart = alt.Chart(pd.DataFrame({'x': [1, 2], 'y': ['b', 'a']})).mark_point().encode(x='x', y='y').properties(width=100).facet(facet='y')
chart.to_dict()
{'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}},
'data': {'name': 'data-5f40ae3874157bbf64df213f9a844d59'},
'facet': {'type': 'nominal', 'field': 'y'},
'spec': {'mark': 'point',
'encoding': {'x': {'type': 'quantitative', 'field': 'x'},
'y': {'type': 'nominal', 'field': 'y'}},
'width': 100},
'$schema': 'https://vega.github.io/schema/vega-lite/v4.8.1.json',
'datasets': {'data-5f40ae3874157bbf64df213f9a844d59': [{'x': 1, 'y': 'b'},
{'x': 2, 'y': 'a'}]}}

Python & Plotly: Adding Hover Data to Density Heat Map

As a trivial example, I'm using the first heat map shown on the Plotly 2D Histograms webpage. The documentation references the hover_data parameter but I'm unable to display additional data. The data frame in the example include these columns:
>>> df.columns
Index(['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'], dtype='object')
According to the said documentation, hover data, such as "size" can be added like this:
>>> fig = px.density_heatmap(df, x="total_bill", y="tip", hover_data=['size'])
>>> fig.show()
However, the generated plot only shows "total_bill", "tip", and "count" in the hover data. What am I missing?
This is definitely a bug with px.density_heatmap. After running fig = px.density_heatmap(df, x="total_bill", y="tip", hover_data=['size']), the hovertemplate should include the size column, but hovertemplate string doesn't include the correct information.
fig.data[0].hovertemplate
'total_bill=%{x}<br>tip=%{y}<br>count=%{z}<extra></extra>'
For the sake of comparison, if we run: fig = px.scatter(df, x="total_bill", y="tip", hover_data=['size']), we can see that the hovertemplate does include the size column embedded in the customdata:
fig.data[0].hovertemplate
'total_bill=%{x}<br>tip=%{y}<br>size=%{customdata[0]}<extra></extra>'
You probably need to use Plotly graph_objects for the time being to display additional df columns in your heatmap when you hover. I can circle back on this answer to show you if you would like!
Thanks to Derek's suggestion and this SO Q&A, I used hover_template, graph_objects, and customdata to plot the data but the custom hover data for the "Smokes" field is not displayed.
import plotly.graph_objects as go
import plotly.express as px
df = px.data.tips()
fig = go.Figure(
data=go.Histogram2d(
x=df['total_bill'],
y=df['tip'],
z=df['size'],
histfunc='sum',
customdata=[df['smoker']]
)
)
fig.update_traces(
hovertemplate='<br>'.join([
'Bill $: %{x}',
'Tip $: %{y}',
'Size: %{z}',
'Smokes: %{customdata[0]}'
])
)
fig.show()

matplotlib: controlling position of y axis label with multiple twinx subplots

I wrote a Python script based on matplotlib that generates curves based on a common timeline. The number of curves sharing the same x axis in my plot can vary from 1 to 6 depending on user options.
Each of the data plotted use different y scales and require a different axis for drawing. As a result, I may need to draw up to 5 different Y axes on the right of my plot. I found the way in some other post to offset the position of the axes as I add new ones, but I still have two issues:
How to control the position of the multiple axes so that the tick labels don't overlap?
How to control the position of each axis label so that it is placed vertically at the bottom of each axis? And how to preserve this alignment as the display window is resized, zoomed-in etc...
I probably need to write some code that will first query the position of the axis and then a directive that will place the label relative to that position but I really have no idea how to do that.
I cannot share my entire code because it is too big, but I derived it from the code in this example. I modified that example by adding one extra plot and one extra axis to more closely match what intend to do in my script.
import matplotlib.pyplot as plt
def make_patch_spines_invisible(ax):
ax.set_frame_on(True)
ax.patch.set_visible(False)
for sp in ax.spines.values():
sp.set_visible(False)
fig, host = plt.subplots()
fig.subplots_adjust(right=0.75)
par1 = host.twinx()
par2 = host.twinx()
par3 = host.twinx()
# Offset the right spine of par2. The ticks and label have already been
# placed on the right by twinx above.
par2.spines["right"].set_position(("axes", 1.2))
# Having been created by twinx, par2 has its frame off, so the line of its
# detached spine is invisible. First, activate the frame but make the patch
# and spines invisible.
make_patch_spines_invisible(par2)
# Second, show the right spine.
par2.spines["right"].set_visible(True)
par3.spines["right"].set_position(("axes", 1.4))
make_patch_spines_invisible(par3)
par3.spines["right"].set_visible(True)
p1, = host.plot([0, 1, 2], [0, 1, 2], "b-", label="Density")
p2, = par1.plot([0, 1, 2], [0, 3, 2], "r-", label="Temperature")
p3, = par2.plot([0, 1, 2], [50, 30, 15], "g-", label="Velocity")
p4, = par3.plot([0,0.5,1,1.44,2],[100, 102, 104, 108, 110], "m-", label="Acceleration")
host.set_xlim(0, 2)
host.set_ylim(0, 2)
par1.set_ylim(0, 4)
par2.set_ylim(1, 65)
host.set_xlabel("Distance")
host.set_ylabel("Density")
par1.set_ylabel("Temperature")
par2.set_ylabel("Velocity")
par3.set_ylabel("Acceleration")
host.yaxis.label.set_color(p1.get_color())
par1.yaxis.label.set_color(p2.get_color())
par2.yaxis.label.set_color(p3.get_color())
par3.yaxis.label.set_color(p4.get_color())
tkw = dict(size=4, width=1.5)
host.tick_params(axis='y', colors=p1.get_color(), **tkw)
par1.tick_params(axis='y', colors=p2.get_color(), **tkw)
par2.tick_params(axis='y', colors=p3.get_color(), **tkw)
par3.tick_params(axis='y', colors=p4.get_color(), **tkw)
host.tick_params(axis='x', **tkw)
lines = [p1, p2, p3, p4]
host.legend(lines, [l.get_label() for l in lines])
# fourth y axis is not shown unless I add this line
plt.tight_layout()
plt.show()
When I run this, I obtain the following plot:
output from above script
In this image, question 2 above means that I would want the y-axis labels 'Temperature', 'Velocity', 'Acceleration' to be drawn directly below each of the corresponding axis.
Thanks in advance for any help.
Regards,
L.
What worked for me was ImportanceOfBeingErnest's suggestion of using text (with a line like
host.text(1.2, 0, "Velocity" , ha="left", va="top", rotation=90,
transform=host.transAxes))
instead of trying to control the label position.

Adding a second label to colorbar

I have an imshow plot with a colorbar. I want two labels in the colorbar, one on the left side and the other one on the right side.
This is the mve:
V = np.array([[1, 2, 3], [4, 5, 6]]) # Just a sample array
plt.imshow(V, cmap = "hot", interpolation = 'none')
clb = plt.colorbar()
clb.set_label("Firstlabel", fontsize=10, labelpad=-40, y=0.5, rotation=90)
#clb.set_label("SECONDLABEL") # This is the label I want to add
plt.savefig("Example")
This produces:
I want a second label on the right side of the colorbar. If I use the commented line a second colorbar is added to my plot, and that is not what I want. How can I do this?
You can't have two label objects, but you could add a second label using clb.ax.text.
Also, note that to move the first label to the left hand side, you could use clb.ax.yaxis.set_label_position('left') rather than labelpad=-40
So, using lines:
clb = plt.colorbar()
clb.set_label("Firstlabel", fontsize=10, y=0.5, rotation=90)
clb.ax.yaxis.set_label_position('left')
clb.ax.text(2.5, 0.5, "SECONDLABEL", fontsize=10, rotation=90, va='center')
Produces this figure:

Bokeh secondary y range affecting primary y range

I'm working on building a Bokeh plot using bokeh.plotting. I have two series with a shared index that I want to plot two vertical bars for. When I use a single bar everything works fine, but when I add a second y range and the second bar it seems to be impacting the primary y range (changes the vales from 0 to 4), and my second vbar() overlays the first. Any assistance on why the bars overlap instead of being side by side and why the second series/yaxis seems to impact the first even though they are separate would be appreciated.
import pandas as pd
import bokeh.plotting as bp
from bokeh.models import NumeralTickFormatter, HoverTool, Range1d, LinearAxis
df_x_series = ['a','b','c']
fig = bp.figure(title='WIP',x_range=df_x_series,plot_width=1200,plot_height=600,toolbar_location='below',toolbar_sticky=False,tools=['reset','save'],active_scroll=None,active_drag=None,active_tap=None)
fig.title.align= 'center'
fig.extra_y_ranges = {'c_count':Range1d(start=0, end=10)}
fig.add_layout(LinearAxis(y_range_name='c_count'), 'right')
fig.vbar(bottom=0, top=[1,2,3], x=['a','b','c'], color='blue', legend='Amt', width=0.3, alpha=0.5)
fig.vbar(bottom=0, top=[5,7,8], x=['a','b','c'], color='green', legend='Ct', width=0.3, alpha=0.8, y_range_name='c_count')
fig.yaxis[0].formatter = NumeralTickFormatter(format='0.0')
bp.output_file('bar.html')
bp.show(fig)
Here's the plot I believe you want:
And here's the code:
import bokeh.plotting as bp
from bokeh.models import NumeralTickFormatter, Range1d, LinearAxis
df_x_series = ['a', 'b', 'c']
fig = bp.figure(
title='WIP',
x_range=df_x_series,
y_range=Range1d(start=0, end=4),
plot_width=1200, plot_height=600,
toolbar_location='below',
toolbar_sticky=False,
tools=['reset', 'save'],
active_scroll=None, active_drag=None, active_tap=None
)
fig.title.align = 'center'
fig.extra_y_ranges = {'c_count': Range1d(start=0, end=10)}
fig.add_layout(LinearAxis(y_range_name='c_count'), 'right')
fig.vbar(bottom=0, top=[1, 2, 3], x=['a:0.35', 'b:0.35', 'c:0.35'], color='blue', legend='Amt', width=0.3, alpha=0.5)
fig.vbar(bottom=0, top=[5, 7, 8], x=['a:0.65', 'b:0.65', 'c:0.65'], color='green', legend='Ct', width=0.3, alpha=0.8, y_range_name='c_count')
fig.yaxis[0].formatter = NumeralTickFormatter(format='0.0')
bp.output_file('bar.html')
bp.show(fig)
A couple of notes:
Categorical axes are currently a bit (ahem) ugly in Bokeh. We hope to address this in the coming months. Each one has a scale of 0 - 1 after a colon which allows you to move things left and right. So I move the first bar to the left by 0.3/2 and the second bar to the right by 0.3/2 (0.3 because that's the width you had used)
The y_range changed because you were using the default y_range for your initial y_range which is a DataRange1d. DataRange uses all the data for the plot to pick its values and adds some padding which is why it was starting at below 0 and going up to the max of your new data. By manually specifying a range in the figure call you get around this.
Thanks for providing a code sample to work from :D

Resources