Altair - Undefined Color Legend when adding a Regression Line - altair

Altair added an undefined color legend after Regression line is added.
It looks like because the regression line introduced a new color to the chart.
I have tried to change the Regression line color by adding color parameter to mark_line(color="red"),
but it would not work (regression line color did not change to red).
I will be grateful for any help.
Below is my code:
test = alt.Chart(sortedGNI).mark_circle().encode(
alt.X('CO2 Emission:Q', title="CO2 Emission per Capita 2019"),
alt.Y('Human Development Index:Q'),
alt.Color('GNI cat:O', scale=alt.Scale(scheme='redyellowgreen'),
sort=["Very High GNI", "High GNI", "Mid GNI", "Low GNI"],
title=["Gross National Income ", "Per Capita"]),
tooltip = ['Country:N', 'Gross National Income Per Capita:Q', 'Human Development Index:Q', 'CO2 Emission:Q']
)
test + test.transform_regression("CO2 Emission", "Human Development Index", method="pow").mark_line()

This should work as well:
test + test.transform_regression("CO2 Emission",
"Human Development Index", method="pow").mark_line().encode(color =
alt.Color(legend=None))

Related

Plotly does not color-code my chart according to the variable indicated

I have the problem that when I concatenate two csv and create a column "Version" with string values, plotly does NOT generate the classification of each of the values, in this case 1 and 2. However, when this variable is numeric, it does generate a continuous classification (I need a discrete classification).
Image shows concatenation and data type
Two legends are observed, but only one color is observed
This is an example of how it doesn't work. However, if I change the variable to continuous, it does work.
I am generating this block to generate graphs for each type of group. Which should be my final result.
grouped = df_final.groupby('Name')
plots = []
for name, group in grouped:
# create a new figure for the group
fig = px.scatter(group.reset_index(), x="Time", y="Observed", opacity=1, width=800, height=600,
color = "Version)
fig.show()
This block generates the chart with its legend, but does not show the colors in the chart.
I am starting with python and plotly, any help would be appreciated.
I'm trying to understand why plotly doesn't sort my string variables

Pie Chart with Labels and facet in altair: not able to proper render the value labels

Using the single chart example, I have a label for each piece of cake.
If I try to transform it in a faceted chart using this code
df=pd.read_csv("input.csv",keep_default_na=False)
base=alt.Chart(df).encode(
theta=alt.Theta(field="v", type="quantitative"),
color=alt.Color(field="k", type="nominal")
)
pie = base.mark_arc(outerRadius=100)
text = base.mark_text(radius=115,fill= "black").encode(alt.Text(field="v", type="quantitative", format=",.1f"))
alt.layer(pie, text, data=df).facet(column='label')
all the labels are all in the same wedge and then illegible (here the vega lite version vega lite version).
how to have a result similar to that of the single chart?
Thank you
f,n,k,v,label
1,3,0-5 %,99.7289972899729,Forest
1,4,5-10 %,0.27100271002710025,Forest
0,1,0-5 %,100.0,Non-Forest
254,5,0-5 %,99.0077177508269,unclassifiable
254,6,5-10 %,0.9922822491730982,unclassifiable
I must add:
stack=True as channel encoding option in the theta channel encoding;
and .resolve_scale( theta="independent" ) to the chart.
And it works (I learned this thanks to Mattijn van Hoek).
df=pd.read_csv("input.csv",keep_default_na=False)
base=alt.Chart(df).encode(
theta=alt.Theta(field="v", type="quantitative", stack=True),
color=alt.Color(field="k", type="nominal")
)
pie = base.mark_arc(outerRadius=100)
text = base.mark_text(radius=115,fill= "black").encode(alt.Text(field="v", type="quantitative", format=",.1f"))
alt.layer(pie, text, data=df).facet(column='label').resolve_scale(theta="independent")

How to add text on interactive Scatter on Altair?

I try to adapt the Selection Detail Example from altair doc (https://altair-viz.github.io/gallery/select_detail.html#selection-detail-example).
I won't detailed my Dataframe structure which is identical with the one from the example (included variable names).
The native code is working well :
# Data is prepared, now make a chart
selector = alt.selection_single(empty='all', fields=['id'])
base = alt.Chart(data).properties(
width=250,
height=250
).add_selection(selector)
points = base.mark_point(filled=True, size=200,opacity=0.9).encode(
x=alt.X('mean(y)',title='Durée de perception',scale=alt.Scale(domain=(11, 23))),
y=alt.Y('mean(x)',title='Taux de marge (%PM)'),
color=alt.condition(selector, 'id:O', alt.value('lightgray')),
tooltip = ['mean(y)','mean(x)']
)
timeseries = base.mark_bar(opacity=1).encode(
x=alt.X('time', title='Items'),
y=alt.Y('value', scale=alt.Scale(domain=(-1, 1)),stack=None),
color=alt.Color('id:O',scale=alt.Scale(domain=domain, range=range_))
#, legend=None)
).transform_filter(
selector
)
points | timeseries
No problem at this stage even if it could be useful to hide all the bars on right chart when no selection is made on the right chart (don't know if it's possible ?)
After that I try to add text to the scatter plot adding this at the end of the code :
text = points.mark_text(dy=-5).encode(
x=alt.X('mean(y)',title='Durée de perception',scale=alt.Scale(domain=(11, 23))),
y=alt.Y('mean(x)',title='NBV (%CA)'),
text='id:O'
)
(points + text) | timeseries
which leads to the following error message :
Javascript Error: Duplicate signal name: "selector094_tuple"
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.
If you have any idea on how to do, i would be grateful
Thanks
The issue is that you cannot add the same selection to two different layers, which you do implicitly by deriving text from points. Try this instead:
text = alt.Chart(data).mark_text(dy=-5).encode(
x=alt.X('mean(y)',title='Durée de perception',scale=alt.Scale(domain=(11, 23))),
y=alt.Y('mean(x)',title='NBV (%CA)'),
text='id:O'
)
(points + text) | timeseries

Bumbling around plotting two sets of seasonal data on the same chart

I have series of monthly inventory data since 2017.
I have a series of inventory_forecasts since Dec2018
I am trying to plot the inventory data on a monthly-seasonal basis, and then overlay the inventory_forecasts of Jan2019 through Dec2019.
The dataframe looks like:
The first way I tried to make the chart does show all the data I want, but I'm unable to control the color of the inventory_zj line. Its color seems to be dominated by the color=year(date):N of the alt.Chart I configured. It is ignoring the color='green' I pass to the mark_line()
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory").mark_line().encode(
x='month',
y="inventory",
color="year(date):N"
)
#this ignores my 'green' color instruction, and marks it the same light blue 2019 color
joe = base.mark_line(color='green').encode(
alt.Y('inventory_zj', scale=alt.Scale(zero=False), )
)
base+joe
I tried to use a layering system, but it's not working at all -- I cannot get it to display the "joe" layer
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory").encode(
x='month(date)'
)
doe = base.mark_line().encode(
alt.Y('inventory', scale=alt.Scale(zero=False), ),
color="year(date):N"
)
joe = base.mark_line(color="green").encode(
alt.Y('inventory_zj', scale=alt.Scale(zero=False), ),
)
#looks identical to the first example
alt.layer(
doe, joe
).resolve_scale(
y='shared'
).configure_axisLeft(labelColor='black').configure_axisRight(labelColor='green',titleColor='green')
#independent shows a second y-axis (which is different from the left y-axis) but no line
alt.layer(
doe, joe
).resolve_scale(
y='independent'
).configure_axisLeft(labelColor='black').configure_axisRight(labelColor='green',titleColor='green')
I feel like i must be trying to assemble this chart in a fundamentally wrong way. I should be able to share teh same left y-axis, have the historic data colored by its year, and have a unique color for the 2019-forecasted data. But I seem to be making a mess of it.
As mentioned in the Customizing Visualizations docs, there are multiple ways to specify things like line color, with a well-defined hierarchy: encodings override mark properties, which override top-level configurations.
In your chart, you write base.mark_point(color='green'), where base contains a color encoding which overrides the mark property. If you don't derive the layer from base (so that it does not have a color encoding), then the line will be green as you hoped. Something like this:
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory")
inventory = base.mark_line().encode(
x='month',
y="inventory",
color="year(date):N"
)
joe = base.mark_line(color='green').encode(
x='month',
y=alt.Y('inventory_zj', scale=alt.Scale(zero=False))
)
inventory + joe

Plot points playing games when I try to size by a value count in bokeh

I'm trying to get the plot points in a scatter graph to size according to the frequency of values in a column of data. The data is coming from a questionnaire.
My questions are: What am I doing wrong, and what can I do to fix it?
I can push out a simple plot with x and y values coming from 2 columns of data. The X axis represents a level (1-100), and the Y axis represents a choice users can make for each level (1-4). For this plot I want to track how many people choose 1-4 on each level - so I need to capture that 1-4 has been selected, then indicate how many times.
Simple plot works fine, though those points have multiple occurrences.
Here's the code for that:
# Set up the graph
WT_Number = data.wt # This is the X axis
CFG_Number = data.cfg # This is the Y axis
wt_cfg_plot = figure(plot_width=1000, plot_height=400,
title="Control Form Groups chosen by WT unit")
# Set up the plot points, including the Hover Tool
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=7,
fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Problem: I then added a value count and set it as the size, to get the plot points to adjust according to the value frequency. But now it pumps out this chart and throws an error:
Plot points are reacting to the code, but now they're doing their own thing.
I added a variable for the value counts (cfg_freq), and used that as the size:
cfg_freq = data['cfg'].value_counts()*4
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=cfg_freq, fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Here's the the last part of the error being thrown:
File "/Applications/anaconda/lib/python3.5/site-packages/bokeh/core/properties.py", line 722, in setattr
(name, self.class.name, text, nice_join(matches)))
AttributeError: unexpected attribute 'size' to Chart, possible attributes are above, background_fill_alpha, background_fill_color, below, border_fill_alpha, border_fill_color, disabled, extra_x_ranges, extra_y_ranges, h_symmetry, height, hidpi, left, legend, lod_factor, lod_interval, lod_threshold, lod_timeout, logo, min_border, min_border_bottom, min_border_left, min_border_right, min_border_top, name, outline_line_alpha, outline_line_cap, outline_line_color, outline_line_dash, outline_line_dash_offset, outline_line_join, outline_line_width, plot_height, plot_width, renderers, responsive, right, tags, title, title_standoff, title_text_align, title_text_alpha, title_text_baseline, title_text_color, title_text_font, title_text_font_size, title_text_font_style, tool_events, toolbar_location, tools, v_symmetry, webgl, width, x_mapper_type, x_range, xgrid, xlabel, xscale, y_mapper_type, y_range, ygrid, ylabel or yscale

Resources