Bumbling around plotting two sets of seasonal data on the same chart - altair

I have series of monthly inventory data since 2017.
I have a series of inventory_forecasts since Dec2018
I am trying to plot the inventory data on a monthly-seasonal basis, and then overlay the inventory_forecasts of Jan2019 through Dec2019.
The dataframe looks like:
The first way I tried to make the chart does show all the data I want, but I'm unable to control the color of the inventory_zj line. Its color seems to be dominated by the color=year(date):N of the alt.Chart I configured. It is ignoring the color='green' I pass to the mark_line()
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory").mark_line().encode(
x='month',
y="inventory",
color="year(date):N"
)
#this ignores my 'green' color instruction, and marks it the same light blue 2019 color
joe = base.mark_line(color='green').encode(
alt.Y('inventory_zj', scale=alt.Scale(zero=False), )
)
base+joe
I tried to use a layering system, but it's not working at all -- I cannot get it to display the "joe" layer
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory").encode(
x='month(date)'
)
doe = base.mark_line().encode(
alt.Y('inventory', scale=alt.Scale(zero=False), ),
color="year(date):N"
)
joe = base.mark_line(color="green").encode(
alt.Y('inventory_zj', scale=alt.Scale(zero=False), ),
)
#looks identical to the first example
alt.layer(
doe, joe
).resolve_scale(
y='shared'
).configure_axisLeft(labelColor='black').configure_axisRight(labelColor='green',titleColor='green')
#independent shows a second y-axis (which is different from the left y-axis) but no line
alt.layer(
doe, joe
).resolve_scale(
y='independent'
).configure_axisLeft(labelColor='black').configure_axisRight(labelColor='green',titleColor='green')
I feel like i must be trying to assemble this chart in a fundamentally wrong way. I should be able to share teh same left y-axis, have the historic data colored by its year, and have a unique color for the 2019-forecasted data. But I seem to be making a mess of it.

As mentioned in the Customizing Visualizations docs, there are multiple ways to specify things like line color, with a well-defined hierarchy: encodings override mark properties, which override top-level configurations.
In your chart, you write base.mark_point(color='green'), where base contains a color encoding which overrides the mark property. If you don't derive the layer from base (so that it does not have a color encoding), then the line will be green as you hoped. Something like this:
base = alt.Chart(inv.loc['2000':].reset_index(), title=f"usa total inventory")
inventory = base.mark_line().encode(
x='month',
y="inventory",
color="year(date):N"
)
joe = base.mark_line(color='green').encode(
x='month',
y=alt.Y('inventory_zj', scale=alt.Scale(zero=False))
)
inventory + joe

Related

Displaying multiple values in Altair/Streamlit tooltips on a bar chart

My DataFrame looks similar to this:
name
reached points
Jose Laderman
13
William Kane
13
I am currently displaying the aggregated count of students reached points of an assignment on an Altair bar chart within Streamlit like this:
brush = alt.selection(type='interval', encodings=['x'])
interactive_test = alt.Chart(df_display_all).mark_bar(opacity=1, width=5).encode(
x= alt.X('reached points', scale=alt.Scale(domain=[0, maxPoints])),
y=alt.Y('count()', type='quantitative', axis=alt.Axis(tickMinStep=1), title='student count'),
).properties(width=1200)
upper = interactive_test.encode(
alt.X('reached points', sort=alt.EncodingSortField(op='count', order='ascending'), scale=alt.Scale(domain=brush, domainMin=-0.5))
)
lower = interactive_test.properties(
height=60
).add_selection(brush)
concat_distribution_interactive = alt.vconcat(upper, lower)
Which produces this output and everything looks fine
The information I want my tooltip to show is a list of students that reached the specific amounts of reached points I'm hovering over. When adding something like:
tooltip='name'
the way my bar chart seems to display values has now been altered to this
When adding something like
tooltip='reached points'
The data seems to be displayed normally but without a tooltip that gives me the necessary information. Is it possible to display tooltip data that isn't used in my x or y axis but still part of the DataFrame I'm putting into the chart?

Change color and legend of plotLearnerPrediction ggplot2 object

I've been producing a number of nice plots with the plotLearnerPrediction function in the mlr package for R. They look like this. From looking into the source code of the plotLearnerPrediction function it looks like the color surfaces are made with geom_tile.
A plot can for example be made by:
library(mlr)
data(iris)
#make a learner
lrn <- "classif.qda"
#make a task
my.task <- makeClassifTask(data = iris, target = "Species")
#make plot
plotLearnerPrediction(learner = lrn, task = my.task)
Now I wish to change the colors, using another red, blue and green tone to match those of some other plots that I've made for a project. for this I tried scale_fill_continuous and scale_fill_manual without any luck (Error: Discrete value supplied to continuous scale) I also wish to change the legend title and the labels for each legend entry (Which I tried giving appropriate parameters to the above scale_fill's). There's a lot of info out there on how to set the geom_tile colours when producing the plot, but I haven't found any info on how to do this post-production (i.e. in somebody else's plot object). Any help would be much appreciated.
When you look into the source code you see how the plot is generated and then you can see which scale has to be overwritten or set.
In this example it's fairly easy:
g = plotLearnerPrediction(learner = lrn, task = my.task)
library(ggplot2)
g + scale_fill_manual(values = c(setosa = "yellow", versicolor = "blue", virginica = "red"))

Setting multiple axvspan labels as one element in legend

I am trying to set up a series of vertical axis spans to symbolize different switching positions at different times. For example, in the figure below, switching position 1 (green) happens quite a few times, alternating between other positions.
I plot these spans running a for loop in a list of tuples, each containing the initial and final indexes of each interval to plot the axvspan.
def plotShades(timestamp, intervals, colour):
for i in range(len(intervals)):
md.plt.axvspan(timestamp[intervals[i][0]], timestamp[intervals[i][1]], alpha=0.5, color=colour, label="interval")
This function is then called upon another one, that plots the shades for each different switching position:
def plotAllOutcomes(timestamp, switches):
#switches is a list of 7 arrays indicating when the switcher is at each one of the 7 positions. If the array has a 1 value, the switcher is there. 0 otherwise.
colors = ['#d73027', '#fc8d59', '#fee08b', '#ffffbf', '#d9ef8b', '#91cf60', '#1a9850']
intervals = []
for i in range(len(switches)):
intervals.append(getIntervals(switches[i], timestamp))
plotShades(timestamp, intervals[i], colors[i])
md.plt.legend()
Doing so with the code snippets I've put here (not the best code, I know - I'm fairly new in Python!) the legend ends up having one item for each interval, and that's pretty awful. This is how it looks:
I'd like to get a legend with only 7 items, each for a single color in my plot of axvspans. How can I proceed to do so? I've searched quite extensively but haven't managed to find this situation being asked before. Thank you in advance for any help!!
A small trick you can apply using the fact that labels starting with "_" are ignored:
plt.axvspan( ... , label = "_"*i + "interval")
Thereby a label is only created for the case where i==0.

Plot points playing games when I try to size by a value count in bokeh

I'm trying to get the plot points in a scatter graph to size according to the frequency of values in a column of data. The data is coming from a questionnaire.
My questions are: What am I doing wrong, and what can I do to fix it?
I can push out a simple plot with x and y values coming from 2 columns of data. The X axis represents a level (1-100), and the Y axis represents a choice users can make for each level (1-4). For this plot I want to track how many people choose 1-4 on each level - so I need to capture that 1-4 has been selected, then indicate how many times.
Simple plot works fine, though those points have multiple occurrences.
Here's the code for that:
# Set up the graph
WT_Number = data.wt # This is the X axis
CFG_Number = data.cfg # This is the Y axis
wt_cfg_plot = figure(plot_width=1000, plot_height=400,
title="Control Form Groups chosen by WT unit")
# Set up the plot points, including the Hover Tool
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=7,
fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Problem: I then added a value count and set it as the size, to get the plot points to adjust according to the value frequency. But now it pumps out this chart and throws an error:
Plot points are reacting to the code, but now they're doing their own thing.
I added a variable for the value counts (cfg_freq), and used that as the size:
cfg_freq = data['cfg'].value_counts()*4
cr = wt_cfg_plot.scatter(WT_Number, CFG_Number, size=cfg_freq, fill_color="blue",
line_color=None, alpha=0.7, hover_fill_color="firebrick",
hover_line_color=None, hover_alpha=1)
Here's the the last part of the error being thrown:
File "/Applications/anaconda/lib/python3.5/site-packages/bokeh/core/properties.py", line 722, in setattr
(name, self.class.name, text, nice_join(matches)))
AttributeError: unexpected attribute 'size' to Chart, possible attributes are above, background_fill_alpha, background_fill_color, below, border_fill_alpha, border_fill_color, disabled, extra_x_ranges, extra_y_ranges, h_symmetry, height, hidpi, left, legend, lod_factor, lod_interval, lod_threshold, lod_timeout, logo, min_border, min_border_bottom, min_border_left, min_border_right, min_border_top, name, outline_line_alpha, outline_line_cap, outline_line_color, outline_line_dash, outline_line_dash_offset, outline_line_join, outline_line_width, plot_height, plot_width, renderers, responsive, right, tags, title, title_standoff, title_text_align, title_text_alpha, title_text_baseline, title_text_color, title_text_font, title_text_font_size, title_text_font_style, tool_events, toolbar_location, tools, v_symmetry, webgl, width, x_mapper_type, x_range, xgrid, xlabel, xscale, y_mapper_type, y_range, ygrid, ylabel or yscale

two textplots in one plot

I have been trying to work with textplot in R and am unsure if my question is possible or not, I know that par() can't be used to place two textplots in one plot. I have been using a page and this code to try and figure things out.
My question is: Is it possible to have two textplots within the same plot?
For example, in the par(mfrow=c(1,1)) scenario below, plot 1 is a texplot of species length. Say I wanted to replicate that textplot twice in that plot. Is that possible?
based on this site:
http://svitsrv25.epfl.ch/R-doc/library/gplots/html/textplot.html
textplot(version)
data(iris)
par(mfrow=c(1,1))
info <- sapply( split(iris$Sepal.Length, iris$Species),
function(x) round(c(Mean=mean(x), SD=sd(x), N=gdata::nobs(x)),2) )
textplot( info, valign="top" )
title("Sepal Length by Species")
What I want to do is put a second textplot within that plot, underneath the original. For arguments sake, replicating that textplot twice in the plot.
Is this possible?
Thanks!
Maybe you've figured it out in the last four months but I thought I'd chip in an answer anyway.
The code provided is most of the way towards doing what you require already, you just have to provide some additional inputs to title() and/or par(). Namely specify that the title is to be above both of the plots by using title("your title", outer = TRUE) and you can further adjust the position of the title with an option in par(), use par(mfrow = c(2,1), oma = c(0,0,"top",0)). Hopefully this answers your question.
require('gplots')
data(iris)
info <- sapply(split(iris$Sepal.Length, iris$Species),
function(x) round(c(Mean = mean(x), SD = sd(x), N = gdata::nobs(x)),2))
## Replace top with a numerical value to control the position of the title with respect to the
## top of the page.
par(mfrow = c(2,1), oma = c(0,0, top ,0))
textplot(info, valign = "top")
textplot(info, valign = "top")
title("Sepal Length by Species", outer = TRUE)

Resources