Is there functionality in plotly to do something similar to hjust/vjust or position_dodge in R? - position

I'm hoping to adjust the location of points and lines in a dumbbell plot to separate the bars rather than overlaying them, similar to position dodge or hjust/vjust in R.
The code below produces something close to what I'd like, but the barbells are overlayed.
urlfile <- 'https://raw.githubusercontent.com/charlottemcclintock/GenSquared/master/data.csv'
df <- read.csv(urlfile)
p <- plot_ly(df, color = I("gray80")) %>%
add_segments(x = ~mom, xend = ~daughter, y = ~country, yend = ~country, showlegend = FALSE) %>%
add_markers(x = ~mom, y = ~country, name = "Mother", color = I("purple")) %>%
add_markers(x = ~daughter, y = ~country, name = "Daughter", color = I("pink")) %>%
add_segments(x = ~dad, xend = ~son, y = ~country, yend = ~country, showlegend = FALSE) %>%
add_markers(x = ~dad, y = ~country, name = "Father", color = I("navy")) %>%
add_markers(x = ~son, y = ~country, name = "Son", color = I("blue")) %>%
layout(
title = "Gender educational disparity",
xaxis = list(title = "Mean Years of Education"),
margin = list(l = 65)
)
p
By coercing the country names to a factor, I can get the ideal spacing but I lose the country labels which I'm hoping to keep. I tried using country and numeric factor index together but plotly doesn't allow discrete and continuous scales together.
df$cnum <- as.numeric(as.factor(df$country))
p <- plot_ly(df, color = I("gray80")) %>%
add_segments(x = ~mom, xend = ~daughter, y = ~cnum+.2, yend = ~cnum+0.2, showlegend = FALSE) %>%
add_markers(x = ~mom, y = ~cnum+.2, name = "Mother", color = I("purple")) %>%
add_markers(x = ~daughter, y = ~cnum+.2, name = "Daughter", color = I("pink")) %>%
add_segments(x = ~dad, xend = ~son, y = ~cnum-.2, yend = ~cnum-.2, showlegend = FALSE) %>%
add_markers(x = ~dad, y = ~cnum-.2, name = "Father", color = I("navy")) %>%
add_markers(x = ~son, y = ~cnum-.2, name = "Son", color = I("blue")) %>%
layout(
title = "Gender educational disparity",
xaxis = list(title = "Mean Years of Education"),
margin = list(l = 65)
)
p
I would like it to look like this:
But with the country names on the y-axis.
Is there a way to adjust the vertical height relative to a discrete axis point?

Update: it's not elegant but I figured out a workaround by overwriting the y axis with a section y axis! Would still love a better answer, but this is a usable fix!
df$arb=15
plot_ly(df, color = I("gray80")) %>%
add_segments(x = ~mom, xend = ~daughter, y = ~cnum+.2, yend = ~cnum+.2, showlegend = FALSE) %>%
add_markers(x = ~mom, y = ~cnum+.2, name = "Mother", color = I("purple"), size=2) %>%
add_markers(x = ~daughter, y = ~cnum+.2, name = "Daughter", color = I("pink"), size=2) %>%
add_segments(x = ~dad, xend = ~son, y = ~cnum-.1, yend = ~cnum-.1, showlegend = FALSE) %>%
add_markers(x = ~dad, y = ~cnum-.1, name = "Father", color = I("navy"), size=2) %>%
add_markers(x = ~son, y = ~cnum-.1, name = "Son", color = I("blue"), size=2) %>%
add_markers(x = ~arb, y = ~country, name = " ", color = I("white"), yaxis = "y2") %>%
layout(
yaxis=list(title="", tickfont=list(color="white")),
yaxis2 = list(overlaying = "y", side = "left", title = ""))
)

Related

R plotly line color by value range

I would like to make this kind of graph (here from Our World In data ) where the line color varies by value range.
edit : adding a screenshot to make it clearer :
With plotly, I found this example but working with type = scatter and mode = markers plot and not with lines:
x <- seq(from = -2,
to = 2,
b = 0.1)
y <- sin(x)
p11 <- plot_ly() %>%
add_trace(type = "scatter",
x = ~x,
y = ~y,
mode = "markers",
marker = list(size = 10,
color = colorRampPalette(brewer.pal(10,"Spectral"))(41))) %>%
layout(title = "Multicolored sine curve",
xaxis = list(title = "x-axis"),
yaxis = list(title = "y-axis"))
p11
is there any ways to use the colorRampPalette or values range but with line (actually it's a time series)
x <- seq(from = -2,
to = 2,
b = 0.1)
y <- sin(x)
p11 <- plot_ly() %>%
add_trace(type = "scatter",
x = ~x,
y = ~y,
mode = "lines",
line = list(width = 1,
color = colorRampPalette(brewer.pal(10,"Spectral"))(41))) %>%
layout(title = "Multicolored sine curve",
xaxis = list(title = "x-axis"),
yaxis = list(title = "y-axis"))
p11
Thank you
You can, but the more points you have the better it will look. Note that I change the .1 in x, to .001.
library(plotly)
library(RColorBrewer)
x <- seq(from = -2,
to = 2,
b = 0.001)
y <- sin(x)
z = cut(x, breaks = 5, include.lowest = T)
p11 <- plot_ly() %>%
add_lines(x = ~x,
y = ~y,
color = ~z,
colors = colorRampPalette(brewer.pal(10,"Spectral"))(length(x))) %>%
layout(title = "Multicolored sine curve",
xaxis = list(title = "x-axis"),
yaxis = list(title = "y-axis"))
p11
If I change that .001 back to .1, it's a bit ugly! You can see the gaps.

Plotly plot a vertical line on a time series plot due to conditions

Hi I have a dataframe with time series on my x axis and values on my y axis.
I am using Plotly and am trying to plot a vertical line on the x axis where there my df.Alert == 1.
Currently I am using another overlay with red marker to plot it but I wish to switch to a vertical line that is restricted within by the y values of my chart. The values on the y axis should still be determined by my trace plot and not the vertical line.
Is there a way for me to do this?
My code sample is written below
Trace = go.Scatter(
name = "Values",
x = df.DateTime,
y = df.Values,
mode='markers',
text= "Unit: " + df['Unit'].astype(str),
)
Alert = go.Scatter(
name = "Alert",
x = df.DateTime,
y = df.Values.where(df.Alert == 1),
mode='markers',
line = dict(color = "red"),
text= "Unit: " + df['Unit'].astype(str),
)
layout = go.Layout(
xaxis = dict(title = "Date and Time"),
yaxis = dict(title = "Values")
)
data = [Trace, Alert]
figure = go.Figure(data = data, layout = layout)
py.iplot(figure)
You perfectly describe what you want to do... plot vline
iterate over rows in DF that are alerts fig.add_vline()
n=50
df = pd.DataFrame({"DateTime":pd.date_range("1-jan-2021", freq="15min", periods=n),
"Alert":np.random.choice([0]*10+[1], n),
"Unit":np.random.choice([0,1,2,3], n),
"Values":np.random.uniform(1,10, n)})
Trace = go.Scatter(
name = "Values",
x = df.DateTime.astype(str),
y = df.Values,
mode='markers',
text= "Unit: " + df['Unit'].astype(str),
)
layout = go.Layout(
xaxis = dict(title = "Date and Time"),
yaxis = dict(title = "Values")
)
data = [Trace]
figure = go.Figure(data = data, layout = layout)
for r in df.loc[df.Alert.astype(bool),].iterrows():
figure.add_vline(x=r[1]["DateTime"], line_width=1, line_dash="solid", line_color="red")
figure

R highcharter - trim row labels but not in tooltip

I want to display a highcharter stacked bar chart where the row labels are trimmed that the first five characters are not shown. However, in the tooltip the full category names should be shown.
In the example above as categories at the xAxis I would like to have only "2012", "2013",.., whereas in the tooltip the whole category names should be displayed.
Here is my code
bs.table = data.frame(
Closing.Date = c("Line 2012", "Year 2013", "Year 2014", "Year 2015", "Year 2016"),
Non.Current.Assets = c(40.4, 30.3, 20.4, 34.5, 20),
Current.Assets = c(3.2, 3.3, 2.4, 3.5, 2)
)
hc <- highchart() %>%
hc_chart(type = "bar") %>%
hc_plotOptions(series = list(stacking = "normal")) %>%
hc_xAxis(categories = bs.table$Closing.Date,
lineColor = 'transparent',
tickWidth = 0,
labels = list(enable = TRUE,
align = 'left',
x = 5,
style = list(fontSize = '1em',color = '#fff'))) %>%
hc_add_series(name ="Non Current Assets",
data = bs.table$Current.Assets,
stack = "Assets",
dataLabels = list(enabled = TRUE,
inside = TRUE,
align = "right",
style = list(fontSize = '1em',color = '#fff'))) %>%
hc_add_series(name = "Current Assets",
data = bs.table$Non.Current.Assets,
stack = "Assets",
dataLabels = list(enabled = TRUE, inside = FALSE, align = "right",
style = list(fontSize = '1em',color = '#fff')) ) %>%
hc_legend(enabled = FALSE) %>%
hc_tooltip(shared = TRUE,
headerFormat = '<b>Statement {point.x}</b><br>',
pointFormat = '<b>{series.name}:</b> {point.y} <br>',
footerFormat = '<b>Total: {point.total} </b>')
Many thanks in advance!
Could you not just change the column name before you create the chart?
# function to get year
substrRight <- function(x, n){
substr(x, nchar(x)-n+1, nchar(x))
}
# create year column
bs.table$year = substrRight(as.character(bs.table$Closing.Date), 4)
# alter x axis to use this column
hc <- highchart() %>%
hc_chart(type = "bar") %>%
hc_plotOptions(series = list(stacking = "normal")) %>%
hc_xAxis(categories = bs.table$year,
lineColor = 'transparent',
tickWidth = 0,
labels = list(enable = TRUE,
align = 'left',
x = 5,
style = list(fontSize = '1em',color = '#fff'))) %>%
hc_add_series(name ="Non Current Assets",
data = bs.table$Current.Assets,
stack = "Assets",
dataLabels = list(enabled = TRUE,
inside = TRUE,
align = "right",
style = list(fontSize = '1em',color = '#fff'))) %>%
hc_add_series(name = "Current Assets",
data = bs.table$Non.Current.Assets,
stack = "Assets",
dataLabels = list(enabled = TRUE, inside = FALSE, align = "right",
style = list(fontSize = '1em',color = '#fff')) ) %>%
hc_legend(enabled = FALSE) %>%
hc_tooltip(shared = TRUE,
headerFormat = '<b>Statement {point.x}</b><br>',
pointFormat = '<b>{series.name}:</b> {point.y} <br>',
footerFormat = '<b>Total: {point.total} </b>')
Edit
This is a sort of workaround that would nearly give you what you want:
highchart() %>%
hc_chart(type = "bar") %>%
hc_xAxis(categories = bs.table$year,
lineColor = 'transparent',
tickWidth = 0,
labels = list(enable = TRUE,
align = 'left',
x = 5,
style = list(fontSize = '1em',color = '#fff'))) %>%
hc_plotOptions(series = list(stacking = "normal")) %>%
hc_add_series(name = "Current Assets", bs.table, "column", hcaes(x = year, y = Current.Assets, stuff = Closing.Date),
tooltip = list(pointFormat = "<b>{point.stuff}</b><br> <b>{series.name}:</b> {point.y} <br>"),
dataLabels = list(enabled = TRUE,
inside = TRUE,
align = "right",
style = list(fontSize = '1em',color = '#fff'))) %>%
hc_add_series(name ="Non Current Assets", bs.table, "column", hcaes(x = year, y = Non.Current.Assets),
tooltip = list(pointFormat = "<b>{point.stuff}</b><br>"),
dataLabels = list(enabled = TRUE, inside = FALSE, align = "right",
style = list(fontSize = '1em',color = '#fff')) ) %>%
hc_legend(enabled = FALSE) %>%
hc_tooltip(shared = TRUE,
headerFormat = '<b>Statement </b>',
footerFormat = '<b>Total: {point.total} </b>')

Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: could not convert string to float: Null

I am doing deep learning using Keras in Rstudio. I have some embedding layers at the beginning of the model. I checked continuous variables and there is no missing value, and also responding variable y is float.
df_cl_dl = df_cl %>% filter(agency == "FHA") %>%
select(lender, channel, fthb, region, credit_score, credit_score_null, source, ltv_uw, seasonality,
current_ltv, loan_age, cash_incentive_a, hpas, loan_size, ur, risk,
vsmm) %>%
sample_n(100000)
inp_lender = layer_input(shape = c(1), name = "inp_lender")
inp_channel = layer_input(shape = c(1), name = "inp_channel")
inp_fthb = layer_input(shape = c(1), name = "inp_fthb")
inp_region = layer_input(shape = c(1), name = "inp_region")
inp_cs_null = layer_input(shape = c(1), name = "inp_cs_null")
inp_source = layer_input(shape = c(1), name = "inp_source")
inp_season = layer_input(shape = c(1), name = "inp_season")
inp_ltv_uw = layer_input(shape = c(1), name = "inp_ltv_uw")
inp_continuous = layer_input(shape = c(8), name = "inp_continuous")
embedding_out1 = inp_lender %>% layer_embedding(input_dim = 3+1, output_dim = 2, input_length = 1, name = "embedding_lender") %>% layer_flatten()
embedding_out2 = inp_channel %>% layer_embedding(input_dim = 3+1, output_dim = 2, input_length = 1, name = "embedding_channel") %>% layer_flatten()
embedding_out3 = inp_fthb %>% layer_embedding(input_dim = 3+1, output_dim = 2, input_length = 1, name = "embedding_fthb") %>% layer_flatten()
embedding_out4 = inp_region %>% layer_embedding(input_dim = 4+1, output_dim = 2, input_length = 1, name = "embedding_region") %>% layer_flatten()
embedding_out5 = inp_cs_null %>% layer_embedding(input_dim = 2+1, output_dim = 2, input_length = 1, name = "embedding_cs_null") %>% layer_flatten()
embedding_out6 = inp_source %>% layer_embedding(input_dim = 2+1, output_dim = 2, input_length = 1, name = "embedding_source") %>% layer_flatten()
embedding_out7 = inp_season %>% layer_embedding(input_dim = 12+1, output_dim = 3, input_length = 1, name = "embedding_season") %>% layer_flatten()
embedding_out8 = inp_ltv_uw %>% layer_embedding(input_dim = 2+1, output_dim = 2, input_length = 1, name = "embedding_ltv_uw") %>% layer_flatten()
combined_model = layer_concatenate(c(embedding_out1, embedding_out2, embedding_out3, embedding_out4,
embedding_out5, embedding_out6, embedding_out7, embedding_out8, inp_continuous)) %>%
layer_dense(units=32, activation = "relu") %>%
layer_dropout(0.3) %>%
layer_dense(units=10, activation = "relu") %>%
layer_dropout(0.15) %>%
layer_dense(units=1)
model = keras::keras_model(inputs = c(inp_lender, inp_channel, inp_fthb, inp_region, inp_cs_null,
inp_source, inp_season, inp_ltv_uw, inp_continuous),
outputs = combined_model)
model %>% compile(loss = "mean_squared_error", optimizer = "sgd", metric = "accuracy")
inputVariables = list(as.matrix(df_cl_dl$lender),
as.matrix(df_cl_dl$channel),
as.matrix(df_cl_dl$fthb),
as.matrix(df_cl_dl$region),
as.matrix(df_cl_dl$credit_score_null),
as.matrix(df_cl_dl$source),
as.matrix(df_cl_dl$seasonality),
as.matrix(df_cl_dl$ltv_uw),
as.matrix(df_cl_dl[,c("credit_score", "current_ltv", "loan_age", "cash_incentive_a", "hpas", "loan_size", "ur", "risk")]))
model %>% fit(x = inputVariables, y = as.matrix(df_cl_dl$vsmm), epochs = 10, batch_size = 2)
Error Massage:
Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: could not convert string to float: Null

plotly: 3D plotting returns a figure with no datapoints

Trying to plot results from K-means clustering using 3D plot (Plotly). There is a blank figure generated in the HTML when I use the below code. I printed the variables scatter 1,2,3 and also the cluster 1,2,3 and values are shown. Is there a plt.show() like in matplotlib in plotly to show the values in the graph?
import pandas as pd
import numpy as np
import argparse
import json
import re
import os
import sys
import plotly
import plotly.graph_objs as go
cluster1=df.loc[df['y'] == 0]
cluster2=df.loc[df['y'] == 1]
cluster3=df.loc[df['y'] == 2]
scatter1 = dict(
mode = "markers",
name = "Cluster 1",
type = "scatter3d",
x = cluster1.as_matrix()[:,0], y = cluster1.as_matrix()[:,1], z = cluster1.as_matrix()[:,2],
marker = dict( size=2, color='green')
)
scatter2 = dict(
mode = "markers",
name = "Cluster 2",
type = "scatter3d",
x = cluster2.as_matrix()[:,0], y = cluster2.as_matrix()[:,1], z = cluster2.as_matrix()[:,2],
marker = dict( size=2, color='blue')
)
scatter3 = dict(
mode = "markers",
name = "Cluster 3",
type = "scatter3d",
x = cluster3.as_matrix()[:,0], y = cluster3.as_matrix()[:,1], z = cluster3.as_matrix()[:,2],
marker = dict( size=2, color='red')
)
cluster1 = dict(
alphahull = 5,
name = "Cluster 1",
opacity = .1,
type = "mesh3d",
x = cluster1.as_matrix()[:,0], y = cluster1.as_matrix()[:,1], z = cluster1.as_matrix()[:,2],
color='green', showscale = True
)
cluster2 = dict(
alphahull = 5,
name = "Cluster 2",
opacity = .1,
type = "mesh3d",
x = cluster2.as_matrix()[:,0], y = cluster2.as_matrix()[:,1], z = cluster2.as_matrix()[:,2],
color='blue', showscale = True
)
cluster3 = dict(
alphahull = 5,
name = "Cluster 3",
opacity = .1,
type = "mesh3d",
x = cluster3.as_matrix()[:,0], y = cluster3.as_matrix()[:,1], z = cluster3.as_matrix()[:,2],
color='red', showscale = True
)
layout = dict(
title = 'Interactive Cluster Shapes in 3D',
scene = dict(
xaxis = dict(zeroline=True ),
yaxis = dict(zeroline=True ),
zaxis = dict(zeroline=True ),
)
)
fig = dict(data=[scatter1, scatter2, scatter3, cluster1, cluster2, cluster3], layout=layout )
# Use py.iplot() for IPython notebook
plotly.offline.iplot(fig, filename='mesh3d_sample.html')
#py.iplot(fig, filename='mesh3d_sample')
HTML with just the axis and no data points displayed

Resources