Plotly : How to enable text label in line graph for the last value? - python-3.x

I am trying to build a graph where the line graph should show the value of only the last element in some beautiful formating.
line graph with no text at end
Now the current method of the text shows for all elements and is a straight text that creates a lot of collisions with different lines in the same graph and looks clumsy.
Will be very nice to achieve something as mentioned in the below image.
desired line graph with text

This is now handled through:
legendgroup = d.name
Plot 1: All
Plot 2: Deselect GOOG in the legend and see that the marker disappears as well:
Complet code:
# imports
import pandas as pd
import plotly.express as px
# data
df = px.data.stocks()
df = df.drop('AMZN', axis = 1)
colors = px.colors.qualitative.T10
# plotly
fig = px.line(df,
x = 'date',
y = [c for c in df.columns if c != 'date'],
template = 'plotly_dark',
color_discrete_sequence = colors,
title = 'Stocks',
)
# move legend
fig.layout.legend.x = -0.3
# add traces for annotations and text for end of lines
for i, d in enumerate(fig.data):
fig.add_scatter(x=[d.x[-1]], y = [d.y[-1]],
mode = 'markers+text',
text = d.y[-1],
textfont = dict(color=d.line.color),
textposition='middle right',
marker = dict(color = d.line.color, size = 12),
legendgroup = d.name,
showlegend=False)
fig.show()

Related

How could I edit my code to plot 4D contour something similar to this example in python?

Similar to many other researchers on stackoverflow who are trying to plot a contour graph out of 4D data (i.e., X,Y,Z and their corresponding value C), I am attempting to plot a 4D contour map out of my data. I have tried many of the suggested solutions in stackover flow. From all of the plots suggested this, and this were the closest to what I want but sill not quite what I need in terms of data interpretation. Here is the ideal plot example: (source)
Here is a subset of the data. I put it on the dropbox. Once this data is downloaded to the directory of the python file, the following code will work. I have modified this script from this post.
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import matplotlib.tri as mtri
#####Importing the data
df = pd.read_csv('Data_4D_plot.csv')
do_random_pt_example = False;
index_x = 0; index_y = 1; index_z = 2; index_c = 3;
list_name_variables = ['x', 'y', 'z', 'c'];
name_color_map = 'seismic';
if do_random_pt_example:
number_of_points = 200;
x = np.random.rand(number_of_points);
y = np.random.rand(number_of_points);
z = np.random.rand(number_of_points);
c = np.random.rand(number_of_points);
else:
x = df['X'].to_numpy();
y = df['Y'].to_numpy();
z = df['Z'].to_numpy();
c = df['C'].to_numpy();
#end
#-----
# We create triangles that join 3 pt at a time and where their colors will be
# determined by the values of their 4th dimension. Each triangle contains 3
# indexes corresponding to the line number of the points to be grouped.
# Therefore, different methods can be used to define the value that
# will represent the 3 grouped points and I put some examples.
triangles = mtri.Triangulation(x, y).triangles;
choice_calcuation_colors = 2;
if choice_calcuation_colors == 1: # Mean of the "c" values of the 3 pt of the triangle
colors = np.mean( [c[triangles[:,0]], c[triangles[:,1]], c[triangles[:,2]]], axis = 0);
elif choice_calcuation_colors == 2: # Mediane of the "c" values of the 3 pt of the triangle
colors = np.median( [c[triangles[:,0]], c[triangles[:,1]], c[triangles[:,2]]], axis = 0);
elif choice_calcuation_colors == 3: # Max of the "c" values of the 3 pt of the triangle
colors = np.max( [c[triangles[:,0]], c[triangles[:,1]], c[triangles[:,2]]], axis = 0);
#end
#----------
###=====adjust this part for the labeling of the graph
list_name_variables[index_x] = 'X (m)'
list_name_variables[index_y] = 'Y (m)'
list_name_variables[index_z] = 'Z (m)'
list_name_variables[index_c] = 'C values'
# Displays the 4D graphic.
fig = plt.figure(figsize = (15,15));
ax = fig.gca(projection='3d');
triang = mtri.Triangulation(x, y, triangles);
surf = ax.plot_trisurf(triang, z, cmap = name_color_map, shade=False, linewidth=0.2);
surf.set_array(colors); surf.autoscale();
#Add a color bar with a title to explain which variable is represented by the color.
cbar = fig.colorbar(surf, shrink=0.5, aspect=5);
cbar.ax.get_yaxis().labelpad = 15; cbar.ax.set_ylabel(list_name_variables[index_c], rotation = 270);
# Add titles to the axes and a title in the figure.
ax.set_xlabel(list_name_variables[index_x]); ax.set_ylabel(list_name_variables[index_y]);
ax.set_zlabel(list_name_variables[index_z]);
ax.view_init(elev=15., azim=45)
plt.show()
Here would be the output:
Although it looks brilliant, it is not quite what I am looking for (the above contour map example). I have modified the following script from this post in the hope to reach the required graph, however, the chart looks nothing similar to what I was expecting (something similar to the previous output graph). Warning: the following code may take some time to run.
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
df = pd.read_csv('Data_4D_plot.csv')
x = df['X'].to_numpy();
y = df['Y'].to_numpy();
z = df['Z'].to_numpy();
cc = df['C'].to_numpy();
# convert to 2d matrices
Z = np.outer(z.T, z)
X, Y = np.meshgrid(x, y)
C = np.outer(cc.T,cc)
# fourth dimention - colormap
# create colormap according to cc-value
color_dimension = C # change to desired fourth dimension
minn, maxx = color_dimension.min(), color_dimension.max()
norm = matplotlib.colors.Normalize(minn, maxx)
m = plt.cm.ScalarMappable(norm=norm, cmap='jet')
m.set_array([])
fcolors = m.to_rgba(color_dimension)
# plot
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(X,Y,Z, rstride=1, cstride=1, facecolors=fcolors, vmin=minn, vmax=maxx, shade=False)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
Now I was wondering from our kind community and experts if you can help me to plot a contour figure similar to the example graph (image one in this post), where the contours are based on the values within the range of C?

How to select specific number of colors to show in color bar from a big list ? - Matplotlib

I plotted some data which has 70 classes, so when I built the color bar it's very difficult to distinguish between each legend as shown below:
The code that I'm using is:
formation_colors = # 70 colors
formation_labels = # 70 labels
data = # the section of the entire dataset which only has 13 labels
data = data.sort_values(by='DEPTH_MD')
ztop=data.DEPTH_MD.min(); zbot=data.DEPTH_MD.max()
cmap_formations = colors.ListedColormap(formation_colors[0:len(formation_colors)], 'indexed')
cluster_f = np.repeat(np.expand_dims(data['Formations'].values,1), 100, 1)
fig = plt.figure(figsize=(2,10))
ax = fig.add_subplot()
im_f = ax.imshow(cluster_f, interpolation='none', aspect='auto', cmap = cmap_formations, vmin=0, vmax=69)
ax.set_xlabel('FORMATION')
ax.set_xticklabels(['']);
divider_f = make_axes_locatable(ax)
cax_f = divider_f.append_axes("right", size="20%", pad=0.05)
cbar_f = plt.colorbar(im_f, cax = cax_f,)
cbar_f.set_ticks(range(0,len(formation_labels))); cbar_f.set_ticklabels(formation_labels)
So far, if I just change:
1. cmap_formations = colors.ListedColormap(formation_colors[0:len(formation_colors)], 'indexed')
2. cbar_f.set_ticks(range(0,len(formation_labels))); cbar_f.set_ticklabels(formation_labels)
to:
cmap_formations = colors.ListedColormap(formation_colors[0:len(data['FORMATION'].unique())], 'indexed')
cbar_f.set_ticks(range(0,len(data['FORMATION'].unique()))); cbar_f.set_ticklabels(data['FORMATION'].unique())
I get, the corresponding colors in the cbar, however the plot is no longer correct and also the legends are out of square
Thank you so much if you have any idea how to do this.
Although not explicitly mentioned in the question, I suppose data['FORMATION'] contains indices from 0 till 69 into the lists of formation_colors and formation_labels
The main problem is that data['FORMATION'] needs to be renumbered to be new indices (with numbers 0 till 12) into the new list of unique colors. np.unique(..., return_inverse=True) returns both the list of unique numbers, and the renumbering for the values.
To be able to reindex the list of colors and of labels, it helps to convert them to numpy arrays.
To make the code easier to debug, the following test uses a simple relation between the list of colors and the list of labels.
from matplotlib import pyplot as plt
from matplotlib import colors
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
import numpy as np
import pandas as pd
formation_colors = np.random.choice(list(colors.CSS4_COLORS), 70, replace=False) # 70 random color names
formation_labels = ['lbl_' + c for c in formation_colors] # 70 labels
formation_colors = np.asarray(formation_colors)
formation_labels = np.asarray(formation_labels)
f = np.random.randint(0, 70, 13)
d = np.sort(np.random.randint(0, 5300, 13))
data = pd.DataFrame({'FORMATION': np.repeat(f, np.diff(np.append(0, d))),
'DEPTH_MD': np.arange(d[-1])})
data = data.sort_values(by='DEPTH_MD')
ztop = data['DEPTH_MD'].min()
zbot = data['DEPTH_MD'].max()
unique_values, formation_new_values = np.unique(data['FORMATION'], return_inverse=True)
cmap_formations = colors.ListedColormap(formation_colors[unique_values], 'indexed')
cluster_f = formation_new_values.reshape(-1, 1)
fig = plt.figure(figsize=(3, 10))
ax = fig.add_subplot()
im_f = ax.imshow(cluster_f, extent=[0, 1, zbot, ztop],
interpolation='none', aspect='auto', cmap=cmap_formations, vmin=0, vmax=len(unique_values)-1)
ax.set_xlabel('FORMATION')
ax.set_xticks([])
divider_f = make_axes_locatable(ax)
cax_f = divider_f.append_axes("right", size="20%", pad=0.05)
cbar_f = plt.colorbar(im_f, cax=cax_f)
cbar_f.set_ticks(np.linspace(0, len(unique_values)-1, 2*len(unique_values)+1)[1::2])
cbar_f.set_ticklabels(formation_labels[unique_values])
plt.subplots_adjust(left=0.2, right=0.5)
plt.show()
Here is a comparison plot:

Why is Bokeh's plot not changing with plot selection?

Struggling to understand why this bokeh visual will not allow me to change plots and see the predicted data. The plot and select (dropdown-looking) menu appears, but I'm not able to change the plot for items in the menu.
Running Bokeh 1.2.0 via Anaconda. The code has been run both inside & outside of Jupyter. No errors display when the code is run. I've looked through the handful of SO posts relating to this same issue, but I've not been able to apply the same solutions successfully.
I wasn't sure how to create a toy problem out of this, so in addition to the code sample below, the full code (including the regression code and corresponding data) can be found at my github here (code: Regression&Plotting.ipynb, data: pred_data.csv, historical_data.csv, features_created.pkd.)
import pandas as pd
import datetime
from bokeh.io import curdoc, output_notebook, output_file
from bokeh.layouts import row, column
from bokeh.models import Select, DataRange1d, ColumnDataSource
from bokeh.plotting import figure
#Must be run from the command line
def get_historical_data(src_hist, drug_id):
historical_data = src_hist.loc[src_hist['ndc'] == drug_id]
historical_data.drop(['Unnamed: 0', 'date'], inplace = True, axis = 1)#.dropna()
historical_data['date'] = pd.to_datetime(historical_data[['year', 'month', 'day']], infer_datetime_format=True)
historical_data = historical_data.set_index(['date'])
historical_data.sort_index(inplace = True)
# csd_historical = ColumnDataSource(historical_data)
return historical_data
def get_prediction_data(src_test, drug_id):
#Assign the new date
#Write a new dataframe with values for the new dates
df_pred = src_test.loc[src_test['ndc'] == drug_id].copy()
df_pred.loc[:, 'year'] = input_date.year
df_pred.loc[:, 'month'] = input_date.month
df_pred.loc[:, 'day'] = input_date.day
df_pred.drop(['Unnamed: 0', 'date'], inplace = True, axis = 1)
prediction = lin_model.predict(df_pred)
prediction_data = pd.DataFrame({'drug_id': prediction[0][0], 'predictions': prediction[0][1], 'date': pd.to_datetime(df_pred[['year', 'month', 'day']], infer_datetime_format=True, errors = 'coerce')})
prediction_data = prediction_data.set_index(['date'])
prediction_data.sort_index(inplace = True)
# csd_prediction = ColumnDataSource(prediction_data)
return prediction_data
def make_plot(historical_data, prediction_data, title):
#Historical Data
plot = figure(plot_width=800, plot_height = 800, x_axis_type = 'datetime',
toolbar_location = 'below')
plot.xaxis.axis_label = 'Time'
plot.yaxis.axis_label = 'Price ($)'
plot.axis.axis_label_text_font_style = 'bold'
plot.x_range = DataRange1d(range_padding = 0.0)
plot.grid.grid_line_alpha = 0.3
plot.title.text = title
plot.line(x = 'date', y='nadac_per_unit', source = historical_data, line_color = 'blue', ) #plot historical data
plot.line(x = 'date', y='predictions', source = prediction_data, line_color = 'red') #plot prediction data (line from last date/price point to date, price point for input_date above)
return plot
def update_plot(attrname, old, new):
ver = vselect.value
new_hist_source = get_historical_data(src_hist, ver) #calls the function above to get the data instead of handling it here on its own
historical_data.data = ColumnDataSource.from_df(new_hist_source)
# new_pred_source = get_prediction_data(src_pred, ver)
# prediction_data.data = new_pred_source.data
#Import data source
src_hist = pd.read_csv('data/historical_data.csv')
src_pred = pd.read_csv('data/pred_data.csv')
#Prep for default view
#Initialize plot with ID number
ver = 781593600
#Set the prediction date
input_date = datetime.datetime(2020, 3, 31) #Make this selectable in future
#Select-menu options
menu_options = src_pred['ndc'].astype(str) #already contains unique values
#Create select (dropdown) menu
vselect = Select(value=str(ver), title='Drug ID', options=sorted((menu_options)))
#Prep datasets for plotting
historical_data = get_historical_data(src_hist, ver)
prediction_data = get_prediction_data(src_pred, ver)
#Create a new plot with the source data
plot = make_plot(historical_data, prediction_data, "Drug Prices")
#Update the plot every time 'vselect' is changed'
vselect.on_change('value', update_plot)
controls = row(vselect)
curdoc().add_root(row(plot, controls))
UPDATED: ERRORS:
1) No errors show up in Jupyter Notebook.
2) CLI shows a UserWarning: Pandas doesn't allow columns to be careated via a new attribute name, referencing `historical_data.data = ColumnDatasource.from_df(new_hist_source).
Ultimately, the plot should have a line for historical data, and another line or dot for predicted data derived from sklearn. It also has a dropdown menu to select each item to plot (one at a time).
Your update_plot is a no-op that does not actually make any changes to Bokeh model state, which is what is necessary to change a Bokeh plot. Changing Bokeh model state means assigning a new value to a property on a Bokeh object. Typically, to update a plot, you would compute a new data dict and then set an existing CDS from it:
source.data = new_data # plain python dict
Or, if you want to update from a DataFame:
source.data = ColumnDataSource.from_df(new_df)
As an aside, don't assign the .data from one CDS to another:
source.data = other_source.data # BAD
By contrast, your update_plot computes some new data and then throws it away. Note there is never any purpose to returning anything at all from any Bokeh callback. The callbacks are called by Bokeh library code, which does not expect or use any return values.
Lastly, I don't think any of those last JS console errors were generated by BokehJS.

Changing the attributes of the what appears when hovering over a Choropleth Map in plotly

I am using plotly in Python 3.6.3 and am trying to do a Choropleth map as in here. I would like to change the attributes of what appears when hovering above the map. That is, for example, if we consider the first map and hover of California, it looks like:
I want to change both the font size of the content that appears and the size of the box. Is there a way to access those?
Here is the code that generates it:
import plotly.plotly as py
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2011_us_ag_exports.csv')
for col in df.columns:
df[col] = df[col].astype(str)
scl = [[0.0, 'rgb(242,240,247)'],[0.2, 'rgb(218,218,235)'],[0.4, 'rgb(188,189,220)'],\
[0.6, 'rgb(158,154,200)'],[0.8, 'rgb(117,107,177)'],[1.0, 'rgb(84,39,143)']]
df['text'] = df['state'] + '<br>' +\
'Beef '+df['beef']+' Dairy '+df['dairy']+'<br>'+\
'Fruits '+df['total fruits']+' Veggies ' + df['total veggies']+'<br>'+\
'Wheat '+df['wheat']+' Corn '+df['corn']
data = [ dict(
type='choropleth',
colorscale = scl,
autocolorscale = False,
locations = df['code'],
z = df['total exports'].astype(float),
locationmode = 'USA-states',
text = df['text'],
marker = dict(
line = dict (
color = 'rgb(255,255,255)',
width = 2
) ),
colorbar = dict(
title = "Millions USD")
) ]
layout = dict(
title = '2011 US Agriculture Exports by State<br>(Hover for breakdown)',
geo = dict(
scope='usa',
projection=dict( type='albers usa' ),
showlakes = True,
lakecolor = 'rgb(255, 255, 255)'),
)
fig = dict( data=data, layout=layout )
py.iplot( fig, filename='d3-cloropleth-map' )
The chloropleth>hoverlabel function lets you set the background color, border color, and font. The size of the border box is determined by the text within it, however. If the name shows up as truncated it can be expanded with the chloropleth>hoverlabel>namelength function.

How do I create a legend for a heatmap in Bokeh 12.4.1

The recent version of Bokeh allows the programmer to put the legend outside of the chart area. This can be accomplished like described here:
p = figure(toolbar_location="above")
r0 = p.circle(x, y)
legend = Legend(items=[
("sin(x)" , [r0]),),
], location=(0, -30))
p.add_layout(legend, 'right')
show(p)
Note: A legend object is attached to a plot via add_layout. The legend object itself consists of tuples and strings together with glyph lists.
The question is what to do when you are just drawing one "data" series as is the case with the code below, adapted from here:
from bokeh.io import show
from bokeh.models import ColumnDataSource, HoverTool, LinearColorMapper
from bokeh.plotting import figure
col = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
row = ['A', 'B', 'C' , 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P']
# this is the colormap from the original NYTimes plot
colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
"#ddb7b1", "#cc7878", "#933b41", "#550b1d"]
mapper = LinearColorMapper(palette=colors)
source = ColumnDataSource(data = dict (
row = test['plate_row'],
col = test['plate_col'],
values = test['Melt Temp']
))
TOOLS = "hover,save,pan,box_zoom,wheel_zoom"
p = figure(title="Plate Heatmap", x_range = (0.0,25.0), y_range =
list(reversed(row)), x_axis_location="above", tools=TOOLS)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None)
legend = Legend(items=[
("test" , [r1]),
], location=(0, -30))
p.add_layout(legend, 'left')
show(p) # show the plot
The issue here is that there is only one glyph. What I actually need is an explanation of what value range is included for different colors. Clearly, this is possible, because the plots defined here show that it's possible.
Update:
Now that I am writing about the problem, I am starting to think, that perhaps I can can just plot multiple series, one for each color...and only plot those coordinates that fall within a certain range...that seems rather clunky though. So any ideas are appreciated!
I figured out a way through using CategoricalColorMapper and then not creating an explicit legend object.
There may be a way to create the legend object explicitly with the same layout, I will have a look later.
import numpy as np
from bokeh.io import show
from bokeh.models import Legend
from bokeh.models import ColumnDataSource, HoverTool,CategoricalColorMapper
from bokeh.plotting import figure
from bokeh.palettes import Blues8
# values to assign colours on
values = np.arange(100,107)
# values that will appear in the legend!!!
legend_values = ['100-101','101-102','102-103','103-04','104-05','105-06',
'106-07']
source = ColumnDataSource(data = dict (
row = np.arange(100,107),
col = np.arange(100,107),
values = np.arange(100,107),
legend_values = legend_values
))
mapper = CategoricalColorMapper(factors=list(values),palette=Blues8)
TOOLS = "hover,save,pan,box_zoom,wheel_zoom"
p = figure(title="Plate Heatmap", x_range = (100,107), y_range =
[90,107], x_axis_location="above", tools=TOOLS)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None,legend='legend_values')
p.legend.location = "bottom_right"
show(p) # show the plot
See the image here 1
After researching this a bit more, I found 2 ways of creating a legends that show what each color means on the heatmap:
1.) Painting several glyph series:
First, I divide the number range into bins like so:
min_value = test['Melt Temp'].min()
max_value = test['Melt Temp'].max()
increment = round((max_value - min_value)/9)
num_bins = [(lower, lower+increment) for lower in
range(int(floor(min_value)), int(round(max_value)),
int(round(increment)))]
Then, I create sub tables from the main tables like so:
source_dict = {}
for range_tuple in num_bins:
range_data = test[(test['Melt Temp'] > int(range_tuple[0])) &
(test['Melt Temp'] <= int(range_tuple[1]))]
source = ColumnDataSource(data = dict (
row = range_data['x'],
col = range_data['y'],
values = range_data['Value']))
source_dict[range_tuple] = source
Then I zip up the colors with a column data source sub-table:
colors = RdYlBu9
glyph_list = []
for color, range_tuple in zip(colors, num_bins):
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source_dict[range_tuple],
fill_color=color,
line_color=None)
glyph_list.append(r1)
Lastly, I create an explicit legend object which requires string-glyph-tuples. The legend object then gets attached to the plot:
legend_list = [("{0}<={1}".format(bin[0], bin[1]), [glyph]) for bin,
glyph in zip(num_bins, glyph_list)]
legend = Legend(items=legend_list, location=(0, -50))
p.add_layout(legend, 'left')
show(p)
Downsides to this approach:
It somehow seems a bit clunky.
Another potential downside I discovered while trying to select objects: If you click on one datapoint of a certain color, all datapoints of that color get selected. Depending on what you want to do this may be a plus or a minus.
2.) Colorbar:
Second approach makes use of #Okonomiyaki's comment above, and is a lot simpler. The basic gist is that you use a color mapper for determining colors of your glyphs. You also create a ColorBar as Okonomiyaki pointed out:
mapper = LogColorMapper(palette="Viridis256", low=min_value,
high=max_value)
source = ColumnDataSource(data = dict (
row = test['x'], col = test['y'], values = test['value']))
p = figure(title="Plate Heatmap", x_range = (0.0,25.0), y_range =
list(reversed(row)),
x_axis_location="above", plot_width=650, plot_height=400)
r1 = p.rect(x="col", y="row", width=1, height=1,
source=source,
fill_color={'field': 'values', 'transform': mapper},
line_color=None)
color_bar = ColorBar(color_mapper=mapper, ticker=LogTicker(),
label_standoff=12, border_line_color=None,
location(0,0))
p.add_layout(color_bar, 'left')
layout = p
show(layout)
I like the elegance of this approach. The only downside to this approach is that you don't get a clean range of numbers that define a given color.
If other people come up with even more elegant approaches, please
share!

Resources