Related
here is a simple bar plot :
x = [1, 2, 3]
y = [10, 45, 23]
plt.bar(x, y)
I just want to show the percentage change from one bar to another. Maybe you can help. Thanks
You could use the bar_label function (matplotlib 3.4.2):
import matplotlib.pyplot as plt
x = [1, 2, 3]
y = [10, 45, 23]
bars = plt.bar(x, y, fc='crimson', ec='navy')
plt.bar_label(bars, [''] + [f'{(y1 - y0) / y0 * 100:+.2f}%' for y0, y1 in zip(y[:-1], y[1:])])
plt.margins(y=0.1)
plt.show()
How to:
display symbols in the legend
colour markers in the same way as the errorbars (argument color gives an error: ValueError: RGBA sequence should have length 3 or 4
remove connecting lines - get only the scatter with errorbars
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D # for legend handle
fig, ax = plt.subplots(figsize = (10,10))
times = [1, 2, 3, 4, 5]
rvs = [2, 4, 2, 4, 7]
sigma = [0.564, 0.6, 0.8, 0.8, 0.4]
rv_telescopes = ['A', 'B', 'A', 'C', 'C']
d = {'rv_times': times, 'rv_rvs': rvs, 'rv_sigma': sigma, 'rv_telescopes': rv_telescopes }
df = pd.DataFrame(data=d)
colors = {'A':'#008f00', 'B':'#e36500', 'C':'red'}
plt.errorbar(df['rv_times'], df['rv_rvs'], df['rv_sigma'], marker = '_', ecolor = df['rv_telescopes'].map(colors), color = df['rv_telescopes'].map(colors), zorder = 1, ms = 30)
handles = [Line2D([0], [0], marker='_', color='w', markerfacecolor=v, label=k, markersize=10) for k, v in colors.items()]
ax.legend(handles=handles, loc='upper left', ncol = 2, fontsize=14)
plt.show()
After edit
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D # for legend handle
import pandas as pd
import numpy as np
times = [1, 2, 3, 4, 5]
rvs = [2, 4, 2, 4, 7]
sigma = [0.564, 0.6, 0.8, 0.8, 0.4]
rv_telescopes = ['A', 'B', 'A', 'C', 'C']
d = {'rv_times': times, 'rv_rvs': rvs, 'rv_sigma': sigma, 'rv_telescopes': rv_telescopes}
df = pd.DataFrame(data=d)
colors = {'A': '#008f00', 'B': '#e36500', 'C': 'red'}
fig, ax = plt.subplots(figsize=(10, 10))
ax.errorbar(df['rv_times'], df['rv_rvs'], df['rv_sigma'], color='none', ecolor=df['rv_telescopes'].map(colors) ,linewidth=1)
ax.scatter(df['rv_times'], df['rv_rvs'], marker='_', linewidth=3, color=df['rv_telescopes'].map(colors), s=1000)
for rv_teles in np.unique(df['rv_telescopes']):
color = colors[rv_teles]
df1 = df[df['rv_telescopes'] == rv_teles] # filter out rows corresponding to df['rv_telescopes']
ax.errorbar(df1['rv_times'], df1['rv_rvs'], df1['rv_sigma'],
color=color, ls='', marker='_', ms=30, linewidth=3, label=rv_teles)
ax.legend(loc='upper left', ncol=1, fontsize=14)
plt.show()
plt.errorbar() works very similar to plt.plot() with extra parameters. As such, it primarily draws a line graph, using a single color. The error bars can be given individual colors via the ecolor= parameter. The markers, however, get the same color as the line graph. The line graph can be suppressed via an empty linestyle. On top of that, plt.scatter() can draw markers with individual colors.
In order not the mix the 'object-oriented' with the 'functional interface', the following example code uses ax.errorbar() and ax.scatter().
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D # for legend handle
import pandas as pd
import numpy as np
times = [1, 2, 3, 4, 5]
rvs = [2, 4, 2, 4, 7]
sigma = [0.564, 0.6, 0.8, 0.8, 0.4]
rv_telescopes = ['A', 'B', 'A', 'C', 'C']
d = {'rv_times': times, 'rv_rvs': rvs, 'rv_sigma': sigma, 'rv_telescopes': rv_telescopes}
df = pd.DataFrame(data=d)
colors = {'A': '#008f00', 'B': '#e36500', 'C': 'red'}
fig, ax = plt.subplots(figsize=(10, 10))
ax.errorbar(df['rv_times'], df['rv_rvs'], df['rv_sigma'], color='none', ecolor=df['rv_telescopes'].map(colors))
ax.scatter(df['rv_times'], df['rv_rvs'], marker='_', color=df['rv_telescopes'].map(colors), s=100)
handles = [Line2D([0], [0], linestyle='', marker='_', color=v, label=k, markersize=10) for k, v in colors.items()]
ax.legend(handles=handles, loc='upper left', ncol=1, fontsize=14)
plt.show()
A far easier approach would be to call ax.errorbar() multiple times, once for each color. This would automatically create appropriate legend handles:
for rv_teles in np.unique(df['rv_telescopes']):
color = colors[rv_teles]
df1 = df[df['rv_telescopes'] == rv_teles] # filter out rows corresponding to df['rv_telescopes']
ax.errorbar(df1['rv_times'], df1['rv_rvs'], df1['rv_sigma'],
color=color, ls='', marker='_', ms=30, label=rv_teles)
ax.legend(loc='upper left', ncol=1, fontsize=14)
plt.show()
I am graphing my predicted and actual results of an ML project using pyplot. I have a scatter plot of each dataset as a subplot and the Y values are elements of [-1, 0, 1]. I would to change the color of the points if both points have the same X and Y value but am not sure how to implement this. Here is my code so far:
import matplotlib.pyplot as plt
Y = [1, 0, -1, 0, 1]
Z = [1, 1, 1, 1, 1]
plt.subplots()
plt.title('Title')
plt.xlabel('Timestep')
plt.ylabel('Score')
plt.scatter(x = [i for i in range(len(Y))], y = Y, label = 'Actual')
plt.scatter(x = [i for i in range(len(Y))], y = Z, label = 'Predicted')
plt.legend()
I would simply make use of NumPy indexing in this case. Specifically, first plot all the data points and then additionally highlight only those point which fulfill the condition X==Y and X==Z
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
Y = np.array([1, 0, -1, 0, 1])
Z = np.array([1, 1, 1, 1, 1])
X = np.arange(len(Y))
# Labels and titles here
plt.scatter(X, Y, label = 'Actual')
plt.scatter(X, Z, label = 'Predicted')
plt.scatter(X[X==Y], Y[X==Y], color='black', s=500)
plt.scatter(X[X==Z], Z[X==Z], color='red', s=500)
plt.xticks(X)
plt.legend()
plt.show()
I want to specify manually the color of a line segment in holoviews, based on a third column.
I'm aware of the hv.Path examples, however, this reduces the length of the line with 1 segment, which I don't want.
I can do it using bokeh, or using matplotlib, but I can't get it right using holoviews
def get_color(min_val, max_val, val, palette):
return palette[(int((val-min_val)*((len(palette)-1)/(max_val-min_val))+.5))]
from bokeh.io import output_file, show
from bokeh.plotting import figure
y = [0,1,2,3,4,5]
x = [0]*len(y)
z = [1,2,3,4,5]
p = figure(plot_width=500, plot_height=200, tools='')
[p.line([x[i],x[i+1]],[y[i],y[i+1]],line_color = get_color(1,5,z,Viridis256), line_width=4) for i,z in enumerate(z) ]
show(p)
import numpy
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
# The line format you curently have:
lines = [[(0, 1, 2, 3, 4), (4, 5, 6, 7, 8)],
[(0, 1, 2, 3, 4), (0, 1, 2, 3, 4)],
[(0, 1, 2, 3, 4), (8, 7, 6, 5, 4)],
[(4, 5, 6, 7, 8), (0, 1, 2, 3, 4)]]
# Reformat it to what `LineCollection` expects:
lines = [zip(x, y) for x, y in lines]
z = np.array([0.1, 9.4, 3.8, 2.0])
fig, ax = plt.subplots()
lines = LineCollection(lines, array=z, cmap=plt.cm.rainbow, linewidths=5)
ax.add_collection(lines)
fig.colorbar(lines)
# Manually adding artists doesn't rescale the plot, so we need to autoscale
ax.autoscale()
plt.show()
from bokeh.io import output_file, show
from bokeh.plotting import figure
y = [0,1,2,3,4,5]
x = [0]*len(y)
z = [1,2,3,4,5]
p = figure(plot_width=500, plot_height=200, tools='')
[p.line([x[i],x[i+1]],[y[i],y[i+1]],line_color = get_color(1,5,z,Viridis256), line_width=4) for i,z in enumerate(z) ]
show(p)
from bokeh.palettes import Viridis256
curvlst = [hv.Curve([[x[i],y[i]],[x[i+1],y[i+1]]],line_color = get_color(1,5,z,Viridis256)) for i,z in enumerate(z) ]
hv.Overlay(curvlst)
WARNING:param.Curve26666: Setting non-parameter attribute line_color=#440154 using a mechanism intended only for parameters
You could use a so called dim transform by rewriting the function a little bit:
def get_color(val, min_val, max_val, palette):
return [palette[(int((val-min_val)*((len(palette)-1)/(max_val-min_val))+.5))]]
y = [0,1,2,3,4,5]
x = [0]*len(y)
z = [1,2,3,4,5]
hv.NdOverlay({z: hv.Curve(([x[i],x[i+1]], [y[i],y[i+1]])) for i, z in enumerate(z)}, kdims=['z']).opts(
'Curve', color=hv.dim('z', get_color, 1, 5, Viridis256))
That being said, I don't think you should have to manually colormap Curves so I've opened: https://github.com/pyviz/holoviews/issues/3764.
I think I found out..
from bokeh.palettes import Viridis256
def get_color(min_val, max_val, val, palette):
return palette[(int((val-min_val)*((len(palette)-1)/(max_val-min_val))+.5))]
curvlst = [hv.Curve([[x[i],y[i]],[x[i+1],y[i+1]]]).opts(color=get_color(1,5,z,Viridis256)) for i,z in enumerate(z) ]
hv.Overlay(curvlst)
Please let me know it this is good practise, or if you know a better way..
Given the following bar chart:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({'A': ['A', 'B'], 'B': [1000,2000]})
fig, ax = plt.subplots(1, 1, figsize=(2, 2))
df.plot(kind='bar', x='A', y='B',
align='center', width=.5, edgecolor='none',
color='grey', ax=ax)
plt.xticks(rotation=25)
plt.show()
I'd like to display the y-tick labels as thousands of dollars like this:
$2,000
I know I can use this to add a dollar sign:
import matplotlib.ticker as mtick
fmt = '$%.0f'
tick = mtick.FormatStrFormatter(fmt)
ax.yaxis.set_major_formatter(tick)
...and this to add a comma:
ax.get_yaxis().set_major_formatter(
mtick.FuncFormatter(lambda x, p: format(int(x), ',')))
...but how do I get both?
Thanks in advance!
You can use StrMethodFormatter, which uses the str.format() specification mini-language.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
df = pd.DataFrame({'A': ['A', 'B'], 'B': [1000,2000]})
fig, ax = plt.subplots(1, 1, figsize=(2, 2))
df.plot(kind='bar', x='A', y='B',
align='center', width=.5, edgecolor='none',
color='grey', ax=ax)
fmt = '${x:,.0f}'
tick = mtick.StrMethodFormatter(fmt)
ax.yaxis.set_major_formatter(tick)
plt.xticks(rotation=25)
plt.show()
You can also use the get_yticks() to get an array of the values displayed on the y-axis (0, 500, 1000, etc.) and the set_yticklabels() to set the formatted value.
df = pd.DataFrame({'A': ['A', 'B'], 'B': [1000,2000]})
fig, ax = plt.subplots(1, 1, figsize=(2, 2))
df.plot(kind='bar', x='A', y='B', align='center', width=.5, edgecolor='none',
color='grey', ax=ax)
--------------------Added code--------------------------
# getting the array of values of y-axis
ticks = ax.get_yticks()
# formatted the values into strings beginning with dollar sign
new_labels = [f'${int(amt)}' for amt in ticks]
# Set the new labels
ax.set_yticklabels(new_labels)
-------------------------------------------------------
plt.xticks(rotation=25)
plt.show()