plotly python change y-axis values ​to your own - python-3.x

A graph of random values ​​is plotted. It is necessary to change the values ​​of the Y axis to others.
xg = np.random.rand(100, 1200)
fig = px.imshow(xg, aspect="auto",color_continuous_scale='ice')
fig.show()
values ​​to be changed
y1=np.arange(100)*0.5
y2=np.arange(100)*5
Thanks!

100 ticks on the yaxis is too many! Have gone with 10
simple case of defining where you want ticks in the range of values (100) and text you want to display again domain value https://plotly.com/python/tick-formatting/#tickmode--array
the multiplications are equivalent to your requirements, just generating arrays of appropriate length (ten ticks)
import numpy as np
import plotly.express as px
xg = np.random.rand(100, 1200)
fig = px.imshow(xg, aspect="auto",color_continuous_scale='ice')
# y1
fig.update_layout(yaxis={"tickmode":"array","tickvals":np.arange(10)*10, "ticktext":np.arange(10)*5}).show()
# y2
fig.update_layout(yaxis={"tickmode":"array","tickvals":np.arange(10)*10, "ticktext":np.arange(10)*50}).show()
supplementary - show both y-axis
as per comment you can use documented approach https://plotly.com/python/multiple-axes/#multiple-axes
you do require a trace per axis hence creation of two traces and modification of second trace to use y2
setting second trace to "visible":False does not work. So there are two identical traces on top of each other
import numpy as np
import plotly.express as px
xg = np.random.rand(100, 1200)
fig = px.imshow(xg, aspect="auto", color_continuous_scale="ice").add_traces(
px.imshow(xg, aspect="auto", color_continuous_scale="ice")
.update_traces(yaxis="y2")
.data
)
fig.update_layout(
xaxis={"domain": [0.05, 1]},
yaxis={
"tickmode": "array",
"tickvals": np.arange(10) * 10,
"ticktext": np.arange(10) * 50,
},
yaxis2={
"tickmode": "array",
"tickvals": np.arange(10) * 10,
"ticktext": np.arange(10) * 5,
"anchor": "free",
"position": 0,
"autorange": "reversed",
},
).show()

Related

Change colorbar limits without changing the values of the data it represents in scatter

I'm trying to change a colorbar attached to a scatter plot so that the minimum and maximum of the colorbar are the minimum and maximum of the data, but I want the data to be centred at zero as I'm using a colormap with white at zero. Here is my example
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 1, 61)
y = np.linspace(0, 1, 61)
C = np.linspace(-10, 50, 61)
M = np.abs(C).max() # used for vmin and vmax
fig, ax = plt.subplots(1, 1, figsize=(5,3), dpi=150)
sc=ax.scatter(x, y, c=C, marker='o', edgecolor='k', vmin=-M, vmax=M, cmap=plt.cm.RdBu_r)
cbar=fig.colorbar(sc, ax=ax, label='$R - R_0$ (mm)')
ax.set_xlabel('x')
ax.set_ylabel('y')
As you can see from the attached figure, the colorbar goes down to -M, where as I want the bar to just go down to -10, but if I let vmin=-10 then the colorbar won't be zerod at white. Normally, setting vmin to +/- M when using contourf the colorbar automatically sorts to how I want. This sort of behaviour is what I expect when contourf uses levels=np.linspace(-M,M,61) rather than setting it with vmin and vmax with levels=62. An example showing the default contourf colorbar behaviour I want in my scatter example is shown below
plt.figure(figsize=(6,5), dpi=150)
plt.contourf(x, x, np.reshape(np.linspace(-10, 50, 61*61), (61,61)),
levels=62, vmin=-M, vmax=M, cmap=plt.cm.RdBu_r)
plt.colorbar(label='$R - R_0$ (mm)')
Does anyone have any thoughts? I found this link which I thought might solve the problem, but when executing the cbar.outline.set_ydata line I get this error AttributeError: 'Polygon' object has no attribute 'set_ydata' .
EDIT a little annoyed that someone has closed this question without allowing me to clarify any questions they might have, as none of the proposed solutions are what I'm asking for.
As for Normalize.TwoSlopeNorm, I do not want to rescale the smaller negative side to use the entire colormap range, I just want the colorbar attached to the side of my graph to stop at -10.
This link also does not solve my issue, as it's the TwoSlopeNorm solution again.
After changing the ylim of the colorbar, the rectangle formed by the surrounding spines is too large. You can make this outline invisible. And then add a new rectangular border:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 1, 61)
y = np.linspace(0, 1, 61)
C = np.linspace(-10, 50, 61)
M = np.abs(C).max() # used for vmin and vmax
fig, ax = plt.subplots(1, 1, figsize=(5, 3), dpi=150)
sc = ax.scatter(x, y, c=C, marker='o', edgecolor='k', vmin=-M, vmax=M, cmap=plt.cm.RdBu_r)
cbar = fig.colorbar(sc, ax=ax, label='$R - R_0$ (mm)')
cb_ymin = C.min()
cb_ymax = C.max()
cb_xmin, cb_xmax = cbar.ax.get_xlim()
cbar.ax.set_ylim(cb_ymin, cb_ymax)
cbar.outline.set_visible(False) # hide the surrounding spines, which are too large after set_ylim
cbar.ax.add_patch(plt.Rectangle((cb_xmin, cb_ymin), cb_xmax - cb_xmin, cb_ymax - cb_ymin,
fc='none', ec='black', clip_on=False))
plt.show()
Another approach until v3.5 is released is to make a custom colormap that does what you want (see also https://matplotlib.org/stable/tutorials/colors/colormap-manipulation.html#sphx-glr-tutorials-colors-colormap-manipulation-py)
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.cm as cm
from matplotlib.colors import ListedColormap
fig, axs = plt.subplots(2, 1)
X = np.random.randn(32, 32) + 2
pc = axs[0].pcolormesh(X, vmin=-6, vmax=6, cmap='RdBu_r')
fig.colorbar(pc, ax=axs[0])
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.cm as cm
from matplotlib.colors import ListedColormap
fig, axs = plt.subplots(2, 1)
X = np.random.randn(32, 32) + 2
pc = axs[0].pcolormesh(X, vmin=-6, vmax=6, cmap='RdBu_r')
fig.colorbar(pc, ax=axs[0])
def keep_center_colormap(vmin, vmax, center=0):
vmin = vmin - center
vmax = vmax - center
dv = max(-vmin, vmax) * 2
N = int(256 * dv / (vmax-vmin))
RdBu_r = cm.get_cmap('RdBu_r', N)
newcolors = RdBu_r(np.linspace(0, 1, N))
beg = int((dv / 2 + vmin)*N / dv)
end = N - int((dv / 2 - vmax)*N / dv)
newmap = ListedColormap(newcolors[beg:end])
return newmap
newmap = keep_center_colormap(-2, 6, center=0)
pc = axs[1].pcolormesh(X, vmin=-2, vmax=6, cmap=newmap)
fig.colorbar(pc, ax=axs[1])
plt.show()

How to draw vertical average lines for overlapping histograms in a loop

I'm trying to draw with matplotlib two average vertical line for every overlapping histograms using a loop. I have managed to draw the first one, but I don't know how to draw the second one. I'm using two variables from a dataset to draw the histograms. One variable (feat) is categorical (0 - 1), and the other one (objective) is numerical. The code is the following:
for chas in df[feat].unique():
plt.hist(df.loc[df[feat] == chas, objective], bins = 15, alpha = 0.5, density = True, label = chas)
plt.axvline(df[objective].mean(), linestyle = 'dashed', linewidth = 2)
plt.title(objective)
plt.legend(loc = 'upper right')
I also have to add to the legend the mean and standard deviation values for each histogram.
How can I do it? Thank you in advance.
I recommend you using axes to plot your figure. Pls see code below and the artist tutorial here.
import numpy as np
import matplotlib.pyplot as plt
# Fixing random state for reproducibility
np.random.seed(19680801)
mu1, sigma1 = 100, 8
mu2, sigma2 = 150, 15
x1 = mu1 + sigma1 * np.random.randn(10000)
x2 = mu2 + sigma2 * np.random.randn(10000)
fig, ax = plt.subplots(1, 1, figsize=(7.2, 7.2))
# the histogram of the data
lbs = ['a', 'b']
colors = ['r', 'g']
for i, x in enumerate([x1, x2]):
n, bins, patches = ax.hist(x, 50, density=True, facecolor=colors[i], alpha=0.75, label=lbs[i])
ax.axvline(bins.mean())
ax.legend()

Seaborn barplot with two y-axis

considering the following pandas DataFrame:
labels values_a values_b values_x values_y
0 date1 1 3 150 170
1 date2 2 6 200 180
It is easy to plot this with Seaborn (see example code below). However, due to the big difference between values_a/values_b and values_x/values_y, the bars for values_a and values_b are not easily visible (actually, the dataset given above is just a sample and in my real dataset the difference is even bigger). Therefore, I would like to use two y-axis, i.e., one y-axis for values_a/values_b and one for values_x/values_y. I tried to use plt.twinx() to get a second axis but unfortunately, the plot shows only two bars for values_x and values_y, even though there are at least two y-axis with the right scaling. :) Do you have an idea how to fix that and get four bars for each label whereas the values_a/values_b bars relate to the left y-axis and the values_x/values_y bars relate to the right y-axis?
Thanks in advance!
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
columns = ["labels", "values_a", "values_b", "values_x", "values_y"]
test_data = pd.DataFrame.from_records([("date1", 1, 3, 150, 170),\
("date2", 2, 6, 200, 180)],\
columns=columns)
# working example but with unreadable values_a and values_b
test_data_melted = pd.melt(test_data, id_vars=columns[0],\
var_name="source", value_name="value_numbers")
g = sns.barplot(x=columns[0], y="value_numbers", hue="source",\
data=test_data_melted)
plt.show()
# values_a and values_b are not displayed
values1_melted = pd.melt(test_data, id_vars=columns[0],\
value_vars=["values_a", "values_b"],\
var_name="source1", value_name="value_numbers1")
values2_melted = pd.melt(test_data, id_vars=columns[0],\
value_vars=["values_x", "values_y"],\
var_name="source2", value_name="value_numbers2")
g1 = sns.barplot(x=columns[0], y="value_numbers1", hue="source1",\
data=values1_melted)
ax2 = plt.twinx()
g2 = sns.barplot(x=columns[0], y="value_numbers2", hue="source2",\
data=values2_melted, ax=ax2)
plt.show()
This is probably best suited for multiple sub-plots, but if you are truly set on a single plot, you can scale the data before plotting, create another axis and then modify the tick values.
Sample Data
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
columns = ["labels", "values_a", "values_b", "values_x", "values_y"]
test_data = pd.DataFrame.from_records([("date1", 1, 3, 150, 170),\
("date2", 2, 6, 200, 180)],\
columns=columns)
test_data_melted = pd.melt(test_data, id_vars=columns[0],\
var_name="source", value_name="value_numbers")
Code:
# Scale the data, just a simple example of how you might determine the scaling
mask = test_data_melted.source.isin(['values_a', 'values_b'])
scale = int(test_data_melted[~mask].value_numbers.mean()
/test_data_melted[mask].value_numbers.mean())
test_data_melted.loc[mask, 'value_numbers'] = test_data_melted.loc[mask, 'value_numbers']*scale
# Plot
fig, ax1 = plt.subplots()
g = sns.barplot(x=columns[0], y="value_numbers", hue="source",\
data=test_data_melted, ax=ax1)
# Create a second y-axis with the scaled ticks
ax1.set_ylabel('X and Y')
ax2 = ax1.twinx()
# Ensure ticks occur at the same positions, then modify labels
ax2.set_ylim(ax1.get_ylim())
ax2.set_yticklabels(np.round(ax1.get_yticks()/scale,1))
ax2.set_ylabel('A and B')
plt.show()

Place the plots on the page in two columns [duplicate]

I found the following example on matplotlib:
import numpy as np
import matplotlib.pyplot as plt
x1 = np.linspace(0.0, 5.0)
x2 = np.linspace(0.0, 2.0)
y1 = np.cos(2 * np.pi * x1) * np.exp(-x1)
y2 = np.cos(2 * np.pi * x2)
plt.subplot(2, 1, 1)
plt.plot(x1, y1, 'ko-')
plt.title('A tale of 2 subplots')
plt.ylabel('Damped oscillation')
plt.subplot(2, 1, 2)
plt.plot(x2, y2, 'r.-')
plt.xlabel('time (s)')
plt.ylabel('Undamped')
plt.show()
My question is: What do i need to change, to have the plots side-by-side?
Change your subplot settings to:
plt.subplot(1, 2, 1)
...
plt.subplot(1, 2, 2)
The parameters for subplot are: number of rows, number of columns, and which subplot you're currently on. So 1, 2, 1 means "a 1-row, 2-column figure: go to the first subplot." Then 1, 2, 2 means "a 1-row, 2-column figure: go to the second subplot."
You currently are asking for a 2-row, 1-column (that is, one atop the other) layout. You need to ask for a 1-row, 2-column layout instead. When you do, the result will be:
In order to minimize the overlap of subplots, you might want to kick in a:
plt.tight_layout()
before the show. Yielding:
Check this page out: http://matplotlib.org/examples/pylab_examples/subplots_demo.html
plt.subplots is similar. I think it's better since it's easier to set parameters of the figure. The first two arguments define the layout (in your case 1 row, 2 columns), and other parameters change features such as figure size:
import numpy as np
import matplotlib.pyplot as plt
x1 = np.linspace(0.0, 5.0)
x2 = np.linspace(0.0, 2.0)
y1 = np.cos(2 * np.pi * x1) * np.exp(-x1)
y2 = np.cos(2 * np.pi * x2)
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(5, 3))
axes[0].plot(x1, y1)
axes[1].plot(x2, y2)
fig.tight_layout()
When stacking subplots in one direction, the matplotlib documentation advocates unpacking immediately if you are just creating a few axes.
fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,8))
sns.histplot(df['Price'], ax=ax1)
sns.histplot(np.log(df['Price']),ax=ax2)
plt.show()
You can use - matplotlib.gridspec.GridSpec
Check - https://matplotlib.org/stable/api/_as_gen/matplotlib.gridspec.GridSpec.html
The below code displays a heatmap on right and an Image on left.
#Creating 1 row and 2 columns grid
gs = gridspec.GridSpec(1, 2)
fig = plt.figure(figsize=(25,3))
#Using the 1st row and 1st column for plotting heatmap
ax=plt.subplot(gs[0,0])
ax=sns.heatmap([[1,23,5,8,5]],annot=True)
#Using the 1st row and 2nd column to show the image
ax1=plt.subplot(gs[0,1])
ax1.grid(False)
ax1.set_yticklabels([])
ax1.set_xticklabels([])
#The below lines are used to display the image on ax1
image = io.imread("https://images-na.ssl-images- amazon.com/images/I/51MvhqY1qdL._SL160_.jpg")
plt.imshow(image)
plt.show()
Output image
Basically we have to define how many rows and columns we require.
Lets Say we have total 4 categorical columns to be plotted. Lets have total 4 plots in 2 rows and 2 columns.
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
sns.set_style("darkgrid")
%matplotlib inline
#15 by 15 size set for entire plots
plt.figure(figsize=(15,15));
#Set rows variable to 2
rows = 2
#Set columns variable to 2, this way we will plot 2 by 2 = 4 plots
columns = 2
#Set the plot_count variable to 1
#This variable will be used to define which plot out of total 4 plot
plot_count = 1
cat_columns = [col for col in df.columns if df[col].dtype=='O']
for col in cat_columns:
plt.subplot(rows, columns, plot_count)
sns.countplot(x=col, data=df)
plt.xticks(rotation=70);
#plot variable is incremented by 1 till 4, specifying which plot of total 4 plots
plot_count += 1

Smooth curves in Python Plots [duplicate]

I've got the following simple script that plots a graph:
import matplotlib.pyplot as plt
import numpy as np
T = np.array([6, 7, 8, 9, 10, 11, 12])
power = np.array([1.53E+03, 5.92E+02, 2.04E+02, 7.24E+01, 2.72E+01, 1.10E+01, 4.70E+00])
plt.plot(T,power)
plt.show()
As it is now, the line goes straight from point to point which looks ok, but could be better in my opinion. What I want is to smooth the line between the points. In Gnuplot I would have plotted with smooth cplines.
Is there an easy way to do this in PyPlot? I've found some tutorials, but they all seem rather complex.
You could use scipy.interpolate.spline to smooth out your data yourself:
from scipy.interpolate import spline
# 300 represents number of points to make between T.min and T.max
xnew = np.linspace(T.min(), T.max(), 300)
power_smooth = spline(T, power, xnew)
plt.plot(xnew,power_smooth)
plt.show()
spline is deprecated in scipy 0.19.0, use BSpline class instead.
Switching from spline to BSpline isn't a straightforward copy/paste and requires a little tweaking:
from scipy.interpolate import make_interp_spline, BSpline
# 300 represents number of points to make between T.min and T.max
xnew = np.linspace(T.min(), T.max(), 300)
spl = make_interp_spline(T, power, k=3) # type: BSpline
power_smooth = spl(xnew)
plt.plot(xnew, power_smooth)
plt.show()
Before:
After:
For this example spline works well, but if the function is not smooth inherently and you want to have smoothed version you can also try:
from scipy.ndimage.filters import gaussian_filter1d
ysmoothed = gaussian_filter1d(y, sigma=2)
plt.plot(x, ysmoothed)
plt.show()
if you increase sigma you can get a more smoothed function.
Proceed with caution with this one. It modifies the original values and may not be what you want.
See the scipy.interpolate documentation for some examples.
The following example demonstrates its use, for linear and cubic spline interpolation:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import interp1d
# Define x, y, and xnew to resample at.
x = np.linspace(0, 10, num=11, endpoint=True)
y = np.cos(-x**2/9.0)
xnew = np.linspace(0, 10, num=41, endpoint=True)
# Define interpolators.
f_linear = interp1d(x, y)
f_cubic = interp1d(x, y, kind='cubic')
# Plot.
plt.plot(x, y, 'o', label='data')
plt.plot(xnew, f_linear(xnew), '-', label='linear')
plt.plot(xnew, f_cubic(xnew), '--', label='cubic')
plt.legend(loc='best')
plt.show()
Slightly modified for increased readability.
One of the easiest implementations I found was to use that Exponential Moving Average the Tensorboard uses:
def smooth(scalars: List[float], weight: float) -> List[float]: # Weight between 0 and 1
last = scalars[0] # First value in the plot (first timestep)
smoothed = list()
for point in scalars:
smoothed_val = last * weight + (1 - weight) * point # Calculate smoothed value
smoothed.append(smoothed_val) # Save it
last = smoothed_val # Anchor the last smoothed value
return smoothed
ax.plot(x_labels, smooth(train_data, .9), x_labels, train_data)
I presume you mean curve-fitting and not anti-aliasing from the context of your question. PyPlot doesn't have any built-in support for this, but you can easily implement some basic curve-fitting yourself, like the code seen here, or if you're using GuiQwt it has a curve fitting module. (You could probably also steal the code from SciPy to do this as well).
Here is a simple solution for dates:
from scipy.interpolate import make_interp_spline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as dates
from datetime import datetime
data = {
datetime(2016, 9, 26, 0, 0): 26060, datetime(2016, 9, 27, 0, 0): 23243,
datetime(2016, 9, 28, 0, 0): 22534, datetime(2016, 9, 29, 0, 0): 22841,
datetime(2016, 9, 30, 0, 0): 22441, datetime(2016, 10, 1, 0, 0): 23248
}
#create data
date_np = np.array(list(data.keys()))
value_np = np.array(list(data.values()))
date_num = dates.date2num(date_np)
# smooth
date_num_smooth = np.linspace(date_num.min(), date_num.max(), 100)
spl = make_interp_spline(date_num, value_np, k=3)
value_np_smooth = spl(date_num_smooth)
# print
plt.plot(date_np, value_np)
plt.plot(dates.num2date(date_num_smooth), value_np_smooth)
plt.show()
It's worth your time looking at seaborn for plotting smoothed lines.
The seaborn lmplot function will plot data and regression model fits.
The following illustrates both polynomial and lowess fits:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([6, 7, 8, 9, 10, 11, 12])
power = np.array([1.53E+03, 5.92E+02, 2.04E+02, 7.24E+01, 2.72E+01, 1.10E+01, 4.70E+00])
df = pd.DataFrame(data = {'T': T, 'power': power})
sns.lmplot(x='T', y='power', data=df, ci=None, order=4, truncate=False)
sns.lmplot(x='T', y='power', data=df, ci=None, lowess=True, truncate=False)
The order = 4 polynomial fit is overfitting this toy dataset. I don't show it here but order = 2 and order = 3 gave worse results.
The lowess = True fit is underfitting this tiny dataset but may give better results on larger datasets.
Check the seaborn regression tutorial for more examples.
Another way to go, which slightly modifies the function depending on the parameters you use:
from statsmodels.nonparametric.smoothers_lowess import lowess
def smoothing(x, y):
lowess_frac = 0.15 # size of data (%) for estimation =~ smoothing window
lowess_it = 0
x_smooth = x
y_smooth = lowess(y, x, is_sorted=False, frac=lowess_frac, it=lowess_it, return_sorted=False)
return x_smooth, y_smooth
That was better suited than other answers for my specific application case.

Resources