Plotly: Pass array to 'fillcolor' argument - colors

I am trying to pass a array of colors to the 'fillcolor' argument to conditionally color the area below a graph.
First, passing a string with the color or rgba code to the 'fillcolor' argument works perfectly fine:
fig.add_trace(
go.Scatter(x=x,
y=y,
mode='lines',
fill='tozeroy',
fillcolor='red',
line=dict(color='rgba(255, 0, 0, 0.9)')))
When passing an array to the 'fillcolor' argument, such as
fig.add_trace(
go.Scatter(x=x,
y=y,
name='price',
mode='lines',
fill='tozeroy',
fillcolor=np.where(y > -5, 'green', 'red'),
line=dict(color='rgba(255, 0, 0, 0.9)')))
leads to this error:
ValueError:
Invalid value of type 'numpy.ndarray' received for the 'fillcolor' property of scatter
Also, passing an array directly or passing 'fillcolor=dict(color=array)', does not work.
Is there a chance to pass an array of colors and hence fill the area to the axis conditionally?
Thanks in advance!

as per comments fillcolor is not an array - it is an attribute of a trace
to achieve a multi-color figure you require a trace for each color line (filled line) required in figure
have uses pandas and Plotly Express in this solution, using your original code to set it up
this does give a marginally different shape plot due to way tozeroy works
import numpy as np
import plotly.graph_objects as go
x = np.linspace(0, 20, 100)
y = np.random.randint(-7, 3, 100)
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=x,
y=y,
mode="lines",
fill="tozeroy",
fillcolor="red",
line=dict(color="rgba(255, 0, 0, 0.9)"),
)
)
fig.show()
# create a dataframe to simplify solution. green and red lines logic as additional columns
df = pd.DataFrame({"x": x, "y": y})
df = df.assign(
y_green=np.where(df["y"] > -5, df["y"], 0), y_red=np.where(df["y"] > -5, 0, df["y"])
)
# create the plot
fig = px.line(
df, x="x", y=["y_green", "y_red"], color_discrete_sequence=["green", "red"]
).for_each_trace(lambda t: t.update(fillcolor=t.line.color, fill="tozeroy"))
fig

Related

How to set axis ticks with non periodical increment in matplolib

I have a 2D array representing the efficiency of a process for a given set of parameters A and B. The parameter A along the columns changes periodically, starting from 0 to 225 with increment one. The problem is with the rows where the parameter was changed in the following order:
[16 ,18 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30 ,31 ,32 ,33 ,35 ,40 ,45 ,50 ,55 ,60 ,65 ,70 ,75 ,80 ,85 ,90 ,95 ,100 ,105 ,110 ,115 ,120 ,125]
So even though the rows increase with increment one, they represent a non-uniform increment of the parameter B. What I need is to showcase the values of the parameter B on the y-axis. Using axes.set_yticks() does not give me what I am looking for, and I do understand why but I do not know how to solve it.
A minimum example:
# Define parameter B values
parb_increment = [16, 18, 20] + list(range(21,34)) + list(range(35,126,5))
print(len(parb_increment))
print(x.shape)
# Figure and axes
figure, axes = plt.subplots(figsize=(10, 8))
# Plotting
im = axes.imshow(x, aspect='auto',
origin="lower",
cmap='Blues',
interpolation='none',
extent=(0, x.shape[1], 0, parb_increment[-1]))
# Unsuccessful trial for yticks
axes.set_yticks(parb_increment, labels=parb_increment)
# Colorbar
cb = figure.colorbar(im, ax=axes)
The previous code gives the figure and output below, and you can see how the ticks are not only misplaced but also start from an incorrect position.
35
(35, 225)
The item that controls the width/height of each pixel is aspect. Unfortunately you can't make it variable. The aspect won't change even if you modify/update y-axis ticks. That's why in your example ticks are mis-aligned with the rows of pixels.
Therefore, the solution to your problem is to duplicate those rows that increment non-uniformly.
See example below:
import numpy as np
import matplotlib.pyplot as plt
# Generate fake data
x = np.random.random((3, 4))
# Create uniform x-ticks and non-uniform y-ticks
x_increment = np.arange(0, x.shape[1]+1, 1)
y_increment = np.arange(0, x.shape[0]+1, 1) * np.arange(0, x.shape[0]+1, 1)
# Plot the data
fig, ax = plt.subplots(figsize=(6, 10))
img = ax.imshow(
x,
extent=(
0, x.shape[1], 0, y_increment[-1]
)
)
fig.colorbar(img, ax=ax)
ax.set_xlim(0, x.shape[1])
ax.set_xticks(x_increment)
ax.set_ylim(0, y_increment[-1])
ax.set_yticks(y_increment);
This replicates your problem and produces the following outcome.
The solution
First, determine the number of repeats of each row in the array:
nr_of_repeats_per_row =np.diff(y_increment)
nr_of_repeats_per_row = nr_of_repeats_per_row[::-1]
You need to reverse the order as the top row in the image is the first row in the array and y_increments provide the difference between rows starting from the last row in the array.
Now you can repeat each row in the array a specific number of times:
x_extended = np.repeat(x, nr_of_repeats_per_row, axis=0)
Replot with the x_extended:
fig, ax = plt.subplots(figsize=(6, 10))
img = ax.imshow(
x_extended,
extent=(
0, x.shape[1], 0, y_increment[-1]
),
interpolation="none"
)
fig.colorbar(img, ax=ax)
ax.set_xlim(0, x.shape[1])
ax.set_xticks(x_increment)
ax.set_ylim(0, y_increment[-1])
ax.set_yticks(y_increment);
And you should get this.

Color Matplotlib Histogram Subplots by a Categorical Variable

I am trying to create histogram subplots whose values I want to color by a second, categorical variable.
A small subset of the data is below
data = {'ift': [0.031967, 0.067416, 0.091275, 0.046852, 0.100406],
'ine': [0.078384, 0.09554, 0.234695, 0.182821, 0.190237],
'ift_out': [1, 1, 0, 1, 0],
'ine_out': [1, 1, 0, 0, 1]}
xyz = pd.DataFrame(data)
xyz
My initial stab at it is also below. A bit stumped on the inclusion of the categorical columns as colors
fig, axs = plt.subplots(nrows=2, ncols=1, sharey=True, tight_layout=True)
axs[0].hist(xyz['ift']) # color = xyz['ift_out']
axs[1].hist(xyz['ine']) # color = xyz['ine_out']
plt.show()
Sample output is attached below
Following #JohanC's answer, I made the some changes to my original code as shown below, and that worked they way I wanted
import matplotlib.pyplot as plt
import seaborn as sns
sns.color_palette("tab10")
sns.set(style="darkgrid")
fig, axs = plt.subplots(nrows=1, ncols=2, tight_layout=True)
g = sns.histplot(data=xyz, x='ift',
hue='ift_out', palette=['skyblue','tomato'], multiple='stack', ax=axs[0])
g = sns.histplot(data=xyz, x='ine',
hue='ine_out', palette=['skyblue','tomato'], multiple='stack', ax=axs[1])

How to visualize a list of strings on a colorbar in matplotlib

I have a dataset like
x = 3,4,6,77,3
y = 8,5,2,5,5
labels = "null","exit","power","smile","null"
Then I use
from matplotlib import pyplot as plt
plt.scatter(x,y)
colorbar = plt.colorbar(labels)
plt.show()
to make a scatter plot, but cannot make colorbar showing labels as its colors.
How to get this?
I'm not sure, if it's a good idea to do that for scatter plots in general (you have the same description for different data points, maybe just use some legend here?), but I guess a specific solution to what you have in mind, might be the following:
from matplotlib import pyplot as plt
# Data
x = [3, 4, 6, 77, 3]
y = [8, 5, 2, 5, 5]
labels = ('null', 'exit', 'power', 'smile', 'null')
# Customize colormap and scatter plot
cm = plt.cm.get_cmap('hsv')
sc = plt.scatter(x, y, c=range(5), cmap=cm)
cbar = plt.colorbar(sc, ticks=range(5))
cbar.ax.set_yticklabels(labels)
plt.show()
This will result in such an output:
The code combines this Matplotlib demo and this SO answer.
Hope that helps!
EDIT: Incorporating the comments, I can only think of some kind of label color dictionary, generating a custom colormap from the colors, and before plotting explicitly grabbing the proper color indices from the labels.
Here's the updated code (I added some additional colors and data points to check scalability):
from matplotlib import pyplot as plt
from matplotlib.colors import LinearSegmentedColormap
import numpy as np
# Color information; create custom colormap
label_color_dict = {'null': '#FF0000',
'exit': '#00FF00',
'power': '#0000FF',
'smile': '#FF00FF',
'addon': '#AAAAAA',
'addon2': '#444444'}
all_labels = list(label_color_dict.keys())
all_colors = list(label_color_dict.values())
n_colors = len(all_colors)
cm = LinearSegmentedColormap.from_list('custom_colormap', all_colors, N=n_colors)
# Data
x = [3, 4, 6, 77, 3, 10, 40]
y = [8, 5, 2, 5, 5, 4, 7]
labels = ('null', 'exit', 'power', 'smile', 'null', 'addon', 'addon2')
# Get indices from color list for given labels
color_idx = [all_colors.index(label_color_dict[label]) for label in labels]
# Customize colorbar and plot
sc = plt.scatter(x, y, c=color_idx, cmap=cm)
c_ticks = np.arange(n_colors) * (n_colors / (n_colors + 1)) + (2 / n_colors)
cbar = plt.colorbar(sc, ticks=c_ticks)
cbar.ax.set_yticklabels(all_labels)
plt.show()
And, the new output:
Finding the correct middle point of each color segment is (still) not good, but I'll leave this optimization to you.

How to reduce the width of histogram?

I have drawn histogram of a diagnosis, which I modeled as poisson distribution in python. I need to reduce the width of rectangle in output graph.
I have written following line in python. I need to width reduction parameter to this code line.
fig = df['overall_diagnosis'].value_counts(normalize=True).plot(kind='bar',rot=0, color=['b', 'r'], alpha=0.5)
You are looking for matplotlib.pyplot.figure. You can use it like this:
from matplotlib.pyplot import figure
figure(num=None, figsize=(10, 10), dpi=80, facecolor='w', edgecolor='k')
Here is a example of how to do it:
names = ['group_a', 'group_b', 'group_c']
values = [1, 10, 100]
plt.figure(1, figsize=(9, 3))
plt.subplot(131)
plt.bar(names, values)
plt.subplot(132)
plt.scatter(names, values)
plt.subplot(133)
plt.plot(names, values)
plt.suptitle('Categorical Plotting')
plt.show()

Using matplotlib to represent three variables in two dimensions with colors [duplicate]

I want to make a scatterplot (using matplotlib) where the points are shaded according to a third variable. I've got very close with this:
plt.scatter(w, M, c=p, marker='s')
where w and M are the data points and p is the variable I want to shade with respect to.
However I want to do it in greyscale rather than colour. Can anyone help?
There's no need to manually set the colors. Instead, specify a grayscale colormap...
import numpy as np
import matplotlib.pyplot as plt
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
# Plot...
plt.scatter(x, y, c=y, s=500) # s is a size of marker
plt.gray()
plt.show()
Or, if you'd prefer a wider range of colormaps, you can also specify the cmap kwarg to scatter. To use the reversed version of any of these, just specify the "_r" version of any of them. E.g. gray_r instead of gray. There are several different grayscale colormaps pre-made (e.g. gray, gist_yarg, binary, etc).
import matplotlib.pyplot as plt
import numpy as np
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
plt.scatter(x, y, c=y, s=500, cmap='gray')
plt.show()
In matplotlib grey colors can be given as a string of a numerical value between 0-1.
For example c = '0.1'
Then you can convert your third variable in a value inside this range and to use it to color your points.
In the following example I used the y position of the point as the value that determines the color:
from matplotlib import pyplot as plt
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [125, 32, 54, 253, 67, 87, 233, 56, 67]
color = [str(item/255.) for item in y]
plt.scatter(x, y, s=500, c=color)
plt.show()
Sometimes you may need to plot color precisely based on the x-value case. For example, you may have a dataframe with 3 types of variables and some data points. And you want to do following,
Plot points corresponding to Physical variable 'A' in RED.
Plot points corresponding to Physical variable 'B' in BLUE.
Plot points corresponding to Physical variable 'C' in GREEN.
In this case, you may have to write to short function to map the x-values to corresponding color names as a list and then pass on that list to the plt.scatter command.
x=['A','B','B','C','A','B']
y=[15,30,25,18,22,13]
# Function to map the colors as a list from the input list of x variables
def pltcolor(lst):
cols=[]
for l in lst:
if l=='A':
cols.append('red')
elif l=='B':
cols.append('blue')
else:
cols.append('green')
return cols
# Create the colors list using the function above
cols=pltcolor(x)
plt.scatter(x=x,y=y,s=500,c=cols) #Pass on the list created by the function here
plt.grid(True)
plt.show()

Resources