What is equivalent of seaborn's hue in Matplotlib? - python-3.x

May I ask what is the equivalent of hue in Matplotlib? I have a line of seaborn code and would need to convert it into Matplotlib format. Each box corresponds to 1 ID at that time. How do i create a legend as well? The dataframe dataset_filtered has 3 columns: time_window (x axis), id and LagTime which is y axis.
sns.catplot(x='time_window', hue='ID', y='LagTime', data= dataset_filtered, kind="box",showfliers=False)
This is what I have written so far but it is not working...
# Generate a color dictionary using RGB for each Id
colors = []
for num in range(len(unique_id)):
num = num + 1
color = (1/num, 1/num, 1/num)
colors.append(color)
color_dictionary = dict(zip(unique_id, colors))
plt.figure(figsize=(30,15))
for time_window in dataset_filtered.index.unique():
dataset_plot = dataset_filtered.loc[time_window]
box = dataset_plot.boxplot('LagTime',patch_artist=True,boxprops=dict(facecolor=color_dictionary['id']),medianprops=dict(color='black'),labels='id')
plt.xlim(-0.5,8)
plt.xticks(np.arange(0.2,10,0.5),dataset_filtered.index.unique())
handles, labels = plt.gca().get_legend_handles_labels()
by_label = OrderedDict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys())
plt.xticklabels()

Related

Bokeh colorbar, assign a tick to each color

I'm trying to plot an heatmap of a matrix containing some counts (called mat in my code, then df after change the structure to use it with Bokeh). The structure is like this:
X
element 1
element 2
element 3
category 1
0
6
4
category 2
1
7
3
category 3
5
2
10
category 4
0
1
4
Now with my code I'm using df.value.unique() both for the color mapper and the ticks, but in the heatmap the colorbar's ticks doesn't correspond to the colors:
How can I make the ticks coincide each one to one color? I'm quite sure I have to use the CategoricalColorMapper but with that I get only a white screen. Thank you for the help.
Here's my code:
mat = pd.read_csv("tests/count_50.dat", sep="\t", index_col=0)
mat.index.name = 'MGI_id'
mat.columns.name = 'phen_sys'
#set data as float numbers
mat=mat.astype(float)
#Create a custom palette and add a specific mapper to map color with values
df = mat.stack(dropna=False).rename("value").reset_index()
pal=bokeh.palettes.brewer['YlGnBu'][len(df.value.unique())]
mapper = LinearColorMapper(palette=pal, low=df.value.min(), high=df.value.max(), nan_color = 'gray')
#Define a figure
p = figure(
plot_width=1280,
plot_height=800,
title="Heatmap",
x_range=list(df.MGI_id.drop_duplicates()),
y_range=list(df.phen_sys.drop_duplicates()[::-1]),
tooltips=[('Phenotype system','#phen_sys'),('Gene','#MGI_id'),('Phenotypes','#value')],
x_axis_location="above",
output_backend="webgl")
#Create rectangles for heatmap
p.rect(
x="MGI_id",
y="phen_sys",
width=1,
height=1,
source=ColumnDataSource(df),
fill_color=transform('value', mapper))
p.xaxis.major_label_orientation = 45
#Add legend
t = df.value.unique()
t.sort()
color_bar = ColorBar(
color_mapper=mapper,
ticker=FixedTicker(ticks=t, desired_num_ticks=len(df.value.unique())),
label_standoff=6,
border_line_color=None)
p.add_layout(color_bar, 'right')
show(p)
I found a solution:
I create a factor list by ordering the values and then converting both the dataframe values and the factors. At that point I created a CategoricalColorMapper instead of the linear one and the plot now is correct:
Your list of values goes from 0 to 10, so ColorBar will go up to 10. You can change mapper 'high' value to '9':
mapper = LinearColorMapper(palette=colors, low=0, high=9, nan_color = 'gray')
Or a ColorBar that goes from 1 to 10:
mapper = LinearColorMapper(palette=colors, low=1, high=10, nan_color = 'gray')

Plotly : How to enable text label in line graph for the last value?

I am trying to build a graph where the line graph should show the value of only the last element in some beautiful formating.
line graph with no text at end
Now the current method of the text shows for all elements and is a straight text that creates a lot of collisions with different lines in the same graph and looks clumsy.
Will be very nice to achieve something as mentioned in the below image.
desired line graph with text
This is now handled through:
legendgroup = d.name
Plot 1: All
Plot 2: Deselect GOOG in the legend and see that the marker disappears as well:
Complet code:
# imports
import pandas as pd
import plotly.express as px
# data
df = px.data.stocks()
df = df.drop('AMZN', axis = 1)
colors = px.colors.qualitative.T10
# plotly
fig = px.line(df,
x = 'date',
y = [c for c in df.columns if c != 'date'],
template = 'plotly_dark',
color_discrete_sequence = colors,
title = 'Stocks',
)
# move legend
fig.layout.legend.x = -0.3
# add traces for annotations and text for end of lines
for i, d in enumerate(fig.data):
fig.add_scatter(x=[d.x[-1]], y = [d.y[-1]],
mode = 'markers+text',
text = d.y[-1],
textfont = dict(color=d.line.color),
textposition='middle right',
marker = dict(color = d.line.color, size = 12),
legendgroup = d.name,
showlegend=False)
fig.show()

How could I edit my code to plot 4D contour something similar to this example in python?

Similar to many other researchers on stackoverflow who are trying to plot a contour graph out of 4D data (i.e., X,Y,Z and their corresponding value C), I am attempting to plot a 4D contour map out of my data. I have tried many of the suggested solutions in stackover flow. From all of the plots suggested this, and this were the closest to what I want but sill not quite what I need in terms of data interpretation. Here is the ideal plot example: (source)
Here is a subset of the data. I put it on the dropbox. Once this data is downloaded to the directory of the python file, the following code will work. I have modified this script from this post.
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import matplotlib.tri as mtri
#####Importing the data
df = pd.read_csv('Data_4D_plot.csv')
do_random_pt_example = False;
index_x = 0; index_y = 1; index_z = 2; index_c = 3;
list_name_variables = ['x', 'y', 'z', 'c'];
name_color_map = 'seismic';
if do_random_pt_example:
number_of_points = 200;
x = np.random.rand(number_of_points);
y = np.random.rand(number_of_points);
z = np.random.rand(number_of_points);
c = np.random.rand(number_of_points);
else:
x = df['X'].to_numpy();
y = df['Y'].to_numpy();
z = df['Z'].to_numpy();
c = df['C'].to_numpy();
#end
#-----
# We create triangles that join 3 pt at a time and where their colors will be
# determined by the values of their 4th dimension. Each triangle contains 3
# indexes corresponding to the line number of the points to be grouped.
# Therefore, different methods can be used to define the value that
# will represent the 3 grouped points and I put some examples.
triangles = mtri.Triangulation(x, y).triangles;
choice_calcuation_colors = 2;
if choice_calcuation_colors == 1: # Mean of the "c" values of the 3 pt of the triangle
colors = np.mean( [c[triangles[:,0]], c[triangles[:,1]], c[triangles[:,2]]], axis = 0);
elif choice_calcuation_colors == 2: # Mediane of the "c" values of the 3 pt of the triangle
colors = np.median( [c[triangles[:,0]], c[triangles[:,1]], c[triangles[:,2]]], axis = 0);
elif choice_calcuation_colors == 3: # Max of the "c" values of the 3 pt of the triangle
colors = np.max( [c[triangles[:,0]], c[triangles[:,1]], c[triangles[:,2]]], axis = 0);
#end
#----------
###=====adjust this part for the labeling of the graph
list_name_variables[index_x] = 'X (m)'
list_name_variables[index_y] = 'Y (m)'
list_name_variables[index_z] = 'Z (m)'
list_name_variables[index_c] = 'C values'
# Displays the 4D graphic.
fig = plt.figure(figsize = (15,15));
ax = fig.gca(projection='3d');
triang = mtri.Triangulation(x, y, triangles);
surf = ax.plot_trisurf(triang, z, cmap = name_color_map, shade=False, linewidth=0.2);
surf.set_array(colors); surf.autoscale();
#Add a color bar with a title to explain which variable is represented by the color.
cbar = fig.colorbar(surf, shrink=0.5, aspect=5);
cbar.ax.get_yaxis().labelpad = 15; cbar.ax.set_ylabel(list_name_variables[index_c], rotation = 270);
# Add titles to the axes and a title in the figure.
ax.set_xlabel(list_name_variables[index_x]); ax.set_ylabel(list_name_variables[index_y]);
ax.set_zlabel(list_name_variables[index_z]);
ax.view_init(elev=15., azim=45)
plt.show()
Here would be the output:
Although it looks brilliant, it is not quite what I am looking for (the above contour map example). I have modified the following script from this post in the hope to reach the required graph, however, the chart looks nothing similar to what I was expecting (something similar to the previous output graph). Warning: the following code may take some time to run.
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
df = pd.read_csv('Data_4D_plot.csv')
x = df['X'].to_numpy();
y = df['Y'].to_numpy();
z = df['Z'].to_numpy();
cc = df['C'].to_numpy();
# convert to 2d matrices
Z = np.outer(z.T, z)
X, Y = np.meshgrid(x, y)
C = np.outer(cc.T,cc)
# fourth dimention - colormap
# create colormap according to cc-value
color_dimension = C # change to desired fourth dimension
minn, maxx = color_dimension.min(), color_dimension.max()
norm = matplotlib.colors.Normalize(minn, maxx)
m = plt.cm.ScalarMappable(norm=norm, cmap='jet')
m.set_array([])
fcolors = m.to_rgba(color_dimension)
# plot
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(X,Y,Z, rstride=1, cstride=1, facecolors=fcolors, vmin=minn, vmax=maxx, shade=False)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
Now I was wondering from our kind community and experts if you can help me to plot a contour figure similar to the example graph (image one in this post), where the contours are based on the values within the range of C?

Data and X axis labels not align

Trying to plot X axis (Event) values on their respective x Axis. Y axis is relative to Time (of the day) when and how long the event lasted. The first label and data plotted are correct. However, the second set of data appears to skip over the major x axis tick and is placed afterwards but before the next major x axis tick. This is repeated for each additional x Axis value plotted. The data does not show a problem with which X axis it should appear on.
Defined the data (source) and can plot the issue with about 50 lines of code.
from bokeh.io import output_file
from bokeh.models import ColumnDataSource, LabelSet
from bokeh.plotting import figure, show
from bokeh.models.formatters import NumeralTickFormatter
import pandas as pd
import math
output_file("events.html", mode="inline")
x1 = []
y1 = []
x2 = []
y2 = []
colorList = []
shortNames = []
nameAndId = ["Event1", 0]
x1.append(nameAndId)
y1.append(33470)
x2.append(nameAndId)
y2.append(33492)
colorList.append("red")
shortNames.append("Evt1")
nameAndId = ["Event2", 1]
x1.append(nameAndId)
y1.append(34116)
x2.append(nameAndId)
y2.append(34151)
colorList.append("green")
shortNames.append("Evt2")
xAxisLabels = ["Event1", "Event2"]
data = {"x1": x1, "y1": y1, "x2": x2, "y2": y2, "color": colorList,\
"shortName": shortNames}
eventDF = pd.DataFrame(data=data,
columns=("x1", "y1", "x2", "y2", "color",\
"shortName"))
source = ColumnDataSource(eventDF)
yRange = [34151, 33470]
p = figure(plot_width=700, plot_height=750, x_range=xAxisLabels,\
y_range=yRange, output_backend="webgl")
p.xaxis.major_label_orientation = math.pi / -2
p.segment(x0="x1",y0="y1",x1="x2",y1="y2", source=source, color="color"\
line_width=12)
p.yaxis[0].formatter = NumeralTickFormatter(format="00:00:00")
p.xaxis.axis_label = "Events"
labels = LabelSet(x="x2",y="y2", text="shortName", text_font_size="8pt"\
text_color="black", level="glyph", x_offset=-6,\
y_offset=-5, render_mode="canvas", angle=270,\
angle_units="deg", source=source)
p.add_layout(labels)
show(p)
I'm thinking this is something simple I've over-looked like a xAxis formatter. I've tried to define one but none seem to work for my use case. The data doesn't seem to be associated to the xAxisLabel. I Expect Event 1 to show on the first X axis tick with Event 2 on the second X axis tick. Event 1 is correct but for each event afterwards, every major X axis tick is skipped with the data residing between tick marks.
The issue in your code is that the actual value for the x-coordinate you are supplying is:
nameAndId = ["Event2", 1]
This kind of list with a category name and a number in a list is understood by Bokeh as a categorical offset. You are explicitly telling Bokeh to position the glyph a distance of 1 (in "synthetic" coordinates) away from the location of "Event2". The reason things "work" for the Event1 case is that the offset in that case is 0:
nameAndId = ["Event1", 0]
I'm not sure what you are trying to accomplish by passing these lists with the second numerical value, so I can't really offer any additional suggestion except to say that it should probably not be passed on to Bokeh.

matplotlib set stacked bar chart labels

I am trying to create a stacked bar chart that groups data by operating system. I'm having trouble creating an individual label for each component in each bar.
What I'm trying to do is different from the example in the docs because in my data each category appears in only one bar, whereas in the example each bar contains one member of each category.
Currently I have this code
plt.cla()
plt.clf()
plt.close()
def get_cmap(n, name='hsv'):
'''Returns a function that maps each index in 0, 1, ..., n-1 to a distinct
RGB color; the keyword argument name must be a standard mpl colormap name.'''
return plt.cm.get_cmap(name, n)
fig = plt.figure(figsize=(18, 10), dpi=80)
# group by the prefixes for now
prefixes = []
indices = []
bars = []
legend = {}
cmap = get_cmap(len(os_counts.index) + 1)
k = 0
for i, prefix in enumerate(d):
indices.append(i)
if len(d[prefix]["names"]) == 1:
prefixes.append(d[prefix]["names"][0])
else:
prefixes.append(prefix)
#colors = [next(cycol) for j in range(len(d[prefix]["names"]))]
colors = [cmap(k + j) for j in range(len(d[prefix]["names"]))]
k += len(colors)
bar = plt.bar([i] * len(d[prefix]["names"]), d[prefix]["values"], color=colors, label=d[prefix]["names"])
bars.append(bar)
plt.xticks(rotation=90)
plt.ylabel("Frequency")
plt.xlabel("Operating System")
plt.xticks(indices, prefixes)
plt.legend()
plt.show()
Which produces this result. As you can see, the legend is created for the first colour within the bar and shows an array.
I think that each call to plt.bar gets one label. So, you are giving it a list as a label for each plt.bar call. If you want a label for every color, representing every operating system then I think the solution is to call plt.bar once for each color or os.

Resources