How to set different color every gene and add legend in ChromoMap using R studio? - colors

I have plot my data to make gene position in chromosome using chromoMap.
here
chr file
chr anot file
and these are my scripts
library(chromoMap)
col.set = c("purple", "#4CBB17","#0096FF", "blue", "brown")
chr.data <- read.csv("chr_file.csv", header=T)
anno.data <- read.csv("chr_anot.csv", header = T)
chromoMap(list(chr.data),list(anno.data), labels = T, data_based_color_map = T,
data_color = list(c(col.set)))
I got the result like this
output file
I would like to set every gene with different color and add the legend.
Any idea what should I do? Please help me.
Thank you

In order to color each gene with a unique color you just need to add a 5th data column to your annotations data anno.data lets say the names of your genes like:
anno.data <- cbind.data.frame(anno.data,Symbol=anno.data$element)
and then each of your gene will have the unique color. Also, for the legend you just need to set the legend argument to true.
here is the code with some additional configuration options for your plot:
chromoMap(list(chr.data),list(anno.data),
# labelling arguments
labels = T,
label_font = 12,
label_angle = -55,
# group annotation arguments
data_based_color_map = T,
data_type = "categorical",
data_colors = list(c(col.set)),
# for the legend
legend = T,
#adjusting the legend along y-axis
lg_y = 250,
#increasing canvas width for legend
canvas_width = 600,
#playing with plot properties
text_font_size = 12,
chr_color = c("#d3d3d3"))
Thanks & Regards,
Lakshay(chromoMap developer)

Related

Plot multi label (values) with multi bar chart

I've this issue I hope you can help.
I've this data :
to_stack = pd.DataFrame([['CHILDREN', 0.42806248287201976, 0.0],
['AMT_TOTAL', 165006, 179357],
['SAL', 582065, 703917.0],
['ANNUITY', 26851, 28416]], columns=('Variable','Id','Mean'))
When I run the code below
to_stack.plot.barh(x='Variable', figsize=(12,8), width = .9)
## First Loop for first Variable "ID"
for index,value in enumerate(to_stack['Id']):
plt.text(value, index, str(value), va='top', )
## Second Loop for Second Variable
for i,val in enumerate(to_stack['Mean']):
plt.text(val, i, str(val), va='bottom' )
I get this result
The Values in each bar ar not well centralized
I've tried several options in Matplotlib.plt.text (ha (center, left, right) , va (top, bottom, baseline) without good results, sometimes it's even worse, values are one on each other.
How can we get the values aligned with the bars ?
Any ideas are really welcome
It's better to extract information from the bars and annotate. That way, you have more control of how the text appears in relative to the bars:
fig, ax = plt.subplots(figsize=(12,8),)
to_stack.plot.barh(x='Variable', width = .9, ax=ax)
for patch in ax.patches:
w, h = patch.get_width(), patch.get_height()
y = patch.get_y()
ax.text(w + -0.1,h/2+y, f'{w:.3f}', va='center')
Output:

How to label line chart with column from pandas dataframe (from 3rd column values)?

I have a data set I filtered to the following (sample data):
Name Time l
1 1.129 1G-d
1 0.113 1G-a
1 3.374 1B-b
1 3.367 1B-c
1 3.374 1B-d
2 3.355 1B-e
2 3.361 1B-a
3 1.129 1G-a
I got this data after filtering the data frame and converting it to CSV file:
# Assigns the new data frame to "df" with the data from only three columns
header = ['Names','Time','l']
df = pd.DataFrame(df_2, columns = header)
# Sorts the data frame by column "Names" as integers
df.Names = df.Names.astype(int)
df = df.sort_values(by=['Names'])
# Changes the data to match format after converting it to int
df.Time=df.Time.astype(int)
df.Time = df.Time/1000
csv_file = df.to_csv(index=False, columns=header, sep=" " )
Now, I am trying to graph lines for each label column data/items with markers.
I want the column l as my line names (labels) - each as a new line, Time as my Y-axis values and Names as my X-axis values.
So, in this case, I would have 7 different lines in the graph with these labels: 1G-d, 1G-a, 1B-b, 1B-c, 1B-d, 1B-e, 1B-a.
I have done the following so far which is the additional settings, but I am not sure how to graph the lines.
plt.xlim(0, 60)
plt.ylim(0, 18)
plt.legend(loc='best')
plt.show()
I used sns.lineplot which comes with hue and I do not want to have name for the label box. Also, in that case, I cannot have the markers without adding new column for style.
I also tried ply.plot but in that case, I am not sure how to have more lines. I can only give x and y values which create only one line.
If there's any other source, please let me know below.
Thanks
The final graph I want to have is like the following but with markers:
You can apply a few tweaks to seaborn's lineplot. Using some created data since your sample isn't really long enough to demonstrate:
# Create data
np.random.seed(2019)
categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = pd.DataFrame({'Name':np.repeat(range(1,11), 10),
'Time':np.random.randn(100).cumsum(),
'l':np.random.choice(categories, 100)
})
# Plot
sns.lineplot(data=df, x='Name', y='Time', hue='l', style='l', dashes=False,
markers=True, ci=None, err_style=None)
# Temporarily removing limits based on sample data
#plt.xlim(0, 60)
#plt.ylim(0, 18)
# Remove seaborn legend title & set new title (if desired)
ax = plt.gca()
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], title='New Title', loc='best')
plt.show()
To apply markers, you have to specify a style variable. This can be the same as hue.
You likely want to remove dashes, ci, and err_style
To remove the seaborn legend title, you can get the handles and labels, then re-add the legend without the first handle and label. You can also specify the location here and set a new title if desired (or just remove title=... for no title).
Edits per comments:
Filtering your data to only a subset of level categories can be done fairly easily via:
categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = df.loc[df['l'].isin(categories)]
markers=True will fail if there are too many levels. If you are only interested in marking points for aesthetic purposes, you can simply multiply a single marker by the number of categories you are interested in (which you have already created to filter your data to categories of interest): markers='o'*len(categories).
Alternatively, you can specify a custom dictionary to pass to the markers argument:
points = ['o', '*', 'v', '^']
mult = len(categories) // len(points) + (len(categories) % len(points) > 0)
markers = {key:value for (key, value)
in zip(categories, points * mult)}
This will return a dictionary of category-point combinations, cycling over the marker points specified until each item in categories has a point style.

How can I remove an extra line from a figure made by matplotlib?

How can I make two plots on Matplotlib where each plot has a bar chart and a line joining points? Namely, I have bar datas in variables nollat and ykkoset and line datas from variables selnollat and selykkoset. I would like to make two files, each one that has one bar chart and one line segment data. The following is the part of the code. I saw that the file eka.png seems to be correct but toka.png has an extra line (the blue one). How can I remove it?
ax = plt.gca()
alanolla = min(nollat)
alayks = min(ykkoset)
ylanolla = max(nollat)
ylayks = max(ykkoset)
ax.set_ylim([0.9*min(alanolla,alayks),1.1*max(ylanolla,ylayks)])
num_bins = len(nollat)
plt.plot(range(len(selnollat)), selnollat)
plt.bar(range(len(nollat)), nollat, color = 'C1')
plt.savefig('eka.png')
ax.set_ylim([0.9*min(alanolla,alayks),1.1*max(ylanolla,ylayks)])
num_bins = len(ykkoset)
plt.plot(range(len(selykkoset)), selykkoset)
plt.bar(range(len(ykkoset)), ykkoset, color = 'C1')
plt.savefig('toka.png')
use plt.cla() to clear the content of the axes after you save your first plot
...
plt.savefig('eka.png')
plt.cla()
ax.set_ylim([0.9*min(alanolla,alayks),1.1*max(ylanolla,ylayks)])
...

Plot the distance between every two points in 2 D

If I have a table with three columns where the first column represents the name of each point, the second column represent numerical data (mean) and the last column represent (second column + fixed number). The following an example how is the data looks like:
I want to plot this table so I have the following figure
If it is possible how I can plot it using either Microsoft Excel or python or R (Bokeh).
Alright, I only know how to do it in ggplot2, I will answer regarding R here.
These method only works if the data-frame is in the format you provided above.
I rename your column to Name.of.Method, Mean, Mean.2.2
Preparation
Loading csv data into R
df <- read.csv('yourdata.csv', sep = ',')
Change column name (Do this if you don't want to change the code below or else you will need to go through each parameter to match your column names.
names(df) <- c("Name.of.Method", "Mean", "Mean.2.2")
Method 1 - Using geom_segment()
ggplot() +
geom_segment(data=df,aes(x = Mean,
y = Name.of.Method,
xend = Mean.2.2,
yend = Name.of.Method))
So as you can see, geom_segment allows us to specify the end position of the line (Hence, xend and yend)
However, it does not look similar to the image you have above.
The line shape seems to represent error bar. Therefore, ggplot provides us with an error bar function.
Method 2 - Using geom_errorbarh()
ggplot(df, aes(y = Name.of.Method, x = Mean)) +
geom_errorbarh(aes(xmin = Mean, xmax = Mean.2.2), linetype = 1, height = .2)
Usually we don't use this method just to draw a line. However, its functionality fits your requirement. You can see that we use xmin and ymin to specify the head and the tail of the line.
The height input is to adjust the height of the bar at the end of the line in both ends.
I would use hbar for this:
from bokeh.io import show, output_file
from bokeh.plotting import figure
output_file("intervals.html")
names = ["SMB", "DB", "SB", "TB"]
p = figure(y_range=names, plot_height=350)
p.hbar(y=names, left=[4,3,2,1], right=[6.2, 5.2, 4.2, 3.2], height=0.3)
show(p)
However Whisker would also be an option if you really want whiskers instead of interval bars.

Right orientation with z up

how to make axis x,y,z look like
from visual import *
f = frame()
# Axis
pointer_x = arrow(frame=f, pos=(0,0,0), axis=(40,0,0), shaftwidth=1, color = color.red)
pointer_y = arrow(frame=f, pos=(0,0,0), axis=(0,40,0), shaftwidth=1, color = color.blue)
pointer_z = arrow(frame=f, pos=(0,0,0), axis=(0,0,40), shaftwidth=1, color = color.green)
# Show X,Y,Z labels
label(frame=f,pos=(40,0,0), text='X')
label(frame=f,pos=(0,40,0), text='Y')
label(frame=f,pos=(0,0,40), text='Z')
This code solved it, if there is better approach comment it.
# Show it like ECEF
f.rotate(angle=radians(-90),axis=(1,0,0),origin=(0,0,0))
f.rotate(angle=radians(180),axis=(0,1,0),origin=(0,0,0))

Resources