R simplify heatmap to pdf - graphics

I want to plot a simplified heatmap that is not so difficult to edit with the scalar vector graphics program I am using (inkscape). The original heatmap as produced below contains lots of rectangles, and I wonder if they could be merged together in the different sectors to simplify the output pdf file:
nentries=100000
ci=rainbow(nentries)
set.seed=1
mean=10
## Generate some data (4 factors)
i = data.frame(
a=round(abs(rnorm(nentries,mean-2))),
b=round(abs(rnorm(nentries,mean-1))),
c=round(abs(rnorm(nentries,mean+1))),
d=round(abs(rnorm(nentries,mean+2)))
)
minvalue = 10
# Discretise values to 1 or 0
m0 = matrix(as.numeric(i>minvalue),nrow=nrow(i))
# Remove rows with all zeros
m = m0[rowSums(m0)>0,]
# Reorder with 1,1,1,1 on top
ms =m[order(as.vector(m %*% matrix(2^((ncol(m)-1):0),ncol=1)), decreasing=TRUE),]
rowci = rainbow(nrow(ms))
colci = rainbow(ncol(ms))
colnames(ms)=LETTERS[1:4]
limits=c(which(!duplicated(ms)),nrow(ms))
l=length(limits)
toname=round((limits[-l]+ limits[-1])/2)
freq=(limits[-1]-limits[-l])/nrow(ms)
rn=rep("", nrow(ms))
for(i in toname) rn[i]=paste(colnames(ms)[which(ms[i,]==1)],collapse="")
rn[toname]=paste(rn[toname], ": ", sprintf( "%.5f", freq ), "%")
heatmap(ms,
Rowv=NA,
labRow=rn,
keep.dendro = FALSE,
col=c("black","red"),
RowSideColors=rowci,
ColSideColors=colci,
)
dev.copy2pdf(file="/tmp/file.pdf")

Why don't you try RSvgDevice? Using it you could save your image as svg file, which is much convenient to Inkscape than pdf

I use the Cairo package for producing svg. It's incredibly easy. Here is a much simpler plot than the one you have in your example:
require(Cairo)
CairoSVG(file = "tmp.svg", width = 6, height = 6)
plot(1:10)
dev.off()
Upon opening in Inkscape, you can ungroup the elements and edit as you like.
Example (point moved, swirl added):

I don't think we (the internet) are being clear enough on this one.
Let me just start off with a successful export example
png("heatmap.png") #Ruby dev's think of this as kind of like opening a `File.open("asdfsd") do |f|` block
heatmap(sample_matrix, Rowv=NA, Colv=NA, col=terrain.colors(256), scale="column", margins=c(5,10))
dev.off()
The dev.off() bit, in my mind, reminds me of an end call to a ruby block or method, in that, the last line of the "nested" or enclosed (between png() and dev.off()) code's output is what gets dumped into the png file.
For example, if you ran this code:
png("heatmap4.png")
heatmap(sample_matrix, Rowv=NA, Colv=NA, col=terrain.colors(32), scale="column", margins=c(5,15))
heatmap(sample_matrix, Rowv=NA, Colv=NA, col=greenred(32), scale="column", margins=c(5,15))
dev.off()
it would output the 2nd (greenred color scheme, I just tested it) heatmap to the heatmap4.png file, just like how a ruby method returns its last line by default

Related

Minimal self-compiling to .pdf Rmarkdown file

I need to compose a simple rmarkdown file, with text, code and the results of executed code included in a resulting PDF file. I would prefer if the source file is executable and self sifficient, voiding the need for a makefile.
This is the best I have been able to achieve, and it is far from good:
#!/usr/bin/env Rscript
library(knitr)
pandoc('hw_ch4.rmd', format='latex')
# TODO: how to NOT print the above commands to the resulting .pdf?
# TODO: how to avoid putting everyting from here on in ""s?
# TODO: how to avoid mentioning the file name above?
# TODO: how to render special symbols, such as tilde, miu, sigma?
# Unicode character (U+3BC) not set up for use with LaTeX.
# See the inputenc package documentation for explanation.
# nano hw_ch4.rmd && ./hw_ch4.rmd && evince hw_ch4.pdf
"
4E1. In the model definition below, which line is the likelihood?
A: y_i is the likelihood, based on the expectation and deviation.
4M1. For the model definition below, simulate observed heights from the prior (not the posterior).
A:
```{r}
points <- 10
rnorm(points, mean=rnorm(points, 0, 10), sd=runif(points, 0, 10))
```
4M3. Translate the map model formula below into a mathematical model definition.
A:
```{r}
flist <- alist(
y tilda dnorm( mu , sigma ),
miu tilda dnorm( 0 , 10 ),
sigma tilda dunif( 0 , 10 )
)
```
"
Result:
What I eventually came to use is the following header. At first it sounded neat, but later I realized
+ is indeed easy to compile in one step
- this is code duplication
- mixing executable script and presentation data in one file is a security risk.
Code:
#!/usr/bin/env Rscript
#<!---
library(rmarkdown)
argv <- commandArgs(trailingOnly=FALSE)
fname <- sub("--file=", "", argv[grep("--file=", argv)])
render(fname, output_format="pdf_document")
quit(status=0)
#-->
---
title:
author:
date: "compiled on: `r Sys.time()`"
---
The quit() line is supposed to guarantee that the rest of the file is treated as data. The <!--- and --> comments are to render the executable code as comments in the data interpretation. They are, in turn, hidden by the #s from the shell.

How to draw a horizontal line at the large y-axis integer?

For the following data.dat file:
08:01:59 451206975237005878
08:04:07 451207335040839108
08:05:56 451207643872368805
08:07:49 451207961547842270
08:09:56 451208317883903787
08:10:12 451208364811411904
08:14:09 451209030026853864
08:16:19 451209395116787156
08:17:14 451209552481002386
08:20:22 451210080432357203
08:25:36 451210963309583903
08:30:23 451211772783766177
08:34:04 451212394854723707
08:35:53 451212702239472024
08:48:46 451214876715294857
08:49:56 451215072475511660
08:51:24 451215321890488523
08:52:39 451215533925588479
08:52:42 451215542324801784
08:54:30 451215845971562410
08:55:08 451215951262906948
08:58:30 451216519498960432
I'd like to draw a horizontal line at the specific level (e.g. 451211772783766177). However, the number is too large to process.
Here is my attempt (based on this post):
$ gnuplot -p -e 'set arrow from 451211772783766177 to 451211772783766177; plot "data.dat" using 2 with lines'
which gives the following error:
line 0: warning: integer overflow; changing to floating point
How I should proceed?
I would treat your large number as a constant function, and plot it right after your data. Also, writing it on a exponential notation X.XE+YY = X.X times 10 to the +YY power (more info on format specifiers here) also takes care of the error:
plot "data.dat" using 2 with lines, 4.51211772783766177E17 with lines
Let me know if this works!

How do I properly reset the clip in cairocffi?

I am trying to write a module for myself for doing some simple drawing with cairocffi to make using the package a bit less cumbersome. However, I seem to have run into some trouble with properly implementing clipping. Specifically, I am having trouble properly resetting the clipping region.
I wrote an example Python script, whose result should be a PostScript file with:
1 red circle (circle_1)
1 black line from the bottom left to the top right of the circle (line1)
1 black line from the top left of the image to the bottom right of the image (line2)
Instead of line2 extending from corner to corner, though, it is still being clipped by the previous call to clip().
Here's the example script:
import cairocffi as cairo
from math import pi
fig_w, fig_h = 237.6, 237.6
test_surf = cairo.PSSurface('test.ps', fig_w, fig_h)
temp_surf = cairo.PSSurface('temp.ps', fig_w, fig_h)
line1 = cairo.Context(temp_surf)
line1.move_to(0, fig_h)
line1.line_to(fig_w, 0)
line1.set_source_rgb(0,0,0)
line1.stroke()
circle_1 = cairo.Context(test_surf)
circle_1.arc(fig_w/2, fig_h/2, fig_w/4, 0, 2*pi)
circle_1.close_path()
circle_1.set_source_rgb(1,0,0)
circle_1.stroke_preserve()
circle_1.set_source_surface(temp_surf)
with circle_1:
circle_1.clip()
circle_1.paint()
line2 = cairo.Context(test_surf)
line2.reset_clip()
line2.move_to(0, 0)
line2.line_to(fig_w, fig_h)
line2.set_source_rgb(0,0,0)
line2.stroke()
I'm not really sure what I'm doing wrong. This seems to be how the cairocffi documentation would suggest that it should be done (i.e., see reset_clip() and save()).
If anyone can point out what I'm doing incorrectly, I'd really appreciate it.

Spatially Subsetting Images in batch mode using IDL and ENVI

I would like to spatially subset LANDSAT photos in ENVI using an IDL program. I have over 150 images that I would like to subset, so I'd like to run the program in batch mode (with no interaction). I know how to do it manually, but what command would I use to spatially subset the image via lat/long coordinates in IDL code?
Here is some inspiration, for a single file.
You can do the same for a large number of files by building up
a list of filenames and looping over it.
; define the image to be opened (could be in a loop), I believe it can also be a tif, img...
img_file='path/to/image.hdr'
envi_open_file,img_file,r_fid=fid
if (fid eq -1) then begin
print, 'Error when opening file ',img_file
return
endif
; let's define some coordinates
XMap=[-70.0580916, -70.5006694]
YMap=[-32.6030694, -32.9797194]
; now convert coordinates into pixel position:
; the transformation function uses the image geographic information:
ENVI_CONVERT_FILE_COORDINATES, FID, XF, YF, XMap, YMap
; we must consider integer. Think twice here, maybe you need to floor() or ceil()
XF=ROUND(XF)
YF=ROUND(YF)
; read the image
envi_file_query, fid, DIMS=DIMS, NB=NB, NL=NL, NS=NS
pos = lindgen(nb)
; and store it in an array
image=fltarr(NS, NL, NB)
; read each band sequentially
FOR i=0, NB-1 DO BEGIN
image[*,*,i]= envi_get_data(fid=fid, dims=dims, pos=pos[i])
endfor
; simply crop the data with array-indexing function
imagen= image[XF[0]:XF[1],YF[0]:YF[1]]
nl2=YF[1]-YF[0]
ns2=XF[1]-XF[0]
; read mapinfo to save it in the final file
map_info=envi_get_map_info(fid=fid)
envi_write_envi_file, imagen, data_type=4, $
descrip = 'cropped', $
map_info = map_info, $
nl=nl2, ns=ns2, nb=nb, r_fid=r_fid, $
OUT_NAME = 'path/to/cropped.hdr'

plotting 3D bar graph in matlab or excel

I need to plot a 3D bar graph in matlab or excel. I am going to use some dates in x-axis, time in y-axis and some amount on the z-axis. Each record in csv file looks like ...
18-Apr, 21, 139.45
I am not sure how to do this right. can anyone help me please. I tried using pivort chart of excel. however, i could not manipulate the axis and use appropriate space between each tick.
thanks
kaisar
Since the question is lacking details, let me illustrate with an example.
Consider the following code:
%# read file contents: date,time,value
fid = fopen('data.csv','rt');
C = textscan(fid, '%s %s %f', 'Delimiter',',');
fclose(fid);
%# correctly reshape the data, and extract x/y labels
num = 5;
d = reshape(C{1},num,[]); d = d(1,:);
t = reshape(C{2},num,[]); t = t(:,1);
Z = reshape(C{3},num,[]);
%# plot 3D bars
bar3(Z)
xlabel('date'), ylabel('time'), zlabel('value')
set(gca, 'XTickLabel',d, 'YTickLabel',t)
I ran on the following data file:
data.csv
18-Apr,00:00,0.85535
18-Apr,03:00,0.38287
18-Apr,06:00,0.084649
18-Apr,09:00,0.73387
18-Apr,12:00,0.33199
19-Apr,00:00,0.83975
19-Apr,03:00,0.37172
19-Apr,06:00,0.82822
19-Apr,09:00,0.17652
19-Apr,12:00,0.12952
20-Apr,00:00,0.87988
20-Apr,03:00,0.044079
20-Apr,06:00,0.68672
20-Apr,09:00,0.73377
20-Apr,12:00,0.43717
21-Apr,00:00,0.37984
21-Apr,03:00,0.97966
21-Apr,06:00,0.39899
21-Apr,09:00,0.44019
21-Apr,12:00,0.15681
22-Apr,00:00,0.32603
22-Apr,03:00,0.31406
22-Apr,06:00,0.8945
22-Apr,09:00,0.24702
22-Apr,12:00,0.31068
23-Apr,00:00,0.40887
23-Apr,03:00,0.70801
23-Apr,06:00,0.14364
23-Apr,09:00,0.87132
23-Apr,12:00,0.083156
24-Apr,00:00,0.46174
24-Apr,03:00,0.030389
24-Apr,06:00,0.7532
24-Apr,09:00,0.70004
24-Apr,12:00,0.21451
25-Apr,00:00,0.6799
25-Apr,03:00,0.55729
25-Apr,06:00,0.85068
25-Apr,09:00,0.55857
25-Apr,12:00,0.90177
26-Apr,00:00,0.41952
26-Apr,03:00,0.35813
26-Apr,06:00,0.48899
26-Apr,09:00,0.25596
26-Apr,12:00,0.92917
27-Apr,00:00,0.46676
27-Apr,03:00,0.25401
27-Apr,06:00,0.43122
27-Apr,09:00,0.70253
27-Apr,12:00,0.40233
Use MATLAB's CSV reading functions (or write your own) and then use bar3 to display the data.

Resources