How to make a dataset similar to that of murchison data in spatstat for ppm and AUc analysis - spatstat

I have four variables: a point process pattern of species
occurrence, rivers, ponds polygons and land image data. I would like
to make a dataset similar to that of Murchison dataset using these
shape layers but I have failed to manoeuvre.
I need to make a data frame from these polygon shape layers of
rivers, ponds and land cover images together with the point pattern
data of species occurrences I tried using a hyper frame but I am
unable to use a distance function from the river or the ponds.
rivers <- readShapespatial("river.shp") ponds <-
readShapeSpatial(pond.shp") fro <- read.table("fro.txt",
header=TRUE) image <- raster("image.tif")
I would like to combine
these four files as a single spatstat object like that of Murchison
data which comes with spatstat package. if I can put them in a frame
then ponds, land cover, rivers are covariates.
I have used analyst function but return errors that they can not be
used as covariates, fore example x is a list can not be used as
covariates particularly for ponds and rivers when I call the dist
function.

Why do you need a hyperframe? You refer to murchison data and that is not
a hyperframe. It simply a standard R list (with extendend classes
listof, anylist and solist for better printing and plotting in
spatstat, but the actual data structure is just a plain list).
To recreate the murchison data:
library(spatstat)
P <- murchison$gold # Points
L <- murchison$faults # Lines
W <- murchison$greenstone # "Windows
mur <- solist(points = P, lines = L, windows = W)
mur
#> List of spatial objects
#>
#> points:
#> Planar point pattern: 255 points
#> window: rectangle = [352782.9, 682589.6] x [6699742, 7101484] metres
#>
#> lines:
#> planar line segment pattern: 3252 line segments
#> window: rectangle = [352782.9, 682589.6] x [6699742, 7101484] metres
#>
#> windows:
#> window: polygonal boundary
#> enclosing rectangle: [352782.9, 681699.6] x [6706467, 7100804] metres
To use the data in a model they don’t have to be collected in a single list,
but it may be convenient. The following two models are identical:
(mod1 <- ppm(P ~ W))
#> Nonstationary Poisson process
#>
#> Log intensity: ~W
#>
#> Fitted trend coefficients:
#> (Intercept) WTRUE
#> -21.918688 3.980409
#>
#> Estimate S.E. CI95.lo CI95.hi Ztest Zval
#> (Intercept) -21.918688 0.1666667 -22.24535 -21.592028 *** -131.51213
#> WTRUE 3.980409 0.1798443 3.62792 4.332897 *** 22.13252
(mod2 <- ppm(points ~ windows, data = mur))
#> Nonstationary Poisson process
#>
#> Log intensity: ~windows
#>
#> Fitted trend coefficients:
#> (Intercept) windowsTRUE
#> -21.918688 3.980409
#>
#> Estimate S.E. CI95.lo CI95.hi Ztest Zval
#> (Intercept) -21.918688 0.1666667 -22.24535 -21.592028 *** -131.51213
#> windowsTRUE 3.980409 0.1798443 3.62792 4.332897 *** 22.13252
If you insist on a hyperframe you should have a column for each measured
variable, but these are primarily used for when you have several replications
of an experiment, and is not of much use here. The function call is simply:
murhyp <- hyperframe(points = P, lines = L, windows = W)

Related

Question about array multiplication in JAGS

I am working with race-stratified population estimates and I want to integrate race-stratified populations from three different data sources (census, PEP, and ACS). I developed a model to use information from all these three sources and estimate the true population which is defined as gamma.ctr for county c time t and race r (1=white and 2 for non-white).
The problem is that PEP data is not race-stratified and I need to find a way to estimate race-stratified pep data.
Before, I used one of the other two sources (census or ACS) to estimate ethnicity proportions and multiply PEP data by these proportions to obtain race-stratified PEP population as input data to the model.
Now I decided to do this multiplication within the model based on ethnicity proportions that are defined by gamma.ctr (true pop in county c, year t, and race r) which is updated by all data sources not one of them.
So I considered the input PEP data as peppop.ct (the population for county c and time t, not race-stratified). Then I defined ethnicity proportion as:
prob[c,t]=gamma.ctr[c,t,1]/(gamma.ctr[c,t,1]+gamma.ctr[c,t,2])
I want to multiply PEP data by these proportions to find race-stratified estimates within the JAGS model:
for (c in 1:Narea){
for (t in 1:nyears){
prob.ct[c,t]<-gamma.ctr[c,t,1]/(gamma.ctr[c,t,1]+gamma.ctr[c,1,2])
peppop.ctr[c,t,1]<-peppop.ct[c,t] * prob.ct[c,t]
peppop.ctr[c,t,2]<-peppop.ct[c,t] * (1-prob.ct[c,t])
}
}
I want to use this peppop.ctr as a response varaible later like this:
for (t in 1:nyears){
peppop.ctr[c,t,r] ~ dnorm(gamma.ctr[c,t,r], taupep.ctr[c,t,r])
}
But I receive this error:
Attempt to redefine node peppop.cpr[1,1,1]
It think the reason for this error is the fact that peppop.ctr are defined twice in left hand side of the equation and the error is related to redefining peppop.ctr in line:
peppop.ctr[c,t,1]<-peppop.ct[c,t] * prob.ct[c,t]
Is it possible to help me to solve this error. I need to estimate peppop.ctr first and then use these estimates to update gamma.ctr parameters. Any help is really appreciated.
You can use the zeros trick to both define a variable (e.g., y below) and then also use that variable as the dependent variable in some subsequent analysis. Here's an example:
library(runjags)
x <- rnorm(1000)
y <- 2 + 3 * x + rnorm(1000)
p <- runif(1000, .1, .9)
w <- y*p
z <- y-w
datl <- list(
x=x,
w=w,
z=z,
zeros = rep(0, length(x)),
N = length(x)
)
mod <- "model{
y <- w + z
C <- 10000 # this just has to be large enough to ensure all phi[i]'s > 0
for (i in 1:N) {
L[i] <- dnorm(y[i], mu[i], tau)
mu[i] <- b[1] + b[2]*x[i]
phi[i] <- -log(L[i]) + C
zeros[i] ~ dpois(phi[i])
}
#sig ~ dunif(0, sd(y))
#tau <- pow(sig, -2)
tau ~ dgamma(1,.1)
b[1] ~ dnorm(0, .0001)
b[2] ~ dnorm(0, .0001)
}
"
out <- run.jags(model = mod, data=datl, monitor = c("b", "tau"), n.chains = 2)
summary(out)
#> Lower95 Median Upper95 Mean SD Mode MCerr MC%ofSD
#> b[1] 1.9334538 1.991586 2.051245 1.991802 0.03026722 NA 0.0002722566 0.9
#> b[2] 2.9019547 2.963257 3.023057 2.963190 0.03057184 NA 0.0002744883 0.9
#> tau 0.9939587 1.087178 1.183521 1.087744 0.04845667 NA 0.0004280217 0.9
#> SSeff AC.10 psrf
#> b[1] 12359 -0.010240572 1.0000684
#> b[2] 12405 -0.006480322 0.9999677
#> tau 12817 0.010135609 1.0000195
summary(lm(y ~ x))
#>
#> Call:
#> lm(formula = y ~ x)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -3.2650 -0.6213 -0.0032 0.6528 3.3956
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.99165 0.03034 65.65 <2e-16 ***
#> x 2.96340 0.03013 98.34 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.9593 on 998 degrees of freedom
#> Multiple R-squared: 0.9065, Adjusted R-squared: 0.9064
#> F-statistic: 9671 on 1 and 998 DF, p-value: < 2.2e-16
Created on 2022-05-14 by the reprex package (v2.0.1)

Combining two point pattern objects in spatstat creating a combined mark

Suppose there are two point patterns in spatstat. I understand we can superimpose these two point patterns to get a single point pattern. Now if there are common coordinates for these patterns but with different marks, then the points will be duplicated when superimposed. Is there a way in spatstat to get a unique set of points while creating a common mark for the coordinates that coincide?
I’m not aware of a built-in solution to do this, so you have to do a bit of
manual work as detailed below.
Load package and make example data with overlapping points:
library(spatstat)
X1 <- cells[1:22]
marks(X1) <- factor("a")
X2 <- cells[20:42]
marks(X2) <- factor("b")
plot(superimpose(X1, X2), main = "")
For each point in X1 find the nearest point in X2:
nn <- nncross(X1, X2)
tail(nn)
#> dist which
#> 17 0.1386110 4
#> 18 0.1802776 5
#> 19 0.1069766 5
#> 20 0.0000000 1
#> 21 0.0000000 2
#> 22 0.0000000 3
id1 <- which(nn$dist==0) ## Tests EXACT equality. Consider small tolerance.
id2 <- nn$which[id1]
Add extra mark level to X1 and assign it to points with duplicates in X2:
levels(marks(X1)) <- c("a", "c")
marks(X1)[20:22] <- factor("c")
X <- superimpose(X1, X2[-id2])
plot(X, main = "")

How to convert intensities to Probabilities in a point pattern using Spatstat in R?

I have two points pattern (ppp) objects p1 and p2. There are X and Y points in p1 and p2 respectively. I have fitted a ppm model (with location coordinates as independent variables) in p1 and then used it to predict "intensity" for each of the Y points in p2.
Now I want to get the probability for event occurrence at that point/zone in p2. How can I use the predicted intensities for this purpose?
Can I do this using Spatstat?
Are there any other alternative.
The intensity is the expected number of points per unit area. In small areas (such as pixels) you can just multiply the intensity by the pixel area to get the probability of presence of a point in the pixel.
fit <- ppm(p1, .......)
inten <- predict(fit)
pixarea <- with(inten, xstep * ystep)
prob <- inten * pixarea
This rule is accurate provided the prob values are smaller than about 0.4.
In a larger region W, the expected number of points is the integral of the intensity function over that region:
EW <- integrate(inten, domain=W)
The result EW is a numeric value, the expected total number of points in W. To get the probability of at least one point,
P <- 1- exp(-EW)
You can also compute prediction intervals for the number of points, using predict.ppm with argument interval="prediction".
Your question, objective and current method are not very clear to me. It
would be beneficial, if you could provide code and graphics, that explains
more clearly what you have done, and what you are trying to obtain. If you
cannot share your data you can use e.g. the built-in dataset chorley as an
example (or simply simulate artificial data):
library(spatstat)
plot(chorley, cols = c(rgb(0,0,0,1), rgb(.8,0,0,.2)))
X <- split(chorley)
X1 <- X$lung
X2 <- X$larynx
mod <- ppm(X1 ~ polynom(x, y, 2))
inten <- predict(mod)
summary(inten)
#> real-valued pixel image
#> 128 x 128 pixel array (ny, nx)
#> enclosing rectangle: [343.45, 366.45] x [410.41, 431.79] km
#> dimensions of each pixel: 0.18 x 0.1670312 km
#> Image is defined on a subset of the rectangular grid
#> Subset area = 315.291058349571 square km
#> Subset area fraction = 0.641
#> Pixel values (inside window):
#> range = [0.002812544, 11.11172]
#> integral = 978.5737
#> mean = 3.103715
plot(inten)
Predicted intensities at the 58 locations in X2
intenX2 <- predict.ppm(mod, locations = X2)
summary(intenX2)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.1372 4.0025 6.0544 6.1012 8.6977 11.0375
These predicted intensities intenX2[i] say that in a small neighbourhood
around each point X2[i] the estimated number of points from X1 is Poisson
distributed with mean intenX2[i] times the area of the small neighbourhood.
So in fact you have estimated a model where in any small area you have a
probability distribution for any number of points happening in that area. If
you want the distribution in a bigger region you just have to integrate the
intensity over that region.
To get a better answer you have to provide more details about your problem.
Created on 2018-12-12 by the reprex package (v0.2.1)

is it possible to run spatstat functions on multiple processors

I am curios to know if spastat functions like envelope or MAD tests can be run on multiple processors on a machine to speed up calculations? Is there any document or tutorial to do this?
Thank you.
Unfortunately, parallelisation is not an integrated part of spatstat,
but rather left to the user. For envelopes and MAD tests the easiest
option is probably to run envelope with a smaller number of
realisations on each core and then combine the results using
pool.envelope. How to run envelope in parallel may depend on your
setup. A simple possibility is to use parallel::mclapply which I know
works out of the box on linux, but much better cross platform
alternatives are surely available in packages on CRAN:
library(spatstat)
ppplist <- replicate(4, cells, simplify = FALSE)
envlist <- parallel::mclapply(ppplist, spatstat::envelope, savefuns = TRUE, nsim = 10)
envfinal <- do.call(pool, envlist)
envfinal
#> Pointwise critical envelopes for K(r)
#> and observed value for 'X[[i]]'
#> Obtained from 40 simulations of CSR
#> Alternative: two.sided
#> Significance level of pointwise Monte Carlo test: 2/41 = 0.0488
#> .....................................................................
#> Math.label Description
#> r r distance argument r
#> obs hat(K)[obs](r) observed value of K(r) for data pattern
#> theo K[theo](r) theoretical value of K(r) for CSR
#> lo hat(K)[lo](r) lower pointwise envelope of K(r) from simulations
#> hi hat(K)[hi](r) upper pointwise envelope of K(r) from simulations
#> .....................................................................
#> Default plot formula: .~r
#> where "." stands for 'obs', 'theo', 'hi', 'lo'
#> Columns 'lo' and 'hi' will be plotted as shading (by default)
#> Recommended range of argument r: [0, 0.25]
#> Available range of argument r: [0, 0.25]

How to plot contour maps and display average of the mark(associated with x,y) for each contour level

I have a contour map in spatstat generated from the intensity function of a point pattern X (like "location of the trees"). Each x,y coordinates in this point pattern is marked with a corresponding third vector (like "diameter of the tree").
-->cf image (of course the vertical lines representing the tree can be omitted)
I would like to display the average of the mark (diameter) in each level of the contour with different colors. Suggestions?
Thanks!
You are effectively asking for a kind of nonparametric regression.
Here is a quick-and-dirty calculation using the function rhohat and demonstrated on the longleaf dataset.
First calculate the intensity estimate: Z <- density(longleaf) yielding an image Z. Next treat Z as a covariate in calls to the rhohat command:
f <- rhohat(unmark(longleaf), Z)
and
g <- rhohat(unmark(longleaf), Z, weights=marks(longleaf)).
Now take the ratio, h <- eval.fv(g/f) and plot it, plot(h). This shows the estimated average tree diameter as a function of the forest density. To apply this function h to the original contours of Z you would first convert h to a true function by H <- as.function(h) then evaluate hZ <- eval.im(H(Z)) and finally plot(hZ).

Resources