Converting intensities to probabilities in ppp - spatstat

Apologies for the overlap with existing questions; mine is at a more basic skill level. I am working with very sparse occurrences spanning very large areas, so I would like to calculate probability at pixels using the density.ppp function (as opposed to relrisk.ppp, where specifying presences+absences would be computationally intractable). Is there a straightforward way to convert density (intensity) to probabilities at each point?
Maxdist=50
dtruncauchy=function(x,L=60) L/(diff(atan(c(-1,1)*Maxdist/L)) * (L^2 + x^2))
dispersfun=function(x,y) dtruncauchy(sqrt(x^2+y^2))
n=1e3; PPP=ppp(1:n,1:n, c(1,n),c(1,n), marks=rep(1,n));
density.ppp(PPP,cutoff=Maxdist,kernel=dispersfun,at="points",leaveoneout=FALSE) #convert to probabilies?
Thank you!!

I think there is a misunderstanding about fundamentals. The spatstat package is designed mainly for analysing "mapped point patterns", datasets which record the locations where events occurred or things were located. It is designed for "presence-only" data, not "presence/absence" data (with some exceptions).
The relrisk function expects input data about the presence of two different types of events, such as the mapped locations of trees belonging to two different species, and then estimates the spatially-varying probability that a tree will belong to each species.
If you have 'presence-only' data stored in a point pattern object X of class "ppp", then density(X, ....) will produce a pixel image of the spatially-varying intensity (expected number of points per unit area). For example if the spatial coordinates were expressed in metres, then the intensity values are "points per square metre". If you want to calculate the probability of presence in each pixel (i.e. for each pixel, the probability that there is at least one presence point in the pixel), you just need to multiply the intensity value by the area of one pixel, which gives the expected number of points in the pixel. If pixels are small (the usual case) then the presence probability is just equal to this value. For physically larger pixels the probability is 1 - exp(-m) where m is the expected number of points.
Example:
X <- redwood
D <- density(X, 0.2)
pixarea <- with(D, xstep * ystep)
M <- pixarea * D
p <- 1 - exp(-M)
then M and p are images which should be almost equal, and can both be interpreted as probability of presence.
For more information see Chapter 6 of the spatstat book.
If, instead, you had a pixel image of presence/absence data, with pixel values equal to 1 or 0 for presence or absence respectively, then you can just use the function blur in the spatstat package to perform kernel smoothing of the image, and the resulting pixel values are presence probabilities.

Related

Find all the planar surfaces in an rgbd image using depth and normal data

Many questions deal with generating normal from depth or depth from normal, but I want to ask about a simple way to generate all the planar surfaces given the depth and normal of an image.
I already have depth and normal of each pixel in the image. For each pixel (ui, vi), assume that we can get its 3D coordinates (xi, yi, zi) with zi as the depth and normal vector (nix, niy, niz). Thus, a unique tangent plane is defined by: nix(x - xi) + niy(y - yi) + niz(z - zi) = 0. Then, for each pixel we can define a unique planar surface by the above equation.
What is a common practice in finding the function f such that f(u, v) = (x, y, z) (from pixel to 3D coordinates)? Is pinhole model (plus the depth data) an effective and accurate one?
How does one generate all the planar surfaces effectively? One way is to iterate through all the pixels in the image and find all the planes, but this seems like an ineffective method.
If its pinhole model
make sure your 3D data is not distorted by projection.
group your points by normal
this is easy or hard depending on the points/normal accuracy. Simply sort the points by normals which leads to O(n.log(n)) where n is number of points.
test/group by planes in single normal group
The idea is to pick 3 points from a group compute plane from it and test which points of the group belongs to it. If too low count you got wrong points picked (not belonging to the same plane) and need to pick different ones. Also if the picked points are too close to each or on the same line you can not get correct plane from it.
The math function for plane is:
x*nx + y*ny + z*nz + d = 0
where (nx,ny,nz) is your normal of the group (unit vector) and (x,y,z) is your point position. So you just compute d from a known point (one of the picked ones (x0,y0,z0) ) ...
d = -x0*nx -y0*ny -z0*nz
and then just test which points are sattisfying this condition:
threshod=1e-20; // just accuracy margin
fabs(x*nx + y*ny + z*nz + d) <= threshod
now remove matched points from the group (move them into found plane object) and apply this bullet again on the remaining points until they count is low or no valid plane is found...
then test another group until no groups are left...
I think RANSAC can speed things up to avoid brute force in this case but never used it myself so google ...
A possible approach for the planes is to consider the set of normal vectors and perform clustering on them (for instance by k-means). Then every cluster can correspond to several parallel surfaces. By evaluating the distance from the origin (a scalar function), you can form sub-clusters which will separate those surfaces. Finally, points at constant distance can belong to different coplanar patches, which you can separate by connected component labelling.
It is likely that clustering on the normal vectors and distance simultaneously (hence in a 4D space) will yield better results and be simpler. Be sure to normalize the vectors. Another option is to represent the vectors by just two parameters (such as spherical angles), but this will lead to a quite non-uniform mapping, and create phase wrapping issues.

intensity of point process - weights with covariate - spatstat

I am trying spatstat for a specific case. In my shapefile of roads, i have attributes speed and % of heavy vehicles on each road. It is an observation that severe accidents are likely to happen on roads with high speeds and more heavy vehicles (because road is not properly access controlled and pedestrians cross the road). We know that there are accidents at a rate (per 5km stretch).
I would like to generate a random poisson with that rate, but giving weight that the points happen more on roads with high speed ( or high % truck)
and if possible also to include the second variable % of trucks
What is the best way to model the two aspects to make a small proof of concept? I have read (portions of) the spatstat book and section on influence of covariates on intensity, but this is still unclear to me.
Thanks
The spatstat function rpoislpp generates a Poisson random point pattern on the network with a given intensity. In this case, you want a spatially-varying intensity, which can be specified by a function of spatial location. That is, you want something like rpoislpp(f, L) where L is the linear network and f is the intensity function.
I assume you have obtained values of the covariate (like speed limit and fraction of trucks) for each road. Then you need to build a function that looks up these values at any spatial location on the network. Once you have this, you can write the intensity function in terms of it.
To start, suppose you have a network L (object of class linnet). The segments of the network can be indexed in the original order given when you specified them: or you can extract these segments by S <- as.psp(L). We need a vector z giving the covariate values for each of these segments (so this will be a numeric vector of length n=nsegments(S)). Then z[i] is the covariate value along segment i. (Note: if you have covariate values for each road, where a road consists of multiple segments of L, then you first need to figure out which segments of L belong to each road, and construct z.)
Next do the following:
Zfun <- linfun(function(x,y,seg,tp) { z[seg] }, L)
This creates a function on the linear network (class linfun) that evaluates the covariate at any spatial location on L. To check it's built correctly, type plot(Zfun).
Now suppose you want the point process intensity to be lambda = exp(3*Z+2). Then do
lam <- function(x,y,seg,tp) { exp(3 * z[seg] + 2) }
lambda <- linfun(lam, L)
(Needless to say, you can write any mathematical expression in the braces; and you can have more than one covariate, etc.)
Finally generate the random points:
X <- rpoislpp(lambda, L)

Representing classification confidence

I am working on a simple AI program that classifies shapes using unsupervised learning method. Essentially I use the number of sides and angles between the sides and generate aggregates percentages to an ideal value of a shape. This helps me create some fuzzingness in the result.
The problem is how do I represent the degree of error or confidence in the classification? For example: a small rectangle that looks very much like a square would yield night membership values from the two categories but can I represent the degree of error?
Thanks
Your confidence is based on used model. For example, if you are simply applying some rules based on the number of angles (or sides), you have some multi dimensional representation of objects:
feature 0, feature 1, ..., feature m
Nice, statistical approach
You can define some kind of confidence intervals, baesd on your empirical results, eg. you can fit multi-dimensional gaussian distribution to your empirical observations of "rectangle objects", and once you get a new object you simply check the probability of such value in your gaussian distribution, and have your confidence (which would be quite well justified with assumption, that your "observation" errors have normal distribution).
Distance based, simple approach
Less statistical approach would be to directly take your model's decision factor and compress it to the [0,1] interaval. For example, if you simply measure distance from some perfect shape to your new object in some metric (which yields results in [0,inf)) you could map it using some sigmoid-like function, eg.
conf( object, perfect_shape ) = 1 - tanh( distance( object, perfect_shape ) )
Hyperbolic tangent will "squash" values to the [0,1] interval, and the only remaining thing to do would be to select some scaling factor (as it grows quite quickly)
Such approach would be less valid in the mathematical terms, but would be similar to the approach taken in neural networks.
Relative approach
And more probabilistic approach could be also defined using your distance metric. If you have distances to each of your "perfect shapes" you can calculate the probability of an object being classified as some class with assumption, that classification is being performed at random, with probiability proportional to the inverse of the distance to the perfect shape.
dist(object, perfect_shape1) = d_1
dist(object, perfect_shape2) = d_2
dist(object, perfect_shape3) = d_3
...
inv( d_i )
conf(object, class_i) = -------------------
sum_j inv( d_j )
where
inv( d_i ) = max( d_j ) - d_i
Conclusions
First two ideas can be also incorporated into the third one to make use of knowledge of all the classes. In your particular example, the third approach should result in confidence of around 0.5 for both rectangle and circle, while in the first example it would be something closer to 0.01 (depending on how many so small objects would you have in the "training" set), which shows the difference - first two approaches show your confidence in classifing as a particular shape itself, while the third one shows relative confidence (so it can be low iff it is high for some other class, while the first two can simply answer "no classification is confident")
Building slightly on what lejlot has put forward; my preference would be to use the Mahalanobis distance with some squashing function. The Mahalanobis distance M(V, p) allows you to measure the distance between a distribution V and a point p.
In your case, I would use "perfect" examples of each class to generate the distribution V and p is the classification you want the confidence of. You can then use something along the lines of the following to be your confidence interval.
1-tanh( M(V, p) )

Sampling a hemisphere using an arbitary distribtuion

I am writing a ray tracer and I wish to fire rays from a point p into a hemisphere above that point according to some distribution.
1) I have derived a method to uniformly sample within a solid angle (defined by theta) above p Image
phi = 2*pi*X_1
alpha = arccos (1-(1-cos(theta))*X_2)
x = sin(alpha)*cos(phi)
y = sin(alpha)*sin*phi
z = -cos(alpha)
Where X is a uniform random number
That works and Im pretty happy with that. But my question is what happens if I do not want a uniform distribution.
I have used the algorithm on page 27 from here and I can draw samples from a piecewise arbitrary distribution. However if I simply say:
alpha = arccos (1-(1-cos(theta)) B1)
Where B is a random number generated from an arbiatry distribution.
It doesn't behave nicely...What am I doing wrong? Thanks in advance. I really really need help on this
Additional:
Perhaps I am asking a leading question. Taking a step back:
Is there a way to generate points on a hemisphere according to an arbitrary distribution. I have a method for uniformly sampling a hemisphere and one for cosine-weighted hemisphere sampling. (pg 663-669 pbrt.org)
With an uniform distribution, you can just average the sample results and obtain the correct result. This is equivalent to divide each sample result by the sample Probability Density Function (PDF) and, in the case of an uniform distribution, it is just 1 / sample_count (i.e. the same of averaging the results).
With an arbitrary distribution, you have still to divide the sample result by the sample PDF however the PDF now depends on the arbitrary distribution you are using. I assume your error is here.

'Probability' of a K-nearest neighbor like classification

I've a small set of data points (around 10) in a 2D space, and each of them have a category label. I wish to classify a new data point based on the existing data point labels and also associate a 'probability' for belonging to any particular label class.
Is it appropriate to label the new point based on the label to its nearest neighbor( like a K-nearest neighbor, K=1)? For getting the probability I wish to permute all the labels and calculate all the minimum distance of the unknown point and the rest and finding the fraction of cases where the minimum distance is lesser or equal to the distance that was used to label it.
Thanks
The Nearest Neighbour method is already using the Bayes theorem to estimate the probability using the points in a ball containing your chosen K points. There is no need to transform, as the number of points in the ball of K points belonging to each label divided by the total number of points in that ball already is an approximation of the posterior probability of that label. In other words:
P(label|z) = P(z|label)P(label) / P(z) = K(label)/K
This is obtained using the Bayes rule of probability on an estimated probability estimated using a subset of the data. In particular, using:
VP(x) = K/N (this gives you the probability of a point in a ball of volume V)
P(x) = K/NV (from above)
P(x=label) = K(label)/N(label)V (where K(label) and N(label) are the number of points in the ball of that given class and the number of points in the total samples of that class)
and
P(label) = N(label)/N.
Therefore, just pick a K, calculate the distances, count the points and by checking their labels and recounting you will have your probability.
Roweis uses a probabilistic framework with KNN in his publication Neighbourhood Component Analysis. The idea is to use a "soft" nearest neighbour classification, where the probability that a point i uses another point j as its neighbour is defined by
,
where d_ij is the euclidean distance between point i and j.
The are no probabilities for such K-nearest classification method because it is discriminative classification as well as SVM. There are should be used postporcess for learning probabilities on unseen data with generative model like logistic regression.
1. learn K nearest classifier
2. Train logistic regression on distance and average distance to K nearest for validation data.
Check for details LibSVM article.
Sort the distances to the 10 centres; they could be
1 5 6 ... — one near, others far
1 1 1 5 6 ... — 3 near, others far
... lots of possibilities.
You could combine the 10 distances to a single number, e.g. 1 - (nearest / average) ** p,
but that's throwing away information.
(Different powers p makes the hills around the centres steeper or flatter.)
If your centres are really Gaussian hills though, take a look at
Multivariate kernel density estimation.
Added:
There are zillions of functions that go smoothly between 0 and 1,
but that doesn't make them probabilities of something.
"Probability" means either that chance, likelihood, is involved,
as in probability of rain;
or that you're trying to impress somebody.
Added again: scholar.google.com "(single|1) nearest neighbor classifier" gets > 300 hits;
"k nearest neighbor classifier" gets almost 3000.
It seems to me (non-expert) that, out of 10 different ways of mapping k-NN distances to labels,
each one might be better than the 9 others — for some data, with some error measure.
Anyway, you could try asking stats.stackexchange.com ,
The answer is : it depends.
Imagine your labels are the surname of a person, and the X,Y coordinates represent some essential characteristics of the person's DNA sequence. Clearly a more close DNA description enhance the probability of having the same surnames.
Now suppose the X,Y is the lat/long of the work office for that person. Working closer isn't related to label (surname) sharing.
So, it depends on the semantic of your tags and axes.
HTH!

Resources