I am trying to implement Aggregation Pheromone density based classification for land use map problem. In the paper, the formula to calculate pheromone intensity deposited at x by ant aj (located at xj) is calculated as :
T(aj,x) = e^-d(xj,x)^2/2del^2
where, d(xj,x) represents the euclidean distance between two points,
del denotes the spread of gaussian function.
I want to know two things : First, what is this and second, how to calculate this.
Related
I have an experimental dataset of the following values (y, x1, x2, w), where y is the measured quantity, x1 and x2 are the two independet variables and w is the error of each measurement.
The function I've chosen to describe my data is
These are my tasks:
1) Estimate values of bi
2) Estimate their standard errors
3) Calculate predicted values of f(x1, x2) on a mesh grid and estimate their confidence intervals
4) Calculate predicted values of
and definite integral
and their confidence intervals on a mesh grid
I have several questions:
1) Can all of my tasks be solved by weighted least squares? I've solved task 1-3 using WLS in matrix form by linearisation of the chosen function, but I have no idea, how to solve step №4.
2) I've performed Monte Carlo simulations to estimate bi and their s.e. I've generated perturbated values y'i from normal distribution with mean yi and standard deviation wi. I did this operation N=5000 times. For each perturbated dataset I estimated b'i, and from 5000 values of b'i I calculated mean values and their standard distribution. In the end, bi estimated from Monte-Carlo simulation coincide with those found by WLS. Am I correct, that standard deviations of b'i must be devided by № of Degrees of freedom to obtain standard error?
3) How to estimate confidence bands for predicted values of y using Monte-Carlo approach? I've generated a bunch of perturbated bi values from normal distribution using their BLUE as mean and standard deviations. Then I calculated lots of predicted values of f(x1,x2), found their means and standard deviations. Values of f(x1,x2) found by WLS and MC coincide, but s.d. found from MC are 5-45 order higher than those from WLS. What is the scaling factor that I'm missing here?
4) It seems that some of parameters b are not independent of each other, since there are only 2 independent variables. Should I take this into account in question 3, when I generate bi values? If yes, how can this be done? Should I use Chi-squared test to decide whether generated values of bi are suitable for further calculations, or should they be rejected?
In fact, I not only want to solve tasks I've mentioned earlier, but also I want to compare the two methods for regression analysys. I would appreciate any help and suggestions!
kNN seems relatively simple to understand: you have your data points and you draw them in your feature space (in a feature space of dimension 2, its the same as drawing points on a xy plane graph). When you want to classify a new data, you put the new data onto the same feature space, find the nearest k neighbors, and see what their labels are, ultimately taking the label(s) with highest votes.
So where does probability come in to play here? All I am doing to calculating distance between two points and taking the label(s) of the closest neighbor.
For a new test sample you look on the K nearest neighbors and look on their labels.
You count how many of those K samples you have in each of the classes, and divide the counts by the number of classes.
For example - lets say that you have 2 classes in your classifier and you use K=3 nearest neighbors, and the labels of those 3 nearest samples are (0,1,1) - the probability of class 0 is 1/3 and the probability for class 1 is 2/3.
I am trying to understand PCA, I went through several tutorials. So far I understand that, the eigenvectors of a matrix implies the directions in which vectors are rotated and scaled when multiplied by that matrix, in proportion of the eigenvalues. Hence the eigenvector associated with the maximum Eigen value defines direction of maximum rotation. I understand that along the principle component, the variations are maximum and reconstruction errors are minimum. What I do not understand is:
why finding the Eigen vectors of the covariance matrix corresponds to the axis such that the original variables are better defined with this axis?
In addition to tutorials, I reviewed other answers here including this and this. But still I do not understand it.
Your premise is incorrect. PCA (and eigenvectors of a covariance matrix) certainly don't represent the original data "better".
Briefly, the goal of PCA is to find some lower dimensional representation of your data (X, which is in n dimensions) such that as much of the variation is retained as possible. The upshot is that this lower dimensional representation is an orthogonal subspace and it's the best k dimensional representation (where k < n) of your data. We must find that subspace.
Another way to think about this: given a data matrix X find a matrix Y such that Y is a k-dimensional projection of X. To find the best projection, we can just minimize the difference between X and Y, which in matrix-speak means minimizing ||X - Y||^2.
Since Y is just a projection of X onto lower dimensions, we can say Y = X*v where v*v^T is a lower rank projection. Google rank if this doesn't make sense. We know Xv is a lower dimension than X, but we don't know what direction it points.
To do that, we find the v such that ||X - X*v*v^t||^2 is minimized. This is equivalent to maximizing ||X*v||^2 = ||v^T*X^T*X*v|| and X^T*X is the sample covariance matrix of your data. This is mathematically why we care about the covariance of the data. Also, it turns out that the v that does this the best, is an eigenvector. There is one eigenvector for each dimension in the lower dimensional projection/approximation. These eigenvectors are also orthogonal.
Remember, if they are orthogonal, then the covariance between any two of them is 0. Now think of a matrix with non-zero diagonals and zero's in the off-diagonals. This is a covariance matrix of orthogonal columns, i.e. each column is an eigenvector.
Hopefully that helps bridge the connection between covariance matrix and how it helps to yield the best lower dimensional subspace.
Again, eigenvectors don't better define our original variables. The axis determined by applying PCA to a dataset are linear combinations of our original variables that tend to exhibit maximum variance and produce the closest possible approximation to our original data (as measured by l2 norm).
My teacher has given me these set of questions as homework and I don't know if I'm understanding it right.
The following customers have rated a number of DVD's as shown in the table. Calculate the similarity figures for these customers using the Euclidean distance method. Then, using the similarity figure as a weighting factor, calculate the weighted average scores for each movie.
Which other customer is most similar to Dave?
What is the similarity score for that customer?
Which movie does this scheme recommend for Dave?
What is the weighted rating for that movie?
Now, to my understanding I need to use coordinates to find the Euclidean distance since the formula is:
And according to my teacher the Similarity formula is:
Do I use the coordinates by row or column to calculate the Euclidean?
Do I need to use another formula to calculate the weighted rating?
OR do I need to use the formula both ways (row and column)?
I'm really trying to wrap my head around this.
Am I in the right direction or off completely?
Since you are comparing customers, you will be comparing the columns, not the rows.
Excel has a function SUMXMY2(array_x, array_y) which computes the square sum of two arrays (e.g. SUMXMY2(DVD_Table[Alice],DVD_Table[Bob])). You can simply take the square root of this to get the Euclidean distance between two customers.
It is not clear to me how the weighted ratings are calculated.
Unless you have further Excel or programming questions related to this, you'd do better posting at https://math.stackexchange.com.
I'm currently drawing up a mock database schema with two tables: Booking and Waypoint.
Booking stores the taxi booking information.
Waypoint stores the pickup and drop off points during the journey, along with the lat lon position. Each sequence is a stop in the journey.
How would I calculate the distance between the different stops in each journey (using the lat/lon data) in Excel?
Is there a way to programmatically define this in Excel, i.e. so that a formula can be placed into the mileage column (Booking table), lookup the matching sequence (via bookingId) for that journey in the Waypoint table and return a result?
Example 1:
A journey with 2 stops:
1 1 1 MK4 4FL, 2, Levens Hall Drive, Westcroft, Milton Keynes 52.002529 -0.797623
2 1 2 MK2 2RD, 55, Westfield Road, Bletchley, Milton Keynes 51.992571 -0.72753
4.1 miles according to Google, entry made in mileage column in Booking table where id = 1
Example 2:
A journey with 3 stops:
6 3 1 MK7 7DT, 2, Spearmint Close, Walnut Tree, Milton Keynes 52.017486 -0.690113
7 3 2 MK18 1JL, H S B C, Market Hill, Buckingham 52.000674 -0.987062
8 3 1 MK17 0FE, 1, Maids Close, Mursley, Milton Keynes 52.040622 -0.759417
27.7 miles according to Google, entry made in mileage column in Booking table where id = 3
If you want to find the distance between two points just use this formula and you will get the result in Km, just convert to miles if needed.
Point A: LAT1, LONG1
Point B: LAT2, LONG2
ACOS(COS(RADIANS(90-Lat1)) *COS(RADIANS(90-Lat2)) +SIN(RADIANS(90-Lat1)) *SIN(RADIANS(90-lat2)) *COS(RADIANS(long1-long2)))*6371
Regards
Until quite recently, accurate maps were constructed by triangulation, which in essence is the application of Pythagoras’s Theorem. For the distance between any pair of co-ordinates take the square root of the sum of the square of the difference in x co-ordinates and the square of the difference in y co-ordinates. The x and y co-ordinates must however be in the same units (eg miles) which involves factoring the latitude and longitude values. This can be complicated because the factor for longitude depends upon latitude (walking all round the North Pole is less far than walking around the Equator) but in your case a factor for 52o North should serve. From this the results (which might be checked here) are around 20% different from the examples you give (in the second case, with pairing IDs 6 and 7 and adding that result to the result from pairing IDs 7 and 8).
Since you say accuracy is not important, and assuming distances are small (say less than 1000 miles) you can use the loxodromic distance.
For this, compute the difference of latitutes (dlat) and difference of longitudes (dlon). If there were any chance (unlikely) that you're crossing meridian 180º, take modulo 360º to ensure the difference of longitudes is between -180º and 180º. Also compute average latitude (alat).
Then compute:
distance= 60*sqrt(dlat^2 + (dlon*cos(alat))^2)
This distance is in nautical miles. Apply conversions as needed.
EXPLANATION: This takes advantage of the fact that one nautical mile is, by definition, always equal to one minute-arc of latitude. The cosine corresponds to the fact that meridians get closer to each other as they approach the poles. The rest is just application of Pythagoras theorem -- which requires that the relevant portion of the globe be flat, which is of course only a good approximation for small distances.
It all depends on what the distance is and what accuracy you require. Calculations based on "Earth locally flat" model will not provide great results for long distances but for short distance they may be ok. Models assuming Earth is a perfect sphere (e.g. Haversine formula) give better accuracy but they still do not produce geodesic grade results.
See Geodesics on an ellipsoid for more details.
One of the high accuracy (fraction of a mm) solutions is known as Vincenty's formulae. For my Excel VBA implementation look here https://github.com/tdjastrzebski/Vincenty-Excel