My Excel spreadsheet contains 500 coordinate points from a 2D space. I want to find the mode value of these 500 coordinate points. The estimation of mode value of any set of numbers is pretty simple. It's simply the highly repeated number among the set of numbers. In excel:
=MODE (A1:A10)
yields mode of data from A1 to A10.
However, a coordinate point is a pair of x and y coordinate. Calculating mode value of x and y coordinate individually may cause an error because individual x coordinate might be paired with many y coordinates and vice-versa. Is there any formula in excel to obtain mode value of paired numbers such as 2D coordinate points?
One way is to use a helper column to convert the coordinate pairs to a single number and then use MODE on the helper column. The helper column formula would be something like =A5*100000+B5 where the 100000 is a large enough number to elevate the significant digits of the first coordinate beyond the significant digits of the second coordinate.
Related
Suppose that we have a scattered series of data X,Y randomly spaced (in the pic they are ordered, but this doesn't matter) and a line which shows the maximum limit we are considering for a sub-application.
Is there a combination of functions to choose the closest points below the orange line? I've tried with a MAXIFS + LOOKUP, but didn't solve anything.
The formula of your line is: y=1.17*x, so you create a helper column, containing a formula like:
=IF(1.17*A3-B3>0;1.17*A3-B3;100000)
This means: calculate the difference between the line and the point if that difference is positive. In case it's negative (which means that the point is above the line), then show a value which is that large that it won't be taken into account while calculating the minimum.
You drag this formula all over the column.
You calculate the minimum of that column (one of the easy ways to do this, is using the autofilter).
World Map
I am using Excel and VBA for D&D and have made a world map seperating resources between different cells. I am using this to calculate the distance between the towns and the resources, so that I can calculate the price per pound of the resource.
In order to find the distance between two points (the resource and the town) I use this formula:
=SQRT(([#ROW]-$C$2)^2+([#COLUMN]-$D$2)^2)
This finds the hypotenuse between the two points, using the columns and rows difference as the other sides of the triangle.
However, I need to go one step further and have a means to tell whether the hypotenuse travels through water tiles or land.
You need a function that gives you a list of cells along your hypotenuse. Then you test each cell to see if it is land or water.
Step 1: Determine the biggest distance vertically or horizontally
Step 2: Divide the smallest distance by the largest distance. This ratio is the distance you move in the smallest direction for each unit of the largest.
Step 3: do a for for loop x to y step 1 for the largest distance. for each iteration of the loop cumulatively add the ratio from 2 to the start position of the smallest. The get the cell reference from the current largest plus the (integer +1) part of the start plus cumulative movement distance in the direction of the shortest.
I am trying to plot the envelope (maximum) values of a series of data. What I need is not the maximum value of the y-axis as the value of x-axis increase but an envelope or spectrum which joins only the maximum points as the values of x-axis increase.
My data look like:
If I ask for the maximum y-values as the values of the x-axis increase, I will get this one (the black line is the maximum of all data as x is asceding):
But I need a line which joins only the next maximum points till x=30 and then the maximum values, which descend (from x=30 to x=100). The curve I need should be smooth and not follow the values of the data but only join the next maximum.
The next curve is the envelope but only after the absolute maximum point. At the left of the absolute maximum point the envelope is not the wished one:
After posting my questions (as comments), I think the following will do what you want (here I'm assuming I understood what you need):
1) At any point along the X axis, you already know how to recognize a maximum,
2) If (1) is correct, you will take into account a maximum (i.e. make it part of the envelope curve) if and only if:
a) All the points to the right are lower than the current maximum, and/or
b) All the points to the left are lower than the current maximum.
Intuitively, this should work.
EDIT:
Assuming that data is arranged in columns, say between B and D and rows 10 to 100, define in cell E10 the following:
=IF(AND(MAX(B10,D10)>MAX(B9:D9),AND(MAX(B10,D10)>MAX(B11:D11)),MAX(B10,D10),"")
This formula will result into a value if you have a local maximum in rows 11 to 99 or blanks otherwise. Then, drag the formula till row 100 and voilà!!!
Note that the first and last point (i.e. rows 10 and 100) might yield a wrong result though. To prevent that, just alter the formula in those two rows.
Hope this is what you were looking for.
I have a large set of XYZ Cartesian points in Excel (some 40k actually) and was looking for a formula or macro to compare every point to every other point to get the distances between them.
The math to get the distance value between two 3D points is:
Distance=SQRT((X2 – X1)^2 + (Y2 – Y1)^2 + (Z2 – Z1)^2)
X1=the X value of the 1st point
X2=the X value of the 2nd point
Y1=the Y value of the 1st point
Y2=the Y value of the 2nd point
etc
Here is an example starting with 10 points:
http://i.imgur.com/U3lchMk.jpg
Would anyone know of a way to build this into Excel so that I can just copy the formula across the page to the horizontal limit? Or would you recommend a better way than using Excel?
As a secondary goal, I want to group the points into clusters that can connect by a distance lower than 2. But if I can accomplish the first goal, I can worry about the second later.
Actually, I was able to come up with the solution with a bit more research: i.imgur.com/9JL5Qni.jpg =SQRT(((INDIRECT("A"&$D2))-(INDIRECT("A"&E$1)))^2+((INDIRECT("B"&$D2))-(INDIRECT("B"&E$1)))^2+((INDIRECT("C"&$D2))-(INDIRECT("C"&E$1)))^2)
Given a range of numbers, say from [80,240], it is easy to determine how much of that range lies within [100,105]: (105-100)/(240-80) = 5/160 = .03125. Easy.
So now, how much of a Meriam Webster dictionary lies between umbrella and velvet? Even if we assume uniform distribution of text across the corpus, is there a standard metric for text?
I don't think there is a standard for that. If you had all entries from Meriam Webster in an array, you could use first and last positions as the bounds, so you would have a set going from 1 to n. Then you could pick the positions of "umbrella" and "velvet", call them x and y, and calculate your range as (y - x + 1) / (n).
That works if you are seeing words as elements of an ordered set, so as to have them behave as real numbers. You are basically dividing the distance between two numbers in a set by the distance between the boundaries of the set. Some forms of algebra deal with them differently - when calculating the Levenshtein distance between any two given words, for example, each words is seen as a vector with as many dimensions as they have characters.
You could define the boundaries of your n-dimensional space by using the biggest word in Meriam Webster (hint: it's "pneumonoultramicroscopicsilicovolcanoconiosis", so your space would have 45 dimensions). However, when considering any A-B pair of words, a third word C of intermediary length may or may not be between those, depending on the operations involved in the transformation from A to B.
You'd have to check every word with a length between that of A and B to check whether they are part of the range between A and B... So it's not a matter of simple calculus, and I don't know if this could be even feasible with a regular computer nowadays. And that's just considering Meriam's close to half a million entries.