I would like to ask in which order should I add elements: 1,2,3,4,5,6,7, so that tree would be fully balanced and children of root node should be red.
The order should be 4,2,6,1,5,3,7.It is like you select the median of the whole set of keys and now from starting element to the median chosen, you select another median (say Median left) and from median to the end element you select a median(say median right). This process goes on recursively.
Related
World Map
I am using Excel and VBA for D&D and have made a world map seperating resources between different cells. I am using this to calculate the distance between the towns and the resources, so that I can calculate the price per pound of the resource.
In order to find the distance between two points (the resource and the town) I use this formula:
=SQRT(([#ROW]-$C$2)^2+([#COLUMN]-$D$2)^2)
This finds the hypotenuse between the two points, using the columns and rows difference as the other sides of the triangle.
However, I need to go one step further and have a means to tell whether the hypotenuse travels through water tiles or land.
You need a function that gives you a list of cells along your hypotenuse. Then you test each cell to see if it is land or water.
Step 1: Determine the biggest distance vertically or horizontally
Step 2: Divide the smallest distance by the largest distance. This ratio is the distance you move in the smallest direction for each unit of the largest.
Step 3: do a for for loop x to y step 1 for the largest distance. for each iteration of the loop cumulatively add the ratio from 2 to the start position of the smallest. The get the cell reference from the current largest plus the (integer +1) part of the start plus cumulative movement distance in the direction of the shortest.
I am trying to plot the envelope (maximum) values of a series of data. What I need is not the maximum value of the y-axis as the value of x-axis increase but an envelope or spectrum which joins only the maximum points as the values of x-axis increase.
My data look like:
If I ask for the maximum y-values as the values of the x-axis increase, I will get this one (the black line is the maximum of all data as x is asceding):
But I need a line which joins only the next maximum points till x=30 and then the maximum values, which descend (from x=30 to x=100). The curve I need should be smooth and not follow the values of the data but only join the next maximum.
The next curve is the envelope but only after the absolute maximum point. At the left of the absolute maximum point the envelope is not the wished one:
After posting my questions (as comments), I think the following will do what you want (here I'm assuming I understood what you need):
1) At any point along the X axis, you already know how to recognize a maximum,
2) If (1) is correct, you will take into account a maximum (i.e. make it part of the envelope curve) if and only if:
a) All the points to the right are lower than the current maximum, and/or
b) All the points to the left are lower than the current maximum.
Intuitively, this should work.
EDIT:
Assuming that data is arranged in columns, say between B and D and rows 10 to 100, define in cell E10 the following:
=IF(AND(MAX(B10,D10)>MAX(B9:D9),AND(MAX(B10,D10)>MAX(B11:D11)),MAX(B10,D10),"")
This formula will result into a value if you have a local maximum in rows 11 to 99 or blanks otherwise. Then, drag the formula till row 100 and voilà!!!
Note that the first and last point (i.e. rows 10 and 100) might yield a wrong result though. To prevent that, just alter the formula in those two rows.
Hope this is what you were looking for.
I have pure theoretical question on KD Tree hierarchy.
Let's say that we have two dimensional tree with 'left rule'.
One of tree nodes has two children which should be sorted by X value.
In the same time, both children have the same X value.
So, what should I do in this case?
To my opinion, there are two option, rather I am sorting them by second (Y) value and distributed according to the 'left rule' — to the left goes one with smallest Y value, and to the right with bigger.
And the second option could be finding distances between these children points and it parent and distributing it according to distance value: closest goes to the left, the other to the right.
Maybe it is a very unsuccessful illustration, but still there could be an issue of having two nodes as 'median' candidates with the same evaluation parameter, doesn't matter X or Y.
In one of the research paper I follow they have said "Classes have been derived from scores with a median split"
Can anyone please explain if this median split is same as median? Thank you :)
A median split is when a set of elements is dichotomised (i.e. split into two) according to the statistical median (50th percentile). One group will contain all elements greater than the median and the other group will contain all elements less than the median.
So if you have a series of numbers (e.g. from 1 to 6) and you do a median split (with the median being 3.5 on this occasion) you will essentially split the series of numbers into two groups:
Group a would be 1, 2, 3
and
Group b would be 4, 5, 6
You can see another example here:
For example, we administer a scale of Optimism, and then use a median-split to label the people above the median as Optimists, and those below the median as Pessimists.
Essentially, the median split is not the median itself but it is a technique that uses the median to perform a split on a set of elements.
I have sorted array of real values, say X, drawn from some unknown distribution. I would like draw a box plot for this data.
In the simplest case, I need to know five values: min, Q1, median, Q3, and max.
Trivially, min = X[0], max = X[length(X)-1], and possibly median = X[ceil(length(X)/2)]. But I'm wondering how to determine the lower quartile Q1 and Q3.
When I plot X = [1,2,4] using MATLAB, I obtain following result:
It seems to me like there is some magic how to obtain the values Q1 = 1.25 and Q3 = 3.5, but I don't know what the magic is. Does anybody have experience with this?
If you go to the original definition of box plots (look up John Tukey), you use the median for the midpoint (i.e., 2 in your data set of 1, 2, 4). The endpoints are the min and max.
The top and bottom of the box are not exactly defined by quartiles, instead they are called "hinges". Hinges are the medians of the top and bottom halves of the data. If there is an odd number of observations, the median of the entire set is used in determining both hinges. The lower hinge is the median of (1,2), or 1.5. The top hinge is the median of (2,4), or 3.
There are actually dozens of definitions of a box plot's quartiles (Wikipedia: "There is no universal agreement on choosing the quartile values"). If you want to rationalize MatLab's box plot, you'll have to check its documentation. Otherwise, you could Google your brains out to try to find a method that matches the results.
Minitab gives 1 and 4 for the hinges in your data set. Excel's PERCENTILE function gives 1.5 and 3, which incidentally matches Tukey's algorithm at least in this case.
The median devides the data into two halves. The median of the first half = Q1, and the median of the second half = Q3.
More info: http://www.purplemath.com/modules/boxwhisk.htm
Note on the MatLab boxplot: The Q1 and Q3 are maybe calculated in a different way in MatLab, I'd try with a larger amount of testing data. With my method, Q1 should be 1 and Q3 should be 4.
EDIT:
The possible calculation that MatLab does, is the difference between the median and the first number of the first half, and take a quarter of that. Add that to the first number to get Q1.
The same (roughly) applies to Q3: Take the difference between the median and the highest number, and subtract a quarter of that from the highest number. That is Q3.