Finding a median within 2-3-4 tree - median

I'm trying to come up with a simple method to find a median of all the input in the 2-3-4 tree.
So let say I have already created a tree. Would it be fine to just make it as simple as possible and just dynamically create a new array and add every element in the tree in a sorted manner and find the middle value? (assuming n is always odd number).
Or would there be a better and more efficient method.

Related

Computing a Fibonacci sequence in Terraform

The title is really all there is to the question: how would you compute the values of a Fibonacci sequence (first N values, where N is an input variable) and store them in a Terraform local variable?
This could, of course, be done with an external data source, but I'm looking for a way to do it in pure Terraform.
There's no real need to actually do this, but the Fibonacci sequence is a representation of a problem I need to solve in Terraform (where values in a list depend on previous values of that same list).
I think the easiest way would be to create your own external data source for that, as in TF you can't access existing list elements during iteration when you create the lists itself.
And specific to Fibonacci sequence. I would just per-compute its values, and then just read any number of values I need from a list in TF or a file. Usually you would know a possible maximum number of those elements your app requires. Thus there is no reason to recalculate it every single time.

Optimal way to add an element in k-th position of list of size n if k<n

I know it's possible to add an element inside a list AND NOT AS THE FIRST ELEMENT NOR THE LAST by redefining the list and adding three lists:
# I want to add 5 into [1,2,3,4,6,7,8,9,0] between the 4 and the 6
A=[1,2,3,4,6,7,8,9,0]
A=[1,2,3,4]+[5]+[6,7,8,9,0]
but I think this isn't optimal, since I'm creating three lists and re-defining a variable. Someone could show me the best way to do this?
You can use insert method of the list mentioned here.
L = [1,2,3,4,6,7,8,9,0]
L.insert(4,5)
This is the optimized way of python, if you need more optimized insertion operation perhaps use some other data structure depending upon your need.

Determining string values when implementing an insert() algorithm

I am currently going through some revision on my data structures and algorithm module, and have hit a sticky point, that I was wondering if anybody on here could clear up.
I'm working through the various insert() algorithms on the data structures, and have been faced with the issue of adding nodes to a binary search tree using the insert algorithm. This would normally not be a problem when inserting ints, however when inserting String objects into the tree, how would I go about comparing the String value within the node I am adding, to the String value within the node on the Binary Search tree, (in order to determine its position within the tree).
In other words, what gives one String a higher value than another String?
This may be a very simple answer, so apologies if so, but thanks in advance for any help!
Strings are usually compared lexicographically.

Quadtree object movement

So I need some help brainstorming, from a theoretical standpoint. Right now I have some code that just draws some objects. The objects lie in the leaves of a quadtree. Now as the objects move I want to keep them placed in the correct leaf of the quadtree.
Right now I am just reconstructing the quadtree on the objects after I change their position. I was trying to figure out a way to correct the tree without rebuilding it completely. All I can think of is having a bunch of pointers to adjacent leaf nodes.
Does anyone have an idea of how to figure out the node into which an object moves without just having a ton of pointers everywhere or a link to articles on this? All I could find was different ways to build the quadtree, nothing about updating it.
If I understand your question. You want some way of mapping between spatial coordinates and leaves on the quadtree.
Here's one possible solution I've been looking at:
For simplicity, let's do the 1D case first. And lets assume we have 32 gridpoints in x. Every grid point then corresponds to some leaf on a quadtree of depth five. (depth 0 = the whole grid, depth 1 = 2 points, depth 2 = 4 points... depth 5 = 32 points).
Each leaf could be represented by the branch indices leading to the leaf. At each level there are two branches we can label A and B. So, a particular leaf might be labeled BBAAB, which would mean, go down the B branch, then the B branch, then the A branch, then the B branch and then the B branch.
So, how do you map e.g. BBABB to an x grid point between 0..31? Just convert it to binary, so that BBABB->11011 = 27. Thus, the mapping from gridpoint to leaf-node is simply a matter of translating the letters A and B into 0s and 1s and then interpreting the result as a binary number.
For the 2D case, it's only slightly more complicated. Now we have four branches from each node, so we can label each branch path using a four-letter alphabet, e.g. starting from the root and taking the 3rd branch and then the fourth branch and then the first branch and then the second branch and then the second branch again we would generate the string CDABB.
Now to convert the string (e.g. 'CDABB') into a pair of gridvalues (x,y).
Let's assume A is lower-left, B is lower right, C is upper left and D is upper right. Then, symbolically, we could write, A.x=0, A.y=0 / B.x=1, B.y=0 / C.x=0, C.y=1 / D.x=1, D.y=1.
Taking the example CDABB, we first look at its x values (CDABB).x = (01011), which gives us the x grid point. And similarly for y.
Finally, if you want to find out e.g. the node immediately to the right of CDABB, then simply convert it to a pair of binary numbers in x and y, add +1 to the x value and convert the new pair of binary numbers back into a string.
I'm sure this has all been discovered, but I haven't yet found this information on the web.
If you have the spatial data necessary to insert an element into the quad-tree in the first place (ex: its point or rectangle), then you have the same data needed to remove it.
An easy way is before you move an element, remove it from the quad-tree using the same data you used to originally insert it, then move it, then re-insert.
Removal from the quad-tree can first remove the element from the leaf node(s), then if the leaf nodes become empty, remove them from their parents. If the parents become empty, remove them from their parents, and so forth.
This simple method is efficient enough for a complex world of objects moving every frame as long as you implement the quad-tree efficiently (ex: use a free list for the nodes). There shouldn't have to be a heap allocation on a per-node basis to insert it, nor a heap deallocation involved in removing every single node. Most node allocations/deallocations should be a simple constant-time operation just involving, say, the manipulation of a couple of integers or pointers.
You can also make this a little more complex if you like. You can start off storing the previous position of an object and then move it. If the new position occupies nodes other than the previous position, then remove the object from the nodes it no longer occupies and insert it to the new ones. Otherwise just keep it in the same node(s).
Update
I usually try to avoid linking my previous answers, but in this case I ended up doing a pretty comprehensive write up on the topic which would be hard to replicate anywhere else. Here it is: https://stackoverflow.com/a/48330314/4842163

Update the quantile for a dataset when a new datapoint is added

Suppose I have a list of numbers and I've computed the q-quantile (using Quantile).
Now a new datapoint comes along and I want to update my q-quantile, without having stored the whole list of previous datapoints.
What would you recommend?
Perhaps it can't be done exactly without, in the worst case, storing all previous datapoints.
In that case, can you think of something that would work well enough?
One idea I had, if you can assume normality, is to use the inverse CDF instead of the q-quantile.
Keep track of the sample variance as you go and then you can compute InverseCDF[NormalDistribution[sampleMean,sampleVariance], q] which should be the value such that a fraction q of the values are smaller, which is what the q-quantile is.
(I see belisarius was thinking along the same lines.
Here's the link he pointed to: http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm )
Unless you know that your underlying data comes from some distribution, it is not possible to update arbitrary quantiles without retaining the original data. You can, as others suggested, assume that the data has some sort of distribution and store the quantiles this way, but this is a rather restrictive approach.
Alternately, have you thought of programming this somewhere besides Mathematica? For example, you could create a class for your datapoints that contains (1) the Double value and (2) some timestamp for when the data came in. In a SortedList of these datapoints classes (which compares based on value), you could get the quantile very fast by simply referencing the index of the datapoints. Want to get a historical quantile? Simply filter on the timestamps in your sorted list.

Resources