How to normalize samples of an ongoing cumulative sum? - audio

For simplicity let's assume we have a function sin(x) and calculated 1000 samples between -1 and 1 with it. We can plot those samples. Now in the next step we want to plot the integral of sin(x) which would be - cos(x) + C. Now i can calculate the integral with my existing samples like this:
y[n] = x[n] + y[n-1]
Because it's a cumulative sum we will need to normalize it to get samples between -1 and 1 on the y axis.
y = 2 * ( x - min(x) / max(x) - min(x) ) - 1
To normalize we need a maximum and a minimum.
Now we want to calculate the next 1000 samples for sin(x) and calculate the integral again. Because it's a cumulative sum we will have a new maximum which means we will need to normalize all of our 2000 samples.
Now my question basically is:
How can i normalize samples in this context without knowing the maximum and minimum?
How can i prevent, to normalize all previous samples again, if i have a new set of samples with a new maximum/minimum?

I've found a solution :)
I also want to mention: This is about periodic functions like Sine, so basically the maximum and minimum should be always the same, right?
In a special case this isn't true:
If you samples don't contain a full period of the function (with global maximum and minimum of the function). This can happen when you choose a very low frequency.
What can you do:
Simply calculate the samples for a function like sin(x) with a
frequency of 1. It will contain the global maximum and minimum of the function (it's important that y varies between -1 and 1, not 0 and 1!).
Then you calculate the integral with the cumulative sum.
get maximum and minimum of the samples
you can scale it up or down: maximum/frequency, minimum/frequency
can be used now to normalize samples which were calculated with any other frequency.
It only need to be calculated once at the beginning.

Related

What's the difference between these two methods for calculating a weighted median?

I'm trying to calculate a weighted median, but don't understand the difference between the following two methods. The answer I get from weighted.median() is different from (df, median(rep(value, count))), but I don't understand why. Are there many ways to get a weighted median? Is one more preferable over the other?
df = read.table(text="row count value
1 1. 25.
2 2. 26.
3 3. 30.
4 2. 32.
5 1. 39.", header=TRUE)
# weighted median
with(df, median(rep(value, count)))
# [1] 30
library(spatstat)
weighted.median(df$value, df$count)
# [1] 28
Note that with(df, median(rep(value, count))) only makes sense for weights which are positive integers (rep will accept float values for count but will coerce them to integers). This approach is thus not a full general approach to computing weighted medians. ?weighted.median shows that what the function tries to do is to compute a value m such that the total weight of the data below m is 50% of the total weight. In the case of your sample, there is no such m that works exactly. 28.5% of the total weight of the data is <= 26 and 61.9% is <= 30. In a case like this, by default ("type 2") it averages these 2 values to get the 28 that is returned. There are two other types. weighted.median(df$value,df$count,type = 1) returns 30. I am not completely sure if this type will always agree with your other approach.

In DDA, why are lines sampled in unit intervals of x if gradient <= 1

from Wikipedia,
A linear DDA starts by calculating the smaller of dy or dx for a unit increment of the other. A line is then sampled at unit intervals in one coordinate and corresponding integer values nearest the line path are determined for the other coordinate.
Considering a line with a positive slope, if the slope is less than or equal to 1, we sample at unit x intervals (dx=1) [...]
For lines with a slope greater than 1, we reverse the role of x and y i.e. we sample at dy=1 [...]
Similar calculations are carried out to determine pixel positions along a line with a negative slope
How does the slope (positive or negative) affect the algorithm?
why is the gradient being less or equal to 1 important?
If your gradient is negative (in one of the dimensions) and you walk along that direction with unit increments, you have to adapt your loop to count backwards.
If you walk along the wrong dimension (with unit increments), you will end up with gaps on the line. E.g., if you have slope 2 and you walk along the x-direction, only every second row will contain a pixel.

Fitting Axis and of Y and X in Excel

I have the following Graph: The Y values are located at X= 32,64,128,256, 512 and 1024. However, the graph shows different values. I would like to show for X-axis labels only the relevant values (i.e.32,64,128,256, 512 and 1024).
In addition, I would like to add the maximal value of 1 to Y-axis. As can be seen I defined the maximal value to be 1 but the graph doesn't show it.
How can I fix these 2 issues both in X-axis and in Y-axis?
For the X-axis: tick the check box "Logarithmic scale" and set the Base to 2.
For the Y-axis: set the Minimum to a value that is divisible by the Major unit 0.1, for example to 0.4.
Thanks to Hans Vogelaar (http://www.eileenslounge.com) for the answer.

Create empirical cumulative distribution function (CDF) and then use the CDF to find probabilities

I have a set of observed data and created an empirical cumulative distribution using Excel. I want to use this CDF to find probabilities like P(x < X) or P (X1 < x < X2 ).
The way I created the CDF is to arrange the data in ascending order and then create a column next to it with the probabilities:
I have 4,121 records and the sample here is for four records. Once I have this calculation done, the curve is plotted using xy scatter plot for Data in the x-axis and Probability in the y-axis. This is how I created the CDF.
How can I find probability below 2.5, P(x<=2.5), or P( 970 < x < 980 )?
I hope there is an easy way because I will have hundreds of probabilities to find.

How to calculate mean and standard deviation for hue values from 0 to 360?

Suppose 5 samples of hue are taken using a simple HSV model for color, having values 355, 5, 5, 5, 5, all a hue of red and "next" to each other as far as perception is concerned. But the simple average is 75 which is far away from 0 or 360, close to a yellow-green.
What is a better way to calculate this mean and associated std?
The simple solution is to convert those angles to a set of vectors, from polar coordinates into cartesian coordinates.
Since you are working with colors, think of this as a conversion into the (a*,b*) plane. Then take the mean of those coordinates, and then revert back into polar form again. Done in matlab,
theta = [355,5,5,5,5];
x = cosd(theta); % cosine in terms of degrees
y = sind(theta); % sine with a degree argument
Now, take the mean of x and y, compute the angle, then
convert back from radians to degrees.
meanangle = atan2(mean(y),mean(x))*180/pi
meanangle =
3.0049
Of course, this solution is valid only for the mean angle. As you can see, it yields a consistent result with the mean of the angles directly, where I recognize that 355 degrees really wraps to -5 degrees.
mean([-5 5 5 5 5])
ans =
3
To compute the standard deviation, it is simplest to do it as
std([-5 5 5 5 5])
ans =
4.4721
Yes, that requires me to do the wrap explicitly.
I think the method proposed by user85109 is a good way to compute the mean, but not the standard deviation:
imagine to have three angles: 180, 180, 181
the mean would be correctly computed, as a number aproximately equal to 180
but from [180,180,-179] you would compute a high variance when in fact it is near zero
At first glance, I would compute separately the means and variances for the half positive angles , [0 to 180] and fot the negative ones [0,-180] and later I would compute the combined variance
https://www.emathzone.com/tutorials/basic-statistics/combined-variance.html
taking into account that the global mean and the difference between it and the local means has to be computed in both directions: clockwise and counterclockwise, and the the correct one has to be chosen.

Resources