How to calculate audio dynamic range? - audio

In Wiki "dynamic range" is defined as "the ratio of the amplitude of the loudest possible undistorted sine wave to the root mean square (rms) noise amplitude", but I'm not clear about how should I use these operands.
I have read in an uncompressed .wav file. It uses 16 bits per sample, and I've converted these bytes to integers (may range from -32768 to 32767). The largest int is 31692 and the smallest -32764. So what should I do next? I saw the formula "20 * log (high / low)" and it doesn't seem to work directly. Could you please show me the calculation steps? Thanks.

I've solved this problem. Actually the formula "20 * log (high / low)" works. "high" should be abs(-32764) = 32764, and low should be the value most near 0 but not 0, which is 1 in my file. So the dynamic range is 20 * log10(32764 / 1) = 90 dB.

Related

Why does this hash calculating bit hack work?

For practice I've implemented the qoi specification in rust. In it there is a small hash function to store recently used pixels:
index_position = (r * 3 + g * 5 + b * 7 + a * 11) % 64
where r, g, b, and a are the red, green, blue and alpha channels respectively.
I assume this works as a hash because it creates a unique prime factorization for the numbers with the mod to limit the number of bytes. Anyways I implemented it naively in my code.
While looking at other implementations I came across this bit hack to optimize the hash calculation:
fn hash(rgba:[u8:4]) -> u8 {
let v = u32::from_ne_bytes(rgba);
let s = (((v as u64) << 32) | (v as u64)) & 0xFF00FF0000FF00FF;
s.wrapping_mul(0x030007000005000Bu64.to_le()).swap_bytes() as u8 & 63
}
I think I understand most of what's going on but I'm confused about the magic number (the multiplicand). To my understanding it should be flipped. As a step by step example:
let rgba = [0x12, 0x34, 0x56, 0x78].
On my machine (little endian) this gives v the value 0x78563412.
The bit shifting spreads the values, giving s = 0x7800340000560012.
Now here's where I get confused. The magic number has the values that should be multiplied aligned in a 64 bit field (3, 5, 7, 11), spaced the same way that the original values are. However they seem to be in reverse order from the values:
0x7800340000560012
0x030007000005000B
When multiplying it would seem that the highest value, the alpha channel (0x78), is being multiplied by 3, while the lowest value, the red channel (0x12), is being multiplied by 11. I'm also not entirely sure why this multiplication works anyway, after multiplying the values by various powers of 2.
I understand that the bytes are then swapped to big endian and trimmed, but that's not until after the multiplication step which loses me.
I know that the code produces the correct hash, but I don't understand why that's the case. Can anyone explain to me what I'm missing?
If you think about the way the math works, you want this flipped order, because it means all the results from each of the "logical" multiplications cluster in the same byte. The highest byte in the first value multiplied by the lowest byte in the second produces a result in the highest byte. The lowest byte in the first value's product with the highest byte in the second value produces a result in the same highest byte, and the same goes for the intermediate bytes.
Yes, the 0x78... and 0x03... are also multiplied by each other, but they overflow way past the top of the value and are lost. Having the order "backwards" means the result of the multiplications we care about all ends up summed in the uppermost byte (the total shift of the results we want is always 56 bits, because the 56th bit offset value is multiplied by the 0th, the 40th by the 16th, the 16th by the 40th, and the 0th by the 56th), with the rest of the multiplications we don't want having their results either overflow (and being lost) or appearing in lower bytes (which we ignore). If you flipped the bytes in the second value, the 0x78 * 0x0B (alpha value & multiplier) component would be lost to overflow, while the 0x12 * 0x03 (red value & multiplier) component wouldn't reach the target byte (every component we cared about would end up somewhat that wasn't the uppermost byte).
For a possibly more intuitive example, imagine doing the same work, but where all the bytes of one input except a single component are zero. If you multiply:
0x7800000000000000 * 0x030007000005000B
the logical result is:
0x1680348000258052800000000000000
but removing the overflow reduces that to:
0x2800000000000000
//^^ result we care about (actual product of 0x78 and 0x0B is 0x528, but only keeping low byte)
Similarly,
0x0000340000000000 * 0x030007000005000B
produces:
0x9c016c000104023c0000000000
overflowing to:
0x04023c0000000000
//^^ result we care about (actual product of 0x34 and 0x5 was 0x104, but only 04 kept)
In that case, the other multiplications did leave data in result (not all overflowed), but since we only look at the high byte, the rest gets ignored.
If you keep doing this math step by step and adding the results, you'll find that the high byte ends up the correct answer to the four individual multiplications you expected (mod 256); flip the order, and it won't work out that way.
The advantage to putting all the results in that high byte is that it allows you to use swap_bytes to move it cheaply to the low byte, and read the value directly (no need to even mask it on many architectures).

Excel pixels not proportional to points

I've noticed that (by changing column width) that the column width measured in points is not proportional to the pixel size. For example, at 21.44 points the pixel width of a column is 200. But at 20 pixels the width becomes 1.44 points, not the expected 2.14 points.
This is very confusing as I'm trying to write a code in VBA which will divide a particular size in 'n' different columns of equal size. Can anyone explain this abnormality? How can I write a code to divide the width (since the parameters for the column width are in points)?
Thanks
So I just was trying things and stumbled across this.
Maybe the numbers are off from a "true" width. If the theory is correct, then there must exist such an offset.
(21.44 + x) = 10 (1.44 + x)
x = 0.7822
Now let's see if this offset works for other some other lengths. For 80 pixels, the length mentioned by MS Excel is 8.11 points. Thus the true length is 8.89. The true length of a column with a width 20 pixels is 1.44 + 0.7822 = 2.222. Note that 2.222 * 4 = 8.89 approx. And this works for some other numbers as well, so I guess the theory should be correct.
Thus to answer the question, add the offset 0.7822 to the observed column width that you need to divide. Then divide it by 'n'. Subtract the offset to obtain the length 'x'. Then use the command Columns(var).ColumnWidth = x for each of the n columns

Verilog code to compute cosx using Taylor series approximation

I'm trying to implement COS X function in Verilog using Taylor series. The problem statement presented to me is as below
"Write a Verilog code to compute cosX using Taylor series approximation. Please attach the source and test bench code of the 8-bit outputs in signed decimal radix format for X = 0° to 360° at the increment of 10° "
I need to understand a couple of things before i proceed. Please correct me if i am wrong someplace
Resolution calculation : 10° increments to cover 0° to 360° => 36 positions
36 in decimal can be represented by 6 bits. Since we can use 6 bits, the resolution is slightly better by using 64 words. The 64 words represent 0° to 360° hence each word represents a resolution of 5.625° ie all values of Cos from 0° to 360° in increments of 5.625°. Thus resolution is 5.625°
Taylor series calculation
Taylor series for cos is given by Cos x approximation by Taylor series
COS X = 1 − (X^2/2!) + (X^4/4!) − (X^6/6!) ..... (using only 3~4 terms)
I have a couple of queries
1) While it is easy to generate X*X (X square) or X cube terms using a multiplier, i am not sure how to deal with the extra bits generated during calculation of X square or X cube terms . Output is 8 bits only
eg X=6 bits ; X square =12 bits ; X cube = 18 bits.
Do i generate them anyways and later ignore them by considering just the MSB 8 bits of the entire result ? ... such a cos wave would suck right ?
2) I am not sure how to handle the +1 addition at start of Taylor series ...COS X = 1 − (X^2/2!) + (X^4/4!) .... Do i add binary 1 directly or do i have to scale the 1 as 2^8 = 255 or 2^6 = 64 since i am using 6 bits at input and 8 bits at output ?
I think this number series normally gives a number in the range +1 to -1. SO you have to decide how you are going to use your 8 bits.
I think a signed number with 1 integer bit and 7 fractional bits, you will not be able to represent 1, but very close.
I have a previous answer explaining how to use fixed-point with verilog. Once your comfortable with that you need to look at how bit growth occurs during multiply.
Just because you are outputting 1 bit int, 7 bit frac internally you could (should) use more to compute the answer.
With 7 fractional bits a 1 integer would look like 9'b0_1_0000000 or 1*2**7.

formula Amplitude using FFT

I want to ask about the formula of amplitude bellow. I am using Fast Fourier Transform. So it returns real and complex numbers.
after that I must search amplitude for each frequency.
My formula is
amplitude = 10 * log (real*real + imagined*imagined)
I want to ask about this formula. What is it source? I have been search, but I don't found any source. Can anybody tell me about that source?
This is a combination of two equations:
1: Finding the magnitude of a complex number (the result of an FFT at a particular bin) - the equation for which is
m = sqrt(r^2 + i ^2)
2: Calculating relative power in decibels from an amplitude value - the equation for which is p =10 * log10(A^2/Aref^2) == 20 log10(A/Aref) where Aref is a some reference value.
By inserting m from equation 1 into a from equation 2 with ARef = 1 we get:
p = 10 log(r^2 + i ^ 2)
Note that this gives you a measure of relative signal power rather than amplitude.
The first part of the formula likely comes from the definition of Decibel, with the reference P0 set to 1, assuming with log you meant a logarithm with base 10.
The second part, i.e. the P1=real^2 + imagined^2 in the link above, is the square of the modulus of the Fourier coefficient cn at the n-th frequency you are considering.
A Fourier coefficient is in general a complex number (See its definition in the case of a DFT here), and P1 is by definition the square of its modulus. The FFT that you mention is just one way of calculating the DFT. In your case, likely the real and complex numbers you refer to are actually the real and imaginary parts of this coefficient cn.
sqrt(P1) is the modulus of the Fourier coefficient cn of the signal at the n-th frequency.
sqrt(P1)/N, is the amplitude of the Fourier component of the signal at the n-th frequency (i.e. the amplitude of the harmonic component of the signal at that frequency), with N being the number of samples in your signal. To convince yourself you need to divide by N, see this equation. However, the division factor depends on the definition/convention of Fourier transform that you use, see the note just above here, and the discussion here.

Binning in Excel

Which formulae in MS Excel can we use for -
equi-depth binning
equi-width binning
Here's what I used. The data I was binning was in A2:A2001.
Equi-width:
I calculated the width in a separate cell (U2), using this formula:
=(MAX($A$2:$A$2001) - MIN($A$2:$A$2001) + 0.00000001)/10
10 is the number of bins. The + 0.00000000001 is there because without it, values equal to the maximum were getting put into their own bin.
Then, for the actual binning, I used this:
=ROUNDDOWN(($A2-MIN($A$2:$A$2001))/$U$2, 0)
This function is finding how many bin-widths above the minimum your value is, by dividing (value - minimum) by the bin width. We only care about how many full bin-widths fit into the value, not fractional ones, so we use ROUNDDOWN to chop off all the fractional bin-widths (that is, show 0 decimal places).
Equi-depth
This one is simpler.
=ROUNDDOWN(PERCENTRANK($A$2:$A$2001, $A2)*10, 0)
First, get the percentile rank of the current cell ($A2) out of all the cells being binned ($A$2:$A$2001). This will be a value between 0 and 1, so to convert it into bins, just multiply by the total number of bins you want (I used 10). Then, chop off the decimals the same way as before.
For either of these, if you want your bins to start at 1 rather than 0, just add a +1 to the end of the formula.
Best approach is to use the built-in method:
http://support.microsoft.com/kb/214269
I think the VBA version of the addin (step 3 with most versions) will also give you the code.
Put this formula in B1:
=MAX( ROUNDUP( PERCENTRANK($A$1:$A$8, A1) *4, 0),1)
Fill down the formula all across B column and you are done. The formula divides the range into 4 equal buckets and it returns the bucket number which the cell A1 falls into. The first bucket contains the lowest 25% of values.
General pattern is:
=MAX( ROUNDUP ( PERCENTRANK ([Range], [TestCell]) * [NumberOfBuckets], 0), 1)
You may have to build the matrix to graph.
For the bin bracket you could use =PERCENTILE() for equi-depth and a proportion of the difference =Max(Data) - Min(Data) for equi-width.
You could obtain the frequency with =COUNTIF(). The bin's Mean could be obtained using =SUMPRODUCT((Data>LOWER_BRACKET)*(Data<UPPER_BRACKET)*Data)/frequency
More complex statistics could be reached hacking around with SUMPRODUCT and/or Array formulas (which I do not recommend since are very hard to comprehend for a non-programmer)

Resources