So let's say I want to mix these 2 audio tracks:
In Audacity, I can use the "Mix and Render" option to mix them together, and I'll get this:
However, when I try to write my own code to mix, I get this:
This is essentially how I mix the samples:
private function mixSamples(sample1:UInt, sample2:UInt):UInt
return (sample1 + sample2) & 0xFF;
(The syntax is Haxe but it should be easy to follow if you don't know it.)
These are 8-bit sample audio files, and I want the product to be 8-bit as well, hence the & 0xFF.
I do understand that by simply adding the samples, I should expect clipping. My issue is that mixing in Audacity doesn't cause clipping (at least not to the extent that my code does), and by looking at the "tail" of the second (longer) track, it doesn't seem to reduce the amplitude. It doesn't sound any softer either.
So basically, my question is this: what's Audacity doing that I'm not? I want to mix tracks to sound exactly as if they're being played on top of one another, but I (obviously) don't want this horrendous clipping.
Here is what I get if I sign the values before I add, then unsign the sum value, as suggested by Radiodef:
As you can see it's much better than before, but is still quite distorted and noisy compared to the result Audacity produces. So my problem still stands, Audacity must be doing something differently.
I mixed the first track on itself, both with my code and Audacity, and compared the points where distortion occurs. This is Audacity's result:
And this is my result:

I think what is happening is you are summing them as unsigned. A typical sound wave is both positive and negative which is why they add together the way they do (some parts cancel). If you have some 8-bit sample that is -96 and another that is 96 and you sum them you will get 0. If what you have is unsigned audio you will instead have the samples 32 and 224 summed = 256 (offset and overflow).
What you need to do is sign them before summing. To sign 8-bit samples convert them to a signed int type and subtract 128 from all of them. I assume what you have are WAV files and you will need to unsign them again after the sum.
Audacity probably does floating point processing. I've heard some real dubious claims about floating point like that it has "infinite dynamic range" and garbage like that but it doesn't clip in the same determinate and obvious way as integers do. Floating point has a finite range of values same as integers but the largest and smallest values are much farther apart. (That's about the simplest way to put it.) Floating point can allow much greater amplitude changes in the audio but the catch is the overall signal to noise ratio is lower than integers.
With the weird distortion my best guess is it is from the mask you are doing with & 0xFF. If you want to actually clip instead of getting overflow you will need to do so yourself.
for (int i = 0; i < samplesLength; i++) {
if (samples[i] > 127) {
samples[i] = 127;
} else if (samples[i] < -128) {
samples[i] = -128;
Otherwise say you have two samples that are 125, summing gets you 250 (11111010). Then you unsign (add 128) and get 378 (101111010). An & will get you 1111010 which is 122. Other numbers might get you results that are effectively negative or close to 0.
If you want to clip at something other than 8-bit, full scale for a bit depth n will be positive (2 ^ (n - 1)) - 1 and negative 2 ^ (n - 1) so for example 32767 and -32768 for 16-bit.
Another thing you can do instead of clipping is to search for clipping and normalize. Something like:
double[] normalize(double[] samples, int length, int destBits) {
double fsNeg = -pow(2, destBits - 1);
double fsPos = -fsNeg - 1;
double peak = 0;
double norm = 1;
for (int i = 0; i < length; i++) {
// find highest clip if there is one
if (samples[i] < fsNeg || samples[i] > fsPos) {
norm = abs(samples[i]);
if (norm > peak) {
norm = peak;
if (peak != 0) {
// ratio to reduce to where there is not a clip
norm = -fsNeg / peak;
for (int i = 0; i < length; i++) {
samples[i] *= norm;
return samples;

It's a lot simpler than you think; although your original files are 8-bit, Audacity handles them internally as 32-bit floating point. You can see this in the screenshot, in the information panel to the left of each track. This means that adding 2 tracks together means adding two floating point samples at each point, and will simply yield sample values from -2.0 to +2.0, which are then clamped to the -1 to +1 range. By comparison, adding two 8-bit integers together will yield another 8-bit number where the value overflows and wraps around. (This can apply whether you use signed or unsigned values.)


What is the correct audio volume slider formula?

I'm building a VoIP application. If I take the slider value and just multiply audio samples by it, I get incorrect, nonlinear sounding results. What's the correct formula to get smooth results?
The correct formula is the decibel formula solved for Prms. Here's example code in C:
// level is 0 to 1, silence is dBFS at level 0
void AdjustVolume(int16_t* buffer, size_t length, float level, float silence = -96)
float factor = pow(10.0f, (1 - level) * silence / 20.0f);
for (size_t i = 0; i < length; i++)
buffer[i] = static_cast<int16_t>(buffer[i] * factor);
There's one tweakable: silence. It's the amount of noise when there's no sound. Or: the loudness level below which you can't hear the sound because of the background noise. The theoretical maximum silence for 16 bit audio samples is -96 dB (a sample with integer value of 1 out of 32767). In the real world however, there's background noise produced by the audio equipment and the surroundings of the listener, so you might want to pick a noisier silence level, like -30 dB or something. Picking the correct silence value will maximize the useful surface area of your volume slider, or minimize the amount of slider area where no perceptible change in volume occurs.

Stacking and dynamic programing

Basically I'm trying to solve this problem :
Given N unit cube blocks, find the smaller number of piles to make in order to use all the blocks. A pile is either a cube or a pyramid. For example two valid piles are the cube 4 *4 *4=64 using 64 blocks, and the pyramid 1²+2²+3²+4²=30 using 30 blocks.
However, I can't find the right angle to approach it. I feel like it's similar to the knapsack problem, but yet, couldn't find an implementation.
Any help would be much appreciated !
First I will give a recurrence relation which will permit to solve the problem recursively. Given N, let
be the subset of square numbers and triangle numbers in {1,...,N} respectively. Let PERMITTED_SIZES be the union of these. Note that, as 1 occurs in PERMITTED_SIZES, any instance is feasible and yields a nonnegative optimum.
The follwing function in pseudocode will solve the problem in the question recursively.
int MinimumNumberOfPiles(int N)
int Result = 1 + min { MinimumNumberOfPiles(N-i) }
where i in PERMITTED_SIZES and i smaller than N;
return Result;
The idea is to choose a permitted bin size for the items, remove these items (which makes the problem instance smaller) and solve recursively for the smaller instances. To use dynamic programming in order to circumvent multiple evaluation of the same subproblem, one would use a one-dimensional state space, namely an array A[N] where A[i] is the minimum number of piles needed for i unit blocks. Using this state space, the problem can be solved iteratively as follows.
for (int i = 0; i < N; i++)
if i is 0 set A[i] to 0,
if i occurs in PERMITTED_SIZES, set A[i] to 1,
set A[i] to positive infinity otherwise;
This initializes the states which are known beforehand and correspond to the base cases in the above recursion. Next, the missing states are filled using the following loop.
for (int i = 0; i <= N; i++)
if (A[i] is positive infinity)
A[i] = 1 + min { A[i-j] : j is in PERMITTED_SIZES and j is smaller than i }
The desired optimal value will be found in A[N]. Note that this algorithm only calculates the minimum number of piles, but not the piles themselves; if a suitable partition is needed, it has to be found either by backtracking or by maintaining additional auxiliary data structures.
In total, provided that PERMITTED_SIZES is known, the problem can be solved in O(N^2) steps, as PERMITTED_SIZES contains at most N values.
The problem can be seen as an adaptation of the Rod Cutting Problem where each square or triangle size has value 0 and every other size has value 1, and the objective is to minimize the total value.
In total, an additional computation cost is necessary to generate PERMITTED_SIZES from the input.
More precisely, the corresponding choice of piles, once A is filled, can be generated using backtracking as follows.
int i = N; // i is the total amount still to be distributed
while ( i > 0 )
choose j such that
j is in PERMITTED_SIZES and j is smaller than i
A[i] = 1 + A[i-j] is minimized
Output "Take a set of size" + j; // or just output j, which is the set size
// the part above can be commented as "let's find out how
// the value in A[i] was generated"
set i = i-j; // decrease amount to distribute

What is the unit of the return values (coefficients) of an FFT?

My application performs an FFT on the raw audio signal (all microphone readings are 16bit integer values in values, which is 1024 cells). It first normalizes the readings according to the 16bit. Then it extracts the magnitude of the frequency 400Hz.
int sample_rate = 22050;
int values[1024];
// omitted: code to read 16bit audio samples into values array
double doublevalues[1024];
for (int i = 0; i < 1024; i++) {
doublevalues[i] = (double)values[i] / 32768.0; // 16bit
fft(doublevalues); // inplace FFT, returns only real coefficients
double magnitude = 400.0 / sample_rate * 2048;
printf("magnitude of 400Hz: %f", magnitude);
When I try this out and generate a 400Hz signal to see the value of magnitude, it is around 0 when there is no 400Hz signal and goes up to 30 or 40 when there is.
What is the unit or meaning of the magnitude field? It surprises me that it is larger than 1 even though I normalize the raw signal to be between -1..+1.
It depends on which FFT you are using, as there are different conventions on scaling. The most common convention is that the output values are scaled by N, where N is the size of the FFT. So a 1024 point FFT will have output values which are 1024 times greater than the corresponding input values. A further complication is that for real-to-complex FFTs people typically ignore the symmetric upper half of the FFT, which is fine (because it's conjugate symmetric) but you need to account for a factor of 2 if you do this.
Other common conventions for FFT scaling are (a) no scaling (i.e. the factor of N has been removed) and (b) sqrt(N), which is sometimes used for symmetric scaling behaviour of FFT versus IFFT (sqrt(N) in each direction).
Since sqrt(1024) == 32 it's possible that you're using an FFT routine with sqrt(N) scaling, since you seem to be seeing values of around 30 for for a unit magnitude sine wave input.

How to draw a frequency spectrum from a Fourier transform

I want to plot the frequency spectrum of a music file (like they do for example in Audacity). Hence I want the frequency in Hertz on the x-axis and the amplitude (or desibel) on the y-axis.
I devide the song (about 20 million samples) into blocks of 4096 samples at a time. These blocks will result in 2049 (N/2 + 1) complex numbers (sine and cosine -> real and imaginary part). So now I have these thousands of individual 2049-arrays, how do I combine them?
Lets say I do the FFT 5000 times resulting in 5000 2049-arrays of complex numbers. Do I plus all the values of the 5000 arrays and then take the magnitude of the combined 2049-array? Do I then sacle the x-axis with the songs sample rate / 2 (eg: 22050 for a 44100hz file)?
Any information will be appriciated
What application are you using for this? I assume you are not doing this by hand, so here is a Matlab example:
>> fbins = fs/N * (0:(N/2 - 1)); % Where N is the number of fft samples
now you can perform
>> plot(fbins, abs(fftOfSignal(1:N/2)))
edit: check this out
Wow I've written a load about this just recently.
I even turned it into a blog post available here.
My explanation is leaning towards spectrograms but its just as easy to render a chart like you describe!
I might not be correct on this one, but as far as I'm aware, you have 2 ways to get the spectrum of the whole song.
1) Do a single FFT on the whole song, which will give you an extremely good frequency resolution, but is in practice not efficient, and you don't need this kind of resolution anyway.
2) Divide it into small chunks (like 4096 samples blocks, as you said), get the FFT for each of those and average the spectra. You will compromise on the frequency resolution, but make the calculation more manageable (and also decrease the variance of the spectrum). Wilhelmsen link's describes how to compute an FFT in C++, and I think some library already exists to do that, like FFTW (but I never managed to compile it, to be fair =) ).
To obtain the magnitude spectrum, average the energy (square of the magnitude) accross all you chunks for every single bins. To get the result in dB, just 10 * log10 the results. That is of course assuming that you are not interested in the phase spectrum. I think this is known as the Barlett's method.
I would do something like this:
// At this point you have the FFT chunks
float sum[N/2+1];
// For each bin
for (int binIndex = 0; binIndex < N/2 + 1; binIndex++)
for (int chunkIndex = 0; chunkIndex < chunkNb; chunkIndex++)
// Get the magnitude of the complex number
float magnitude = FFTChunk[chunkIndex].bins[binIndex].real * FFTChunk[chunkIndex].bins[binIndex].real
+ FFTChunk[chunkIndex].bins[binIndex].im * FFTChunk[chunkIndex].bins[binIndex].im;
magnitude = sqrt(magnitude);
// Add the energy
sum[binIndex] += magnitude * magnitude;
// Average the energy;
sum[binIndex] /= chunkNb;
// Then get the values in decibel
for (int binIndex = 0; binIndex < N/2 + 1; binIndex++)
sum[binIndex] = 10 * log10f(sum[binIndex]);
Hope this answers your question.
Edit: Goz's post will give you plenty of information on the matter =)
Commonly, you would take just one of the arrays, corresponding to the point in time of the music in which you are interested. The you would calculate the log of the magnitude of each complex array element. Plot the N/2 results as Y values, and scale the X axis from 0 to Fs/2 (where Fs is the sampling rate).

Microsoft.DirectX.Vector3.Normalize() inconsistency

Two ways to normalize a Vector3 object; by calling Vector3.Normalize() and the other by normalizing from scratch:
class Tester {
static Vector3 NormalizeVector(Vector3 v)
float l = v.Length();
return new Vector3(v.X / l, v.Y / l, v.Z / l);
public static void Main(string[] args)
Vector3 v = new Vector3(0.0f, 0.0f, 7.0f);
Vector3 v2 = NormalizeVector(v);
The code above produces this:
X: 0
Y: 0
Z: 1
X: 0
Y: 0
Z: 0.9999999
(Bonus points: Why Me?)
Look how they implemented it (e.g. in asm).
Maybe they wanted to be faster and produced something like:
l = 1 / v.length();
return new Vector3(v.X * l, v.Y * l, v.Z * l);
to trade 2 divisions against 3 multiplications (because they thought mults were faster than divs (which is for modern fpus most often not valid)). This introduced one level more of operation, so the less precision.
This would be the often cited "premature optimization".
Don't care about this. There's always some error involved when using floats. If you're curious, try changing to double and see if this still happens.
You should expect this when using floats, the basic reason being that the computer processes in binary and this doesn't map exactly to decimal.
For an intuitive example of issues between different bases consider the fraction 1/3. It cannot be represented exactly in Decimal (it's 0.333333.....) but can be in Terniary (as 0.1).
Generally these issues are a lot less obvious with doubles, at the expense of computing costs (double the number of bits to manipulate). However in view of the fact that a float level of precision was enough to get man to the moon then you really shouldn't obsess :-)
These issues are sort of computer theory 101 (as opposed to programming 101 - which you're obviously well beyond), and if your heading towards Direct X code where similar things can come up regularly I'd suggest it might be a good idea to pick up a basic computer theory book and read it quickly.
You have here an interesting discussion about String formatting of floats.
Just for reference:
Your number requires 24 bits to be represented, which means that you are using up the whole mantissa of a float (23bits + 1 implied bit).
Single.ToString () is ultimately implemented by a native function, so I cannot tell for sure what is going on, but my guess is that it uses the last digit to round the whole mantissa.
The reason behind this could be that you often get numbers that cannot be represented exactly in binary, so you would get a long mantissa; for instance, 0.01 is represented internally as 0.00999... as you can see by writing:
float f = 0.01f;
Console.WriteLine ("{0:G}", f);
Console.WriteLine ("{0:G}", (double) f);
by rounding at the seventh digit, you will get back "0.01", which is what you would have expected.
For what seen above, numbers with only 7 digits will not show this problem, as you already saw.
Just to be clear: the rounding is taking place only when you convert your number to a string: your calculations, if any, will use all the available bits.
Floats have a precision of 7 digits externally (9 internally), so if you go above that then rounding (with potential quirks) is automatic.
If you drop the float down to 7 digits (for instance, 1 to the left, 6 to the right) then it will work out and the string conversion will as well.
As for the bonus points:
Why you ? Because this code was 'eager to blow on you'.
(Vulcan... blow... ok.
If your code is broken by minute floating point rounding errors, then I'm afraid you need to fix it, as they're just a fact of life.
