I've written the following code for frequency modulation of an audio signal. The audio itself is 1 sec long, sampled at 8000 Hz. I want to apply FM to this audio signal by using a sine wave with a frequency of 50 Hz (expressed as a fraction of the sampling frequency). The modulating signal has a modulation index of 0.25 so as to create only one pair of sidebands.
for (i = 0; i < 7999; i++) {
phi_delta = 8000 - 8000 * (1 + 0.25 * sin(2* pi * mf * i));
f_phi_accum += phi_delta; //this can have a negative value
/*keep only the integer part that'll be used as an index into the input array*/
i_phi_accum = f_phi_accum;
/*keep only the fractional part that'll be used to interpolate between samples*/
r_phi_accum = f_phi_accum - i_phi_accum;
//If I'm getting negative values should I convert them to positive
//r_phi_accum = fabs(f_phi_accum - i_phi_accum);
i_phi_accum = abs(i_phi_accum);
/*since i_phi_accum often exceeds 7999 I have to add this if statement so as to prevent out of bounds errors */
if (i_phi_accum < 7999)
output[i] = ((input[i_phi_accum] + input[i_phi_accum + 1])/2) * r_phi_accum;
}
Your calculation of phi_delta is off by a factor of 8000 and an offset - it should be 1 +/- a small value, i.e.
phi_delta = 1.0 + 0.25 * sin(2.0 * pi * mf * i));
which will result in phi_delta having a range of 0.75 to 1.25.
Related
Does anybody knows an algorithm for making a random series of numbers (like 100 java-byte (>=-127 & <= 127) ) which when are drawn as a bar chart, would be similar to a regular audio spectrum, like those SoundCloud ones?
I'm trying to write one, it has multiple Random and Sinus calculations, but the result is very ugly, it's something between a sinus wave and an old toothbrush. I would be very thankful if you code direct me to a one which is aesthetically convincing
An algorithm with an explanation (and/or picture) is fine. A pseudocode would be very nice of you. An actual JAVA code is bonus. :D
Edit:
This is the code I'm using right now. It's convoluted but I'm basically adding a random deviation to a sinus wave with random amplitude (which I'm not sure if it was a good idea).
private static final int FREQ = 7;
private static final double DEG_TO_RAD = Math.PI / 180;
private static final int MAX_AMPLITUDE = 127;
private static final float DEVIATION = 0.1f; // 10 percent is maximum deviation
private void makeSinusoidRandomBytes() {
byte[] bytes = new byte[AUDIO_VISUALIZER_DENSITY];
for (int i = 0; i < AUDIO_VISUALIZER_DENSITY; i++) {
int amplitude = random.nextInt(MAX_AMPLITUDE) - MAX_AMPLITUDE/2;
byte dev = (byte) (random.nextInt((int) Math.max(Math.abs(2 * DEVIATION * amplitude), 1))
- Math.abs(DEVIATION * amplitude));
bytes[i] = (byte) (Math.sin(i * FREQ * DEG_TO_RAD) * amplitude - dev);
}
this.bytes = bytes;
}
A real soundwave is actually a combination of sine waves of different frequencies and amplitudes added together, not random deviations from a sine wave. The difficult part will be to choose a combination of wave amplitudes and frequencies that will produce the output that you will subjectively like! However, most sound waves have a base frequency and then a number of overtones which "fit into" that wavelength - for example it might have an overtone at 3/2 * the base frequency and at amplitude of 2/3 the base frequency. By combining these overtones and scaling the resulting waveform to the -127 - +127 range, you'll get an actual soundwave.
The following code is C#, but close enough to Java to give you an idea. It's from a game, where I needed to combine many sine waves together to create various types of oscillating effects:
/// <summary>
/// Return a value between 0 and 1 based on a sine-wave oscillating with a given combination of periods at a given point in time
/// </summary>
/// <param name="time">time to get wave value at</param>
/// <param name="periods">lengths of waves</param>
/// <returns>height of wave</returns>
public static float MultiPulse(float time, params float[] periods)
{
float c = 0;
foreach (float p in periods)
{
float cp = (MathHelper.Pi / p) * time;
float s = ((float)Math.Sin(cp) + 1) / 2;
c += s / periods.Length;
}
return c;
}
You probably want to modify that to allow you to specify different amplitudes as well as periods for the waves you are combining.
By combining many widely varying amplitudes and periods (frequencies) you should by trial and error be able to get something convincing.
Based on the idea see sharper gave me, this is the code I'm using right now:
int mainAmp = random.nextInt(MAX_AMPLITUDE) - MAX_AMPLITUDE / 2;
int overtoneAmp = random.nextInt(MAX_AMPLITUDE * 2 / 3) - MAX_AMPLITUDE / 3;
int overtone2Amp = random.nextInt(MAX_AMPLITUDE * 4 / 7) - MAX_AMPLITUDE / 2 * 7;
int mainFreq = random.nextInt(7) + 7;
int overtoneFreq = mainFreq * 3 / 2;
int overtone2Freq = mainFreq * 7 / 4;
byte[] bytes = new byte[AUDIO_VISUALIZER_DENSITY];
for (int i = 0; i < AUDIO_VISUALIZER_DENSITY; i++) {
bytes[i] = (byte) (Math.sin(i * mainFreq * DEG_TO_RAD) * mainAmp
+ Math.sin(i * overtoneFreq * DEG_TO_RAD) * overtoneAmp
+ Math.sin(i * overtone2Freq * DEG_TO_RAD) * overtone2Amp);
}
Main frequency is between 8 and 15 for my app. You can play with those. The other two overtones I'm using are (2 - 1/2)x & (2 - 1/4)x of main frequency. You can add more like (2 - 1/8)x etc. Or use another series of frequencies. I also randomize the amplitude to get a unique wave each time.
These are some waves I'm drawing using this code:
I'm generating a sine wave using the following method -
sampling rate = 22050;
theta = 0;
for (i = 0; i < N; i++)
{
theta = phase * 2 * PI;
signal[i] = amplitude * sin(theta);
phase = phase + frequency/sampling rate;
}
When I generate a signal with a frequency of 8000 Hz, there is distortion in the output. Frequencies below this (e.g. 6000 Hz) are generated correctly. The 8000 Hz signal is generated correctly if I place a check on the phase like so -
if (phase > 1)
{
float temp = phase - 1;
phase = temp;
}
I think it has something to do with the sine function in Xcode, probably a range of values it can accept? The same code with and without the phase wrapping has no difference in Matlab. Can someone explain what's happening here?
I believe the calculation should be (2.0 * PI) * Frequency/Samplerate
This will give you the next phase increment in radians. this value can then be fed into the Sin function to calculate the phase. Note you need to accumulate the radian values.
Technically, your first statement is incorrect as it is worded. FS/2 is the nyquist value. You can produce frequencies above this but they will alias.
In terms of phase wrapping there are different ways to manage this.
My understanding of Radians is that it is 'linear' representation of the phase that doesn't repeat while phase revolves around 2 pi values. So you may not have a wrap issue if you manage phase by managing the radians.
Happy to corrected by more knowledgable folks.
I'm not certain, but I believe the problem might be:
theta = phase * 2 * PI;
I think Xcode will change the result to an integer. You might want to try:
theta = phase * 2.0 * PI;
instead, and make sure your PI variable is a double.
All of which makes this off-topic for DSP.SE. :-)
#cixelsyd has the correct formula ... here is the code to create a set of samples of a given frequency based on a sample rate
incr_theta := (2.0 * math.Pi * given_freq) / samples_per_second
phase := -1.74 // given phase ... typically 0 note its a constant
theta := 0.0
for curr_sample := 0; curr_sample < number_of_samples; curr_sample++ {
source_buffer[curr_sample] = math.Sin(theta + phase)
theta += incr_theta
}
for efficiency its best to move the calculation of delta theta outside of the loop ... notice phase is a constant as it just gives us an initial offset
I have a strange behaviour in my attempt to code Excel's NORMINV() in C. As norminv() I took this function from a mathematician, it's probably correct since I also tried different ones with same result. Here's the code:
double calculate_probability(double x0, double x1)
{
return x0 + (x1 - x0) * rand() / ((double)RAND_MAX);
}
int main() {
long double probability = 0.0;
long double mean = 0.0;
long double stddev = 0.001;
long double change_percentage = 0.0;
long double current_price = 100.0;
srand(time(0));
int runs = 0;
long double prob_sum = 0.0;
long double price_sum = 0.0;
while (runs < 100000)
{
probability = calculate_probability(0.00001, 0.99999);
change_percentage = mean + stddev * norminv(probability); //norminv(p, mu, sigma) = mu + sigma * norminv(p)
current_price = current_price * (1.0 + change_percentage);
runs++;
prob_sum += probability;
price_sum += current_price;
}
printf("\n\n%f %f\n", price_sum / runs, prob_sum / runs);
return 0;
}
Now I want to simulate Excel's NORMINV(rand(), 0, 0.001) where rand() is a value > 0 and < 1, 0 is the mean and 0.001 would be the standard deviation.
With 1000 values it looks okay:
100.729780 0.501135
With 10000 values it spreads too much:
107.781909 0.502301
And with 100000 values it sometimes spreads even more:
87.876500 0.498738
Now I don't know why that happens. My assumption is that the random number generator has to be normally distributed, too. In my case probability is calculated fine since the mean is pretty much 0.5 all the time. Thus I don't know why the mean deviation is increasing. Can somebody help me?
You're doing something along the lines of a random walk, except your moves are with a multiplicative scaling factor rather than additive steps.
Consider two successive moves, the first of which gives 20% inflation, the second with 20% deflation. Starting with a baseline of 100, after the first step you're at 120. If you now take 80% of 120, you get 96 rather than the original 100. In other words, seemingly symmetric scaling factors are not actually symmetric. While your scaling factors are random, they are still being created symmetrically around 1, so I'm not surprised to see deviations accumulate.
Ok, so basically, I am implementing the following algorithm:
1) Slice signal of size 256 with an overlap of 128
2) Multiply each chunk with the Hanning window
3) Get DFT
4) Compute the abs value sqrt(re*re+im*im)
Plotting these values, as a imshow I get the following result:
This looks ok, it's clearly showing some difference, i.e. the spike where the signal has most amplitude shows. However, in Python I get this result:
I know that I'm doing something right, but, also doing something wrong. I just can't seem to find out where which is making me not think I have done it correctly.
Any rough ideas to where I could be going wrong here? I mean, is plotting the abs value the right way here or not?
Thanks
EDIT:
Result after clamping..
UPDATE:
Code:
for(unsigned j=0; (j < stft_temp[i].size()/2); j++)
{
double v = 10 * log10(stft_temp[i][j].re * stft_temp[i][j].re + stft_temp[i][j].im * stft_temp[i][j].im);
double pixe = 1.5 * (v + 100);
STFT[i][j] = (int) pixe;
}
Typically you might want to use a log magnitude and then scale to the required range, which would usually be 0..255. In pseudo-code:
mag_dB = 10 * log10(re * re + im * im); // get log magnitude (dB)
pixel_intensity = 1.5 * (mag_dB + 100); // offset and scale
pixel_intensity = min(pixel_intensity, 255); // clamp to 0..255
pixel_intensity = max(pixel_intensity, 0);
I'm using FMOD library to extract PCM from an MP3. I get the whole 2 channel - 16 bit thing, and I also get that a sample rate of 44100hz is 44,100 samples of "sound" in 1 second. What I don't get is, what exactly does the 16 bit value represent. I know how to plot coordinates on an xy axis, but what am I plotting? The y axis represents time, the x axis represents what? Sound level? Is that the same as amplitude? How do I determine the different sounds that compose this value. I mean, how do I get a spectrum from a 16 bit number.
This may be a separate question, but it's actually what I really need answered: How do I get the amplitude at every 25 milliseconds? Do I take 44,100 values, divide by 40 (40 * 0.025 seconds = 1 sec) ? That gives 1102.5 samples; so would I feed 1102 values into a blackbox that gives me the amplitude for that moment in time?
Edited original post to add code I plan to test soon: (note, I changed the frame rate from 25 ms to 40 ms)
// 44100 / 25 frames = 1764 samples per frame -> 1764 * 2 channels * 2 bytes [16 bit sample] = 7056 bytes
private const int CHUNKSIZE = 7056;
uint bytesread = 0;
var squares = new double[CHUNKSIZE / 4];
const double scale = 1.0d / 32768.0d;
do
{
result = sound.readData(data, CHUNKSIZE, ref read);
Marshal.Copy(data, buffer, 0, CHUNKSIZE);
//PCM samples are 16 bit little endian
Array.Reverse(buffer);
for (var i = 0; i < buffer.Length; i += 4)
{
var avg = scale * (Math.Abs((double)BitConverter.ToInt16(buffer, i)) + Math.Abs((double)BitConverter.ToInt16(buffer, i + 2))) / 2.0d;
squares[i >> 2] = avg * avg;
}
var rmsAmplitude = ((int)(Math.Floor(Math.Sqrt(squares.Average()) * 32768.0d))).ToString("X2");
fs.Write(buffer, 0, (int) read);
bytesread += read;
statusBar.Text = "writing " + bytesread + " bytes of " + length + " to output.raw";
} while (result == FMOD.RESULT.OK && read == CHUNKSIZE);
After loading mp3, seems my rmsAmplitude is in the range 3C00 to 4900. Have I done something wrong? I was expecting a wider spread.
Yes, a sample represents amplitude (at that point in time).
To get a spectrum, you typically convert it from the time domain to the frequency domain.
Last Q: Multiple approaches are used - You may want the RMS.
Generally, the x axis is the time value and y axis is the amplitude. To get the frequency, you need to take the Fourier transform of the data (most likely using the Fast Fourier Transform [fft] algorithm).
To use one of the simplest "sounds", let's assume you have a single frequency noise with frequency f. This is represented (in the amplitude/time domain) as y = sin(2 * pi * x / f).
If you convert that into the frequency domain, you just end up with Frequency = f.
Each sample represents the voltage of the analog signal at a given time.