I am trying to make an old TV static type effect in P5.js, and although I am able to make the effect work, the frame rate is quite low.
My approach is the following:
Loop through each pixel
Set the stroke to a random value
Call the point() function to paint the pixel
Initially, I was doing this in the draw function directly but it was very slow. I was getting less than 1 frame a second. So I switch to the following paint buffer approach:
const SCREEN_WIDTH = 480
const SCREEN_HEIGHT = 480
var ScreenBuffer;
function setup(){
createCanvas(SCREEN_WIDTH, SCREEN_HEIGHT);
ScreenBuffer = createGraphics(SCREEN_WIDTH,SCREEN_HEIGHT);
}
function draw(){
paintBuffer();
image(ScreenBuffer,0,0);
}
function paintBuffer(){
console.log("Painting Buffer")
for(var x = 0; x< SCREEN_WIDTH; x++){
for(var y = 0; y< SCREEN_HEIGHT; y++){
ScreenBuffer.stroke(Math.random() * 255)
ScreenBuffer.point(x,y)
}
}
}
Although I am getting a performance improvement, its nowhere near the 30 frames a second I want to be at. Is there a better way to do this?
The only way I can get reasonable performance is by filling up the screen with small squares instead with the following code:
for(var x = 0; x< SCREEN_WIDTH-10; x+=10){
for(var y = 0; y< SCREEN_HEIGHT-10; y+=10){
//ScreenBuffer.stroke(Math.random() * 255)
//ScreenBuffer.point(x,y)
ScreenBuffer.fill(Math.random() * 255);
ScreenBuffer.noStroke()
ScreenBuffer.rect(x,y,10,10)
}
}
But I would really like a pixel effect - ideally to fill the whole screen.
Believe it or not, it's actually the call to stroke() that's slowing down your sketch. You can get around this by setting the value of the pixels directly, using the set() function or accessing the pixels array directly.
More info can be found in the reference, but here's a simple example:
function setup() {
createCanvas(500, 500);
}
function draw() {
for (var i = 0; i < width; i++) {
for (var j = 0; j < height; j++) {
var c = random(255);
set(i, j, c);
}
}
updatePixels();
text(frameRate(), 20, 20);
}
Another approach you might consider is generating a few buffers that contain static images ahead of time, and then using those to draw your static. There's really no need to make the static completely dynamic, so do the work once and then just load from image files or buffers created using the createGraphics() function.
Related
In my game,if I touch a particular object,coin objects will come out of them at random speeds and occupy random positions.
public void update(delta){
if(isTouched()&& getY()<Constants.WORLD_HEIGHT/2){
setY(getY()+(randomSpeed * delta));
setX(getX()-(randomSpeed/4 * delta));
}
}
Now I want to make this coins occupy positions in some patterns.Like if 3 coins come out,a triangle pattern or if 4 coins, rectangular pattern like that.
I tried to make it work,but coins are coming out and moved,but overlapping each other.Not able to create any patterns.
patterns like:
This is what I tried
int a = Math.abs(rndNo.nextInt() % 3)+1;//no of coins
int no =0;
float coinxPos = player.getX()-coins[0].getWidth()/2;
float coinyPos = player.getY();
int minCoinGap=20;
switch (a) {
case 1:
for (int i = 0; i < coins.length; i++) {
if (!coins[i].isCoinVisible() && no < a) {
coins[i].setCoinVisible(true);
coinxPos = coinxPos+rndNo.nextInt()%70;
coinyPos = coinyPos+rndNo.nextInt()%70;
coins[i].setPosition(coinxPos, coinyPos);
no++;
}
}
break;
case 2:
for (int i = 0; i < coins.length; i++) {
if (!coins[i].isCoinVisible() && no < a) {
coins[i].setCoinVisible(true);
coinxPos = coinxPos+minCoinGap+rndNo.nextInt()%70;
coinyPos = coinyPos+rndNo.nextInt()%150;
coins[i].setPosition(coinxPos, coinyPos);
no++;
}
}
break:
......
......
default:
break;
may be this is a simple logic to implement,but I wasted a lot of time on it and got confused of how to make it work.
Any help would be appreciated.
In my game, when I want some object at X,Y to reach some specific coordinates Xe,Ye at every frame I'm adding to it's coordinates difference between current and wanted position, divided by constant and multiplied by time passed from last frame. That way it starts moving quickly and goes slowly and slowly as it's closer, looks kinda cool.
X += ((Xe - X)* dt)/ CONST;
Y += ((Ye - Y)* dt)/ CONST;
You'll experimentally get that CONST value, bigger value means slower movement. If you want it to look even cooler you can add velocity variable and instead of changing directly coordinates depending on distance from end position you can adjust that velocity. That way even if object at some point reaches the end position it will still have some velocity and it will keep moving - it will have inertia. A bit more complex to achieve, but movement would be even wilder.
And if you want that Xe,Ye be some specific position (not random), then just set those constant values. No need to make it more complicated then that. Set like another constat OFFSET:
static final int OFFSET = 100;
Xe1 = X - OFFSET; // for first coin
Ye1 = Y - OFFSET;
Xe2 = X + OFFSET; // for second coin
Ye2 = Y - OFFSET;
...
I am using the following code to convert a Bitmap to Complex and vice versa.
Even though those were directly copied from Accord.NET framework, while testing these static methods, I have discovered that, repeated use of these static methods cause 'data-loss'. As a result, the end output/result becomes distorted.
public partial class ImageDataConverter
{
#region private static Complex[,] FromBitmapData(BitmapData bmpData)
private static Complex[,] ToComplex(BitmapData bmpData)
{
Complex[,] comp = null;
if (bmpData.PixelFormat == PixelFormat.Format8bppIndexed)
{
int width = bmpData.Width;
int height = bmpData.Height;
int offset = bmpData.Stride - (width * 1);//1 === 1 byte per pixel.
if ((!Tools.IsPowerOf2(width)) || (!Tools.IsPowerOf2(height)))
{
throw new Exception("Imager width and height should be n of 2.");
}
comp = new Complex[width, height];
unsafe
{
byte* src = (byte*)bmpData.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++, src++)
{
comp[y, x] = new Complex((float)*src / 255,
comp[y, x].Imaginary);
}
src += offset;
}
}
}
else
{
throw new Exception("EightBppIndexedImageRequired");
}
return comp;
}
#endregion
public static Complex[,] ToComplex(Bitmap bmp)
{
Complex[,] comp = null;
if (bmp.PixelFormat == PixelFormat.Format8bppIndexed)
{
BitmapData bmpData = bmp.LockBits( new Rectangle(0, 0, bmp.Width, bmp.Height),
ImageLockMode.ReadOnly,
PixelFormat.Format8bppIndexed);
try
{
comp = ToComplex(bmpData);
}
finally
{
bmp.UnlockBits(bmpData);
}
}
else
{
throw new Exception("EightBppIndexedImageRequired");
}
return comp;
}
public static Bitmap ToBitmap(Complex[,] image, bool fourierTransformed)
{
int width = image.GetLength(0);
int height = image.GetLength(1);
Bitmap bmp = Imager.CreateGrayscaleImage(width, height);
BitmapData bmpData = bmp.LockBits(
new Rectangle(0, 0, width, height),
ImageLockMode.ReadWrite,
PixelFormat.Format8bppIndexed);
int offset = bmpData.Stride - width;
double scale = (fourierTransformed) ? Math.Sqrt(width * height) : 1;
unsafe
{
byte* address = (byte*)bmpData.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++, address++)
{
double min = System.Math.Min(255, image[y, x].Magnitude * scale * 255);
*address = (byte)System.Math.Max(0, min);
}
address += offset;
}
}
bmp.UnlockBits(bmpData);
return bmp;
}
}
(The DotNetFiddle link of the complete source code)
(ImageDataConverter)
Output:
As you can see, FFT is working correctly, but, I-FFT isn't.
That is because bitmap to complex and vice versa isn't working as expected.
What could be done to correct the ToComplex() and ToBitmap() functions so that they don't loss data?
I do not code in C# so handle this answer with extreme prejudice!
Just from a quick look I spotted few problems:
ToComplex()
Is converting BMP into 2D complex matrix. When you are converting you are leaving imaginary part unchanged, but at the start of the same function you have:
Complex[,] complex2D = null;
complex2D = new Complex[width, height];
So the imaginary parts are either undefined or zero depends on your complex class constructor. This means you are missing half of the data needed for reconstruction !!! You should restore the original complex matrix from 2 images one for real and second for imaginary part of the result.
ToBitmap()
You are saving magnitude which is I think sqrt( Re*Re + Im*Im ) so it is power spectrum not the original complex values and so you can not reconstruct back... You should store Re,Im in 2 separate images.
8bit per pixel
That is not much and can cause significant round off errors after FFT/IFFT so reconstruction can be really distorted.
[Edit1] Remedy
There are more options to repair this for example:
use floating complex matrix for computations and bitmap only for visualization.
This is the safest way because you avoid additional conversion round offs. This approach has the best precision. But you need to rewrite your DIP/CV algorithms to support complex domain matrices instead of bitmaps which require not small amount of work.
rewrite your conversions to support real and imaginary part images
Your conversion is really bad as it does not store/restore Real and Imaginary parts as it should and also it does not account for negative values (at least I do not see it instead they are cut down to zero which is WRONG). I would rewrite the conversion to this:
// conversion scales
float Re_ofset=256.0,Re_scale=512.0/255.0;
float Im_ofset=256.0,Im_scale=512.0/255.0;
private static Complex[,] ToComplex(BitmapData bmpRe,BitmapData bmpIm)
{
//...
byte* srcRe = (byte*)bmpRe.Scan0.ToPointer();
byte* srcIm = (byte*)bmpIm.Scan0.ToPointer();
complex c = new Complex(0.0,0.0);
// for each line
for (int y = 0; y < height; y++)
{
// for each pixel
for (int x = 0; x < width; x++, src++)
{
complex2D[y, x] = c;
c.Real = (float)*(srcRe*Re_scale)-Re_ofset;
c.Imaginary = (float)*(srcIm*Im_scale)-Im_ofset;
}
src += offset;
}
//...
}
public static Bitmap ToBitmapRe(Complex[,] complex2D)
{
//...
float Re = (complex2D[y, x].Real+Re_ofset)/Re_scale;
Re = min(Re,255.0);
Re = max(Re, 0.0);
*address = (byte)Re;
//...
}
public static Bitmap ToBitmapIm(Complex[,] complex2D)
{
//...
float Im = (complex2D[y, x].Imaginary+Im_ofset)/Im_scale;
Re = min(Im,255.0);
Re = max(Im, 0.0);
*address = (byte)Im;
//...
}
Where:
Re_ofset = min(complex2D[,].Real);
Im_ofset = min(complex2D[,].Imaginary);
Re_scale = (max(complex2D[,].Real )-min(complex2D[,].Real ))/255.0;
Im_scale = (max(complex2D[,].Imaginary)-min(complex2D[,].Imaginary))/255.0;
or cover bigger interval then the complex matrix values.
You can also encode both Real and Imaginary parts to single image for example first half of image could be Real and next the Imaginary part. In that case you do not need to change the function headers nor names at all .. but you would need to handle the images as 2 joined squares each with different meaning ...
You can also use RGB images where R = Real, B = Imaginary or any other encoding that suites you.
[Edit2] some examples to make my points more clear
example of approach #1
The image is in form of floating point 2D complex matrix and the images are created only for visualization. There is little rounding error this way. The values are not normalized so the range is <0.0,255.0> per pixel/cell at first but after transforms and scaling it could change greatly.
As you can see I added scaling so all pixels are multiplied by 315 to actually see anything because the FFT output values are small except of few cells. But only for visualization the complex matrix is unchanged.
example of approach #2
Well as I mentioned before you do not handle negative values, normalize values to range <0,1> and back by scaling and rounding off and using only 8 bits per pixel to store the sub results. I tried to simulate that with my code and here is what I got (using complex domain instead of wrongly used power spectrum like you did). Here C++ source only as an template example as you do not have the functions and classes behind it:
transform t;
cplx_2D c;
rgb2i(bmp0);
c.ld(bmp0,bmp0);
null_im(c);
c.mul(1.0/255.0);
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out0_Re.bmp");
bmp1->SaveToFile("_out0_Im.bmp");
t. DFFT(c,c);
c.wrap();
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out1_Re.bmp");
bmp1->SaveToFile("_out1_Im.bmp");
c.wrap();
t.iDFFT(c,c);
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out2_Re.bmp");
bmp1->SaveToFile("_out2_Im.bmp");
And here the sub results:
As you can see after the DFFT and wrap the image is really dark and most of the values are rounded off. So the result after unwrap and IDFFT is really pure.
Here some explanations to code:
c.st(bmpre,bmpim) is the same as your ToBitmap
c.ld(bmpre,bmpim) is the same as your ToComplex
c.mul(scale) multiplies complex matrix c by scale
rgb2i converts RGB to grayscale intensity <0,255>
i2iii converts grayscale intensity ro grayscale RGB image
I'm not really good in this puzzles but double check this dividing.
comp[y, x] = new Complex((float)*src / 255, comp[y, x].Imaginary);
You can loose precision as it is described here
Complex class definition in Remarks section.
May be this happens in your case.
Hope this helps.
I picked up Processing today, and wrote a program to generate a double slit interference pattern. After tweaking with the values a little, it works, but the pattern generated is fuzzier than what is possible in some other programs. Here's a screenshot:
As you can see, the fringes are not as smooth at the edges as I believe is possible. I expect them to look like this or this.
This is my code:
// All quantities in mm
float slit_separation = 0.005;
float screen_dist = 50;
float wavelength = 5e-4f;
PVector slit1, slit2;
float scale = 1e+1f;
void setup() {
size(500, 500);
colorMode(HSB, 360, 100, 1);
noLoop();
background(255);
slit_separation *= scale;
screen_dist *= scale;
wavelength *= scale;
slit1 = new PVector(-slit_separation / 2, 0, -screen_dist);
slit2 = new PVector(slit_separation / 2, 0, -screen_dist);
}
void draw() {
translate(width / 2, height / 2);
for (float x = -width / 2; x < width / 2; x++) {
for (float y = -height / 2; y < height / 2; y++) {
PVector pos = new PVector(x, y, 0);
float path_diff = abs(PVector.sub(slit1, pos).mag() - PVector.sub(slit2, pos).mag());
float parameter = map(path_diff % wavelength, 0, wavelength, 0, 2 * PI);
stroke(100, 100, pow(cos(parameter), 2));
point(x, y);
}
}
}
My code is mathematically correct, so I am wondering if there's something wrong I am doing in transforming the physical values to pixels on screen.
I'm not totally sure what you're asking- what exactly do you expect it to look like? Would it be possible to narrow this down to a single line that's misbehaving instead of the nested for loop?
But just taking a guess at what you're talking about: keep in mind that Processing enables anti-aliasing by default. To disable it, you have to call the noSmooth() function. You can call it in your setup() function:
void setup() {
size(500, 500);
noSmooth();
//rest of your code
It's pretty obvious if you compare them side-by-side:
If that's not what you're talking about, please post an MCVE of just one or two lines instead of a nested for loop. It would also be helpful to include a mockup of what you'd expect versus what you're getting. Good luck!
I want to make something like Terraria item sidebar thing. (the Left-top rectangles one). And here is my code.
Variables are
public Rectangle InventorySlots;
public Item[] Quickbar = new Item[9];
public Item mouseItem = null;
public Item[] Backpack = new Item[49];
public int selectedBar = 0;
Here is the initialization
inventory[0] = Content.Load<Texture2D>("Contents/Overlays/InventoryBG");
inventory[1] = Content.Load<Texture2D>("Contents/Overlays/InventoryBG2");
update method
int a = viewport.Width / 22;
for (int b = 0; b <= Quickbar.Length; ++b)
{
InventorySlots = new Rectangle(((a/10)*b)+(b),0,a,a);
}
draw method
spriteBatch.Begin();
for (int num = 0; num <= Quickbar.Length; ++num )
spriteBatch.Draw(inventory[0], InventorySlots, Color.White);
spriteBatch.Draw(inventory[1], InventorySlots, Color.White);
spriteBatch.End();
Yes it is not done, but when i try to run it, the texture didn't show up.
I am unable to find out what is wrong in my code.
is it in with SpriteBatch? In the draw method? or In the Update?
Resolved
The problem isnt at the code Itself. the Problem is in this:
int a = viewport.Width / 22;
The thing is, i trought that viewport in here (I've used a Starter Kit) is the Game Window!
You are assigning InventorySlots overwriting its content...
also it seems that you want to draw two sprites... but you are drawing only one inside the loop... and your looping over Quickbar when seems that its not related with your drawing calls.
And it seems that your slot layout calculations have few sense...
You should use an array or a list:
public List<Rectangle> InventorySlots = new List<Rectangle>();
// You put this code in update... but it's not going to change..
// maybe initialize is best suited
// Initialize
int a = viewport.Width / 22;
InventorySlots.Clear();
for (int b = 0; b < Quickbar.Length; ++b)
{ // Generate slots in a line, with a pixel gap among them
InventorySlots.Add( new Rectangle( 1 + (a+2) * b ,0,a,a) );
}
//Draw
spriteBatch.Begin();
for (int num = 0; num < InventorySlots.Count; ++num )
{
spriteBatch.Draw(inventory[0], InventorySlots[num], Color.White);
spriteBatch.Draw(inventory[1], InventorySlots[num], Color.White);
}
spriteBatch.End();
I am trying to build a system that will be able to process a record of someone whistling and output notes.
Can anyone recommend an open-source platform which I can use as the base for the note/pitch recognition and analysis of wave files ?
Thanks in advance
As many others have already said, FFT is the way to go here. I've written a little example in Java using FFT code from http://www.cs.princeton.edu/introcs/97data/. In order to run it, you will need the Complex class from that page also (see the source for the exact URL).
The code reads in a file, goes window-wise over it and does an FFT on each window. For each FFT it looks for the maximum coefficient and outputs the corresponding frequency. This does work very well for clean signals like a sine wave, but for an actual whistle sound you probably have to add more. I've tested with a few files with whistling I created myself (using the integrated mic of my laptop computer), the code does get the idea of what's going on, but in order to get actual notes more needs to be done.
1) You might need some more intelligent window technique. What my code uses now is a simple rectangular window. Since the FFT assumes that the input singal can be periodically continued, additional frequencies are detected when the first and the last sample in the window don't match. This is known as spectral leakage ( http://en.wikipedia.org/wiki/Spectral_leakage ), usually one uses a window that down-weights samples at the beginning and the end of the window ( http://en.wikipedia.org/wiki/Window_function ). Although the leakage shouldn't cause the wrong frequency to be detected as the maximum, using a window will increase the detection quality.
2) To match the frequencies to actual notes, you could use an array containing the frequencies (like 440 Hz for a') and then look for the frequency that's closest to the one that has been identified. However, if the whistling is off standard tuning, this won't work any more. Given that the whistling is still correct but only tuned differently (like a guitar or other musical instrument can be tuned differently and still sound "good", as long as the tuning is done consistently for all strings), you could still find notes by looking at the ratios of the identified frequencies. You can read http://en.wikipedia.org/wiki/Pitch_%28music%29 as a starting point on that. This is also interesting: http://en.wikipedia.org/wiki/Piano_key_frequencies
3) Moreover it might be interesting to detect the points in time when each individual tone starts and stops. This could be added as a pre-processing step. You could do an FFT for each individual note then. However, if the whistler doesn't stop but just bends between notes, this would not be that easy.
Definitely have a look at the libraries the others suggested. I don't know any of them, but maybe they contain already functionality for doing what I've described above.
And now to the code. Please let me know what worked for you, I find this topic pretty interesting.
Edit: I updated the code to include overlapping and a simple mapper from frequencies to notes. It works only for "tuned" whistlers though, as mentioned above.
package de.ahans.playground;
import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
public class FftMaxFrequency {
// taken from http://www.cs.princeton.edu/introcs/97data/FFT.java.html
// (first hit in Google for "java fft"
// needs Complex class from http://www.cs.princeton.edu/introcs/97data/Complex.java
public static Complex[] fft(Complex[] x) {
int N = x.length;
// base case
if (N == 1) return new Complex[] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2"); }
// fft of even terms
Complex[] even = new Complex[N/2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
Complex[] q = fft(even);
// fft of odd terms
Complex[] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
Complex[] r = fft(odd);
// combine
Complex[] y = new Complex[N];
for (int k = 0; k < N/2; k++) {
double kth = -2 * k * Math.PI / N;
Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
y[k] = q[k].plus(wk.times(r[k]));
y[k + N/2] = q[k].minus(wk.times(r[k]));
}
return y;
}
static class AudioReader {
private AudioFormat audioFormat;
public AudioReader() {}
public double[] readAudioData(File file) throws UnsupportedAudioFileException, IOException {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
audioFormat = in.getFormat();
int depth = audioFormat.getSampleSizeInBits();
long length = in.getFrameLength();
if (audioFormat.isBigEndian()) {
throw new UnsupportedAudioFileException("big endian not supported");
}
if (audioFormat.getChannels() != 1) {
throw new UnsupportedAudioFileException("only 1 channel supported");
}
byte[] tmp = new byte[(int) length];
byte[] samples = null;
int bytesPerSample = depth/8;
int bytesRead;
while (-1 != (bytesRead = in.read(tmp))) {
if (samples == null) {
samples = Arrays.copyOf(tmp, bytesRead);
} else {
int oldLen = samples.length;
samples = Arrays.copyOf(samples, oldLen + bytesRead);
for (int i = 0; i < bytesRead; i++) samples[oldLen+i] = tmp[i];
}
}
double[] data = new double[samples.length/bytesPerSample];
for (int i = 0; i < samples.length-bytesPerSample; i += bytesPerSample) {
int sample = 0;
for (int j = 0; j < bytesPerSample; j++) sample += samples[i+j] << j*8;
data[i/bytesPerSample] = (double) sample / Math.pow(2, depth);
}
return data;
}
public AudioFormat getAudioFormat() {
return audioFormat;
}
}
public class FrequencyNoteMapper {
private final String[] NOTE_NAMES = new String[] {
"A", "Bb", "B", "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#"
};
private final double[] FREQUENCIES;
private final double a = 440;
private final int TOTAL_OCTAVES = 6;
private final int START_OCTAVE = -1; // relative to A
public FrequencyNoteMapper() {
FREQUENCIES = new double[TOTAL_OCTAVES*12];
int j = 0;
for (int octave = START_OCTAVE; octave < START_OCTAVE+TOTAL_OCTAVES; octave++) {
for (int note = 0; note < 12; note++) {
int i = octave*12+note;
FREQUENCIES[j++] = a * Math.pow(2, (double)i / 12.0);
}
}
}
public String findMatch(double frequency) {
if (frequency == 0)
return "none";
double minDistance = Double.MAX_VALUE;
int bestIdx = -1;
for (int i = 0; i < FREQUENCIES.length; i++) {
if (Math.abs(FREQUENCIES[i] - frequency) < minDistance) {
minDistance = Math.abs(FREQUENCIES[i] - frequency);
bestIdx = i;
}
}
int octave = bestIdx / 12;
int note = bestIdx % 12;
return NOTE_NAMES[note] + octave;
}
}
public void run (File file) throws UnsupportedAudioFileException, IOException {
FrequencyNoteMapper mapper = new FrequencyNoteMapper();
// size of window for FFT
int N = 4096;
int overlap = 1024;
AudioReader reader = new AudioReader();
double[] data = reader.readAudioData(file);
// sample rate is needed to calculate actual frequencies
float rate = reader.getAudioFormat().getSampleRate();
// go over the samples window-wise
for (int offset = 0; offset < data.length-N; offset += (N-overlap)) {
// for each window calculate the FFT
Complex[] x = new Complex[N];
for (int i = 0; i < N; i++) x[i] = new Complex(data[offset+i], 0);
Complex[] result = fft(x);
// find index of maximum coefficient
double max = -1;
int maxIdx = 0;
for (int i = result.length/2; i >= 0; i--) {
if (result[i].abs() > max) {
max = result[i].abs();
maxIdx = i;
}
}
// calculate the frequency of that coefficient
double peakFrequency = (double)maxIdx*rate/(double)N;
// and get the time of the start and end position of the current window
double windowBegin = offset/rate;
double windowEnd = (offset+(N-overlap))/rate;
System.out.printf("%f s to %f s:\t%f Hz -- %s\n", windowBegin, windowEnd, peakFrequency, mapper.findMatch(peakFrequency));
}
}
public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
new FftMaxFrequency().run(new File("/home/axr/tmp/entchen.wav"));
}
}
i think this open-source platform suits you
http://code.google.com/p/musicg-sound-api/
Well, you could always use fftw to perform the Fast Fourier Transform. It's a very well respected framework. Once you've got an FFT of your signal you can analyze the resultant array for peaks. A simple histogram style analysis should give you the frequencies with the greatest volume. Then you just have to compare those frequencies to the frequencies that correspond with different pitches.
in addition to the other great options:
csound pitch detection: http://www.csounds.com/manual/html/pvspitch.html
fmod: http://www.fmod.org/ (has a free version)
aubio: http://aubio.org/doc/pitchdetection_8h.html
You might want to consider Python(x,y). It's a scientific programming framework for python in the spirit of Matlab, and it has easy functions for working in the FFT domain.
If you use Java, have a look at TarsosDSP library. It has a pretty good ready-to-go pitch detector.
Here is an example for android, but I think it doesn't require too much modifications to use it elsewhere.
I'm a fan of the FFT but for the monophonic and fairly pure sinusoidal tones of whistling, a zero-cross detector would do a far better job at determining the actual frequency at a much lower processing cost. Zero-cross detection is used in electronic frequency counters that measure the clock rate of whatever is being tested.
If you going to analyze anything other than pure sine wave tones, then FFT is definitely the way to go.
A very simple implementation of zero cross detection in Java on GitHub