Loss of data during the Inverse-FFT of an Image - c#-4.0

I am using the following code to convert a Bitmap to Complex and vice versa.
Even though those were directly copied from Accord.NET framework, while testing these static methods, I have discovered that, repeated use of these static methods cause 'data-loss'. As a result, the end output/result becomes distorted.
public partial class ImageDataConverter
{
#region private static Complex[,] FromBitmapData(BitmapData bmpData)
private static Complex[,] ToComplex(BitmapData bmpData)
{
Complex[,] comp = null;
if (bmpData.PixelFormat == PixelFormat.Format8bppIndexed)
{
int width = bmpData.Width;
int height = bmpData.Height;
int offset = bmpData.Stride - (width * 1);//1 === 1 byte per pixel.
if ((!Tools.IsPowerOf2(width)) || (!Tools.IsPowerOf2(height)))
{
throw new Exception("Imager width and height should be n of 2.");
}
comp = new Complex[width, height];
unsafe
{
byte* src = (byte*)bmpData.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++, src++)
{
comp[y, x] = new Complex((float)*src / 255,
comp[y, x].Imaginary);
}
src += offset;
}
}
}
else
{
throw new Exception("EightBppIndexedImageRequired");
}
return comp;
}
#endregion
public static Complex[,] ToComplex(Bitmap bmp)
{
Complex[,] comp = null;
if (bmp.PixelFormat == PixelFormat.Format8bppIndexed)
{
BitmapData bmpData = bmp.LockBits( new Rectangle(0, 0, bmp.Width, bmp.Height),
ImageLockMode.ReadOnly,
PixelFormat.Format8bppIndexed);
try
{
comp = ToComplex(bmpData);
}
finally
{
bmp.UnlockBits(bmpData);
}
}
else
{
throw new Exception("EightBppIndexedImageRequired");
}
return comp;
}
public static Bitmap ToBitmap(Complex[,] image, bool fourierTransformed)
{
int width = image.GetLength(0);
int height = image.GetLength(1);
Bitmap bmp = Imager.CreateGrayscaleImage(width, height);
BitmapData bmpData = bmp.LockBits(
new Rectangle(0, 0, width, height),
ImageLockMode.ReadWrite,
PixelFormat.Format8bppIndexed);
int offset = bmpData.Stride - width;
double scale = (fourierTransformed) ? Math.Sqrt(width * height) : 1;
unsafe
{
byte* address = (byte*)bmpData.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++, address++)
{
double min = System.Math.Min(255, image[y, x].Magnitude * scale * 255);
*address = (byte)System.Math.Max(0, min);
}
address += offset;
}
}
bmp.UnlockBits(bmpData);
return bmp;
}
}
(The DotNetFiddle link of the complete source code)
(ImageDataConverter)
Output:
As you can see, FFT is working correctly, but, I-FFT isn't.
That is because bitmap to complex and vice versa isn't working as expected.
What could be done to correct the ToComplex() and ToBitmap() functions so that they don't loss data?

I do not code in C# so handle this answer with extreme prejudice!
Just from a quick look I spotted few problems:
ToComplex()
Is converting BMP into 2D complex matrix. When you are converting you are leaving imaginary part unchanged, but at the start of the same function you have:
Complex[,] complex2D = null;
complex2D = new Complex[width, height];
So the imaginary parts are either undefined or zero depends on your complex class constructor. This means you are missing half of the data needed for reconstruction !!! You should restore the original complex matrix from 2 images one for real and second for imaginary part of the result.
ToBitmap()
You are saving magnitude which is I think sqrt( Re*Re + Im*Im ) so it is power spectrum not the original complex values and so you can not reconstruct back... You should store Re,Im in 2 separate images.
8bit per pixel
That is not much and can cause significant round off errors after FFT/IFFT so reconstruction can be really distorted.
[Edit1] Remedy
There are more options to repair this for example:
use floating complex matrix for computations and bitmap only for visualization.
This is the safest way because you avoid additional conversion round offs. This approach has the best precision. But you need to rewrite your DIP/CV algorithms to support complex domain matrices instead of bitmaps which require not small amount of work.
rewrite your conversions to support real and imaginary part images
Your conversion is really bad as it does not store/restore Real and Imaginary parts as it should and also it does not account for negative values (at least I do not see it instead they are cut down to zero which is WRONG). I would rewrite the conversion to this:
// conversion scales
float Re_ofset=256.0,Re_scale=512.0/255.0;
float Im_ofset=256.0,Im_scale=512.0/255.0;
private static Complex[,] ToComplex(BitmapData bmpRe,BitmapData bmpIm)
{
//...
byte* srcRe = (byte*)bmpRe.Scan0.ToPointer();
byte* srcIm = (byte*)bmpIm.Scan0.ToPointer();
complex c = new Complex(0.0,0.0);
// for each line
for (int y = 0; y < height; y++)
{
// for each pixel
for (int x = 0; x < width; x++, src++)
{
complex2D[y, x] = c;
c.Real = (float)*(srcRe*Re_scale)-Re_ofset;
c.Imaginary = (float)*(srcIm*Im_scale)-Im_ofset;
}
src += offset;
}
//...
}
public static Bitmap ToBitmapRe(Complex[,] complex2D)
{
//...
float Re = (complex2D[y, x].Real+Re_ofset)/Re_scale;
Re = min(Re,255.0);
Re = max(Re, 0.0);
*address = (byte)Re;
//...
}
public static Bitmap ToBitmapIm(Complex[,] complex2D)
{
//...
float Im = (complex2D[y, x].Imaginary+Im_ofset)/Im_scale;
Re = min(Im,255.0);
Re = max(Im, 0.0);
*address = (byte)Im;
//...
}
Where:
Re_ofset = min(complex2D[,].Real);
Im_ofset = min(complex2D[,].Imaginary);
Re_scale = (max(complex2D[,].Real )-min(complex2D[,].Real ))/255.0;
Im_scale = (max(complex2D[,].Imaginary)-min(complex2D[,].Imaginary))/255.0;
or cover bigger interval then the complex matrix values.
You can also encode both Real and Imaginary parts to single image for example first half of image could be Real and next the Imaginary part. In that case you do not need to change the function headers nor names at all .. but you would need to handle the images as 2 joined squares each with different meaning ...
You can also use RGB images where R = Real, B = Imaginary or any other encoding that suites you.
[Edit2] some examples to make my points more clear
example of approach #1
The image is in form of floating point 2D complex matrix and the images are created only for visualization. There is little rounding error this way. The values are not normalized so the range is <0.0,255.0> per pixel/cell at first but after transforms and scaling it could change greatly.
As you can see I added scaling so all pixels are multiplied by 315 to actually see anything because the FFT output values are small except of few cells. But only for visualization the complex matrix is unchanged.
example of approach #2
Well as I mentioned before you do not handle negative values, normalize values to range <0,1> and back by scaling and rounding off and using only 8 bits per pixel to store the sub results. I tried to simulate that with my code and here is what I got (using complex domain instead of wrongly used power spectrum like you did). Here C++ source only as an template example as you do not have the functions and classes behind it:
transform t;
cplx_2D c;
rgb2i(bmp0);
c.ld(bmp0,bmp0);
null_im(c);
c.mul(1.0/255.0);
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out0_Re.bmp");
bmp1->SaveToFile("_out0_Im.bmp");
t. DFFT(c,c);
c.wrap();
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out1_Re.bmp");
bmp1->SaveToFile("_out1_Im.bmp");
c.wrap();
t.iDFFT(c,c);
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out2_Re.bmp");
bmp1->SaveToFile("_out2_Im.bmp");
And here the sub results:
As you can see after the DFFT and wrap the image is really dark and most of the values are rounded off. So the result after unwrap and IDFFT is really pure.
Here some explanations to code:
c.st(bmpre,bmpim) is the same as your ToBitmap
c.ld(bmpre,bmpim) is the same as your ToComplex
c.mul(scale) multiplies complex matrix c by scale
rgb2i converts RGB to grayscale intensity <0,255>
i2iii converts grayscale intensity ro grayscale RGB image

I'm not really good in this puzzles but double check this dividing.
comp[y, x] = new Complex((float)*src / 255, comp[y, x].Imaginary);
You can loose precision as it is described here
Complex class definition in Remarks section.
May be this happens in your case.
Hope this helps.

Related

DirectX 11: text output, using your own font texture

I'm learning DirectX, using the book "Sherrod A., Jones W. - Beginning DirectX 11 Game Programming - 2011" Now I'm exploring the 4th chapter about drawing text.
Please, help we to fix my function, that I'm using to draw a string on the screen. I've already loaded font texture and in the function I create some sprites with letters and define texture coordinates for them. This compiles correctly, but doesn't draw anything. What's wrong?
bool DirectXSpriteGame :: DrawString(char* StringToDraw, float StartX, float StartY)
{
//VAR
HRESULT D3DResult; //The result of D3D functions
int i; //Counters
const int IndexA = static_cast<char>('A'); //ASCII index of letter A
const int IndexZ = static_cast<char>('Z'); //ASCII index of letter Z
int StringLenth = strlen(StringToDraw); //Lenth of drawing string
float ScreenCharWidth = static_cast<float>(LETTER_WIDTH) / static_cast<float>(SCREEN_WIDTH); //Width of the single char on the screen(in %)
float ScreenCharHeight = static_cast<float>(LETTER_HEIGHT) / static_cast<float>(SCREEN_HEIGHT); //Height of the single char on the screen(in %)
float TexelCharWidth = 1.0f / static_cast<float>(LETTERS_NUM); //Width of the char texel(in the texture %)
float ThisStartX; //The start x of the current letter, drawingh
float ThisStartY; //The start y of the current letter, drawingh
float ThisEndX; //The end x of the current letter, drawing
float ThisEndY; //The end y of the current letter, drawing
int LetterNum; //Letter number in the loaded font
int ThisLetter; //The current letter
D3D11_MAPPED_SUBRESOURCE MapResource; //Map resource
VertexPos* ThisSprite; //Vertecies of the current sprite, drawing
//VAR
//Clamping string, if too long
if(StringLenth > LETTERS_NUM)
{
StringLenth = LETTERS_NUM;
}
//Mapping resource
D3DResult = _DeviceContext -> Map(_vertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &MapResource);
if(FAILED(D3DResult))
{
throw("Failed to map resource");
}
ThisSprite = (VertexPos*)MapResource.pData;
for(i = 0; i < StringLenth; i++)
{
//Creating geometry for the letter sprite
ThisStartX = StartX + ScreenCharWidth * static_cast<float>(i);
ThisStartY = StartY;
ThisEndX = ThisStartX + ScreenCharWidth;
ThisEndY = StartY + ScreenCharHeight;
ThisSprite[0].Position = XMFLOAT3(ThisEndX, ThisEndY, 1.0f);
ThisSprite[1].Position = XMFLOAT3(ThisEndX, ThisStartY, 1.0f);
ThisSprite[2].Position = XMFLOAT3(ThisStartX, ThisStartY, 1.0f);
ThisSprite[3].Position = XMFLOAT3(ThisStartX, ThisStartY, 1.0f);
ThisSprite[4].Position = XMFLOAT3(ThisStartX, ThisEndY, 1.0f);
ThisSprite[5].Position = XMFLOAT3(ThisEndX, ThisEndY, 1.0f);
ThisLetter = static_cast<char>(StringToDraw[i]);
//Defining the letter place(number) in the font
if(ThisLetter < IndexA || ThisLetter > IndexZ)
{
//Invalid character, the last character in the font, loaded
LetterNum = IndexZ - IndexA + 1;
}
else
{
LetterNum = ThisLetter - IndexA;
}
//Unwraping texture on the geometry
ThisStartX = TexelCharWidth * static_cast<float>(LetterNum);
ThisStartY = 0.0f;
ThisEndY = 1.0f;
ThisEndX = ThisStartX + TexelCharWidth;
ThisSprite[0].TextureCoords = XMFLOAT2(ThisEndX, ThisEndY);
ThisSprite[1].TextureCoords = XMFLOAT2(ThisEndX, ThisStartY);
ThisSprite[2].TextureCoords = XMFLOAT2(ThisStartX, ThisStartY);
ThisSprite[3].TextureCoords = XMFLOAT2(ThisStartX, ThisStartY);
ThisSprite[4].TextureCoords = XMFLOAT2(ThisStartX, ThisEndY);
ThisSprite[5].TextureCoords = XMFLOAT2(ThisEndX, ThisEndY);
ThisSprite += VERTEX_IN_RECT_NUM;
}
for(i = 0; i < StringLenth; i++, ThisSprite -= VERTEX_IN_RECT_NUM);
_DeviceContext -> Unmap(_vertexBuffer, 0);
_DeviceContext -> Draw(VERTEX_IN_RECT_NUM * StringLenth, 0);
return true;
}
Although the piece of code constructing the Vertex Array seems correct to me at first glance, it seems like you are trying to Draw your vertices with a Shader which has not been set yet !
It is difficult to precisely answer you without looking at the whole code, but I can guess that you will need to do something like that :
1) Create Vertex and Pixel Shaders by compiling them first from their respective buffers
2) Create the Input Layout description, which describes the Input Buffers that will be read by the Input Assembler stage. It will have to match your VertexPos structure and your shader structure.
3) Set the Shader parameters.
4) Only now you can Set Shader rendering parameters : Set the InputLayout, as well as the Vertex and Pixel Shaders that will be used to render your triangles by something like :
_DeviceContext -> Unmap(_vertexBuffer, 0);
_DeviceContext->IASetInputLayout(myInputLayout);
_DeviceContext->VSSetShader(myVertexShader, NULL, 0); // Set Vertex shader
_DeviceContext->PSSetShader(myPixelShader, NULL, 0); // Set Pixel shader
_DeviceContext -> Draw(VERTEX_IN_RECT_NUM * StringLenth, 0);
This link should help you achieve what you want to do : http://www.rastertek.com/dx11tut12.html
Also, I recommend you to set an IndexBuffer and to use the method DrawIndexed to render your triangles for performance reasons : It will allow the graphics adapter to store vertices in a vertex cache, allowing recently-used vertex to be fetched from the cache instead of reading it from the vertex buffer.
More about this concern can be found on MSDN : http://msdn.microsoft.com/en-us/library/windows/desktop/bb147325(v=vs.85).aspx
Hope this helps!
P.S : Also, don't forget to release the resources after using them by calling Release().

How to enhance the colors and contrast of an image

I need to process the first "Original" image to get something similar to the second "Enhanced" one. I applied some naif calculation and the new image has more contrast and more strong colors but in the higher color regions a color hole appears. I have no idea about image processing, it would be great if you can suggest me which concepts and/or algorithms I could apply to get the result without this problem.
Convert the image to the HSB (Hue, Saturation, Brightness) color space.
Multiply the saturation by some amount. Use a cutoff value if your platform requires it.
Example in Mathematica:
satMult = 4; (*saturation multiplier *)
imgHSB = ColorConvert[Import["http://i.imgur.com/8XkxR.jpg"], "HSB"];
cs = ColorSeparate[imgHSB]; (* separate in H, S and B*)
newSat = Image[ImageData[cs[[2]]] * satMult]; (* cs[[2]] is the saturation*)
ColorCombine[{cs[[1]], newSat, cs[[3]]}, "HSB"]] (* rebuild the image *)
A table increasing the saturation value:
The "holes" that you see in the processed picture are the darker areas of the original picture, which went to negative values with your darkening algorithm. I suspect these out of range values are then written to the new image as positive numbers, so they end up in the higher part of the brightness scale. For example, let's say a pixel value is 10, and you are substracting 12 from all pixels to darken them a bit. This pixel will underflow and become -2. When you write it back to the file, -2 gets represented as 0xfe in hex, and this is 254 if you take it as an unsigned number.
You should use an algorithm that keeps the pixel values within the valid range, or at least you should "clamp" the values to the valid range. A typical clamp function defined as a C macro would be:
#define clamp(p) (p < 0 ? 0 : (p > 255 ? 255 : p))
If you add the above macro to your processing function it will take care of the "holes", but instead you will now have dark colors in those places.
If you are ready for something a bit more advanced, here on Wikipedia they have the brightness and contrast formulas that are used by The GIMP. These which will do a pretty good job with your image if you choose the proper coefficients.
This wikipedia article does a good job of explaining histogram equalization for contrast enhancement.
Code for grayscale images:
unsigned char* EnhanceContrast(unsigned char* data, int width, int height)
{
int* cdf = (int*) calloc(256, sizeof(int));
for(int y = 0; y < height; y++) {
for(int x = 0; x < width; x++) {
int val = data[width*y + x];
cdf[val]++;
}
}
int cdf_min = cdf[0];
for(int i = 1; i < 256; i++) {
cdf[i] += cdf[i-1];
if(cdf[i] < cdf_min) {
cdf_min = cdf[i];
}
}
unsigned char* enhanced_data = (unsigned char*) malloc(width*height);
for(int y = 0; y < height; y++) {
for(int x = 0; x < width; x++) {
enhanced_data[width*y + x] = (int) round(cdf[data[width*y + x]] - cdf_min)*255.0/(width*height-cdf_min);
}
}
free(cdf);
return enhanced_data;
}

Pattern Recognition for image comparision in .net

Can anybody share code or algorithm(using pattern recognition) for image comparision in .net.
I need to compare 2 images of different resolution and textures and the find the difference . Now i have code to find the difference between 2 images using C#
// Load the images.
Bitmap bm1 = (Bitmap) (Image.FromFile(txtFile1.Text));
Bitmap bm2 = (Bitmap) (Image.FromFile(txtFile2.Text));
// Make a difference image.
int wid = Math.Min(bm1.Width, bm2.Width);
int hgt = Math.Min(bm1.Height, bm2.Height);
Bitmap bm3 = new Bitmap(wid, hgt);
// Create the difference image.
bool are_identical = true;
int r1;
int g1;
int b1;
int r2;
int g2;
int b2;
int r3;
int g3;
int b3;
Color eq_color = Color.Transparent;
Color ne_color = Color.Transparent;
for (int x = 0; x <= wid - 1; x++)
{
for (int y = 0; y <= hgt - 1; y++)
{
if (bm1.GetPixel(x, y).Equals(bm2.GetPixel(x, y)))
{
bm3.SetPixel(x, y, eq_color);
}
else
{
bm1.SetPixel(x, y, ne_color);
are_identical = false;
}
}
}
// Display the result.
picResult.Image = bm1;
Bitmap Logo = new Bitmap(picResult.Image);
Logo.MakeTransparent(Logo.GetPixel(1, 1));
picResult.Image = (Image)Logo;
//this.Cursor = Cursors.Default;
if ((bm1.Width != bm2.Width) || (bm1.Height != bm2.Height))
{
are_identical = false;
}
if (are_identical)
{
MessageBox.Show("The images are identical");
}
else
{
MessageBox.Show("The images are different");
}
//bm1.Dispose()
// bm2.Dispose()
BUT this compare if the 2 images are of same resolution and size.if some shadow is there on one image(but the 2 images are same) it shows the difference between the image..so i am trying to compare using pattern recognition.
As nailxx said, there is no "100% working free code" or something. Some years ago I helped implementing a "face recognition" app, and one of the things we used was "Locale binary patterns". Its not too easy, but it gave quite good results. Find a paper about it here:
Local binary patterns
Edit: I'm afraid I can't find the paper that I have used these days, it was shorter and fixed on the LBP itself and not how to use it with textures.
Your request is a really complex scientific (not even engineering) task.
The basic obvious algorithm is the following:
Somehow select all object on both comparing images.
This part is relatively simple and can be solved in many ways.
Compare all objects. This part is a task for scientists, considering the fact that they can be shifted, rotated, resized, and so on. :)
However, this can be solved in the case of you have a fixed number of entities to recognize. Like "circle", "triangle","rectange","line".

Analyze "whistle" sound for pitch/note

I am trying to build a system that will be able to process a record of someone whistling and output notes.
Can anyone recommend an open-source platform which I can use as the base for the note/pitch recognition and analysis of wave files ?
Thanks in advance
As many others have already said, FFT is the way to go here. I've written a little example in Java using FFT code from http://www.cs.princeton.edu/introcs/97data/. In order to run it, you will need the Complex class from that page also (see the source for the exact URL).
The code reads in a file, goes window-wise over it and does an FFT on each window. For each FFT it looks for the maximum coefficient and outputs the corresponding frequency. This does work very well for clean signals like a sine wave, but for an actual whistle sound you probably have to add more. I've tested with a few files with whistling I created myself (using the integrated mic of my laptop computer), the code does get the idea of what's going on, but in order to get actual notes more needs to be done.
1) You might need some more intelligent window technique. What my code uses now is a simple rectangular window. Since the FFT assumes that the input singal can be periodically continued, additional frequencies are detected when the first and the last sample in the window don't match. This is known as spectral leakage ( http://en.wikipedia.org/wiki/Spectral_leakage ), usually one uses a window that down-weights samples at the beginning and the end of the window ( http://en.wikipedia.org/wiki/Window_function ). Although the leakage shouldn't cause the wrong frequency to be detected as the maximum, using a window will increase the detection quality.
2) To match the frequencies to actual notes, you could use an array containing the frequencies (like 440 Hz for a') and then look for the frequency that's closest to the one that has been identified. However, if the whistling is off standard tuning, this won't work any more. Given that the whistling is still correct but only tuned differently (like a guitar or other musical instrument can be tuned differently and still sound "good", as long as the tuning is done consistently for all strings), you could still find notes by looking at the ratios of the identified frequencies. You can read http://en.wikipedia.org/wiki/Pitch_%28music%29 as a starting point on that. This is also interesting: http://en.wikipedia.org/wiki/Piano_key_frequencies
3) Moreover it might be interesting to detect the points in time when each individual tone starts and stops. This could be added as a pre-processing step. You could do an FFT for each individual note then. However, if the whistler doesn't stop but just bends between notes, this would not be that easy.
Definitely have a look at the libraries the others suggested. I don't know any of them, but maybe they contain already functionality for doing what I've described above.
And now to the code. Please let me know what worked for you, I find this topic pretty interesting.
Edit: I updated the code to include overlapping and a simple mapper from frequencies to notes. It works only for "tuned" whistlers though, as mentioned above.
package de.ahans.playground;
import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
public class FftMaxFrequency {
// taken from http://www.cs.princeton.edu/introcs/97data/FFT.java.html
// (first hit in Google for "java fft"
// needs Complex class from http://www.cs.princeton.edu/introcs/97data/Complex.java
public static Complex[] fft(Complex[] x) {
int N = x.length;
// base case
if (N == 1) return new Complex[] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2"); }
// fft of even terms
Complex[] even = new Complex[N/2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
Complex[] q = fft(even);
// fft of odd terms
Complex[] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
Complex[] r = fft(odd);
// combine
Complex[] y = new Complex[N];
for (int k = 0; k < N/2; k++) {
double kth = -2 * k * Math.PI / N;
Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
y[k] = q[k].plus(wk.times(r[k]));
y[k + N/2] = q[k].minus(wk.times(r[k]));
}
return y;
}
static class AudioReader {
private AudioFormat audioFormat;
public AudioReader() {}
public double[] readAudioData(File file) throws UnsupportedAudioFileException, IOException {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
audioFormat = in.getFormat();
int depth = audioFormat.getSampleSizeInBits();
long length = in.getFrameLength();
if (audioFormat.isBigEndian()) {
throw new UnsupportedAudioFileException("big endian not supported");
}
if (audioFormat.getChannels() != 1) {
throw new UnsupportedAudioFileException("only 1 channel supported");
}
byte[] tmp = new byte[(int) length];
byte[] samples = null;
int bytesPerSample = depth/8;
int bytesRead;
while (-1 != (bytesRead = in.read(tmp))) {
if (samples == null) {
samples = Arrays.copyOf(tmp, bytesRead);
} else {
int oldLen = samples.length;
samples = Arrays.copyOf(samples, oldLen + bytesRead);
for (int i = 0; i < bytesRead; i++) samples[oldLen+i] = tmp[i];
}
}
double[] data = new double[samples.length/bytesPerSample];
for (int i = 0; i < samples.length-bytesPerSample; i += bytesPerSample) {
int sample = 0;
for (int j = 0; j < bytesPerSample; j++) sample += samples[i+j] << j*8;
data[i/bytesPerSample] = (double) sample / Math.pow(2, depth);
}
return data;
}
public AudioFormat getAudioFormat() {
return audioFormat;
}
}
public class FrequencyNoteMapper {
private final String[] NOTE_NAMES = new String[] {
"A", "Bb", "B", "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#"
};
private final double[] FREQUENCIES;
private final double a = 440;
private final int TOTAL_OCTAVES = 6;
private final int START_OCTAVE = -1; // relative to A
public FrequencyNoteMapper() {
FREQUENCIES = new double[TOTAL_OCTAVES*12];
int j = 0;
for (int octave = START_OCTAVE; octave < START_OCTAVE+TOTAL_OCTAVES; octave++) {
for (int note = 0; note < 12; note++) {
int i = octave*12+note;
FREQUENCIES[j++] = a * Math.pow(2, (double)i / 12.0);
}
}
}
public String findMatch(double frequency) {
if (frequency == 0)
return "none";
double minDistance = Double.MAX_VALUE;
int bestIdx = -1;
for (int i = 0; i < FREQUENCIES.length; i++) {
if (Math.abs(FREQUENCIES[i] - frequency) < minDistance) {
minDistance = Math.abs(FREQUENCIES[i] - frequency);
bestIdx = i;
}
}
int octave = bestIdx / 12;
int note = bestIdx % 12;
return NOTE_NAMES[note] + octave;
}
}
public void run (File file) throws UnsupportedAudioFileException, IOException {
FrequencyNoteMapper mapper = new FrequencyNoteMapper();
// size of window for FFT
int N = 4096;
int overlap = 1024;
AudioReader reader = new AudioReader();
double[] data = reader.readAudioData(file);
// sample rate is needed to calculate actual frequencies
float rate = reader.getAudioFormat().getSampleRate();
// go over the samples window-wise
for (int offset = 0; offset < data.length-N; offset += (N-overlap)) {
// for each window calculate the FFT
Complex[] x = new Complex[N];
for (int i = 0; i < N; i++) x[i] = new Complex(data[offset+i], 0);
Complex[] result = fft(x);
// find index of maximum coefficient
double max = -1;
int maxIdx = 0;
for (int i = result.length/2; i >= 0; i--) {
if (result[i].abs() > max) {
max = result[i].abs();
maxIdx = i;
}
}
// calculate the frequency of that coefficient
double peakFrequency = (double)maxIdx*rate/(double)N;
// and get the time of the start and end position of the current window
double windowBegin = offset/rate;
double windowEnd = (offset+(N-overlap))/rate;
System.out.printf("%f s to %f s:\t%f Hz -- %s\n", windowBegin, windowEnd, peakFrequency, mapper.findMatch(peakFrequency));
}
}
public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
new FftMaxFrequency().run(new File("/home/axr/tmp/entchen.wav"));
}
}
i think this open-source platform suits you
http://code.google.com/p/musicg-sound-api/
Well, you could always use fftw to perform the Fast Fourier Transform. It's a very well respected framework. Once you've got an FFT of your signal you can analyze the resultant array for peaks. A simple histogram style analysis should give you the frequencies with the greatest volume. Then you just have to compare those frequencies to the frequencies that correspond with different pitches.
in addition to the other great options:
csound pitch detection: http://www.csounds.com/manual/html/pvspitch.html
fmod: http://www.fmod.org/ (has a free version)
aubio: http://aubio.org/doc/pitchdetection_8h.html
You might want to consider Python(x,y). It's a scientific programming framework for python in the spirit of Matlab, and it has easy functions for working in the FFT domain.
If you use Java, have a look at TarsosDSP library. It has a pretty good ready-to-go pitch detector.
Here is an example for android, but I think it doesn't require too much modifications to use it elsewhere.
I'm a fan of the FFT but for the monophonic and fairly pure sinusoidal tones of whistling, a zero-cross detector would do a far better job at determining the actual frequency at a much lower processing cost. Zero-cross detection is used in electronic frequency counters that measure the clock rate of whatever is being tested.
If you going to analyze anything other than pure sine wave tones, then FFT is definitely the way to go.
A very simple implementation of zero cross detection in Java on GitHub

BlackBerry - image 3D transform

I know how to rotate image on any angle with drawTexturePath:
int displayWidth = Display.getWidth();
int displayHeight = Display.getHeight();
int[] x = new int[] { 0, displayWidth, displayWidth, 0 };
int[] x = new int[] { 0, 0, displayHeight, displayHeight };
int angle = Fixed32.toFP( 45 );
int dux = Fixed32.cosd(angle );
int dvx = -Fixed32.sind( angle );
int duy = Fixed32.sind( angle );
int dvy = Fixed32.cosd( angle );
graphics.drawTexturedPath( x, y, null, null, 0, 0, dvx, dux, dvy, duy, image);
but what I need is a 3d projection of simple image with 3d transformation (something like this)
Can you please advice me how to do this with drawTexturedPath (I'm almost sure it's possible)?
Are there any alternatives?
The method used by this function(2 walk vectors) is the same as the oldskool coding tricks used for the famous 'rotozoomer' effect. rotozoomer example video
This method is a very fast way to rotate, zoom, and skew an image. The rotation is done simply by rotating the walk vectors. The zooming is done simply by scaling the walk vectors. The skewing is done by rotating the walkvectors in respect to one another (e.g. they don't make a 90 degree angle anymore).
Nintendo had made hardware in their SNES to use the same effect on any of the sprites and or backgrounds. This made way for some very cool effects.
One big shortcoming of this technique is that one can not perspectively warp a texture. To do this, every new horizontal line, the walk vectors should be changed slightly. (hard to explain without a drawing).
On the snes they overcame this by altering every scanline the walkvectors (In those days one could set an interrupt when the monitor was drawing any scanline). This mode was later referred to as MODE 7 (since it behaved like a new virtual kind of graphics mode). The most famous games using this mode were Mario kart and F-zero
So to get this working on the blackberry, you'll have to draw your image "displayHeight" times (e.g. Every time one scanline of the image). This is the only way to achieve the desired effect. (This will undoubtedly cost you a performance hit since you are now calling the drawTexturedPath function a lot of times with new values, instead of just one time).
I guess with a bit of googling you can find some formulas (or even an implementation) how to calc the varying walkvectors. With a bit of paper (given your not too bad at math) you might deduce it yourself too. I've done it myself too when I was making games for the Gameboy Advance so I know it can be done.
Be sure to precalc everything! Speed is everything (especially on slow machines like phones)
EDIT: did some googling for you. Here's a detailed explanation how to create the mode7 effect. This will help you achieve the same with the Blackberry function. Mode 7 implementation
With the following code you can skew your image and get a perspective like effect:
int displayWidth = Display.getWidth();
int displayHeight = Display.getHeight();
int[] x = new int[] { 0, displayWidth, displayWidth, 0 };
int[] y = new int[] { 0, 0, displayHeight, displayHeight };
int dux = Fixed32.toFP(-1);
int dvx = Fixed32.toFP(1);
int duy = Fixed32.toFP(1);
int dvy = Fixed32.toFP(0);
graphics.drawTexturedPath( x, y, null, null, 0, 0, dvx, dux, dvy, duy, image);
This will skew your image in a 45º angle, if you want a certain angle you just need to use some trigonometry to determine the lengths of your vectors.
Thanks for answers and guidance, +1 to you all.
MODE 7 was the way I choose to implement 3D transformation, but unfortunately I couldn't make drawTexturedPath to resize my scanlines... so I came down to simple drawImage.
Assuming you have a Bitmap inBmp (input texture), create new Bitmap outBmp (output texture).
Bitmap mInBmp = Bitmap.getBitmapResource("map.png");
int inHeight = mInBmp.getHeight();
int inWidth = mInBmp.getWidth();
int outHeight = 0;
int outWidth = 0;
int outDrawX = 0;
int outDrawY = 0;
Bitmap mOutBmp = null;
public Scr() {
super();
mOutBmp = getMode7YTransform();
outWidth = mOutBmp.getWidth();
outHeight = mOutBmp.getHeight();
outDrawX = (Display.getWidth() - outWidth) / 2;
outDrawY = Display.getHeight() - outHeight;
}
Somewhere in code create a Graphics outBmpGraphics for outBmp.
Then do following in iteration from start y to (texture height)* y transform factor:
1.create a Bitmap lineBmp = new Bitmap(width, 1) for one line
2.create a Graphics lineBmpGraphics from lineBmp
3.paint i line from texture to lineBmpGraphics
4.encode lineBmp to EncodedImage img
5.scale img according to MODE 7
6.paint img to outBmpGraphics
Note: Richard Puckett's PNGEncoder BB port used in my code
private Bitmap getMode7YTransform() {
Bitmap outBmp = new Bitmap(inWidth, inHeight / 2);
Graphics outBmpGraphics = new Graphics(outBmp);
for (int i = 0; i < inHeight / 2; i++) {
Bitmap lineBmp = new Bitmap(inWidth, 1);
Graphics lineBmpGraphics = new Graphics(lineBmp);
lineBmpGraphics.drawBitmap(0, 0, inWidth, 1, mInBmp, 0, 2 * i);
PNGEncoder encoder = new PNGEncoder(lineBmp, true);
byte[] data = null;
try {
data = encoder.encode(true);
} catch (IOException e) {
e.printStackTrace();
}
EncodedImage img = PNGEncodedImage.createEncodedImage(data,
0, -1);
float xScaleFactor = ((float) (inHeight / 2 + i))
/ (float) inHeight;
img = scaleImage(img, xScaleFactor, 1);
int startX = (inWidth - img.getScaledWidth()) / 2;
int imgHeight = img.getScaledHeight();
int imgWidth = img.getScaledWidth();
outBmpGraphics.drawImage(startX, i, imgWidth, imgHeight, img,
0, 0, 0);
}
return outBmp;
}
Then just draw it in paint()
protected void paint(Graphics graphics) {
graphics.drawBitmap(outDrawX, outDrawY, outWidth, outHeight, mOutBmp,
0, 0);
}
To scale, I've do something similar to method described in Resizing a Bitmap using .scaleImage32 instead of .setScale
private EncodedImage scaleImage(EncodedImage image, float ratioX,
float ratioY) {
int currentWidthFixed32 = Fixed32.toFP(image.getWidth());
int currentHeightFixed32 = Fixed32.toFP(image.getHeight());
double w = (double) image.getWidth() * ratioX;
double h = (double) image.getHeight() * ratioY;
int width = (int) w;
int height = (int) h;
int requiredWidthFixed32 = Fixed32.toFP(width);
int requiredHeightFixed32 = Fixed32.toFP(height);
int scaleXFixed32 = Fixed32.div(currentWidthFixed32,
requiredWidthFixed32);
int scaleYFixed32 = Fixed32.div(currentHeightFixed32,
requiredHeightFixed32);
EncodedImage result = image.scaleImage32(scaleXFixed32, scaleYFixed32);
return result;
}
See also
J2ME Mode 7 Floor Renderer - something much more detailed & exciting if you writing a 3D game!
You want to do texture mapping, and that function won't cut it. Maybe you can kludge your way around it but the better option is to use a texture mapping algorithm.
This involves, for each row of pixels, determining the edges of the shape and where on the shape those screen pixels map to (the texture pixels). It's not so hard actually but may take a bit of work. And you'll be drawing the pic only once.
GameDev has a bunch of articles with sourcecode here:
http://www.gamedev.net/reference/list.asp?categoryid=40#212
Wikipedia also has a nice article:
http://en.wikipedia.org/wiki/Texture_mapping
Another site with 3d tutorials:
http://tfpsly.free.fr/Docs/TomHammersley/index.html
In your place I'd seek out a simple demo program that did something close to what you want and use their sources as base to develop my own - or even find a portable source library, I´m sure there must be a few.

Resources