Ok, so I have a histogram (represented by an array of ints), and I'm looking for the best way to find local maxima and minima. Each histogram should have 3 peaks, one of them (the first one) probably much higher than the others.
I want to do several things:
Find the first "valley" following the first peak (in order to get rid of the first peak altogether in the picture)
Find the optimum "valley" value in between the remaining two peaks to separate the picture
I already know how to do step 2 by implementing a variant of Otsu.
But I'm struggling with step 1
In case the valley in between the two remaining peaks is not low enough, I'd like to give a warning.
Also, the image is quite clean with little noise to account for
What would be the brute-force algorithms to do steps 1 and 3? I could find a way to implement Otsu, but the brute-force is escaping me, math-wise. As it turns out, there is more documentation on doing methods like otsu, and less on simply finding peaks and valleys. I am not looking for anything more than whatever gets the job done (i.e. it's a temporary solution, just has to be implementable in a reasonable timeframe, until I can spend more time on it)
I am doing all this in c#
Any help on which steps to take would be appreciated!
Thank you so much!
EDIT: some more data:
most histogram are likely to be like the first one, with the first peak representing background.
Use peakiness-test. It's a method to find all the possible peak between two local minima, and measure the peakiness based on a formula. If the peakiness higher than a threshold, the peak is accepted.
Source: UCF CV CAP5415 lecture 9 slides
Below is my code:
public static List<int> PeakinessTest(int[] histogram, double peakinessThres)
{
int j=0;
List<int> valleys = new List<int> ();
//The start of the valley
int vA = histogram[j];
int P = vA;
//The end of the valley
int vB = 0;
//The width of the valley, default width is 1
int W = 1;
//The sum of the pixels between vA and vB
int N = 0;
//The measure of the peaks peakiness
double peakiness=0.0;
int peak=0;
bool l = false;
try
{
while (j < 254)
{
l = false;
vA = histogram[j];
P = vA;
W = 1;
N = vA;
int i = j + 1;
//To find the peak
while (P < histogram[i])
{
P = histogram[i];
W++;
N += histogram[i];
i++;
}
//To find the border of the valley other side
peak = i - 1;
vB = histogram[i];
N += histogram[i];
i++;
W++;
l = true;
while (vB >= histogram[i])
{
vB = histogram[i];
W++;
N += histogram[i];
i++;
}
//Calculate peakiness
peakiness = (1 - (double)((vA + vB) / (2.0 * P))) * (1 - ((double)N / (double)(W * P)));
if (peakiness > peakinessThres & !valleys.Contains(j))
{
//peaks.Add(peak);
valleys.Add(j);
valleys.Add(i - 1);
}
j = i - 1;
}
}
catch (Exception)
{
if (l)
{
vB = histogram[255];
peakiness = (1 - (double)((vA + vB) / (2.0 * P))) * (1 - ((double)N / (double)(W * P)));
if (peakiness > peakinessThres)
valleys.Add(255);
//peaks.Add(255);
return valleys;
}
}
//if(!valleys.Contains(255))
// valleys.Add(255);
return valleys;
}
Related
I'm using simple moving average in Math.Net, but now that I also need to calculate EMA (exponential moving average) or any kind of weighted moving average, I don't find it in the library.
I looked over all methods under MathNet.Numerics.Statistics and beyond, but didn't find anything similar.
Is it missing in library or I need to reference some additional package?
I don't see any EMA in MathNet.Numerics, however it's trivial to program. The routine below is based on the definition at Investopedia.
public double[] EMA(double[] x, int N)
{
// x is the input series
// N is the notional age of the data used
// k is the smoothing constant
double k = 2.0 / (N + 1);
double[] y = new double[x.Length];
y[0] = x[0];
for (int i = 1; i < x.Length; i++) y[i] = k * x[i] + (1 - k) * y[i - 1];
return y;
}
Occasionally I found this package: https://daveskender.github.io/Stock.Indicators/docs/INDICATORS.html It targets to the latest .NET framework and has very detailed documents.
Try this:
public IEnumerable<double> EMA(IEnumerable<double> items, int notationalAge)
{
double k = 2.0d / (notationalAge + 1), prev = 0.0d;
var e = items.GetEnumerator();
if (!e.MoveNext()) yield break;
yield return prev = e.Current;
while(e.MoveNext())
{
yield return prev = (k * e.Current) + (1 - k) * prev;
}
}
It will still work with arrays, but also List, Queue, Stack, IReadOnlyCollection, etc.
Although it's not explicitly stated I also get the sense this is working with money, in which case it really ought to use decimal instead of double.
In my game,if I touch a particular object,coin objects will come out of them at random speeds and occupy random positions.
public void update(delta){
if(isTouched()&& getY()<Constants.WORLD_HEIGHT/2){
setY(getY()+(randomSpeed * delta));
setX(getX()-(randomSpeed/4 * delta));
}
}
Now I want to make this coins occupy positions in some patterns.Like if 3 coins come out,a triangle pattern or if 4 coins, rectangular pattern like that.
I tried to make it work,but coins are coming out and moved,but overlapping each other.Not able to create any patterns.
patterns like:
This is what I tried
int a = Math.abs(rndNo.nextInt() % 3)+1;//no of coins
int no =0;
float coinxPos = player.getX()-coins[0].getWidth()/2;
float coinyPos = player.getY();
int minCoinGap=20;
switch (a) {
case 1:
for (int i = 0; i < coins.length; i++) {
if (!coins[i].isCoinVisible() && no < a) {
coins[i].setCoinVisible(true);
coinxPos = coinxPos+rndNo.nextInt()%70;
coinyPos = coinyPos+rndNo.nextInt()%70;
coins[i].setPosition(coinxPos, coinyPos);
no++;
}
}
break;
case 2:
for (int i = 0; i < coins.length; i++) {
if (!coins[i].isCoinVisible() && no < a) {
coins[i].setCoinVisible(true);
coinxPos = coinxPos+minCoinGap+rndNo.nextInt()%70;
coinyPos = coinyPos+rndNo.nextInt()%150;
coins[i].setPosition(coinxPos, coinyPos);
no++;
}
}
break:
......
......
default:
break;
may be this is a simple logic to implement,but I wasted a lot of time on it and got confused of how to make it work.
Any help would be appreciated.
In my game, when I want some object at X,Y to reach some specific coordinates Xe,Ye at every frame I'm adding to it's coordinates difference between current and wanted position, divided by constant and multiplied by time passed from last frame. That way it starts moving quickly and goes slowly and slowly as it's closer, looks kinda cool.
X += ((Xe - X)* dt)/ CONST;
Y += ((Ye - Y)* dt)/ CONST;
You'll experimentally get that CONST value, bigger value means slower movement. If you want it to look even cooler you can add velocity variable and instead of changing directly coordinates depending on distance from end position you can adjust that velocity. That way even if object at some point reaches the end position it will still have some velocity and it will keep moving - it will have inertia. A bit more complex to achieve, but movement would be even wilder.
And if you want that Xe,Ye be some specific position (not random), then just set those constant values. No need to make it more complicated then that. Set like another constat OFFSET:
static final int OFFSET = 100;
Xe1 = X - OFFSET; // for first coin
Ye1 = Y - OFFSET;
Xe2 = X + OFFSET; // for second coin
Ye2 = Y - OFFSET;
...
Here's the link to the question..
http://www.codechef.com/problems/J7
I figured out that 2 edges have to be equal in order to give the maximum volume, and then used x, x, a*x as the lengths of the three edges to write the equations -
4*x + 4*x + 4*a*x = P (perimeter) and,
2*x^2 + 4*(a*x *x) = S (total area of the box)
so from the first equation I got x in terms of P and a, and then substituted it in the second equation and then got a quadratic equation with the unknown being a. and then I used the greater root of a and got x.
But this method seems to be giving the wrong answer! :|
I know that there isn't any logical error in this. Maybe some formatting error?
Here's the main code that I've written :
{
public static void main(String[] args)
{
TheBestBox box = new TheBestBox();
reader = box.new InputReader(System.in);
writer = box.new OutputWriter(System.out);
getAttributes();
writer.flush();
reader.close();
writer.close();
}
public static void getAttributes()
{
t = reader.nextInt(); // t is the number of test cases in the question
for (int i = 0; i < t; i++)
{
p = reader.nextInt(); // p is the perimeter given as input
area = reader.nextInt(); // area of the whole sheet, given as input
a = findRoot(); // the fraction by which the third side differs by the first two
side = (double) p / (4 * (2 + a)); // length of the first and the second sides (equal)
height = a * side; // assuming that the base is a square, the height has to be the side which differs from the other two
// writer.println(side * side * height);
// System.out.printf("%.2f\n", (side * side * height));
writer.println(String.format("%.2f", (side * side * height))); // just printing out the final answer
}
}
public static double findRoot() // the method to find the 2 possible fractions by which the height can differ from the other two sides and return the bigger one of them
{
double a32, b, discriminant, root1, root2;
a32 = 32 * area - p * p;
b = 32 * area - 2 * p * p;
discriminant = Math.sqrt(b * b - 4 * 8 * area * a32);
double temp;
temp = 2 * 8 * area;
root1 = (- b + discriminant) / temp;
root2 = (- b - discriminant) / temp;
return Math.max(root1, root2);
}
}
could someone please help me out with this? Thank You. :)
I also got stuck in this question and realized that can be done by making equation of V(volume) in terms of one side say 'l' and using differentiation to find maximum volume in terms of any one side 'l'.
So, equations are like this :-
P = 4(l+b+h);
S = 2(l*b+b*h+l*h);
V = l*b*h;
so equation in l for V = (l^3) - (l^2)P/4 + lS/2 -------equ(1)
After differentiation we get:-
d(V)/d(l) = 3*(l^2) - l*P/2 + S/2;
to get max V we need to equate above equation to zero(0) and get the value of l.
So, solutions to a quadratic equation will be:-
l = ( P + sqrt((P^2)-24S) ) / 24;
so substitute this l in equation(1) to get max volume.
This Question was asked to me at the Google interview. I could do it O(n*n) ... Can I do it in better time.
A string can be formed only by 1 and 0.
Definition:
X & Y are strings formed by 0 or 1
D(X,Y) = Remove the things common at the start from both X & Y. Then add the remaining lengths from both the strings.
For e.g.
D(1111, 1000) = Only First alphabet is common. So the remaining string is 111 & 000. Therefore the result length("111") & length("000") = 3 + 3 = 6
D(101, 1100) = Only First two alphabets are common. So the remaining string is 01 & 100. Therefore the result length("01") & length("100") = 2 + 3 = 5
It is pretty that obvious that do find out such a crazy distance is going to be linear. O(m).
Now the question is
given n input, say like
1111
1000
101
1100
Find out the maximum crazy distance possible.
n is the number of input strings.
m is the max length of any input string.
The solution of O(n2 * m) is pretty simple. Can it be done in a better way?
Let's assume that m is fixed. Can we do this in better than O(n^2) ?
Put the strings into a tree, where 0 means go left and 1 means go right. So for example
1111
1000
101
1100
would result in a tree like
Root
1
0 1
0 1* 0 1
0* 0* 1*
where the * means that an element ends there. Constructing this tree clearly takes O(n m).
Now we have to find the diameter of the tree (the longest path between two nodes, which is the same thing as the "crazy distance"). The optimized algorithm presented there hits each node in the tree once. There are at most min(n m, 2^m) such nodes.
So if n m < 2^m, then the the algorithm is O(n m).
If n m > 2^m (and we necessarily have repeated inputs), then the algorithm is still O(n m) from the first step.
This also works for strings with a general alphabet; for an alphabet with k letters build a k-ary tree, in which case the runtime is still O(n m) by the same reasoning, though it takes k times as much memory.
I think this is possible in O(nm) time by creating a binary tree where each bit in a string encodes the path (0 left, 1 right). Then finding the maximum distance between nodes of the tree which can be done in O(n) time.
This is my solution, I think it works:
Create a binary tree from all strings. The tree will be constructed in this way:
at every round, select a string and add it to the tree. so for your example, the tree will be:
<root>
<1> <empty>
<1> <0>
<1> <0> <1> <0>
<1> <0> <0>
So each path from root to a leaf will represent a string.
Now the distance between each two leaves is the distance between two strings. To find the crazy distance, you must find the diameter of this graph, that you can do it easily by dfs or bfs.
The total complexity of this algorithm is:
O(n*m) + O(n*m) = O(n*m).
I think this problem is something like "find prefix for two strings", you can use trie(http://en.wikipedia.org/wiki/Trie) to accerlate searching
I have a google phone interview 3 days before, but maybe I failed...
Best luck to you
To get an answer in O(nm) just iterate across the characters of all string (this is an O(n) operation). We will compare at most m characters, so this will be done O(m). This gives a total of O(nm). Here's a C++ solution:
int max_distance(char** strings, int numstrings, int &distance) {
distance = 0;
// loop O(n) for initialization
for (int i=0; i<numstrings; i++)
distance += strlen(strings[i]);
int max_prefix = 0;
bool done = false;
// loop max O(m)
while (!done) {
int c = -1;
// loop O(n)
for (int i=0; i<numstrings; i++) {
if (strings[i][max_prefix] == 0) {
done = true; // it is enough to reach the end of one string to be done
break;
}
int new_element = strings[i][max_prefix] - '0';
if (-1 == c)
c = new_element;
else {
if (c != new_element) {
done = true; // mismatch
break;
}
}
}
if (!done) {
max_prefix++;
distance -= numstrings;
}
}
return max_prefix;
}
void test_misc() {
char* strings[] = {
"10100",
"10101110",
"101011",
"101"
};
std::cout << std::endl;
int distance = 0;
std::cout << "max_prefix = " << max_distance(strings, sizeof(strings)/sizeof(strings[0]), distance) << std::endl;
}
Not sure why use trees when iteration gives you the same big O computational complexity without the code complexity. anyway here is my version of it in javascript O(mn)
var len = process.argv.length -2; // in node first 2 arguments are node and program file
var input = process.argv.splice(2);
var current;
var currentCount = 0;
var currentCharLoc = 0;
var totalCount = 0;
var totalComplete = 0;
var same = true;
while ( totalComplete < len ) {
current = null;
currentCount = 0;
for ( var loc = 0 ; loc < len ; loc++) {
if ( input[loc].length === currentCharLoc) {
totalComplete++;
same = false;
} else if (input[loc].length > currentCharLoc) {
currentCount++;
if (same) {
if ( current === null ) {
current = input[loc][currentCharLoc];
} else {
if (current !== input[loc][currentCharLoc]) {
same = false;
}
}
}
}
}
if (!same) {
totalCount += currentCount;
}
currentCharLoc++;
}
console.log(totalCount);
I am trying to build a system that will be able to process a record of someone whistling and output notes.
Can anyone recommend an open-source platform which I can use as the base for the note/pitch recognition and analysis of wave files ?
Thanks in advance
As many others have already said, FFT is the way to go here. I've written a little example in Java using FFT code from http://www.cs.princeton.edu/introcs/97data/. In order to run it, you will need the Complex class from that page also (see the source for the exact URL).
The code reads in a file, goes window-wise over it and does an FFT on each window. For each FFT it looks for the maximum coefficient and outputs the corresponding frequency. This does work very well for clean signals like a sine wave, but for an actual whistle sound you probably have to add more. I've tested with a few files with whistling I created myself (using the integrated mic of my laptop computer), the code does get the idea of what's going on, but in order to get actual notes more needs to be done.
1) You might need some more intelligent window technique. What my code uses now is a simple rectangular window. Since the FFT assumes that the input singal can be periodically continued, additional frequencies are detected when the first and the last sample in the window don't match. This is known as spectral leakage ( http://en.wikipedia.org/wiki/Spectral_leakage ), usually one uses a window that down-weights samples at the beginning and the end of the window ( http://en.wikipedia.org/wiki/Window_function ). Although the leakage shouldn't cause the wrong frequency to be detected as the maximum, using a window will increase the detection quality.
2) To match the frequencies to actual notes, you could use an array containing the frequencies (like 440 Hz for a') and then look for the frequency that's closest to the one that has been identified. However, if the whistling is off standard tuning, this won't work any more. Given that the whistling is still correct but only tuned differently (like a guitar or other musical instrument can be tuned differently and still sound "good", as long as the tuning is done consistently for all strings), you could still find notes by looking at the ratios of the identified frequencies. You can read http://en.wikipedia.org/wiki/Pitch_%28music%29 as a starting point on that. This is also interesting: http://en.wikipedia.org/wiki/Piano_key_frequencies
3) Moreover it might be interesting to detect the points in time when each individual tone starts and stops. This could be added as a pre-processing step. You could do an FFT for each individual note then. However, if the whistler doesn't stop but just bends between notes, this would not be that easy.
Definitely have a look at the libraries the others suggested. I don't know any of them, but maybe they contain already functionality for doing what I've described above.
And now to the code. Please let me know what worked for you, I find this topic pretty interesting.
Edit: I updated the code to include overlapping and a simple mapper from frequencies to notes. It works only for "tuned" whistlers though, as mentioned above.
package de.ahans.playground;
import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
public class FftMaxFrequency {
// taken from http://www.cs.princeton.edu/introcs/97data/FFT.java.html
// (first hit in Google for "java fft"
// needs Complex class from http://www.cs.princeton.edu/introcs/97data/Complex.java
public static Complex[] fft(Complex[] x) {
int N = x.length;
// base case
if (N == 1) return new Complex[] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2"); }
// fft of even terms
Complex[] even = new Complex[N/2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
Complex[] q = fft(even);
// fft of odd terms
Complex[] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
Complex[] r = fft(odd);
// combine
Complex[] y = new Complex[N];
for (int k = 0; k < N/2; k++) {
double kth = -2 * k * Math.PI / N;
Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
y[k] = q[k].plus(wk.times(r[k]));
y[k + N/2] = q[k].minus(wk.times(r[k]));
}
return y;
}
static class AudioReader {
private AudioFormat audioFormat;
public AudioReader() {}
public double[] readAudioData(File file) throws UnsupportedAudioFileException, IOException {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
audioFormat = in.getFormat();
int depth = audioFormat.getSampleSizeInBits();
long length = in.getFrameLength();
if (audioFormat.isBigEndian()) {
throw new UnsupportedAudioFileException("big endian not supported");
}
if (audioFormat.getChannels() != 1) {
throw new UnsupportedAudioFileException("only 1 channel supported");
}
byte[] tmp = new byte[(int) length];
byte[] samples = null;
int bytesPerSample = depth/8;
int bytesRead;
while (-1 != (bytesRead = in.read(tmp))) {
if (samples == null) {
samples = Arrays.copyOf(tmp, bytesRead);
} else {
int oldLen = samples.length;
samples = Arrays.copyOf(samples, oldLen + bytesRead);
for (int i = 0; i < bytesRead; i++) samples[oldLen+i] = tmp[i];
}
}
double[] data = new double[samples.length/bytesPerSample];
for (int i = 0; i < samples.length-bytesPerSample; i += bytesPerSample) {
int sample = 0;
for (int j = 0; j < bytesPerSample; j++) sample += samples[i+j] << j*8;
data[i/bytesPerSample] = (double) sample / Math.pow(2, depth);
}
return data;
}
public AudioFormat getAudioFormat() {
return audioFormat;
}
}
public class FrequencyNoteMapper {
private final String[] NOTE_NAMES = new String[] {
"A", "Bb", "B", "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#"
};
private final double[] FREQUENCIES;
private final double a = 440;
private final int TOTAL_OCTAVES = 6;
private final int START_OCTAVE = -1; // relative to A
public FrequencyNoteMapper() {
FREQUENCIES = new double[TOTAL_OCTAVES*12];
int j = 0;
for (int octave = START_OCTAVE; octave < START_OCTAVE+TOTAL_OCTAVES; octave++) {
for (int note = 0; note < 12; note++) {
int i = octave*12+note;
FREQUENCIES[j++] = a * Math.pow(2, (double)i / 12.0);
}
}
}
public String findMatch(double frequency) {
if (frequency == 0)
return "none";
double minDistance = Double.MAX_VALUE;
int bestIdx = -1;
for (int i = 0; i < FREQUENCIES.length; i++) {
if (Math.abs(FREQUENCIES[i] - frequency) < minDistance) {
minDistance = Math.abs(FREQUENCIES[i] - frequency);
bestIdx = i;
}
}
int octave = bestIdx / 12;
int note = bestIdx % 12;
return NOTE_NAMES[note] + octave;
}
}
public void run (File file) throws UnsupportedAudioFileException, IOException {
FrequencyNoteMapper mapper = new FrequencyNoteMapper();
// size of window for FFT
int N = 4096;
int overlap = 1024;
AudioReader reader = new AudioReader();
double[] data = reader.readAudioData(file);
// sample rate is needed to calculate actual frequencies
float rate = reader.getAudioFormat().getSampleRate();
// go over the samples window-wise
for (int offset = 0; offset < data.length-N; offset += (N-overlap)) {
// for each window calculate the FFT
Complex[] x = new Complex[N];
for (int i = 0; i < N; i++) x[i] = new Complex(data[offset+i], 0);
Complex[] result = fft(x);
// find index of maximum coefficient
double max = -1;
int maxIdx = 0;
for (int i = result.length/2; i >= 0; i--) {
if (result[i].abs() > max) {
max = result[i].abs();
maxIdx = i;
}
}
// calculate the frequency of that coefficient
double peakFrequency = (double)maxIdx*rate/(double)N;
// and get the time of the start and end position of the current window
double windowBegin = offset/rate;
double windowEnd = (offset+(N-overlap))/rate;
System.out.printf("%f s to %f s:\t%f Hz -- %s\n", windowBegin, windowEnd, peakFrequency, mapper.findMatch(peakFrequency));
}
}
public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
new FftMaxFrequency().run(new File("/home/axr/tmp/entchen.wav"));
}
}
i think this open-source platform suits you
http://code.google.com/p/musicg-sound-api/
Well, you could always use fftw to perform the Fast Fourier Transform. It's a very well respected framework. Once you've got an FFT of your signal you can analyze the resultant array for peaks. A simple histogram style analysis should give you the frequencies with the greatest volume. Then you just have to compare those frequencies to the frequencies that correspond with different pitches.
in addition to the other great options:
csound pitch detection: http://www.csounds.com/manual/html/pvspitch.html
fmod: http://www.fmod.org/ (has a free version)
aubio: http://aubio.org/doc/pitchdetection_8h.html
You might want to consider Python(x,y). It's a scientific programming framework for python in the spirit of Matlab, and it has easy functions for working in the FFT domain.
If you use Java, have a look at TarsosDSP library. It has a pretty good ready-to-go pitch detector.
Here is an example for android, but I think it doesn't require too much modifications to use it elsewhere.
I'm a fan of the FFT but for the monophonic and fairly pure sinusoidal tones of whistling, a zero-cross detector would do a far better job at determining the actual frequency at a much lower processing cost. Zero-cross detection is used in electronic frequency counters that measure the clock rate of whatever is being tested.
If you going to analyze anything other than pure sine wave tones, then FFT is definitely the way to go.
A very simple implementation of zero cross detection in Java on GitHub