Using Java 8 feature takes more time - multithreading

Sample Program provided to count number of elements from an array that are less than the specified value.
Processing time taken by the program varies, using Java 8 forEach and stream takes more time for execution. Please explain if I should go with Java 8 features and if not which areas should it be avoided, additionally will parallelStream or multithreading with multi core processor help?
Code:
public static void main(String[] args) {
int lessThan = 4;
// Using for loop iteration
List<Integer> integerForSort1 = Arrays.asList(4, 1, 1, 2, 3);
long startTime1 = System.nanoTime();
long count1 = countNumbers(integerForSort1, lessThan);
long stopTime1 = System.nanoTime();
System.out.println(stopTime1 - startTime1);
System.out.println(count1);
integerForSort1 = null;
System.gc();
// Using binary search
List<Integer> integerForSort2 = Arrays.asList(4, 1, 1, 2, 3);
long startTime2 = System.nanoTime();
long count2 = countByBinarySearch(integerForSort2, lessThan);
long stopTime2 = System.nanoTime();
System.out.println(stopTime2 - startTime2);
System.out.println(count2);
integerForSort2 = null;
System.gc();
// Using Java 8
List<Integer> integerForSort3 = Arrays.asList(4, 1, 1, 2, 3);
long startTime3 = System.nanoTime();
long count3 = integerForSort3.stream()
.filter(p -> p < lessThan)
.count();
long stopTime3 = System.nanoTime();
System.out.println(stopTime3 - startTime3);
System.out.println(count3);
integerForSort3 = null;
System.gc();
//Using Java 8 for each loop
List<Integer> integerForSort4 = Arrays.asList(4, 1, 1, 2, 3);
long startTime4 = System.nanoTime();
long count4 = process(integerForSort4, p -> p < lessThan);
long stopTime4 = System.nanoTime();
System.out.println(stopTime4 - startTime4);
System.out.println(count4);
integerForSort4 = null;
}
public static long countNumbers(List<Integer> integerForSort, int lessThan) {
long count = 0;
Collections.sort(integerForSort);
for (Integer anIntegerForSort : integerForSort) {
if (anIntegerForSort < lessThan)
count++;
}
return count;
}
public static long countByBinarySearch(List<Integer> integerForSort, int lessThan){
if(integerForSort==null||integerForSort.isEmpty())
return 0;
int low = 0, mid = 0, high = integerForSort.size();
Collections.sort(integerForSort);
while(low != high){
mid = (low + high) / 2;
if (integerForSort.get(mid) < lessThan) {
low = mid + 1;
}
else {
high = mid;
}
}
return low;
}
public static long process(List<Integer> integerForSort, Predicate<Integer> predicate) {
final AtomicInteger i = new AtomicInteger(0);
integerForSort.forEach((Integer p) -> {
if (predicate.test(p)) {
i.getAndAdd(1);
}
});
return i.intValue();
}
Output:
345918
4
21509
4
29651234
4
2242999
4
Questions:
Is it possible to reduce the process time using Java 8 features?
Why does Java 8 stream takes more time?
How can I use lambda expression with binary Search, will it process faster?
Even using multithreading with concurrent.ExecutorService gave consistent results:
Result 4 : Thread 'pool-1-thread-1' ran process - Using Java 8 stream in 6 millisecond, from 11:16:05:361 to 11:16:05:367
Result 4 : Thread 'pool-1-thread-2' ran process - Using Java 8 forEach in 3 millisecond, from 11:16:05:361 to 11:16:05:364
Result 4 : Thread 'pool-1-thread-4' ran process - Using Java 7 binary Search in 0 millisecond, from 11:16:05:379 to 11:16:05:379
Result 4 : Thread 'pool-1-thread-3' ran process - Using Java 7 for loop in 1 millisecond, from 11:16:05:362 to 11:16:05:363

I do not know the answer, since I didn't perform any tests, but I think, the performance tests with 5 elements have no diagnostic value and are senseless. You should generate arrays of 10s,100s of thousands, or hundreds of milions, to see the performance difference.
Java 8 creates several objects, which causes some overhead in opposite of simple for loop. So having only 5 test elements, your results depend on how much work is needed for initialization of you execution.
You know, multiplying of 5 numbers on CPU is faster then even copying it to GPU memory, so you have your result on CPU even before GPU starts to compute. But if your data grows up and your GPU multiplies parallely hundreds of numbers, you will see the speed difference.

Related

How to rewrite this using Object Oriented Programming? remove statics, create object, prefix methods with object name...? what else?

Can someone help me understand how to make the following using object-oriented programming?
It is a short program that compares how quickly two different sort algorithms run on arrays of various sizes. When trying to rewrite it using object orientation, I start by making an object called "OO" at the top of Main, then prefixing all the method calls with "OO.", then remvoing all the static modifiers, but it still throws errors. I'm sure there is an easy fix, I'm just very new to this. Thanks. (below is the version that runs, with no attempt yet at OOP)
import java.util.Arrays;
public class SortingExercise {
//Create constants to give array size
final static int CONST100 = 100000; // The length of arrays that will be sorted.
final static int CONST10 = 10000; // The length of arrays that will be sorted.
final static int CONST1 = 1000; // The length of arrays that will be sorted.
final static int CONSTMIL = 1000000; //Constant for use in making/sorting a million-element array
/*
* Creates an array populated with random integers varying widely in size
* #param count : the length of the array that is created.
*/
private static int[] randomInts(int count) {
int[] numbers = new int[count];
for (int i = 0; i < count; i++)
numbers[i] = (int)(Integer.MAX_VALUE * Math.random());
return numbers;
}
/*
* Sorts an array of integers using the selection sort algorithm.
*/
private static void selectionSort(int[] numbers) {
for (int end = numbers.length-1; end > 0; end-- ) {
int maxloc = 0;
for (int i = 1; i <= end; i++) {
if (numbers[i] > numbers[maxloc])
maxloc = i;
}
int temp = numbers[end];
numbers[end] = numbers[maxloc];
numbers[maxloc] = temp;
}
}
public static void main(String[] args) {
double startTime;
double runTime;
int[] Array1;
int[] Array2;
//FIRST RUN - 1,000 ELEMENTS
//Create Arrays
Array1 = randomInts(CONST1);
Array2 = Arrays.copyOf(Array1, CONST1);
//Sort and print comparative run times of selectionSort and the Array.sort method on identical 1,000-element arrays
startTime = System.currentTimeMillis();
selectionSort(Array1);
runTime = System.currentTimeMillis() - startTime;
System.out.println(runTime + " milliseconds for Array1 with 1,000 elements using selectionSort");
startTime = System.nanoTime();
Arrays.sort(Array2);
runTime = System.nanoTime() - startTime;
System.out.println(runTime/1000000 + " milliseconds for Array2 with 1,000 elements using built-in Array.sort method");
System.out.println();
//SECOND RUN - 10,000 ELEMENTS
//Create Arrays
Array1 = randomInts(CONST10);
Array2 = Arrays.copyOf(Array1, CONST10);
//Sort and print comparative run times of selectionSort and the Array.sort method on identical 10,000-element arrays
startTime = System.currentTimeMillis();
selectionSort(Array1);
runTime = System.currentTimeMillis() - startTime;
System.out.println(runTime + " milliseconds for Array1 with 10,000 elements using selectionSort");
startTime = System.currentTimeMillis();
Arrays.sort(Array2);
runTime = System.currentTimeMillis() - startTime;
System.out.println(runTime + " milliseconds for Array2 with 10,000 elements using built-in Array.sort method");
System.out.println();
//THIRD RUN - 100,000 ELEMENTS
//Create Arrays
Array1 = randomInts(CONST100);
Array2 = Arrays.copyOf(Array1, CONST100);
//Sort and print comparative run times of selectionSort and the Array.sort method on identical 100,000-element arrays
startTime = System.currentTimeMillis();
selectionSort(Array1);
runTime = System.currentTimeMillis() - startTime;
System.out.println(runTime + " milliseconds for Array1 with 100,000 elements using selectionSort");
startTime = System.currentTimeMillis();
Arrays.sort(Array2);
runTime = System.currentTimeMillis() - startTime;
System.out.println(runTime + " milliseconds for Array2 with 100,000 elements using built-in Array.sort method");
System.out.println();
//OPTIONAL FINAL RUN ONLY USING .sort() -> 1 MILLION ELEMENTS
//Create Array
Array1 = randomInts(CONSTMIL);
//Sort and print runtime for a million-element array
startTime = System.currentTimeMillis();
Arrays.sort(Array1);
runTime = System.currentTimeMillis() - startTime;
System.out.println(runTime + " milliseconds for Array1 with 1 million elements using built-in Array.sort method");
//END PROGRAM
}
}

How to divide a huge loop into multiple threads and then add result in collection?

I am performing some task in a loop. I need to divide this loop of 1.2 million into multiple threads. Each thread will have some result in list. When all threads are completed I need to add all threads list data into one common list. I can not use ExecutorService. How can I do this?
It should be compatible to jdk 1.6.
This is what I am doing right now:
List<Thread> threads = new ArrayList<Thread>();
int elements = 1200000;
public void function1() {
int oneTheadElemCount = 10000;
float fnum_threads = (float)elements / (float)oneTheadElemCount ;
String s = String.valueOf(fnum_threads);
int num_threads = Integer.parseInt(s.substring(0, s.indexOf("."))) + 1 ;
for(int count =0 ; count < num_threads ; count++) {
int endIndex = ((oneTheadElemCount * (num_threads - count)) + 1000) ;
int startindex = endIndex - oneTheadElemCount ;
if(count == (num_threads-1) )
{
startindex = 0;
}
if(startindex == 0 && endIndex > elements) {
endIndex = elements -1 ;
}
dothis( startindex,endIndex);
}
for(Thread t : threads) {
t.run();
}
}
public List dothis(int startindex, int endIndex) throws Exception {
Thread thread = new Thread(new Runnable() {
#Override
public void run() {
for (int i = startindex;
(i < endIndex && (startindex < elements && elements) ) ; i++)
{
//task adding elements in list
}
}
});
thread.start();
threads.add(thread);
return list;
}
I don't know which version of Java you are using but in Java 7 and higher, you can use Fork/Join ForkJoinPool.
Basically,
Fork/Join, introduced in Java 7, isn't intended to replace or compete
with the existing concurrency utility classes; instead it updates and
completes them. Fork/Join addresses the need for divide-and-conquer,
or recursive task-processing in Java programs (see Resources).
Fork/Join's logic is very simple: (1) separate (fork) each large task
into smaller tasks; (2) process each task in a separate thread
(separating those into even smaller tasks if necessary); (3) join the
results.
Citation.
There are various example online that can help with it. I haven't used it myself.
I hope this helps.
For Java6, you can follow this related SO question.

Persistent thread style OpenCL implementation is very slow

I came across the persistent thread (PT) style implementation for non-homogeneous work distribution and wrote a simple kernel to compare the computation time with a kernel doing the same computations the usual way. But my test implementation is about 6 times slower than the ordinary implementation even without the overhead for sorting the buffer to get corresponding operations of 32. Is this a reasonable slowdown or am I overlooking something? I launched the PT kernel with global_work_size = local_work_size = CL_DEVICE_MAX_WORK_GROUP_SIZE, which is 512. If I chose less, than obviously it gets even slower.
This is the ordinary kernel:
__kernel void myKernel(const __global int* buffer)
{
int myIndex = get_local_id(0);
doSomeComputations(buffer[myIndex]); //just many adds and mults, no conditionals
}
And this is the PT style kernel:
__constant int finalIndex = 655360;
__kernel void myKernel(const __global int* buffer)
{
__local volatile int nextIndex;
if (get_local_id(0) == 0)
nextIndex = 0;
mem_fence(CLK_LOCAL_MEM_FENCE);
int myIndex;
while(true){
// get next index
myIndex = nextIndex + get_local_id(0);
if (myIndex > finalIndex)
return;
if ( get_local_id(0) == 0)
nextIndex += 512;
mem_fence(CLK_LOCAL_MEM_FENCE);
doSomeComputations(buffer[myIndex]); //same computations as above
}
}
I thought both implementations should take about the same time. Why is the PT style implementation so much slower? Thank you in advance.
------------Edited below this line-------------
So just to be clear. This kernel launched with global_work_size=655360 and local_work_size=512
__kernel void myKernel()
{
int myIndex = get_local_id(0);
volatile float result;
float4 test = float4(1.1f);
for(int i=0; i<1000; i++)
test = (test*test + test*test)/2.0;
result = test.x;
}
runs 6 times faster than this kernel launched with global_work_size=512 and local_work_size=512
__kernel void myKernel()
{
for(size_t idx = 0; idx < 655360; idx += get_local_size(0))
{
volatile float result;
float4 test = float4(1.1f);
for(int i=0; i<1000; i++)
test = (test*test + test*test)/2.0;
result = test.x;
}
}
You can reduce your second kernel to just this:
__kernel void myKernel(const __global int* buffer)
{
for(int x = 0; x < 655360; x += get_local_size(0))
doSomeComputations(buffer[x+get_local_id(0)]);
}
Update: added summary of the below conversation
First kernel (global_work_size=655360 and local_work_size=512) will be split into 655360/512 = 1280 work groups which will fully utilize the GPU. The second kernel (global_work_size=512 and local_work_size=512) will utilize just one computing unit which explains why the first one runs faster.
More details about persistent threads in GPU: persistent-threads-in-opencl-and-cuda.

Random number in long range, is this the way?

Can somebody verify this method. I need a long type number inside a range of two longs. I use the .NET Random.Next(min, max) function which return int's. Is my reasoning correct if I simply divide the long by 2, generate the random number and finally multiply it by 2 again? Or am I too enthusiastic...
I understand that my random resolution will decrease but are there any other mistakes which will lead to no such a random number.
long min = st.MinimumTime.Ticks; //long is Signed 64-bit integer
long max = st.MaximumTime.Ticks;
int minInt = (int) (min / 2); //int is Signed 64-bit integer
int maxInt = (int) (max / 2); //int is Signed 64-bit integer
Random random = new Random();
int randomInt = random.Next(minInt, maxInt);
long randomLong = (randomInt * 2);
Why don't you just generate two random Int32 values and make one Int64 out of them?
long LongRandom(long min, long max, Random rand) {
long result = rand.Next((Int32)(min >> 32), (Int32)(max >> 32));
result = (result << 32);
result = result | (long)rand.Next((Int32)min, (Int32)max);
return result;
}
Sorry, I forgot to add boundaries the first time. Added min and max params. You can test it like that:
long r = LongRandom(100000000000000000, 100000000000000050, new Random());
Values of r will lie in the desired range.
EDIT: the implementation above is flawed. It's probably worth it to generate 4 16-bit integers rather than 2 32-bit ones to avoid signed-unsigned problems. But at this point the solution loses its elegancy, so I think it's best to stick with Random.NextBytes version:
long LongRandom(long min, long max, Random rand) {
byte[] buf = new byte[8];
rand.NextBytes(buf);
long longRand = BitConverter.ToInt64(buf, 0);
return (Math.Abs(longRand % (max - min)) + min);
}
It looks pretty well in terms of value distribution (judging by very simple tests I ran).
Some other answers here have two issues: having a modulo bias, and failing to correctly handle values of max = long.MaxValue. (Martin's answer has neither problem, but his code is unreasonably slow with large ranges.)
The following code will fix all of those issues:
//Working with ulong so that modulo works correctly with values > long.MaxValue
ulong uRange = (ulong)(max - min);
//Prevent a modolo bias; see https://stackoverflow.com/a/10984975/238419
//for more information.
//In the worst case, the expected number of calls is 2 (though usually it's
//much closer to 1) so this loop doesn't really hurt performance at all.
ulong ulongRand;
do
{
byte[] buf = new byte[8];
random.NextBytes(buf);
ulongRand = (ulong)BitConverter.ToInt64(buf, 0);
} while (ulongRand > ulong.MaxValue - ((ulong.MaxValue % uRange) + 1) % uRange);
return (long)(ulongRand % uRange) + min;
The following fully-documented class can be dropped into your codebase to implement the above solution easily and brain-free. Like all code on Stackoverflow, it's licensed under CC-attribution, so you can feel free to use to use it for basically whatever you want.
using System;
namespace MyNamespace
{
public static class RandomExtensionMethods
{
/// <summary>
/// Returns a random long from min (inclusive) to max (exclusive)
/// </summary>
/// <param name="random">The given random instance</param>
/// <param name="min">The inclusive minimum bound</param>
/// <param name="max">The exclusive maximum bound. Must be greater than min</param>
public static long NextLong(this Random random, long min, long max)
{
if (max <= min)
throw new ArgumentOutOfRangeException("max", "max must be > min!");
//Working with ulong so that modulo works correctly with values > long.MaxValue
ulong uRange = (ulong)(max - min);
//Prevent a modolo bias; see https://stackoverflow.com/a/10984975/238419
//for more information.
//In the worst case, the expected number of calls is 2 (though usually it's
//much closer to 1) so this loop doesn't really hurt performance at all.
ulong ulongRand;
do
{
byte[] buf = new byte[8];
random.NextBytes(buf);
ulongRand = (ulong)BitConverter.ToInt64(buf, 0);
} while (ulongRand > ulong.MaxValue - ((ulong.MaxValue % uRange) + 1) % uRange);
return (long)(ulongRand % uRange) + min;
}
/// <summary>
/// Returns a random long from 0 (inclusive) to max (exclusive)
/// </summary>
/// <param name="random">The given random instance</param>
/// <param name="max">The exclusive maximum bound. Must be greater than 0</param>
public static long NextLong(this Random random, long max)
{
return random.NextLong(0, max);
}
/// <summary>
/// Returns a random long over all possible values of long (except long.MaxValue, similar to
/// random.Next())
/// </summary>
/// <param name="random">The given random instance</param>
public static long NextLong(this Random random)
{
return random.NextLong(long.MinValue, long.MaxValue);
}
}
}
Usage:
Random random = new Random();
long foobar = random.NextLong(0, 1234567890L);
This creates a random Int64 by using random bytes, avoiding modulo bias by retrying if the number is outside the safe range.
static class RandomExtensions
{
public static long RandomLong(this Random rnd)
{
byte[] buffer = new byte[8];
rnd.NextBytes (buffer);
return BitConverter.ToInt64(buffer, 0);
}
public static long RandomLong(this Random rnd, long min, long max)
{
EnsureMinLEQMax(ref min, ref max);
long numbersInRange = unchecked(max - min + 1);
if (numbersInRange < 0)
throw new ArgumentException("Size of range between min and max must be less than or equal to Int64.MaxValue");
long randomOffset = RandomLong(rnd);
if (IsModuloBiased(randomOffset, numbersInRange))
return RandomLong(rnd, min, max); // Try again
else
return min + PositiveModuloOrZero(randomOffset, numbersInRange);
}
static bool IsModuloBiased(long randomOffset, long numbersInRange)
{
long greatestCompleteRange = numbersInRange * (long.MaxValue / numbersInRange);
return randomOffset > greatestCompleteRange;
}
static long PositiveModuloOrZero(long dividend, long divisor)
{
long mod;
Math.DivRem(dividend, divisor, out mod);
if(mod < 0)
mod += divisor;
return mod;
}
static void EnsureMinLEQMax(ref long min, ref long max)
{
if(min <= max)
return;
long temp = min;
min = max;
max = temp;
}
}
Here is a solution that leverages from the other answers using Random.NextBytes, but also pays careful attention to boundary cases. I've structured it as a set of extension methods. Also, I've accounted for modulo bias, by sampling another random number it falls out of range.
One of my gripes (at least for the situation I was trying to use it) is that the maximum is usually exclusive so if you want to roll a die, you do something like Random.Next(0,7). However, this means you can never get this overload to return the .MaxValue for the datatype (int, long, ulong, what-have-you). Therefore, I've added an inclusiveUpperBound flag to toggle this behavior.
public static class Extensions
{
//returns a uniformly random ulong between ulong.Min inclusive and ulong.Max inclusive
public static ulong NextULong(this Random rng)
{
byte[] buf = new byte[8];
rng.NextBytes(buf);
return BitConverter.ToUInt64(buf, 0);
}
//returns a uniformly random ulong between ulong.Min and Max without modulo bias
public static ulong NextULong(this Random rng, ulong max, bool inclusiveUpperBound = false)
{
return rng.NextULong(ulong.MinValue, max, inclusiveUpperBound);
}
//returns a uniformly random ulong between Min and Max without modulo bias
public static ulong NextULong(this Random rng, ulong min, ulong max, bool inclusiveUpperBound = false)
{
ulong range = max - min;
if (inclusiveUpperBound)
{
if (range == ulong.MaxValue)
{
return rng.NextULong();
}
range++;
}
if (range <= 0)
{
throw new ArgumentOutOfRangeException("Max must be greater than min when inclusiveUpperBound is false, and greater than or equal to when true", "max");
}
ulong limit = ulong.MaxValue - ulong.MaxValue % range;
ulong r;
do
{
r = rng.NextULong();
} while(r > limit);
return r % range + min;
}
//returns a uniformly random long between long.Min inclusive and long.Max inclusive
public static long NextLong(this Random rng)
{
byte[] buf = new byte[8];
rng.NextBytes(buf);
return BitConverter.ToInt64(buf, 0);
}
//returns a uniformly random long between long.Min and Max without modulo bias
public static long NextLong(this Random rng, long max, bool inclusiveUpperBound = false)
{
return rng.NextLong(long.MinValue, max, inclusiveUpperBound);
}
//returns a uniformly random long between Min and Max without modulo bias
public static long NextLong(this Random rng, long min, long max, bool inclusiveUpperBound = false)
{
ulong range = (ulong)(max - min);
if (inclusiveUpperBound)
{
if (range == ulong.MaxValue)
{
return rng.NextLong();
}
range++;
}
if (range <= 0)
{
throw new ArgumentOutOfRangeException("Max must be greater than min when inclusiveUpperBound is false, and greater than or equal to when true", "max");
}
ulong limit = ulong.MaxValue - ulong.MaxValue % range;
ulong r;
do
{
r = rng.NextULong();
} while(r > limit);
return (long)(r % range + (ulong)min);
}
}
private long randomLong()
{
Random random = new Random();
byte[] bytes = new byte[8];
random.NextBytes(bytes);
return BitConverter.ToInt64(bytes, 0);
}
This will get you a secure random long:
using (RNGCryptoServiceProvider rg = new RNGCryptoServiceProvider())
{
byte[] rno = new byte[9];
rg.GetBytes(rno);
long randomvalue = BitConverter.ToInt64(rno, 0);
}
Start at the minimum, add a random percentage of the difference between the min and the max. Problem with this is that NextDouble returns a number x such that 0 <= x < 1, so there's a chance you'll never hit the max number.
long randomLong = min + (long)(random.NextDouble() * (max - min));
Your randomLong will always be even and you will have eliminated even more values because you are very far away from the maximum for long, The maximum for long is 2^32 * max for int. You should use Random.NextBytes.
You can try CryptoRandom of the Inferno library:
public class CryptoRandom : Random
// implements all Random methods, as well as:
public byte[] NextBytes(int count)
public long NextLong()
public long NextLong(long maxValue)
public long NextLong(long minValue, long maxValue)
I wrote some Test Methods and check my own method and many of the answers from this and the same questions. Generation of redundant values is a big problem. I found #BlueRaja - Danny Pflughoeft answer at this address Is good enough and did not generate redundant values at least for first 10,000,000s. This is a Test Method:
[TestMethod]
public void TestRand64WithExtensions()
{
Int64 rnum = 0;
HashSet<Int64> hs = new HashSet<long>();
Random randAgent = new Random((int)DateTime.Now.Ticks);
for (int i = 0; i < 10000000; i++)
{
rnum = randAgent.NextLong(100000000000000, 999999999999999);
//Test returned value is greater than zero
Assert.AreNotEqual(0, rnum);
//Test Length of returned value
Assert.AreEqual(15, rnum.ToString().Length);
//Test redundancy
if (!hs.Contains(rnum)) { hs.Add(rnum); }
else
{
//log redundant value and current length we received
Console.Write(rnum + " | " + hs.Count.ToString());
Assert.Fail();
}
}
}
I didn't want to post this as an answer but I can't stuff this in the comment section and I didn't want to add as an edit to answer without author consent. So pardon me as this is not an independent answer and maybe just a prove to one of the answers.
I wrote a benchmarking C# console app that tests 5 different methods for generating unsigned 64-bit integers. Some of those methods are mentioned above. Method #5 appeared to consistently be the quickest. I claim to be no coding genius, but if this helps you, you're welcome to it. If you have better ideas, please submit. - Dave (sbda26#gmail.com)
enter code here
static private Random _clsRandom = new Random();
private const int _ciIterations = 100000;
static void Main(string[] args)
{
RunMethod(Method1);
RunMethod(Method2);
RunMethod(Method3);
RunMethod(Method4);
RunMethod(Method5);
Console.ReadLine();
}
static void RunMethod(Func<ulong> MethodX)
{
ulong ulResult;
DateTime dtStart;
TimeSpan ts;
Console.WriteLine("--------------------------------------------");
Console.WriteLine(MethodX.Method.Name);
dtStart = DateTime.Now;
for (int x = 1; x <= _ciIterations; x++)
ulResult = MethodX.Invoke();
ts = DateTime.Now - dtStart;
Console.WriteLine(string.Format("Elapsed time: {0} milliseconds", ts.TotalMilliseconds));
}
static ulong Method1()
{
int x1 = _clsRandom.Next(int.MinValue, int.MaxValue);
int x2 = _clsRandom.Next(int.MinValue, int.MaxValue);
ulong y;
// lines must be separated or result won't go past 2^32
y = (uint)x1;
y = y << 32;
y = y | (uint)x2;
return y;
}
static ulong Method2()
{
ulong ulResult = 0;
for(int iPower = 0; iPower < 64; iPower++)
{
double dRandom = _clsRandom.NextDouble();
if(dRandom > 0.5)
{
double dValue = Math.Pow(2, iPower);
ulong ulValue = Convert.ToUInt64(dValue);
ulResult = ulResult | ulValue;
}
}
return ulResult;
}
static ulong Method3() // only difference between #3 and #2 is that this one (#3) uses .Next() instead of .NextDouble()
{
ulong ulResult = 0;
for (int iPower = 0; iPower < 64; iPower++)
if (_clsRandom.Next(0, 1) == 1)
ulResult = ulResult | Convert.ToUInt64(Math.Pow(2, iPower));
return ulResult;
}
static ulong Method4()
{
byte[] arr_bt = new byte[8];
ulong ulResult;
_clsRandom.NextBytes(arr_bt);
ulResult = BitConverter.ToUInt64(arr_bt, 0);
return ulResult;
}
// Next method courtesy of https://stackoverflow.com/questions/14708778/how-to-convert-unsigned-integer-to-signed-integer-without-overflowexception/39107847
[System.Runtime.InteropServices.StructLayout(System.Runtime.InteropServices.LayoutKind.Explicit)]
struct EvilUnion
{
[System.Runtime.InteropServices.FieldOffset(0)] public int Int32;
[System.Runtime.InteropServices.FieldOffset(0)] public uint UInt32;
}
static ulong Method5()
{
var evil = new EvilUnion();
ulong ulResult = 0;
evil.Int32 = _clsRandom.Next(int.MinValue, int.MaxValue);
ulResult = evil.UInt32;
ulResult = ulResult << 32;
evil.Int32 = _clsRandom.Next(int.MinValue, int.MaxValue);
ulResult = ulResult | evil.UInt32;
return ulResult;
}
}
I'll add my solution for generating random unsigned long integer (random ulong) below max value.
public static ulong GetRandomUlong(ulong maxValue)
{
Random rnd = new Random();
//This algorithm works with inclusive upper bound, but random generators traditionally have exclusive upper bound, so we adjust.
//Zero is allowed, function will return zero, as well as for 1. Same behavior as System.Random.Next().
if (maxValue > 0) maxValue--;
byte[] maxValueBytes = BitConverter.GetBytes(maxValue);
byte[] result = new byte[8];
int i;
for(i = 7; i >= 0; i--)
{
//senior bytes are either zero (then Random will write in zero without our help), or equal or below that of maxValue
result[i] = (byte)rnd.Next( maxValueBytes[i] + 1 );
//If, going high bytes to low bytes, we got ourselves a byte, that is lower than that of MaxValue, then lower bytes may be of any value.
if ((uint)result[i] < maxValueBytes[i]) break;
}
for(i--; i >= 0; i--) // I like this row
{
result[i] = (byte)rnd.Next(256);
}
return BitConverter.ToUInt64(result, 0);
}
C#10 now has long randoms built in.
Use NextInt64 if you can.
You're better off taking the difference between minimum and maximum (if it fits in an int), getting a random between 0 and that, and adding it to the minimum.
Is there anything wrong with using this simple approach?
long min = 10000000000001;
long max = 99999999999999;
Random random = new Random();
long randomNumber = min + random.Next() % (max - min);
d
My worked solution. Tested for 1000+ times:
public static long RandomLong(long min, long max)
{
return min + (long)RandomULong(0, (ulong)Math.Abs(max - min));
}
public static ulong RandomULong(ulong min, ulong max)
{
var hight = Rand.Next((int)(min >> 32), (int)(max >> 32));
var minLow = Math.Min((int)min, (int)max);
var maxLow = Math.Max((int)min, (int)max);
var low = (uint)Rand.Next(minLow, maxLow);
ulong result = (ulong)hight;
result <<= 32;
result |= (ulong)low;
return result;
}
How about generating bytes and converting to int64?
/* generate a byte array, then convert to unint64 */
var r = new Random(); // DONT do this for each call - use a static Random somewhere
var barray = new byte[64/8];
r.NextBytes(barray);
var rint64 = BitConverter.ToUInt64(barray, 0);
Sees to work for me (:
What's wrong with generating a double to be intended as a factor to be used to calculate the actual long value starting from the max value a long can be?!
long result = (long)Math.Round( random.NextDouble() * maxLongValue );
NextDouble generates a random number between [0.0, 0.99999999999999978] (msdn doc)
You multiply this random number by your maxLongValue.
You Math.Round that result so you can get the chance to get maxLongValue anyway (eg: simulate you got 1.0 from the NextDouble).
You cast back to long.

Analyze "whistle" sound for pitch/note

I am trying to build a system that will be able to process a record of someone whistling and output notes.
Can anyone recommend an open-source platform which I can use as the base for the note/pitch recognition and analysis of wave files ?
Thanks in advance
As many others have already said, FFT is the way to go here. I've written a little example in Java using FFT code from http://www.cs.princeton.edu/introcs/97data/. In order to run it, you will need the Complex class from that page also (see the source for the exact URL).
The code reads in a file, goes window-wise over it and does an FFT on each window. For each FFT it looks for the maximum coefficient and outputs the corresponding frequency. This does work very well for clean signals like a sine wave, but for an actual whistle sound you probably have to add more. I've tested with a few files with whistling I created myself (using the integrated mic of my laptop computer), the code does get the idea of what's going on, but in order to get actual notes more needs to be done.
1) You might need some more intelligent window technique. What my code uses now is a simple rectangular window. Since the FFT assumes that the input singal can be periodically continued, additional frequencies are detected when the first and the last sample in the window don't match. This is known as spectral leakage ( http://en.wikipedia.org/wiki/Spectral_leakage ), usually one uses a window that down-weights samples at the beginning and the end of the window ( http://en.wikipedia.org/wiki/Window_function ). Although the leakage shouldn't cause the wrong frequency to be detected as the maximum, using a window will increase the detection quality.
2) To match the frequencies to actual notes, you could use an array containing the frequencies (like 440 Hz for a') and then look for the frequency that's closest to the one that has been identified. However, if the whistling is off standard tuning, this won't work any more. Given that the whistling is still correct but only tuned differently (like a guitar or other musical instrument can be tuned differently and still sound "good", as long as the tuning is done consistently for all strings), you could still find notes by looking at the ratios of the identified frequencies. You can read http://en.wikipedia.org/wiki/Pitch_%28music%29 as a starting point on that. This is also interesting: http://en.wikipedia.org/wiki/Piano_key_frequencies
3) Moreover it might be interesting to detect the points in time when each individual tone starts and stops. This could be added as a pre-processing step. You could do an FFT for each individual note then. However, if the whistler doesn't stop but just bends between notes, this would not be that easy.
Definitely have a look at the libraries the others suggested. I don't know any of them, but maybe they contain already functionality for doing what I've described above.
And now to the code. Please let me know what worked for you, I find this topic pretty interesting.
Edit: I updated the code to include overlapping and a simple mapper from frequencies to notes. It works only for "tuned" whistlers though, as mentioned above.
package de.ahans.playground;
import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
public class FftMaxFrequency {
// taken from http://www.cs.princeton.edu/introcs/97data/FFT.java.html
// (first hit in Google for "java fft"
// needs Complex class from http://www.cs.princeton.edu/introcs/97data/Complex.java
public static Complex[] fft(Complex[] x) {
int N = x.length;
// base case
if (N == 1) return new Complex[] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2"); }
// fft of even terms
Complex[] even = new Complex[N/2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
Complex[] q = fft(even);
// fft of odd terms
Complex[] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
Complex[] r = fft(odd);
// combine
Complex[] y = new Complex[N];
for (int k = 0; k < N/2; k++) {
double kth = -2 * k * Math.PI / N;
Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
y[k] = q[k].plus(wk.times(r[k]));
y[k + N/2] = q[k].minus(wk.times(r[k]));
}
return y;
}
static class AudioReader {
private AudioFormat audioFormat;
public AudioReader() {}
public double[] readAudioData(File file) throws UnsupportedAudioFileException, IOException {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
audioFormat = in.getFormat();
int depth = audioFormat.getSampleSizeInBits();
long length = in.getFrameLength();
if (audioFormat.isBigEndian()) {
throw new UnsupportedAudioFileException("big endian not supported");
}
if (audioFormat.getChannels() != 1) {
throw new UnsupportedAudioFileException("only 1 channel supported");
}
byte[] tmp = new byte[(int) length];
byte[] samples = null;
int bytesPerSample = depth/8;
int bytesRead;
while (-1 != (bytesRead = in.read(tmp))) {
if (samples == null) {
samples = Arrays.copyOf(tmp, bytesRead);
} else {
int oldLen = samples.length;
samples = Arrays.copyOf(samples, oldLen + bytesRead);
for (int i = 0; i < bytesRead; i++) samples[oldLen+i] = tmp[i];
}
}
double[] data = new double[samples.length/bytesPerSample];
for (int i = 0; i < samples.length-bytesPerSample; i += bytesPerSample) {
int sample = 0;
for (int j = 0; j < bytesPerSample; j++) sample += samples[i+j] << j*8;
data[i/bytesPerSample] = (double) sample / Math.pow(2, depth);
}
return data;
}
public AudioFormat getAudioFormat() {
return audioFormat;
}
}
public class FrequencyNoteMapper {
private final String[] NOTE_NAMES = new String[] {
"A", "Bb", "B", "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#"
};
private final double[] FREQUENCIES;
private final double a = 440;
private final int TOTAL_OCTAVES = 6;
private final int START_OCTAVE = -1; // relative to A
public FrequencyNoteMapper() {
FREQUENCIES = new double[TOTAL_OCTAVES*12];
int j = 0;
for (int octave = START_OCTAVE; octave < START_OCTAVE+TOTAL_OCTAVES; octave++) {
for (int note = 0; note < 12; note++) {
int i = octave*12+note;
FREQUENCIES[j++] = a * Math.pow(2, (double)i / 12.0);
}
}
}
public String findMatch(double frequency) {
if (frequency == 0)
return "none";
double minDistance = Double.MAX_VALUE;
int bestIdx = -1;
for (int i = 0; i < FREQUENCIES.length; i++) {
if (Math.abs(FREQUENCIES[i] - frequency) < minDistance) {
minDistance = Math.abs(FREQUENCIES[i] - frequency);
bestIdx = i;
}
}
int octave = bestIdx / 12;
int note = bestIdx % 12;
return NOTE_NAMES[note] + octave;
}
}
public void run (File file) throws UnsupportedAudioFileException, IOException {
FrequencyNoteMapper mapper = new FrequencyNoteMapper();
// size of window for FFT
int N = 4096;
int overlap = 1024;
AudioReader reader = new AudioReader();
double[] data = reader.readAudioData(file);
// sample rate is needed to calculate actual frequencies
float rate = reader.getAudioFormat().getSampleRate();
// go over the samples window-wise
for (int offset = 0; offset < data.length-N; offset += (N-overlap)) {
// for each window calculate the FFT
Complex[] x = new Complex[N];
for (int i = 0; i < N; i++) x[i] = new Complex(data[offset+i], 0);
Complex[] result = fft(x);
// find index of maximum coefficient
double max = -1;
int maxIdx = 0;
for (int i = result.length/2; i >= 0; i--) {
if (result[i].abs() > max) {
max = result[i].abs();
maxIdx = i;
}
}
// calculate the frequency of that coefficient
double peakFrequency = (double)maxIdx*rate/(double)N;
// and get the time of the start and end position of the current window
double windowBegin = offset/rate;
double windowEnd = (offset+(N-overlap))/rate;
System.out.printf("%f s to %f s:\t%f Hz -- %s\n", windowBegin, windowEnd, peakFrequency, mapper.findMatch(peakFrequency));
}
}
public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
new FftMaxFrequency().run(new File("/home/axr/tmp/entchen.wav"));
}
}
i think this open-source platform suits you
http://code.google.com/p/musicg-sound-api/
Well, you could always use fftw to perform the Fast Fourier Transform. It's a very well respected framework. Once you've got an FFT of your signal you can analyze the resultant array for peaks. A simple histogram style analysis should give you the frequencies with the greatest volume. Then you just have to compare those frequencies to the frequencies that correspond with different pitches.
in addition to the other great options:
csound pitch detection: http://www.csounds.com/manual/html/pvspitch.html
fmod: http://www.fmod.org/ (has a free version)
aubio: http://aubio.org/doc/pitchdetection_8h.html
You might want to consider Python(x,y). It's a scientific programming framework for python in the spirit of Matlab, and it has easy functions for working in the FFT domain.
If you use Java, have a look at TarsosDSP library. It has a pretty good ready-to-go pitch detector.
Here is an example for android, but I think it doesn't require too much modifications to use it elsewhere.
I'm a fan of the FFT but for the monophonic and fairly pure sinusoidal tones of whistling, a zero-cross detector would do a far better job at determining the actual frequency at a much lower processing cost. Zero-cross detection is used in electronic frequency counters that measure the clock rate of whatever is being tested.
If you going to analyze anything other than pure sine wave tones, then FFT is definitely the way to go.
A very simple implementation of zero cross detection in Java on GitHub

Resources