groovy fibonacchi trampoline and memoize - groovy

I am trying to evaluate fibonacchi sequence for 10000000.
Using basic trampoline it looks like this
def rFibonacchi
rFibonacchi = {
BigInteger n, prev = 0, next = 1 ->
(n < 2) ? prev : rFibonacchi.trampoline(n - 1, prev, next + prev)
}.trampoline()
But using trampoline & memoize combo I am constantly getting OutOfMemoryError.
def tFibonacchi, mFibonacchi
mFibonacchi = { BigInteger n, prev, next ->
n < 2 ? prev : tFibonacchi.trampoline(n - 1, next, prev + next)
}.memoize()
tFibonacchi = { BigInteger n, prev = 0, next = 1 ->
mFibonacchi(n, next, prev)
}.trampoline()
tFibonacchi(10000000); // GC overhead limit exceed
Is it my algorithm's issue?

Your algorithm doesn't get any bonus from using memoization. To quote the groovy docs on memoization:
Memoization allows the result of the call of a closure to be cached. It is interesting if the computation done by a function (closure) is slow, but you know that this function is going to be called often with the same arguments. A typical example is the Fibonacci suite. A naive implementation may look like this:
def fib
fib = { long n -> n<2?n:fib(n-1)+fib(n-2) }
assert fib(15) == 610 // slow!
It is a naive implementation because 'fib' is often called recursively with the same arguments, leading to an exponential algorithm:
computing fib(15) requires the result of fib(14) and fib(13)
computing fib(14) requires the result of fib(13) and fib(12)
Since calls are recursive, you can already see that we will compute the same values again and again, although they could be cached. This naive implementation can be "fixed" by caching the result of calls using memoize:
fib = { long n -> n<2?n:fib(n-1)+fib(n-2) }.memoize()
assert fib(25) == 75025 // fast!
The cache works using the actual values of the arguments.
You are using an improved Fibonacci algorithm to the above one. Yours is more iterative and it never calls mFibonacchi twice with all of the same arguments. This causes groovy to cache the result of each call, but never actually use this cache, which leads to the memory overflow. The memoization is actually an issue.
Your algorithm is equivalent to:
BigInteger fibonacchi(BigInteger n) {
BigInteger prev = 0, next = 1
for (; n > 2; n--) {
BigInteger temp = prev
prev = next
next = prev + temp
}
return prev
}

Related

Math.Net Exponential Moving Average

I'm using simple moving average in Math.Net, but now that I also need to calculate EMA (exponential moving average) or any kind of weighted moving average, I don't find it in the library.
I looked over all methods under MathNet.Numerics.Statistics and beyond, but didn't find anything similar.
Is it missing in library or I need to reference some additional package?
I don't see any EMA in MathNet.Numerics, however it's trivial to program. The routine below is based on the definition at Investopedia.
public double[] EMA(double[] x, int N)
{
// x is the input series
// N is the notional age of the data used
// k is the smoothing constant
double k = 2.0 / (N + 1);
double[] y = new double[x.Length];
y[0] = x[0];
for (int i = 1; i < x.Length; i++) y[i] = k * x[i] + (1 - k) * y[i - 1];
return y;
}
Occasionally I found this package: https://daveskender.github.io/Stock.Indicators/docs/INDICATORS.html It targets to the latest .NET framework and has very detailed documents.
Try this:
public IEnumerable<double> EMA(IEnumerable<double> items, int notationalAge)
{
double k = 2.0d / (notationalAge + 1), prev = 0.0d;
var e = items.GetEnumerator();
if (!e.MoveNext()) yield break;
yield return prev = e.Current;
while(e.MoveNext())
{
yield return prev = (k * e.Current) + (1 - k) * prev;
}
}
It will still work with arrays, but also List, Queue, Stack, IReadOnlyCollection, etc.
Although it's not explicitly stated I also get the sense this is working with money, in which case it really ought to use decimal instead of double.

What is the worst case for binary search

Where should an element be located in the array so that the run time of the Binary search algorithm is O(log n)?
The first or last element will give the worst case complexity in binary search as you'll have to do maximum no of comparisons.
Example:
1 2 3 4 5 6 7 8 9
Here searching for 1 will give you the worst case, with the result coming in 4th pass.
1 2 3 4 5 6 7 8
In this case, searching for 8 will give the worst case, with the result coming in 4 passes.
Note that in the second case searching for 1 (the first element) can be done in just 3 passes. (compare 1 & 4, compare 1 & 2 and finally 1)
So, if no. of elements are even, the last element gives the worst case.
This is assuming all arrays are 0 indexed. This happens due to considering the mid as float of (start + end) /2.
// Java implementation of iterative Binary Search
class BinarySearch
{
// Returns index of x if it is present in arr[],
// else return -1
int binarySearch(int arr[], int x)
{
int l = 0, r = arr.length - 1;
while (l <= r)
{
int m = l + (r-l)/2;
// Check if x is present at mid
if (arr[m] == x)
return m;
// If x greater, ignore left half
if (arr[m] < x)
l = m + 1;
// If x is smaller, ignore right half
else
r = m - 1;
}
// if we reach here, then element was
// not present
return -1;
}
// Driver method to test above
public static void main(String args[])
{
BinarySearch ob = new BinarySearch();
int arr[] = {2, 3, 4, 10, 40};
int n = arr.length;
int x = 10;
int result = ob.binarySearch(arr, x);
if (result == -1)
System.out.println("Element not present");
else
System.out.println("Element found at " +
"index " + result);
}
}
Time Complexity:
The time complexity of Binary Search can be written as
T(n) = T(n/2) + c
The above recurrence can be solved either using Recurrence T ree method or Master method. It falls in case II of Master Method and solution of the recurrence is Theta(Logn).
Auxiliary Space: O(1) in case of iterative implementation. In case of recursive implementation, O(Logn) recursion call stack space.

How do I return the smallest value using a for loop?

I am given a limit, and I have to return the smallest value for n to make it true: 1+2+3+4+...+n >= limit. I feel like there's one thing missing, but I can't tell.
public int whenToReachLimit(int limit) {
int sum = 0;
for (int i = 1; sum < limit; i++) {
sum = sum + i;
}
return sum;
}
The output would be:
1 : 1
4 : 3
10 : 4
You get avoid the loop to compute the sum of the n first integers, using:
Thus the inequality becomes:
Notice that the left-hand side is positive (if n is negative, the sum is empty) and strictly increasing. Notice also that you are looking for the first integer satisfying the inequality. The idea here is first to replace the inequality by an equality which will allow us to solve the equation for n. In a second step, the possibly non-integer solution will be rounder to the closest integer.
Solving this equation for n should give you two solutions. The negative one can be discarded (remember n is positive). That is:
Finally, let's round this solution to the closest integer that will also satisfy the inequality:
NB: it can be overkilled for small inputs
I'm not sure if I know exactly what you want to do. But I would recommend to make a "practice run".
If Limit = 0 the function returns 0
If Limit = 1 the function returns 1
If Limit = 2 the function return 3
If Limit = 3 the function return 3
If Limit = 4 the function return 6
If Limit = 5 the function return 6
Now you decide by your own if the functions does what you're expecting.
I've found the answer. Turns out it doesn't work with a for loop which I find odd. But this is the answer to my own question.
public int whenToReachLimit(int limit) {
int n = 0;
int sum = 0;
while (sum < limit) {
sum += n;
n++;
}
return n-1;
}
You don't want to return sum, you want to return n (smallest possible value satisfying the given requirement).
return i-1 instead of sum.

Performance difference in toString.map and toString.toArray.map

While coding Euler problems, I ran across what I think is bizarre:
The method toString.map is slower than toString.toArray.map.
Here's an example:
def main(args: Array[String])
{
def toDigit(num : Int) = num.toString.map(_ - 48) //2137 ms
def toDigitFast(num : Int) = num.toString.toArray.map(_ - 48) //592 ms
val startTime = System.currentTimeMillis;
(1 to 1200000).map(toDigit)
println(System.currentTimeMillis - startTime)
}
Shouldn't the method map on String fallback to a map over the array? Why is there such a noticeable difference? (Note that increasing the number even causes an stack overflow on the non-array case).
Original
Could be because toString.map uses the WrappedString implicit, while toString.toArray.map uses the WrappedArray implicit to resolve map.
Let's see map, as defined in TraversableLike:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
b.sizeHint(this)
for (x <- this) b += f(x)
b.result
}
WrappedString uses a StringBuilder as builder:
def +=(x: Char): this.type = { append(x); this }
def append(x: Any): StringBuilder = {
underlying append String.valueOf(x)
this
}
The String.valueOf call for Any uses Java Object.toString on the Char instances, possibly getting boxed first. These extra ops might be the cause of speed difference, versus the supposedly shorter code paths of the Array builder.
This is a guess though, would have to measure.
Edit
After revising, the general point still stands, but the I referred the wrong implicits, since the toDigit methods return an Int sequence (or like), not a translated string as I misread.
toDigit uses LowPriorityImplicits.fallbackStringCanBuildFrom[T]: CanBuildFrom[String, T, immutable.IndexedSeq[T]], with T = Int, which just defers to a general IndexedSeq builder.
toDigitFast uses a direct Array implicit of type CanBuildFrom[Array[_], T, Array[T]], which is unarguably faster.
Passing the following CBF for toDigit explicitly makes the two methods on par:
object FastStringToArrayBuild {
def canBuildFrom[T : ClassManifest] = new CanBuildFrom[String, T, Array[T]] {
private def newBuilder = scala.collection.mutable.ArrayBuilder.make()
def apply(from: String) = newBuilder
def apply() = newBuilder
}
}
You're being fooled by running out of memory. The toDigit version does create more intermediate objects, but if you have plenty of memory then the GC won't be heavily impacted (and it'll all run faster). For example, if instead of creating 1.2 million numbers, I create 12k 100x in a row, I get approximately equal times for the two methods. If I create 1.2k 5-digit numbers 1000x in a row, I find that toDigit is about 5% faster.
Given that the toDigit method produces an immutable collection, which is better when all else is equal since it is easier to reason about, and given that all else is equal for all but highly demanding tasks, I think the library is as it should be.
When trying to improve performance, of course one needs to keep all sorts of tricks in mind; one of these is that arrays have better memory characteristics for collections of known length than do the fancy collections in the Scala library. Also, one needs to know that map isn't the fastest way to get things done; if you really wanted this to be fast you should
final def toDigitReallyFast(num: Int, accum: Long = 0L, iter: Int = 0): Array[Byte] = {
if (num==0) {
val ans = new Array[Byte](math.max(1,iter))
var i = 0
var ac = accum
while (i < ans.length) {
ans(ans.length-i-1) = (ac & 0xF).toByte
ac >>= 4
i += 1
}
ans
}
else {
val next = num/10
toDigitReallyFast(next, (accum << 4) | (num-10*next), iter+1)
}
}
which on my machine is at 4x faster than either of the others. And you can get almost 3x faster yet again if you leave everything in a Long and pack the results in an array instead of using 1 to N:
final def toDigitExtremelyFast(num: Int, accum: Long = 0L, iter: Int = 0): Long = {
if (num==0) accum | (iter.toLong << 48)
else {
val next = num/10
toDigitExtremelyFast(next, accum | ((num-10*next).toLong<<(4*iter)), iter+1)
}
}
// loop, instead of 1 to N map, for the 1.2k number case
{
var i = 10000
val a = new Array[Long](1201)
while (i<=11200) {
a(i-10000) = toDigitReallyReallyFast(i)
i += 1
}
a
}
As with many things, performance tuning is highly dependent on exactly what you want to do. In contrast, library design has to balance many different concerns. I do think it's worth noticing where the library is sub-optimal with respect to performance, but this isn't really one of those cases IMO; the flexibility is worth it for the common use cases.

Tail recursion with Groovy

I coded 3 factorial algorithms:
I expect to fail by stack overflow. No problem.
I try a tail recursive call, and convert the previous algorithm from recursive to iterative. It doesn't work, but I don't understand why.
I use trampoline() method and it works fine as I expect.
def factorial
factorial = { BigInteger n ->
if (n == 1) return 1
n * factorial(n - 1)
}
factorial(1000) // stack overflow
factorial = { Integer n, BigInteger acc = 1 ->
if (n == 1) return acc
factorial(n - 1, n * acc)
}
factorial(1000) // stack overflow, why?
factorial = { Integer n, BigInteger acc = 1 ->
if (n == 1) return acc
factorial.trampoline(n - 1, n * acc)
}.trampoline()
factorial(1000) // It works.
There is no tail recursion in Java, and hence there is none in Groovy either (without using something like trampoline() as you have shown)
The closest I have seen to this, is an AST transformation which cleverly wraps the return recursion into a while loop
Edit
You're right, Java (and hence Groovy) do support this sort of tail-call iteration, however, it doesn't seem to work with Closures...
This code however (using a method rather than a closure for the fact call):
public class Test {
BigInteger fact( Integer a, BigInteger acc = 1 ) {
if( a == 1 ) return acc
fact( a - 1, a * acc )
}
static main( args ) {
def t = new Test()
println "${t.fact( 1000 )}"
}
}
When saved as Test.groovy and executed with groovy Test.groovy runs, and prints the answer:
402387260077093773543702433923003985719374864210714632543799910429938512398629020592044208486969404800479988610197196058631666872994808558901323829669944590997424504087073759918823627727188732519779505950995276120874975462497043601418278094646496291056393887437886487337119181045825783647849977012476632889835955735432513185323958463075557409114262417474349347553428646576611667797396668820291207379143853719588249808126867838374559731746136085379534524221586593201928090878297308431392844403281231558611036976801357304216168747609675871348312025478589320767169132448426236131412508780208000261683151027341827977704784635868170164365024153691398281264810213092761244896359928705114964975419909342221566832572080821333186116811553615836546984046708975602900950537616475847728421889679646244945160765353408198901385442487984959953319101723355556602139450399736280750137837615307127761926849034352625200015888535147331611702103968175921510907788019393178114194545257223865541461062892187960223838971476088506276862967146674697562911234082439208160153780889893964518263243671616762179168909779911903754031274622289988005195444414282012187361745992642956581746628302955570299024324153181617210465832036786906117260158783520751516284225540265170483304226143974286933061690897968482590125458327168226458066526769958652682272807075781391858178889652208164348344825993266043367660176999612831860788386150279465955131156552036093988180612138558600301435694527224206344631797460594682573103790084024432438465657245014402821885252470935190620929023136493273497565513958720559654228749774011413346962715422845862377387538230483865688976461927383814900140767310446640259899490222221765904339901886018566526485061799702356193897017860040811889729918311021171229845901641921068884387121855646124960798722908519296819372388642614839657382291123125024186649353143970137428531926649875337218940694281434118520158014123344828015051399694290153483077644569099073152433278288269864602789864321139083506217095002597389863554277196742822248757586765752344220207573630569498825087968928162753848863396909959826280956121450994871701244516461260379029309120889086942028510640182154399457156805941872748998094254742173582401063677404595741785160829230135358081840096996372524230560855903700624271243416909004153690105933983835777939410970027753472000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
As a guess, I would say that the JVM does not know how to optimise closures (like it does with methods), so this tail call does not get optimised out in the bytecode before it is executed
Starting with version 2.3 Groovy supports tail recursion with the #TailRecursive annotation for methods:
http://java.dzone.com/articles/groovy-goodness-more-efficient

Resources