stack overflow while calculing permutations or a string in Scala

stack overflow while calculing permutations or a string in Scala - string

All is in the title, and the code is here:
implicit class utils(val chaîne: String) {
def permutations1(): List[String] = {
if (chaîne.length() == 0) List()
else
if (chaîne.length() == 1) List(chaîne)
else {
val retour1=for {i:Int <- 0 to chaîne.length() - 2
chaîne_réduite = chaîne.drop(i)
liste_avec_chaîne_réduite = chaîne_réduite.permutations1()
une_chaîne_réduite_et_permutée <- liste_avec_chaîne_réduite
j <- 0 to une_chaîne_réduite_et_permutée.length()
}
yield new StringBuilder(une_chaîne_réduite_et_permutée).insert(j, chaîne(j)).toString
retour1.toList
}
}
}
Can you explain me why it does not work and eventually correct my code to make it avoid the stack overflow?

Isn't the problem NP-complete? Thus you may only run any code with very limited length of strings.
To make it work on a reasonable string lengths careful optimization is required. For instance, to improve the performance you may try #tailrec optimization.
Representation in the form of String and StringBuilder is very inefficient for the task. Try List of Char for instance.

I found an answer myself:
implicit class utils (val chaîne: String) {
def permutations1 : Seq [String] = {
if (chaîne.size == 1) Seq (chaîne)
else chaîne.flatMap (x => chaîne.filterNot (_ == x).permutations1.map (x + _))
}
}

Related

Print characters at even and odd indices from a String

Using scala, how to print string in even and odd indices of a given string? I am aware of the imperative approach using var. I am looking for an approach that uses immutability, avoids side-effects (of course, until need to print result) and concise.

Here is a tail-recursive solution returning even and odd chars (List[Char], List[Char]) in one go
def f(in: String): (List[Char], List[Char]) = {
#tailrec def run(s: String, idx: Int, accEven: List[Char], accOdd: List[Char]): (List[Char], List[Char]) = {
if (idx < 0) (accEven, accOdd)
else if (idx % 2 == 0) run(s, idx - 1, s.charAt(idx) :: accEven, accOdd)
else run(s, idx - 1, accEven, s.charAt(idx) :: accOdd)
}
run(in, in.length - 1, Nil, Nil)
}
which could be printed like so
val (even, odd) = f("abcdefg")
println(even.mkString)

Another way to explore is using zipWithIndex
def printer(evenOdd: Int) {
val str = "1234"
str.zipWithIndex.foreach { i =>
i._2 % 2 match {
case x if x == evenOdd => print(i._1)
case _ =>
}
}
}
In this case you can check the results by using the printer function
scala> printer(1)
24
scala> printer(0)
13
.zipWithIndex takes a List and returns tuples of the elements coupled with their index. Knowing that a String is a list of Char
Looking at str
scala> val str = "1234"
str: String = 1234
str.zipWithIndex
res: scala.collection.immutable.IndexedSeq[(Char, Int)] = Vector((1,0), (2,1), (3,2), (4,3))
Lastly, as you only need to print, using foreach instead of map is more ideal as you aren't expecting values to be returned

You can use the sliding function, which is quite simple:
scala> "abcdefgh".sliding(1,2).mkString("")
res16: String = aceg
scala> "abcdefgh".tail.sliding(1,2).mkString("")
res17: String = bdfh

val s = "abcd"
// ac
(0 until s.length by 2).map(i => s(i))
// bd
(1 until s.length by 2).map(i => s(i))
just pure functions with map operator

Parallel Merge Sort in Scala

I have been trying to implement parallel merge sort in Scala. But with 8 cores, using .sorted is still about twice as fast.
edit:
I rewrote most of the code to minimize object creation. Now it runs about as fast as the .sorted
Input file with 1.2M integers:
1.333580 seconds (my implementation)
1.439293 seconds (.sorted)
How should I parallelize this?
New implementation
object Mergesort extends App
{
//=====================================================================================================================
// UTILITY
implicit object comp extends Ordering[Any] {
def compare(a: Any, b: Any) = {
(a, b) match {
case (a: Int, b: Int) => a compare b
case (a: String, b: String) => a compare b
case _ => 0
}
}
}
//=====================================================================================================================
// MERGESORT
val THRESHOLD = 30
def inssort[A](a: Array[A], left: Int, right: Int): Array[A] = {
for (i <- (left+1) until right) {
var j = i
val item = a(j)
while (j > left && comp.lt(item,a(j-1))) {
a(j) = a(j-1)
j -= 1
}
a(j) = item
}
a
}
def mergesort_merge[A](a: Array[A], temp: Array[A], left: Int, right: Int, mid: Int) : Array[A] = {
var i = left
var j = right
while (i < mid) { temp(i) = a(i); i+=1; }
while (j > mid) { temp(i) = a(j-1); i+=1; j-=1; }
i = left
j = right-1
var k = left
while (k < right) {
if (comp.lt(temp(i), temp(j))) { a(k) = temp(i); i+=1; k+=1; }
else { a(k) = temp(j); j-=1; k+=1; }
}
a
}
def mergesort_split[A](a: Array[A], temp: Array[A], left: Int, right: Int): Array[A] = {
if (right-left == 1) a
if ((right-left) > THRESHOLD) {
val mid = (left+right)/2
mergesort_split(a, temp, left, mid)
mergesort_split(a, temp, mid, right)
mergesort_merge(a, temp, left, right, mid)
}
else
inssort(a, left, right)
}
def mergesort[A: ClassTag](a: Array[A]): Array[A] = {
val temp = new Array[A](a.size)
mergesort_split(a, temp, 0, a.size)
}
Previous implementation
Input file with 1.2M integers:
4.269937 seconds (my implementation)
1.831767 seconds (.sorted)
What sort of tricks there are to make it faster and cleaner?
object Mergesort extends App
{
//=====================================================================================================================
// UTILITY
val StartNano = System.nanoTime
def dbg(msg: String) = println("%05d DBG ".format(((System.nanoTime - StartNano)/1e6).toInt) + msg)
def time[T](work: =>T) = {
val start = System.nanoTime
val res = work
println("%f seconds".format((System.nanoTime - start)/1e9))
res
}
implicit object comp extends Ordering[Any] {
def compare(a: Any, b: Any) = {
(a, b) match {
case (a: Int, b: Int) => a compare b
case (a: String, b: String) => a compare b
case _ => 0
}
}
}
//=====================================================================================================================
// MERGESORT
def merge[A](left: List[A], right: List[A]): Stream[A] = (left, right) match {
case (x :: xs, y :: ys) if comp.lteq(x, y) => x #:: merge(xs, right)
case (x :: xs, y :: ys) => y #:: merge(left, ys)
case _ => if (left.isEmpty) right.toStream else left.toStream
}
def sort[A](input: List[A], length: Int): List[A] = {
if (length < 100) return input.sortWith(comp.lt)
input match {
case Nil | List(_) => input
case _ =>
val middle = length / 2
val (left, right) = input splitAt middle
merge(sort(left, middle), sort(right, middle + length%2)).toList
}
}
def msort[A](input: List[A]): List[A] = sort(input, input.length)
//=====================================================================================================================
// PARALLELIZATION
//val cores = Runtime.getRuntime.availableProcessors
//dbg("Detected %d cores.".format(cores))
//lazy implicit val ec = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(cores))
def futuremerge[A](fa: Future[List[A]], fb: Future[List[A]])(implicit order: Ordering[A], ec: ExecutionContext) =
{
for {
a <- fa
b <- fb
} yield merge(a, b).toList
}
def parallel_msort[A](input: List[A], length: Int)(implicit order: Ordering[A]): Future[List[A]] = {
val middle = length / 2
val (left, right) = input splitAt middle
if(length > 500) {
val fl = parallel_msort(left, middle)
val fr = parallel_msort(right, middle + length%2)
futuremerge(fl, fr)
}
else {
Future(msort(input))
}
}
//=====================================================================================================================
// MAIN
val results = time({
val src = Source.fromFile("in.txt").getLines
val header = src.next.split(" ").toVector
val lines = if (header(0) == "i") src.map(_.toInt).toList else src.toList
val f = parallel_msort(lines, lines.length)
Await.result(f, concurrent.duration.Duration.Inf)
})
println("Sorted as comparison...")
val sorted_src = Source.fromFile(input_folder+"in.txt").getLines
sorted_src.next
time(sorted_src.toList.sorted)
val writer = new PrintWriter("out.txt", "UTF-8")
try writer.print(results.mkString("\n"))
finally writer.close
}

My answer is probably going to be a bit long, but i hope that it will be useful for both you and me.
So, first question is: "how scala is doing sorting for a List?" Let's have a look at the code from scala repo!
def sorted[B >: A](implicit ord: Ordering[B]): Repr = {
val len = this.length
val b = newBuilder
if (len == 1) b ++= this
else if (len > 1) {
b.sizeHint(len)
val arr = new Array[AnyRef](len) // Previously used ArraySeq for more compact but slower code
var i = 0
for (x <- this) {
arr(i) = x.asInstanceOf[AnyRef]
i += 1
}
java.util.Arrays.sort(arr, ord.asInstanceOf[Ordering[Object]])
i = 0
while (i < arr.length) {
b += arr(i).asInstanceOf[A]
i += 1
}
}
b.result()
}
So what the hell is going on here? Long story short: with java. Everything else is just size justification and casting. Basically this is the line which defines it:
java.util.Arrays.sort(arr, ord.asInstanceOf[Ordering[Object]])
Let's go one level deeper into JDK sources:
public static <T> void sort(T[] a, Comparator<? super T> c) {
if (c == null) {
sort(a);
} else {
if (LegacyMergeSort.userRequested)
legacyMergeSort(a, c);
else
TimSort.sort(a, 0, a.length, c, null, 0, 0);
}
}
legacyMergeSort is nothing but single threaded implementation of merge sort algorithm.
The next question is: "what is TimSort.sort and when do we use it?"
To my best knowledge default value for this property is false, which leads us to TimSort.sort algorithm. Description can be found here. Why is it better? Less comparisons that in merge sort according to comments in JDK sources.
Moreover you should be aware that it is all single threaded, so no parallelization here.
Third question, "your code":
You create too many objects. When it comes to performance, mutation (sadly) is your friend.
Premature optimization is the root of all evil -- Donald Knuth. Before making any optimizations (like parallelism), try to implement single threaded version and compare the results.
Use something like JMH to test performance of your code.
You should not probably use Stream class if you want to have the best performance as it does additional caching.
I intentionally did not give you answer like "super-fast merge sort in scala can be found here", but just some tips for you to apply to your code and coding practices.
Hope it will help you.

How can I translate this Groovy function to C#?

I would like to translate to C# the following Groovy code
def find_perfect_numbers(def number) {
(2..number).findAll { x-> (1..(x/2)).findAll{ x % it == 0 }.sum() == x }
}
which I got from here.
This is what I have, but it's not ready yet, doesn't compile either. I don't understand the groovy code good enough.
public List<int> find_perfect_numbers(int number)
{
List<int> lst = new List<int>();
lst = 2.To(number).FindAll(x => (1.To(x/2)).FindAll( x % it == 0).Sum() == x);
return lst;
}
I can't translate the part x % it == 0 (because "it" is an index).
I want the C# code to look as much like the groovy function as possible. Specifically, the line lst = 2.To( .....
I don't want to use a different solution to find perfect numbers (I have another working function already). For me this is only about the syntax, not about a good "perfect numbers function".
It's OK to create new (extension) functions that help doing this, just like the To function I used:
For the To function above I have used this StackOverflow function:
Generating sets of integers in C#
and changed it a little so that it returns a List of int instead of an array of int
public static class ListExtensions
{
public static List<int> To(this int start, int end)
{
return Enumerable.Range(start, end - start + 1).ToList();
}
}
Can anyone help me?
=== Update ===
This is what I have now, but it's not working yet, I get
DivideByZeroException was unhandled at the part s.value % s.idx == 0:
lst = 2.To(number).FindAll(x => ((1.To(x / 2)).Select((y, index) => new {value = y, idx = index}).Where( s => s.value % s.idx == 0).Sum(t => t.value) == (decimal)x));

I found it myself.
lst = 2.To(number)
.FindAll(x => ((1.To(x / 2))
.Select((y, index) => new {value = y, idx = index+1})
.Where( s => x % s.idx == 0)
.Sum(t => t.value) == (decimal)x));
Not as pretty as the Groovy one, but it works.

Reading Multiple Inputs from the Same Line the Scala Way

I tried to use readInt() to read two integers from the same line but that is not how it works.
val x = readInt()
val y = readInt()
With an input of 1 727 I get the following exception at runtime:
Exception in thread "main" java.lang.NumberFormatException: For input string: "1 727"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:231)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at scala.Console$.readInt(Console.scala:356)
at scala.Predef$.readInt(Predef.scala:201)
at Main$$anonfun$main$1.apply$mcVI$sp(Main.scala:11)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:75)
at Main$.main(Main.scala:10)
at Main.main(Main.scala)
I got the program to work by using readf but it seems pretty awkward and ugly to me:
val (x,y) = readf2("{0,number} {1,number}")
val a = x.asInstanceOf[Int]
val b = y.asInstanceOf[Int]
println(function(a,b))
Someone suggested that I just use Java's Scanner class, (Scanner.nextInt()) but is there a nice idiomatic way to do it in Scala?
Edit:
My solution following paradigmatic's example:
val Array(a,b) = readLine().split(" ").map(_.toInt)
Followup Question: If there were a mix of types in the String how would you extract it? (Say a word, an int and a percentage as a Double)

If you mean how would you convert val s = "Hello 69 13.5%" into a (String, Int, Double) then the most obvious way is
val tokens = s.split(" ")
(tokens(0).toString,
tokens(1).toInt,
tokens(2).init.toDouble / 100)
// (java.lang.String, Int, Double) = (Hello,69,0.135)
Or as mentioned you could match using a regex:
val R = """(.*) (\d+) (\d*\.?\d*)%""".r
s match {
case R(str, int, dbl) => (str, int.toInt, dbl.toDouble / 100)
}
If you don't actually know what data is going to be in the String, then there probably isn't much reason to convert it from a String to the type it represents, since how can you use something that might be a String and might be in Int? Still, you could do something like this:
val int = """(\d+)""".r
val pct = """(\d*\.?\d*)%""".r
val res = s.split(" ").map {
case int(x) => x.toInt
case pct(x) => x.toDouble / 100
case str => str
} // Array[Any] = Array(Hello, 69, 0.135)
now to do anything useful you'll need to match on your values by type:
res.map {
case x: Int => println("It's an Int!")
case x: Double => println("It's a Double!")
case x: String => println("It's a String!")
case _ => println("It's a Fail!")
}
Or if you wanted to take things a bit further, you could define some extractors which will do the conversion for you:
abstract class StringExtractor[A] {
def conversion(s: String): A
def unapply(s: String): Option[A] = try { Some(conversion(s)) }
catch { case _ => None }
}
val intEx = new StringExtractor[Int] {
def conversion(s: String) = s.toInt
}
val pctEx = new StringExtractor[Double] {
val pct = """(\d*\.?\d*)%""".r
def conversion(s: String) = s match { case pct(x) => x.toDouble / 100 }
}
and use:
"Hello 69 13.5%".split(" ").map {
case intEx(x) => println(x + " is Int: " + x.isInstanceOf[Int])
case pctEx(x) => println(x + " is Double: " + x.isInstanceOf[Double])
case str => println(str)
}
prints
Hello
69 is Int: true
0.135 is Double: true
Of course, you can make the extrators match on anything you want (currency mnemonic, name begging with 'J', URL) and return whatever type you want. You're not limited to matching Strings either, if instead of StringExtractor[A] you make it Extractor[A, B].

You can read the line as a whole, split it using spaces and then convert each element (or the one you want) to ints:
scala> "1 727".split(" ").map( _.toInt )
res1: Array[Int] = Array(1, 727)
For most complex inputs, you can have a look at parser combinators.

The input you are describing is not two Ints but a String which just happens to be two Ints. Hence you need to read the String, split by the space and convert the individual Strings into Ints as suggested by #paradigmatic.

One way would be splitting and mapping:
// Assuming whatever is being read is assigned to "input"
val input = "1 727"
val Array(x, y) = input split " " map (_.toInt)
Or, if you have things a bit more complicated than that, a regular expression is usually good enough.
val twoInts = """^\s*(\d+)\s*(\d+)""".r
val Some((x, y)) = for (twoInts(a, b) <- twoInts findFirstIn input) yield (a, b)
There are other ways to use regex. See the Scala API docs about them.
Anyway, if regex patterns are becoming too complicated, then you should appeal to Scala Parser Combinators. Since you can combine both, you don't loose any of regex's power.
import scala.util.parsing.combinator._
object MyParser extends JavaTokenParsers {
def twoInts = wholeNumber ~ wholeNumber ^^ { case a ~ b => (a.toInt, b.toInt) }
}
val MyParser.Success((x, y), _) = MyParser.parse(MyParser.twoInts, input)
The first example was more simple, but harder to adapt to more complex patterns, and more vulnerable to invalid input.

I find that extractors provide some machinery that makes this type of processing nicer. And I think it works up to a certain point nicely.
object Tokens {
def unapplySeq(line: String): Option[Seq[String]] =
Some(line.split("\\s+").toSeq)
}
class RegexToken[T](pattern: String, convert: (String) => T) {
val pat = pattern.r
def unapply(token: String): Option[T] = token match {
case pat(s) => Some(convert(s))
case _ => None
}
}
object IntInput extends RegexToken[Int]("^([0-9]+)$", _.toInt)
object Word extends RegexToken[String]("^([A-Za-z]+)$", identity)
object Percent extends RegexToken[Double](
"""^([0-9]+\.?[0-9]*)%$""", _.toDouble / 100)
Now how to use:
List("1 727", "uptime 365 99.999%") collect {
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
}
// List[java.lang.String] = List(sum 728, uptime 364.99634999999995)
To use for reading lines at the console:
Iterator.continually(readLine("prompt> ")).collect{
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
case Tokens(Word("done")) => "done"
}.takeWhile(_ != "done").foreach(println)
// type any input and enter, type "done" and enter to finish
The nice thing about extractors and pattern matching is that you can add case clauses as necessary, you can use Tokens(a, b, _*) to ignore some tokens. I think they combine together nicely (for instance with literals as I did with done).

Tail recursion with Groovy

I coded 3 factorial algorithms:
I expect to fail by stack overflow. No problem.
I try a tail recursive call, and convert the previous algorithm from recursive to iterative. It doesn't work, but I don't understand why.
I use trampoline() method and it works fine as I expect.
def factorial
factorial = { BigInteger n ->
if (n == 1) return 1
n * factorial(n - 1)
}
factorial(1000) // stack overflow
factorial = { Integer n, BigInteger acc = 1 ->
if (n == 1) return acc
factorial(n - 1, n * acc)
}
factorial(1000) // stack overflow, why?
factorial = { Integer n, BigInteger acc = 1 ->
if (n == 1) return acc
factorial.trampoline(n - 1, n * acc)
}.trampoline()
factorial(1000) // It works.

There is no tail recursion in Java, and hence there is none in Groovy either (without using something like trampoline() as you have shown)
The closest I have seen to this, is an AST transformation which cleverly wraps the return recursion into a while loop
Edit
You're right, Java (and hence Groovy) do support this sort of tail-call iteration, however, it doesn't seem to work with Closures...
This code however (using a method rather than a closure for the fact call):
public class Test {
BigInteger fact( Integer a, BigInteger acc = 1 ) {
if( a == 1 ) return acc
fact( a - 1, a * acc )
}
static main( args ) {
def t = new Test()
println "${t.fact( 1000 )}"
}
}
When saved as Test.groovy and executed with groovy Test.groovy runs, and prints the answer:
402387260077093773543702433923003985719374864210714632543799910429938512398629020592044208486969404800479988610197196058631666872994808558901323829669944590997424504087073759918823627727188732519779505950995276120874975462497043601418278094646496291056393887437886487337119181045825783647849977012476632889835955735432513185323958463075557409114262417474349347553428646576611667797396668820291207379143853719588249808126867838374559731746136085379534524221586593201928090878297308431392844403281231558611036976801357304216168747609675871348312025478589320767169132448426236131412508780208000261683151027341827977704784635868170164365024153691398281264810213092761244896359928705114964975419909342221566832572080821333186116811553615836546984046708975602900950537616475847728421889679646244945160765353408198901385442487984959953319101723355556602139450399736280750137837615307127761926849034352625200015888535147331611702103968175921510907788019393178114194545257223865541461062892187960223838971476088506276862967146674697562911234082439208160153780889893964518263243671616762179168909779911903754031274622289988005195444414282012187361745992642956581746628302955570299024324153181617210465832036786906117260158783520751516284225540265170483304226143974286933061690897968482590125458327168226458066526769958652682272807075781391858178889652208164348344825993266043367660176999612831860788386150279465955131156552036093988180612138558600301435694527224206344631797460594682573103790084024432438465657245014402821885252470935190620929023136493273497565513958720559654228749774011413346962715422845862377387538230483865688976461927383814900140767310446640259899490222221765904339901886018566526485061799702356193897017860040811889729918311021171229845901641921068884387121855646124960798722908519296819372388642614839657382291123125024186649353143970137428531926649875337218940694281434118520158014123344828015051399694290153483077644569099073152433278288269864602789864321139083506217095002597389863554277196742822248757586765752344220207573630569498825087968928162753848863396909959826280956121450994871701244516461260379029309120889086942028510640182154399457156805941872748998094254742173582401063677404595741785160829230135358081840096996372524230560855903700624271243416909004153690105933983835777939410970027753472000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
As a guess, I would say that the JVM does not know how to optimise closures (like it does with methods), so this tail call does not get optimised out in the bytecode before it is executed

Starting with version 2.3 Groovy supports tail recursion with the #TailRecursive annotation for methods:
http://java.dzone.com/articles/groovy-goodness-more-efficient

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

stack overflow while calculing permutations or a string in Scala - string

I found an answer myself: implicit class utils (val chaîne: String) { def permutations1 : Seq [String] = { if (chaîne.size == 1) Seq (chaîne) else chaîne.flatMap (x => chaîne.filterNot (_ == x).permutations1.map (x + _)) } }

Related

Print characters at even and odd indices from a String

Parallel Merge Sort in Scala

How can I translate this Groovy function to C#?

Reading Multiple Inputs from the Same Line the Scala Way

Tail recursion with Groovy

Categories

Resources