Haskell multiple statement efficiency

Haskell multiple statement efficiency - haskell

Is this efficient for checking multiple statements in Haskell? Or this there a better way?
case ((x > -10) && (x < 20),x /= 9,(x `mod` 2) == 0,x) of
(False,_,_,_) -> error "Not in range"
(_,False,_,_) -> error "Must not be 9"
(_,_,False,_) -> error "Must be even"
(True,True,True,10) -> stuff ()
(True,True,True,20) -> stuff ()
_ -> error "Error Message"

It's sometimes difficult to come up with small examples of this problem which don't look contrived, but they do happen. Sometimes you need a bunch of computed results to figure out how to split a function into its cases.
So yes, I often find it's cleanest to use case on a tuple of things-I-might-care-about to build complex decision processes. I trust laziness to compute the minimum required to resolve which branch to invoke.
It's worth trying to express your tests via Boolean guards (or even pattern guards), but sometimes there's nothing to beat tabulating the computed values you need in a big tuple and then writing a row for each interesting combination of circumstances.

Assuming that caring about efficiency is really important, and is not premature optimization, you should optimize for the case which is most common; I think that even in Haskell, it means that you want to have the True,True,True cases on top.
Actually, in the given case, if x == 10 or x == 20 you don't need to do the other tests - you don't even need to build thunk computing them; and the compiler cannot know (without profile-guided optimization) which is the code path which will be executed the most, while you should have a reasonable guess (in general you need profiling to verify that).
So what you want is something like the following (untested):
case x of
10 -> stuff ()
20 -> stuff ()
_ -> case ((x > -10) && (x < 20),x /= 9,(x `mod` 2) == 0) of
(False,_,_) -> error "Not in range"
(_,False,_) -> error "Must not be 9"
(_,_,False) -> error "Must be even"
_ -> error "Error Message"
Disclaimer: I did not verify what happens to this code and to the original one after all optimizations.

How's this? You're checking the conditions in order, and returning something on the first to fail, so make the conditions into a list and search through it.
fn x = case lookup False conds of
Just ohno -> error ohno
Nothing
| x == 10 -> stuff
| x == 20 -> stuff
| otherwise -> error "Error Message"
where
conds = [
(x > -10 && x < 20, "Not in range"),
(x /= 9, "Must not be 9"),
(even x, "Must be even")]

Related

Idiomatic formatting of error messages and other complex strings [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 days ago.
Improve this question
When creating a command-line app, one usually has to do some kind of parsing of command-line arguments, and print an error message if a different number of arguments is expected, or they do not make sense. For the sake of simplicity let's say that a program takes a positive integer as its only argument. Parsing and further program execution in Haskell can be done like this:
main :: IO ()
main = do
args <- getArgs
case args of
[arg] -> case readMaybe arg :: Maybe Int of
Just n | n > 0 -> runProg n
Just n -> die $ "expected a positive integer (got: " <> show n <> ")"
Nothing -> die $ "expected an integer (got: " <> arg <> ")"
_ -> die $ "expected exactly one argument (got: " <> show (length args) <> ")"
Creation of appropriate error message feels clunky to me, especially combined with show anywhere I want to include a non-string argument. There is printf but this on the other hand feels... not Haskell-y. What would be the idiomatic approach here? Perhaps my bias against the methods I listed is unjustified and it is, in fact, idiomatic Haskell?

As per the comment, if you're actually parsing command line arguments, you probably want to use optparse-applicative (or maybe optparse).
More generally, I think a reasonably idiomatic way of constructing complex error messages in Haskell is to represent the errors with an algebraic data type:
data OptError
= BadArgCount Int Int -- expected, actual
| NotInteger String
| NotPositive Int
supply a pretty-printer:
errorMessage :: OptError -> String
errorMessage (BadArgCount exp act) = "expected " <> show exp
<> " arguments, got " <> show act
errorMessage (NotInteger str) = "expected integer, got " <> show str
errorMessage (NotPositive n) = "expected positive integer, got " <> show n
and perform the processing in a monad that supports throwing errors:
data Args = Args Int
processArgs :: [String] -> Either OptError Args
processArgs [x] = case readMaybe x of
Just n | n > 0 -> pure $ Args n
| otherwise -> throwError $ NotPositive n
Nothing -> throwError $ NotInteger x
processArgs xs = throwError $ BadArgCount 1 (length xs)
This is certainly overkill for argument processing in a small command-line utility, but it works well in other contexts that demand complex error reporting, and it has several advantages over the die ... approach:
All the error messages are tabulated in one place, so you know exactly what errors the processArgs function can throw.
Error construction is type checked, reducing the potential for errors in your error handling code.
Error reporting is separated from error rendering. This is useful for internationalization, separate error reporting styles for terminal and non-terminal output, reuse of the functions in driver code that wants to handle errors itself, etc. It's also more ergonomic for development, since you don't have to take a break from "real coding" to make up a sensible error message. This typically results in better error reporting in the final product, since it encourages you to write a clear, consistent set of error messages all at once, after the core logic is finished.
It facilitates refactoring the errors systematically, for example to add location information (not relevant for command line arguments, but relevant for errors in input files, for example), or to add hints/recommendations for correction.
It's relatively easy to define a custom monad that also supports warnings and "non-fatal" errors that allow further error checking to continue, generating a list of errors all at once, instead of failing after the first error.
I haven't used this approach for command line arguments, since I usually use optparse-applicative. But, I have used it when coding up interpreters.

Can I use where in Haskell to find function parameter given the function output?

This is my program:
modify :: Integer -> Integer
modify a = a + 100
x = x where modify(x) = 101
In ghci, this compiles successfully but when I try to print x the terminal gets stuck. Is it not possible to find input from function output in Haskell?

x = x where modify(x) = 101
is valid syntax but is equivalent to
x = x where f y = 101
where x = x is a recursive definition, which will get stuck in an infinite loop (or generate a <<loop>> exception), and f y = 101 is a definition of a local function, completely unrelated to the modify function defined elsewhere.
If you turn on warnings you should get a message saying "warning: the local definition of modify shadows the outer binding", pointing at the issue.
Further, there is no way to invert a function like you'd like to do. First, the function might not be injective. Second, even if it were such, there is no easy way to invert an arbitrary function. We could try all the possible inputs but that would be extremely inefficient.

G-machine, (non-)strict contexts - why case expressions need special treatment

I'm currently reading Implementing functional languages: a tutorial by SPJ and the (sub)chapter I'll be referring to in this question is 3.8.7 (page 136).
The first remark there is that a reader following the tutorial has not yet implemented C scheme compilation (that is, of expressions appearing in non-strict contexts) of ECase expressions.
The solution proposed is to transform a Core program so that ECase expressions simply never appear in non-strict contexts. Specifically, each such occurrence creates a new supercombinator with exactly one variable which body corresponds to the original ECase expression, and the occurrence itself is replaced with a call to that supercombinator.
Below I present a (slightly modified) example of such transformation from 1
t a b = Pack{2,1} ;
f x = Pack{2,2} (case t x 7 6 of
<1> -> 1;
<2> -> 2) Pack{1,0} ;
main = f 3
== transformed into ==>
t a b = Pack{2,1} ;
f x = Pack{2,2} ($Case1 (t x 7 6)) Pack{1,0} ;
$Case1 x = case x of
<1> -> 1;
<2> -> 2 ;
main = f 3
I implemented this solution and it works like charm, that is, the output is Pack{2,2} 2 Pack{1,0}.
However, what I don't understand is - why all that trouble? I hope it's not just me, but the first thought I had of solving the problem was to just implement compilation of ECase expressions in C scheme. And I did it by mimicking the rule for compilation in E scheme (page 134 in 1 but I present that rule here for completeness): so I used
E[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
and wrote
C[[case e of alts]] p = C[[e]] p ++ [Eval] ++ [Casejump D[[alts]] p]
I added [Eval] because Casejump needs an argument on top of the stack in weak head normal form (WHNF) and C scheme doesn't guarantee that, as opposed to E scheme.
But then the output changes to enigmatic: Pack{2,2} 2 6.
The same applies when I use the same rule as for E scheme, i.e.
C[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
So I guess that my "obvious" solution is inherently wrong - and I can see that from outputs. But I'm having trouble stating formal arguments as to why that approach was bound to fail.
Can someone provide me with such argument/proof or some intuition as to why the naive approach doesn't work?

The purpose of the C scheme is to not perform any computation, but just delay everything until an EVAL happens (which it might or might not). What are you doing in your proposed code generation for case? You're calling EVAL! And the whole purpose of C is to not call EVAL on anything, so you've now evaluated something prematurely.
The only way you could generate code directly for case in the C scheme would be to add some new instruction to perform the case analysis once it's evaluated.
But we (Thomas Johnsson and I) decided it was simpler to just lift out such expressions. The exact historical details are lost in time though. :)

haskell: factors of a natural number

I'm trying to write a function in Haskell that calculates all factors of a given number except itself.
The result should look something like this:
factorlist 15 => [1,3,5]
I'm new to Haskell and the whole recursion subject, which I'm pretty sure I'm suppoused to apply in this example but I don't know where or how.
My idea was to compare the given number with the first element of a list from 1 to n div2
with the mod function but somehow recursively and if the result is 0 then I add the number on a new list. (I hope this make sense)
I would appreciate any help on this matter
Here is my code until now: (it doesn't work.. but somehow to illustrate my idea)
factorList :: Int -> [Int]
factorList n |n `mod` head [1..n`div`2] == 0 = x:[]

There are several ways to handle this. But first of all, lets write a small little helper:
isFactorOf :: Integral a => a -> a -> Bool
isFactorOf x n = n `mod` x == 0
That way we can write 12 `isFactorOf` 24 and get either True or False. For the recursive part, lets assume that we use a function with two arguments: one being the number we want to factorize, the second the factor, which we're currently testing. We're only testing factors lesser or equal to n `div` 2, and this leads to:
createList n f | f <= n `div` 2 = if f `isFactorOf` n
then f : next
else next
| otherwise = []
where next = createList n (f + 1)
So if the second parameter is a factor of n, we add it onto the list and proceed, otherwise we just proceed. We do this only as long as f <= n `div` 2. Now in order to create factorList, we can simply use createList with a sufficient second parameter:
factorList n = createList n 1
The recursion is hidden in createList. As such, createList is a worker, and you could hide it in a where inside of factorList.
Note that one could easily define factorList with filter or list comprehensions:
factorList' n = filter (`isFactorOf` n) [1 .. n `div` 2]
factorList'' n = [ x | x <- [1 .. n`div` 2], x `isFactorOf` n]
But in this case you wouldn't have written the recursion yourself.
Further exercises:
Try to implement the filter function yourself.
Create another function, which returns only prime factors. You can either use your previous result and write a prime filter, or write a recursive function which generates them directly (latter is faster).

#Zeta's answer is interesting. But if you're new to Haskell like I am, you may want a "simple" answer to start with. (Just to get the basic recursion pattern...and to understand the indenting, and things like that.)
I'm not going to divide anything by 2 and I will include the number itself. So factorlist 15 => [1,3,5,15] in my example:
factorList :: Int -> [Int]
factorList value = factorsGreaterOrEqual 1
where
factorsGreaterOrEqual test
| (test == value) = [value]
| (value `mod` test == 0) = test : restOfFactors
| otherwise = restOfFactors
where restOfFactors = factorsGreaterOrEqual (test + 1)
The first line is the type signature, which you already knew about. The type signature doesn't have to live right next to the list of pattern definitions for a function, (though the patterns themselves need to be all together on sequential lines).
Then factorList is defined in terms of a helper function. This helper function is defined in a where clause...that means it is local and has access to the value parameter. Were we to define factorsGreaterOrEqual globally, then it would need two parameters as value would not be in scope, e.g.
factorsGreaterOrEqual 4 15 => [5,15]
You might argue that factorsGreaterOrEqual is a useful function in its own right. Maybe it is, maybe it isn't. But in this case we're going to say it isn't of general use besides to help us define factorList...so using the where clause and picking up value implicitly is cleaner.
The indentation rules of Haskell are (to my tastes) weird, but here they are summarized. I'm indenting with two spaces here because it grows too far right if you use 4.
Having a list of boolean tests with that pipe character in front are called "guards" in Haskell. I simply establish the terminal condition as being when the test hits the value; so factorsGreaterOrEqual N = [N] if we were doing a call to factorList N. Then we decide whether to concatenate the test number into the list by whether dividing the value by it has no remainder. (otherwise is a Haskell keyword, kind of like default in C-like switch statements for the fall-through case)
Showing another level of nesting and another implicit parameter demonstration, I added a where clause to locally define a function called restOfFactors. There is no need to pass test as a parameter to restOfFactors because it lives "in the scope" of factorsGreaterOrEqual...and as that lives in the scope of factorList then value is available as well.

why is function name repeated in haskell? (newbie)

Why is the function name repeated in
example:
lucky :: (Integral a) => a -> String
lucky 7 = "LUCKY NUMBER SEVEN!"
lucky x = "Sorry, you're out of luck, pal!"
when should I not be repeating function name? what is the meaning of it?
thanks

What you are seeing is pattern match in action.
I will show you another example:
test 1 = "one"
test 2 = "two"
test 3 = "three"
Demo in ghci:
ghci> test 1
"one"
ghci> test 2
"two"
ghci> test 3
"three"
ghci> test 4
"*** Exception: Non-exhaustive patterns in function test
So, when you call any function, the runtime system will try to match
the input with the defined function. So a call to test 3 will
initially check test 1 and since 1 is not equal to 3, it will
move on to the next definition. Again since 2 is not equal to 3,
it will move to the next defintion. In the next definiton since 3 is
equal to 3 it will return "three" String back. When you try to
pattern match something, which doesn't exist at all, the program
throws the exception.

This kind of pattern matching can be transformed to a case statement (and indeed, that's what compilers will normally do!):
lucky' n = case n of
7 -> "LUCKY NUMBER SEVEN!"
x -> "Sorry, you're out of luck, pal!"
Because the x isn't really used, you'd normally write _ -> "Sorry, ..." instead.
Note that this is not2 the same as
lucky'' n = if n==7 then ...
Equality comparison with (==) is in general more expensive1 than pattern matching, and also comes out uglier.
1 Why it's more expensive: suppose we have a big data structure. To determine that they are equal, the program will need to dig through both entire structures, make sure really all branches are equal. However, if you pattern match, you will just compare a small part you're interested in right now.
2 Actually, it is the same in the case, but just because the compiler has a particular trick for pattern matching on numbers: it rewrites it with (==). This is really special for Num types and not true for anything else. (Except if you use the OverloadedStrings extension.)

That definition of lucky uses "pattern matching", and equals (in this case)
lucky :: (Integral a) => a -> String
lucky a = if a == 7
then "LUCKY NUMBER SEVEN!"
else "Sorry, you're out of luck, pal!"

I assume you're looking at learn you a haskell. After that example, it says that
When you call lucky, the patterns will be checked from top to bottom and when it conforms to a pattern, the corresponding function body will be used.
So the first line indicates the type of the function, and later lines are patterns to check. Each line has the function name so the compiler knows you're still talking about the same function.
Think of it this way: When you write the expression lucky (a+b) or whatever, the compiler will attempt to replace lucky (a+b) with the first thing before the = in the function definition that "fits." So if a=3 and b=4, you get this series of replacements:
lucky (a+b) =
lucky (3+4) =
--pattern matching occurs...
lucky 7 =
"LUCKY NUMBER SEVEN!"
This is part of what makes Haskell so easy to reason about in practice; you get a system that works similarly to math.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell multiple statement efficiency - haskell

Related

Idiomatic formatting of error messages and other complex strings [closed]

Can I use where in Haskell to find function parameter given the function output?

G-machine, (non-)strict contexts - why case expressions need special treatment

haskell: factors of a natural number

why is function name repeated in haskell? (newbie)

Categories

Resources