Haskell - pattern matching syntactic sugar and where - haskell

Often I have a function of such pattern:
f :: a -> b
f x = case x of
... -> g ...
... -> g ...
...
... -> g ...
where g = ...
There is an syntactic sugar for almost this case:
f :: a -> b
f ... = g ...
f ... = g ...
...
f ... = g ...
Unfortunately I can't attach my where to it: I'll obviously get bunch of not in scopes.
I can make g a separate function, but it's not nice: my module's namespace will be polluted with utility functions.
Is there any workaround?

I think that your first example isn't bad at all. The only syntactic weight is case x of, plus -> instead of =; the latter is offset by the fact that you can omit the function name for each clause. Indeed, even dflemstr's proposed go helper function is syntactically heavier.
Admittedly, it's slightly inconsistent compared to the normal function clause syntax, but this is probably a good thing: it more precisely visually delimits the scope in which x is available.

No, there is no workaround. When you have multiple clauses for a function like that, they cannot share a where-clause. Your only option is to use a case statement, or do something like this:
f x =
go x
where
go ... = g ...
go ... = g ...
g = ...
...if you really want to use a function form for some reason.

f = g . h -- h is most of your original f
where h ... = ...
h ... = ...
g =

From Haskell 2010 on, or with GHC you can also do:
f x
| m1 <- x = g
| m2 <- x = g
...
where g =
but note that you cannot use the variables bound in the patterns in g. It's equivalent to:
f x = let g = ... in case () of
() -> case x of
m1 -> g
_ -> case x of
m2 -> g
....

Your original solution seems to be the best and only workaround. Syntactically it's not any heavier than direct pattern matching on function parameters if not even lighter.
But just in case if what you need is just to check preconditions and not pattern match don't forget about guards, which allow you to access the where scope freely. But really I see nothing bad in your case of solution.
f :: a -> b
f a
| a == 2 = ...
| isThree a = ...
| a >= 4 = ...
| otherwise = ...
where isThree x = x == 3

With LambdaCase, you can also do this:
{-# language LambdaCase #-}
f :: a -> b
f = \case
... -> g ...
... -> g ...
...
... -> g ...
where g = ...

Is it safe to assume that you consistently use g on most, if not all, of the different branches of the case statement?
Operating with the assumption that f :: a -> b for some a and b (possibly polymorphic), g is necessarily some function of the form c -> d, which means that there must be a way to consistently extract a c out of an a. Call that getC :: a -> c. In that case, the solution would be to simply use h . g . getC for all cases, where h :: d -> b.
But suppose you can't always get the c out of an a. Perhaps a is of the form f c, where f is a Functor? Then you could fmap g :: f c -> f d, and then somehow transform f d into a b.
Just sort of rambling here, but fmap was the first thing that came to mind when I saw that you appeared to be applying g on every branch.

Related

Are these 2 nested functions (that bodies depend on top function arg) equivalent from the compiler's point of view?

I have some function test which has a signature like:
data D = D | C
test :: D -> ....
test d ... =
And I want to create with let some nested function which body is either body-A or body-B based on case analyze of the d. So, I can do it as:
let nestedFun p =
case d of
C -> case (ft, p) of
(Just SM.FileTypeRegular, Just p1) | Just nm <- takeFileName (cs p1) -> S.member nm itemNames
_ -> False
D -> case (ft, p) of
(Just SM.FileTypeRegular, Just p1) -> S.member (hash $ cs #_ #FilePath p1) itemHashes
_ -> False
or as
let nestedFun p =
case (ft, p) of
(Just SM.FileTypeRegular, Just p1) -> case d of
C | Just nm <- takeFileName (cs p1) ->
S.member nm itemNames
D ->
S.member (hash $ cs #_ #FilePath p1) itemHashes
_ -> False
In short, the difference is that the 1st version looks like Python's:
if isinstance(d, D):
nestedFun = lambda p: ...
else:
nestedFun = lambda p: ...
while the 2nd one is like:
def nestedFun(p):
if isinstance(d, D): ...
else: ...
I will call this nestedFun on the big list of values so the question here is: Is the Haskell compiler/optimizer able to understand that both versions are the same and to reduce the 2nd one to the 1st one, so the case-analyze on d happens just once?
GHC is able to -- the optimizer does consider case-of-case transformations to see if they enable other optimizations -- but not in a way that you can rely on. If you need this, I highly recommend performing that transformation by hand. In fact, for the case you describe here, I would go even farther, and make it clear that the case can happen before p is in scope:
nestedFunDmwit = case d of
C -> \p -> case (ft, p) of ...
D -> \p -> case (ft, p) of ...
The difference here is that nestedFun will re-evaluate the case each time it is applied to an argument, while nestedFunDmwit will evaluate the case just once. So, for example, map (nestedFun x) [a, b, c] would reliably evaluate the case just once; map nestedFun [a, b, c] would evaluate the case three times unless things line up just so for the optimizer; and map nestedFunDmwit [a, b, c] would reliably evaluate the case just once.

An ArrowCircuit instance for stream processors which could block

The Control.Arrow.Operations.ArrowCircuit class is for:
An arrow type that can be used to interpret synchronous circuits.
I want to know what synchronous means here. I looked it up on Wikipedia, where they are speaking of digital electronics. My electronics is quite rusty, so here is the question: what is wrong (if anything is) with such an instance for the so-called asynchronous stream processors:
data StreamProcessor a b = Get (a -> StreamProcessor a b) |
Put b (StreamProcessor a b) |
Halt
instance Category StreamProcessor where
id = Get (\ x -> Put x id)
Put c bc . ab = Put c (bc . ab)
Get bbc . Put b ab = (bbc b) . ab
Get bbc . Get aab = Get $ \ a -> (Get bbc) . (aab a)
Get bbc . Halt = Halt
Halt . ab = Halt
instance Arrow StreamProcessor where
...
getThroughBlocks :: [a] -> StreamProcessor a b -> StreamProcessor a b
getThroughBlocks ~(a : input) (Get f) = getThroughBlocks input (f a)
getThroughBlocks _input putOrHalt = putOrHalt
getThroughSameArgBlocks :: a -> StreamProcessor a b -> StreamProcessor a b
getThroughSameArgBlocks = getThroughBlocks . repeat
instance ArrowLoop StreamProcessor where
loop Halt = Halt
loop (Put (c, d) bdcd') = Put c (loop bdcd')
loop (Get f) = Get $ \ b ->
let
Put (c, d) bdcd' = getThroughSameArgBlocks (b, d) (f (b, d))
in Put c (loop bdcd')
instance ArrowCircuit StreamProcessor where
delay b = Put b id
I reckon this solution to work for us as: we want someArrowCircuit >>> delay b to be someArrowCircuit delayed by one tick with b coming before anything from it. It is easy to see we get what we want:
someArrowCircuit >>> delay b
= someArrowCircuit >>> Put b id
= Put b id . someArrowCircuit
= Put b (id . someArrowCircuit)
= Put b someArrowCircuit
Are there any laws for such a class? If I made no mistake writing delay down, how does synchronous live alongside asynchronous?
The only law that I know of related to ArrowCircuit is actually for the similar ArrowInit class from Causal Commutative Arrows, which says that delay i *** delay j = delay (i,j). I'm pretty sure your version satisfies this (and it looks like a totally reasonable implementation), but it still feels a little strange considering that StreamProcessor isn't synchronous.
Particularly, synchronous circuits follow a pattern of a single input producing a single output. For example, if you have a Circuit a b and provide it a value of type a, then you will get one and only one output b. The "one-tick delay" that delay introduces is thus a delay of one output by one step.
But things are a little funky for asynchronous circuits. Let's consider an example:
runStreamProcessor :: StreamProcessor a b -> [a] -> [b]
runStreamProcessor (Put x s) xs = x : runStreamProcessor s xs
runStreamProcessor _ [] = []
runStreamProcessor Halt _ = []
runStreamProcessor (Get f) (x:xs) = runStreamProcessor (f x) xs
multiplyOneThroughFive :: StreamProcessor Int Int
multiplyOneThroughFive = Get $ \x ->
Put (x*1) $ Put (x*2) $ Put (x*3) $ Put (x*4) $ Put (x*5) multiplyOneThroughFive
Here, multiplyOneThroughFive produces 5 outputs for each input it receives. Now, consider the difference between multiplyOneThroughFive >>> delay 100 and delay 100 >>> multiplyOneThroughFive:
> runStreamProcessor (multiplyOneThroughFive >>> delay 100) [1,2]
[100,1,2,3,4,5,2,4,6,8,10]
> runStreamProcessor (delay 100 >>> multiplyOneThroughFive) [1,2]
[100,200,300,400,500,1,2,3,4,5,2,4,6,8,10]
Inserting the delay at a different point in the circuit actually caused us to produce a different number of results. Indeed, it seems as if the circuit as a whole underwent a 5-tick delay instead of just a 1-tick delay. This would definitely be unexpected behavior in a synchronous environment!

Is this an accurate example of a Haskell Pullback?

I'm still trying to grasp an intuition of pullbacks (from category theory), limits, and universal properties, and I'm not quite catching their usefulness, so maybe you could help shed some insight on that as well as verifying my trivial example?
The following is intentionally verbose, the pullback should be (p, p1, p2), and (q, q1, q2) is one example of a non-universal object to "test" the pullback against to see if things commute properly.
-- MY DIAGRAM, A -> B <- C
type A = Int
type C = Bool
type B = (A, C)
f :: A -> B
f x = (x, True)
g :: C -> B
g x = (1, x)
-- PULLBACK, (p, p1, p2)
type PL = Int
type PR = Bool
type P = (PL, PR)
p = (1, True) :: P
p1 = fst
p2 = snd
-- (g . p2) p == (f . p1) p
-- TEST CASE
type QL = Int
type QR = Bool
type Q = (QL, QR)
q = (152, False) :: Q
q1 :: Q -> A
q1 = ((+) 1) . fst
q2 :: Q -> C
q2 = ((||) True) . snd
u :: Q -> P
u (_, _) = (1, True)
-- (p2 . u == q2) && (p1 . u = q1)
I was just trying to come up with an example that fit the definition, but it doesn't seem particularly useful. When would I "look for" a pull back, or use one?
I'm not sure Haskell functions are the best context
in which to talk about pull-backs.
The pull-back of A -> B and C -> B can be identified with a subset of A x C,
and subset relationships are not directly expressible in Haskell's
type system. In your specific example the pull-back would be
the single element (1, True) because x = 1 and b = True are
the only values for which f(x) = g(b).
Some good "practical" examples of pull-backs may be found
starting on page 41 of Category Theory for Scientists
by David I. Spivak.
Relational joins are the archetypal example of pull-backs
which occur in computer science. The query:
SELECT ...
FROM A, B
WHERE A.x = B.y
selects pairs of rows (a,b) where a is a row from table A
and b is a row from table B and where some function of a
equals some other function of b. In this case the functions
being pulled back are f(a) = a.x and g(b) = b.y.
Another interesting example of a pullback is type unification in type inference. You get type constraints from several places where a variable is used, and you want to find the tightest unifying constraint. I mention this example in my blog.

higher-order functions in Haskell

I have written the following code:
hosum :: (Int->Int)->(Int->Int)
hosum f 0 = 1
hosum f n = afunction f (-abs(n)) (abs(n))
afunction :: (Int->Int)->Int->Int->Int
afunction f a z
|a==z
= 0
|otherwise
= afunction f (a+1) z + afunction f a z
to find the sum of f(i) from -|n| to |n|.. Where is my mistake?
As pointed out in the comments, your code never calls the f function. There are several other things in your code that I don't understand:
hosum f 0 = 1. Why is it one for any f. Shouldn't it be f 0?
In afunction, why is the result 0 if a == z. If the range is inclusive, it should be zero only if a > z.
afunction in the otherwise case calls itself twice. Why doesn't it apply f to a and calls afunction f (a + 1) z only?
Now about a correct solution.
The easiest(and idiomatic) way to implement it is to use standard sum and map functions. It gives a one-liner(if we don't count type signature):
hosum :: (Int -> Int) -> Int -> Int
hosum f n = sum $ map f [-abs(n)..abs(n)]
In plain English, this function takes a list of all numbers from -abs(n) to abs(n), applies f to each of them and sums them up. That's exactly what the problem statement tells us to do.

Values of variables for pattern matching

Just staring with Haskell. I want to define some elements to easily create morphisms between them.
a = "foo"
b = "bar"
g a = a --Problem is here
g b = a --Problem is here
Edit The problem is that haskell treats "a" in "g a" as a variable, but I actually want the value of the "a" defined above. Conceptually a want this
g (valueOf a) = a --Problem is here
g (valueOf b) = a --Problem is here
Where valueOf is a magic function that would give me
g "foo" = a
g "bar" = a
Use
a = "foo"
b = "bar"
g x | x==a = a
| x==b = a
or
g "foo" = a
g "bar" = a
When you pattern match using a variable as in
g a = ...
the variable a is a local variable, bound to the argument of the function. Even if a was already defined globally, the code above will not use the value of the global a to perform a comparison.
This semantics allows to reason locally about your code. Consider this code as an example:
f 2 x = 4
f c d = 0
Just by looking at the above definition you can see that f 2 3 is 4. This is not changed if later on you add a definition for x as follows:
x = 5
f 2 x = 4
f c d = 0
If the match semantics compared the second argument to 5, now we would have f 2 3 equal to 0. This would make reasoning about the function definitions harder, so most (if not all) functional languages such as Haskell use "local" variables for pattern matching, ignoring the possible global definitions for such variables.
A more adventurous alternative is to use view patterns:
{-# LANGUAGE ViewPatterns #-}
a = "foo"
b = "bar"
g ((==a) -> True) = ...
g ((==b) -> True) = ...
I am not a fan of this approach though, since I find standard patterns with guards to be clearer.
Apologies in advance if this is a complete misunderstanding of what
you want to accomplish but wouldn't something like this do?
Data Obj = A | B
g A = A
g B = A
f A = "foo"
f B = "bar"
You want a predefined set of objects, yes?

Resources