Question about the ~ and # operators in Haskell

Question about the ~ and # operators in Haskell - haskell

What exactly do they do? I know one possible use of # (assigning a name at the start of a pattern match), but haven't been able to find anything on ~.
I found them in the following code snippet, taken from http://www.haskell.org/haskellwiki/Prime_numbers, but the article assumes that you're fluent in Haskell syntax and doesn't bother explaining its esoteric operators (the part I'm confused about is the start of the declaration for sieve):
primesPT () = 2 : primes'
where
primes' = sieve [3,5..] primes' 9
sieve (p:xs) ps# ~(_:t) q
| p < q = p : sieve xs ps q
| True = sieve [x | x<-xs, rem x p /= 0] t (head t^2)
Any explanation (or link to one) about the syntax used here would be greatly appreciated.

The operator ~ makes a match lazy. Usually a pattern-match evaluates the argument, as there is a need to check whether the pattern fails. If you prefix a pattern with ~, there is no evaluation until it is needed. This functionality is often used in “Tying the knot” code, where one needs to refer to structures that are not yet created. If the pattern fails upon evaulation, the result is undefined.
Here is an example:
f (_:_) = True
f [] = False
g ~(_:_) = True
g [] = False
f [] yields False, while g [] yields true, because the first pattern always matches. (You actually get a warning for this code)
That said, you can see ~ as the opposite of !, which forces the evaluation of an argument even if it's unneeded.
Note that these operators only make things strict/lazy at the level they are applied at, not recursively. For example:
h ~((x,y):xys) = ...
The pattern match on the tuple is strict, but the cons pattern is lazy.

It's a lazy pattern match (also known as irrefutable pattern match which I think is the better name).
Essentially, ~(_:t) will always match, even if the input is the empty list []. Of course, this is dangerous if you don't know what you're doing:
Prelude> let f ~(_:t) = t in f []
*** Exception: <interactive>:1:4-15: Irrefutable pattern failed for pattern (_ : t)

Related

Meaning of overlapping pattern in Haskell

My current understanding of pattern overlapping in Haskell is that 2 patterns are considered to be overlapping if some argument values passed to a function can be matched by multiple patterns.
Given:
last :: [a] -> a
last [x] = x
last (_ : xs) = last xs
passing the argument value [1] would match both the first pattern [x] and the 2nd pattern (_ : xs) - so that would mean the function has overlapping patterns even though both patterns can be matched.
What makes this confusing is that although the patterns are (by the definition above) overlapping, GHC does not show any warning about them being overlapping.
Reverting the 2 pattern matches in the last function does show the overlapping warning:
last :: [a] -> a
last (_ : xs) = last xs
last [x] = x
Warning:
src\OverlappingPatterns.hs:6:1: Warning:
Pattern match(es) are overlapped
In an equation for `last': last [x] = ...
It is almost as though GHC consideres the patterns overlapping if a previous pattern makes it impossible to match a pattern which occurs later.
What is the correct way to determine if a function has overlapping patterns or not?
Update
I am looking for the overlapping pattern definition used in fp101x course.
According to the definition used in fp101x the following function has overlapping patterns:
last :: [a] -> a
last [x] = x
last (_ : xs) = last xs
This is in contradiction with GHC definition of overlapping pattern which does not consider it to have any overlapping patterns.
Without a proper definition of what overlapping pattern means in the fp101x course context, it is impossible to solve that exercise. And the definition used there is not the GHC one.

The updated question clarifies the OP wants a formal definition of overlapping patterns. Here "overlapping" is meant in the sense used by GHC when it emits its warnings: that is, when it detects that a case branch is unreachable because its pattern does not match with anything which is not already handled by earlier branch.
A possible formal definition can indeed follow that intuition. That is, for any pattern p one can first define the set of values (denotations) [[p]] matching with p. (For this, it is important to know the type of the variables involved in p -- [[p]] depends on a type environment Gamma.) Then, one can say that in the sequence of patterns
q0 q1 ... qn p
the pattern p is overlapping iff [[p]], as a set, is included in [[q0]] union ... union [[qn]].
The above definition is hardly operative, though -- it does not immediately lead to an algorithm for checking overlaps. Indeed, computing [[p]] is unfeasible since it is an infinite set, in general.
To define an algorithm, I'd try to define a representation for the set of terms "not yet matched" by any pattern q0 .. qn. As an example, suppose we work with lists of booleans:
Remaining: _ (that is, any list)
q0 = []
Remaining: _:_ (any non empty list)
q1 = (True:xs)
Remaining: False:_
p = (True:False:ys)
Remaining: False:_
Here, the "remaining" set did not change, so the last pattern is overlapping.
As another example:
Remaining: _
q0 = True:[]
Remaining: [] , False:_ , True:_:_
q1 = False:xs
Remaining: [], True:_:_
q2 = True:False:xs
Remaining: [], True:True:_
q3 = []
Remaining: True:True:_
p = True:xs
Remaining: nothing -- not overlapping (and exhaustive as well!)
As you can see, in each step we match each of the "remaining" samples with the pattern at hand. This generates a new set of remaining samples (possibly none). The collection of all these samples forms the new remaining set.
For this, note that it is important to know the list of constructors for each type. This is because when matching with True, you must know there's another False case remaining. Similarly, if you match against [], there's another _:_ case remaining. Roughly, when matching against constructor K, all other constructors of the same type remain.
The above examples are not yet an algorithm, but they can get you started, hopefully.
All of this of course ignores case guards (which make the overlap undecidable), pattern guards, GADTs (which can further refine the remaining set in quite subtle ways).

I am looking for the overlapping pattern definition used in fp101x course.
"Patterns that do not rely on the order in which they are matched are
called disjoint or non-overlapping." (from "Programming in Haskell"
Graham Hutton)
So this example would be non-overlapping
foldr :: (a → b → b) → b → [a] → b
foldr v [] = v
foldr f v (x : xs) = f x (foldr f v xs)
Because you can change the order of pattern-matching like this:
foldr :: (a → b → b) → b → [a] → b
foldr f v (x : xs) = f x (foldr f v xs)
foldr v [] = v
And here you can't:
last :: [a] -> a
last [x] = x
last (_ : xs) = last xs
So the last one )) is overlapping.

I think the thing is that in the first case, not all matches of [x] will match (_:xs). On the second case, the converse is true (no one matching (_:xs) will fall through [x]). So, overlapping really means that there is an unreachable pattern.
This is what GHC documentation has to say about it:
By default, the compiler will warn you if a set of patterns are either
incomplete (i.e., you're only matching on a subset of an algebraic
data type's constructors), or overlapping, i.e.,
f :: String -> Int
f [] = 0
f (_:xs) = 1
f "2" = 2
where the last pattern match in `f' won't ever be reached, as
the second pattern overlaps it. More often than not, redundant
patterns is a programmer mistake/error, so this option is enabled by
default.
Maybe "unreachable pattern" would be a better choice of words.

I would suggest using reasoning logic in combination with compiler messages and test results would be a better way to understand if a function has overlapping patterns or not. As two examples, the first which has already been listed, indeed results in a compiler warning.
-- The first definition should work as expected.
last1 :: [a] -> a
last1 [x] = x
last1 (_:xs) = last xs
in the second case if we swap the last two lines around then a compiler error which states. Program error: pattern match failure: init1 [] results
last :: [a] -> a
last (_:xs) = last xs
last [x] = x
This matches the logic of passing a singleton list which could match in both patterns, and in this case the now second line.
last (_:xs) = last xs
will match in both cases. If we then move onto the second example
-- The first definition should work as expected
drop :: Int -> [a] -> [a]
drop 0 xs = xs
drop n [] = []
drop n (_:xs) = drop1 (n - 1) xs
In the second case if we again swap the last line with the first line then we don't get a compiler error but we also don't get the results we expect. Main> drop 1 [1,2,3] returns an empty list []
drop :: Int -> [a] -> [a]
drop n (_:xs) = drop1 (n - 1) xs
drop 0 xs = xs
drop n [] = []
In summary I think this is why reasoning (as oppose to a formal definition) for determining overlapping patterns works ok.

Span and pattern matching

The span function is defined below. I am curious as to how (ys, zs) is pattern matched with (x:ys, zs) where there is already an 'x' and a cons. I some how believed pattern matching would be an in-place replacement, but this blows my mind and had my jaw dropped. This is really beautiful.
I am curious as to if this construct and more is explained in any book (I am currently reading Real World Haskell Chapter 4 and wonder if this book or any other explains this in detail). Sorry if I come off as naive, but to me, this is a fine pattern matching construct and I would love to know more.
span p [] = ([],[])
span p xs#(x:xs')
| p x = (x:ys,zs)
| otherwise = ([],xs)
where (ys,zs) = span p xs'

you're right, this is beautiful. It is the closest thing to Prolog's TRMC, in Haskell.
Let me explain. That definition is equivalent to
span p xs = case xs of
(x:t) | p x -> let (ys,zs) = span p t in
(x:ys,zs) -- value1
_ -> ([],xs) -- value2 constructed from known parts
Because Haskell is lazy, value1 is constructed and returned immediately, without any intermediate recursive calls, just as the simple value2. At this point x is already known (it was bound as part of pattern matching) but ys and zs are not calculated yet — just their definition is retained alongside the value1 with two "holes" in it, (x:_,_). Only if either of the "holes" values will be demanded later, their values will be calculated by making the further call to span and filling those holes with the destructured result (let bindings are pattern matches too).
This is known as guarded recursion in Haskell - the recursive call is guarded against by the constructor(s) - here, (,) and (:) - creating value with hole(s), to be filled later as needed.
Incidentally, in Prolog this is written as
span(P,[], [],[]). % -- two "inputs", two "outputs"
span(P,XS, A,B):-
XS = [X|T],
( call(P,X) -> % -- predicate P holds for X:
A=[X|YS], B=ZS, % -- first, the value with holes is created
span(P,T, YS,ZS) % -- then, the holes are filled
; % -- else:
A=[], B=XS ). % -- output values are set

Much of pattern syntax can also be used for expressions, so that you can use the same syntax for taking apart data with a pattern as you use for building it with an expression.
Note that since Haskell values are immutable, there are no in place replacements.
The part (x:ys,zs) is not itself a pattern, but is an expression that builds a new value from the values x, ys and zs, which themselves come from patterns.
x comes from the pattern xs#(x:xs') and is bound to the first element of the list passed as the second argument of span. This also binds xs' to the remainder of the list, and xs to the original whole. (The # means "match the pattern to the right but also give a name bound to the whole, and is an exception to the rule that patterns can also be used as expressions.)
ys and zs come from the pattern (ys,zs) in where (ys,zs) = span p xs'. They are bound to the first and second element of the tuple returned from a recursive call of span p xs' with the remainder of the list after x has been removed.
Putting this together, the expression (x:ys,zs) makes a tuple that is the same as the one returned from the recursive span p xs', except that x has been consed to the first tuple element.
Someone else will have to answer about books, I learned Haskell too long ago to have read them. But if everything else fails, you can read the precise definitions in the Haskell report.

Haskell, parse error in input 'if'

I can't for the life of me figure out why there is a problem with this if statement (haskell noobie.) Can anyone help me out?
fst3 (a,b,c) = a
snd3 (a,b,c) = b
trd3 (a,b,c) = c
fst4 (a,b,c,d) = a
snd4 (a,b,c,d) = b
trd4 (a,b,c,d) = c
qud4 (a,b,c,d) = d
fractionalKnapsack (x:xs) =
fractionalKnapsack (x:xs) []
fractionalKnapsack (x:xs) fracList =
((fst3 x),(snd3 x),(trd3 x),(snd3 x) / (trd3 x)):fracList
if length (x:xs) <= 1
then computeKnapsack sort(fracList)
else fractionalKnapsack xs fracList
computeKnapsack (x:xs) = (x:xs)

There are a few things wrong with this code. You have two different definitions for fractionalKnapsack, each taking a different number of arguments, clearly that causes the compiler some trouble. Also the parse error on the if statement is because there shouldn't actually be an if statement where you are trying to put one, you have already completed the definition of the function before you reach the if statement.
It might help a little bit if you better explained what you are trying to do, or what you expect to be happening with the code you wrote.

: is the "cons" operator. It cons-tructs a list by providing the "head" element on the left, and a "tail" list on the right
ghci> 1 : [2,3,4]
[1,2,3,4]
You can pattern match on lists with more than 0 elements using :.
ghci> let (x:xs) = [1,2,3,4]
ghci> x
1
ghci> xs
[2,3,4]
The way that you are using (x:xs) in your code hints that you do not yet have a firm grasp of the definition of lists nor of pattern matching. Rather than using
if length (x:xs) <= 1
it is more common to simply pattern match. A simple example:
howMany :: [a] -> String
howMany [] = "Zero"
howMany [x] = "One"
howMany (x:xs) = "Many"
Haskell functions can be defined with a sequence of "equations" like this where you pattern match on the possible cases that you are interested in. This brings us to the other issues with your code, which are:
The equations for fractionalKnapsack don't match. One has 1 argument, the other has 2. You probably meant to name the second fractionalKnapsack'.
Neither of the fractionalKnapsack definitions handles the empty list case. I'm not sure about this; this may be acceptable if you know that it will never be given an empty list.
None of your functions have type signatures. Type inference can infer them, but it is usually a good idea to write the type signature first, to express your intent for the function and guide you in its implementation.
The second definition of fractionalKnapsack doesn't make sense. There can only be one expression after the = sign, but you have provided two, separated by a newline. This is invalid Haskell and explains why there is a parse error on "if": because whatever compiler/interpreter you were using did not expect the beginning of another expression!

computeKnapsack sort(fracList)
That's probably an error too. It should be computeKnapsack (sort fracList) (or, equivalently, computeKnapsack $ sort fracList).
When you do computeKnapsack sort(fracList) it's equals to doing computeKnapsack sort (fracList), which is equivalent for doing computeKnapsack sort fracList, which means: "give computeKnapsack two arguments: sort and fracList".

Compare the head of a haskell string?

Struggling to learn Haskell, how does one take the head of a string and compare it with the next character untill it finds a character thats note true?
In pseudo code I'm trying to:
while x == 'next char in string' put in new list to be returned

The general approach would be to create a function that recursively evaluates the head of the string until it finds the false value or reaches the end.
To do that, you would need to
understand recursion (prerequisite: understand recursion) and how to write recursive functions in Haskell
know how to use the head function
quite possibly know how to use list comprehension in Haskell
I have notes on Haskell that you may find useful, but you may well find Yet Another Haskell Tutorial more comprehensive (Sections 3.3 Lists; 3.5 Functions; and 7.8 More Lists would probably be good places to start in order to address the bullet points I mention)
EDIT0:
An example using guards to test the head element and continue only if it the same as the second element:
someFun :: String -> String
someFun[] = []
someFun [x:y:xs]
| x == y = someFun(y:xs)
| otherwise = []
EDIT1:
I sort of want to say x = (newlist) and then rather than otherwise = [] have otherwise = [newlist] if that makes any sense?
It makes sense in an imperative programming paradigm (e.g. C or Java), less so for functional approaches
Here is a concrete example to, hopefully, highlight the different between the if,then, else concept the quote suggests and what is happening in the SomeFun function:
When we call SomeFun [a,a,b,b] we match this to SomeFun [x:y:xs] and since x is 'a', and y is 'a', and x==y, then SomeFun [a,a,b,b] = SomeFun [a,b,b], which again matches SomeFun [x:y:xs] but condition x==y is false, so we use the otherwise guard, and so we get SomeFun [a,a,b,b] = SomeFun [a,b,b] = []. Hence, the result of SomeFun [a,a,b,b] is [].
So where did the data go? .Well, I'll hold my hands up and admit a bug in the code, which is now a feature I'm using to explain how Haskell functions work.
I find it helpful to think more in terms of constructing mathematical expressions rather than programming operations. So, the expression on the right of the = is your result, and not an assignment in the imperative (e.g. Java or C sense).
I hope the concrete example has shown that Haskell evaluates expressions using substitution, so if you don't want something in your result, then don't include it in that expression. Conversely, if you do want something in the result, then put it in the expression.
Since your psuedo code is
while x == 'next char in string' put in new list to be returned
I'll modify the SomeFun function to do the opposite and let you figure out how it needs to be modified to work as you desire.
someFun2 :: String -> String
someFun2[] = []
someFun2 [x:y:xs]
| x == y = []
| otherwise = x : someFun(y:xs)
Example Output:
SomeFun2 [a,a,b,b] = []
SomeFun2 [a,b,b,a,b] = [a]
SomeFun2 [a,b,a,b,b,a,b] = [a,b,a]
SomeFun2 [a,b,a,b] = [a,b,a,b]
(I'd like to add at this point, that these various code snippets aren't tested as I don't have a compiler to hand, so please point out any errors so I can fix them, thanks)

There are two typical ways to get the head of a string. head, and pattern matching (x:xs).
In fact, the source for the head function shows is simply defined with pattern matching:
head (x:_) = x
head _ = badHead
I highly recommend you check out Learn You a Haskell # Pattern Matching. It gives this example, which might help:
tell (x:y:[]) = "The list has two elements: " ++ show x ++ " and " ++ show y
Notice how it pattern matched against (x:y:[]), meaning the list must have two elements, and no more. To match the first two elements in a longer list, just swap [] for a variable (x:y:xs)
If you choose the pattern matching approach, you will need to use recursion.
Another approach is the zip xs (drop 1 xs). This little idiom creates tuples from adjacent pairs in your list.
ghci> let xs = [1,2,3,4,5]
ghci> zip xs (drop 1 xs)
[(1,2),(2,3),(3,4),(4,5)]
You could then write a function that looks at these tuples one by one. It would also be recursive, but it could be written as a foldl or foldr.
For understanding recursion in Haskell, LYAH is again highly recommended:
Learn You a Haskell # Recursion

What do the parentheses signify in (x:xs) when pattern matching?

when you split a list using x:xs syntax why is it wrapped in a parentheses? what is the significance of the parentheses? why not [x:xs] or just x:xs?

The cons cell doesn't have to be parenthesized in every context, but in most contexts it is because
Function application binds tighter than any infix operator.
Burn this into your brain in letters of fire.
Example:
length [] = 0
length (x:xs) = 1 + length xs
If parentheses were omitted the compiler would think you had an argument x followed by an ill-placed infix operator, and it would complain bitterly. On the other hand this is OK
length l = case l of [] -> 0
x:xs -> 1 + length xs
In this case neither x nor xs can possibly be construed as part of a function application so no parentheses are needed.
Note that the same wonderful rule function application binds tighter than any infix operator is what allows us to write length xs in 1 + length xs without any parentheses. The infix rule giveth and the infix rule taketh away.

You're simply using the cons operator :, which has low precedence. Parentheses are needed so that things stay right.
And you don't use [x:xs], because that would match a list whose only element is a list with head x and tail xs.

I don't know exact answer, but I guess that is due to what can be matched in patterns. Only constructors can be matched. Constructors can be of single word or composite. Look at the next code:
data Foo = Bar | Baz Int
f :: Foo -> Int
f Bar = 1
f (Baz x) = x - 1
Single word constructors match as is. But composite constructors must be surrounded with parens in order to avoid ambiguity. If we skip parens it looks like matching against two independent arguments:
f Baz x = x - 1
So, as (:) is composite it must be in parens. Skipping parens for Bar is a kind of syntactic sugar.
UPDATE: I realized that (as sykora noted) it is a consequence of operator precedence. It clarifies my assumptions. Function application (which is just space between function and argument) has highest precedence. Others including (:) have lower precedence. So f x:xs is to be interpreted as ((:) (f x)) xs that is presumably not what we need. While f (x:xs) is interpreted as f applied to x:xs which is in turn (:) applied to x and xs.

It's to do with parsing.
Remember, the colon : is just a constructor that's written with operator syntax. So a function like
foo [] = 0
foo (x:xs) = x + foo xs
could also be written as
foo [] = 0
foo ((:) x xs) = x + foo xs
If you drop the parenthesis in that last line, it becomes very hard to parse!

: is a data constructor, like any other pattern match, but written infix. The parentheses are purely there because of infix precedence; they're actually not required and can be safely omitted when precedence rules allow. For instance:
> let (_, a:_) = (1, [2, 3, 4]) in a
2
> let a:_ = "xyzzy"
'x'
> case [1, 2, 3] of; a:b -> a; otherwise -> 0;
1
Interestingly, that doesn't seem to work in the head of a lambda. Not sure why.
As always, the "juxtaposition" operator binds tighter than anything else, so more often than not the delimiters are necessary, but they're not actually part of the pattern match--otherwise you wouldn't be able to use patterns like (x:y:zs) instead of (x:(y:zs)).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Question about the ~ and # operators in Haskell - haskell

Related

Meaning of overlapping pattern in Haskell

Span and pattern matching

Haskell, parse error in input 'if'

Compare the head of a haskell string?

What do the parentheses signify in (x:xs) when pattern matching?

Categories

Resources