Pattern matching (x:_): why is the list head bounded to x? - haskell

Just struggling here with Haskell... i have a pretty bad terminology, and given my native language is not english, it's a little complicated to make the right searches :P i was following some haskell tutorials/books (Learn you a Haskell, Real World Haskell, Happy Learn Haskell, also a mailing list, and some random pages), and now i'm stopped here:
head' :: [Char] -> Char
head' (x:_) = x
This function receives a list of elements of String type, and if i apply it like this:
head' "hello"
It returns "h", which is bounded to x, and "ello" is bounded to _, but it does not matter because i don't use it. I understand that the (:) function (or used as an infix operator) receives an element which will be put and the start of a new list, whose tail will be the other received element:
'a' : ['b', 'c'] will return "abc" But, why when i use ":" inside parentheses, the first element is bounded to x and the rest to _? What happens here?
I read a few SO questions like this (x:xs) pattern Haskell logic and this (which is more closer to answer my question i think) What does (x:_) and [x:_] mean? but the accepted question of this last one says:
": is a constructor for lists, which takes the head of the new list as its left argument and the tail as its right argument. If you use it as a pattern like here that means that the head of the list you match is given to the right pattern and the tail to the left."
"The head of the list is given to the right, and the tail to the left"... it really confuses me: if the head is given to "_" and the tail to "x" when used ":" on pattern matching, why x has the value of the list head?
I think maybe its my bad english level which makes me difficult to grasp this. I will also appreciate some hint (like a specific search) instead of a direct answer.
EDIT: For another noob like me.... as said by the accepted answer, "abcd" is just 'a':'b':'c':'d', the pattern (x:_) matches 'a':'b' and so on, the underscore means "i don't care about the rest", and receives the rest of the characters. Just that :)

The list data type is defined like this:
data [a] = [] | a : [a]
That means that a list of as is either empty, or a head element attached to a tail element using the : (cons) constructor.
When you pattern match on a list, you define your function for each of these two cases. In the case of head', it just fails when given an empty list, so we only match one case: the a : [a] case.
If you called head' "hello" (as it exists right now, with the type signature included) it should fail, because "hello" is actually a String -- an alias for [Char] in Haskell. "hello" is merely syntactic sugar for the following construction:
"hello" = 'h':'e':'l':'l':'o':[]
So, when you pattern match on the list using head', you get 'h' on the left side of the first :, and the rest of the list (which we don't care about) bound to _.

Related

Split string on multiple delimiters of any length in Haskell

I am attempting a Haskell coding challenge where, given a certain string with a prefix indicating which substrings are delimiting markers, a list needs to be built from the input.
I have already solved the problem for multiple single-length delimiters, but I am stuck with the problem where the delimiters can be any length. I use splitOneOf from Data.List.Split, but this works for character (length 1) delimiters only.
For example, given
input ";,\n1;2,3,4;10",
delimiters are ';' and ','
splitting the input on the above delivers
output [1,2,3,4,10]
The problem I'm facing has two parts:
Firstly, a single delimiter of any length, e.g.
"****\n1****2****3****4****10" should result in the list [1,2,3,4,10].
Secondly, more than one delimiter can be specified, e.g.
input "[***][||]\n1***2||3||4***10",
delimiters are "***" and "||"
splitting the input on the above delivers
output [1,2,3,4,10]
My code for retrieving the delimiter in the case of character delimiters:
--This gives the delimiters as a list of characters, i.e. a String.
getDelimiter::String->[Char]
getDelimiter text = head . splitOn "\n" $ text
--drop "[delimiters]\n" from the input
body::String->String
body text = drop ((length . getDelimiter $ text)+1)) $ text
--returns tuple with fst being the delimiters, snd the body of the input
doc::String->(String,String)
doc text = (getDelimiter text, body text)
--given the delimiters and the body of the input, return a list of strings
numbers::(String,String)->[String]
numbers (delim, rest) = splitOneOf delim rest
--input ",##\n1,2#3#4" gives output ["1","2","3","4"]
getList::String->[String]
getList text = numbers . doc $ text
So my question is, how do I do the processing for when the delimiters are e.g. "***" and "||"?
Any hints are welcome, especially in a functional programming context.
If you don't mind making multiple passes over the input string, you can use splitOn from Data.List.Split, and gradually split the input string using one delimiter at a time.
You can write this fairly succinctly using foldl':
import Data.List
import Data.List.Split
splitOnAnyOf :: Eq a => [[a]] -> [a] -> [[a]]
splitOnAnyOf ds xs = foldl' (\ys d -> ys >>= splitOn d) [xs] ds
Here, the accumulator for the fold operation is a list of strings, or more generally [[a]], so you have to 'lift' xs into a list, using [xs].
Then you fold over the delimiters ds - not the input string to be parsed. For each delimiter d, you split the accumulated list of strings with splitOn, and concatenate them. You could also have used concatMap, but here I arbitrarily chose to use the more general >>= (bind) operator.
This seems to do what is required in the OP:
*Q49228467> splitOnAnyOf [";", ","] "1;2,3,4;10"
["1","2","3","4","10"]
*Q49228467> splitOnAnyOf ["***", "||"] "1***2||3||4***10"
["1","2","3","4","10"]
Since this makes multiple passes over temporary lists, it's most likely not the fastest implementation you can make, but if you don't have too many delimiters, or extremely long lists, this may be good enough.
This problem has two kinds of solutions: the simple, and the efficient. I will not cover the efficient (because it is not simple), though I will hint on it.
But first, the part where you extract the delimiter and body parts of the input, may be simplified with Data.List.break:
delims = splitOn "/" . fst . break (== '\n') -- Presuming the delimiters are delimited with
-- a slash.
body = snd . break (== '\n')
In any way, we may reduce this problem to finding the positions of all the given patterns in a given string. (By saying "string", I do not mean the haskell String. Rather, I mean an arbitrarily long sequence (or even an infinite stream) of any symbols for which an Equality relation is defined, which is typed in Haskell as Eq a => [a]. I hope this is not too confusing.) As soon as we have the positions, we may slice the string to our hearts' content. If we want to deal with an infinite stream, we must obtain the positions incrementally, and yield the results as we go, which is a restriction that must be kept in mind. Haskell is equipped well enough to handle the stream case as well as the finite string.
A simple approach is to cast isPrefixOf on the string, for each of the patterns.
If some of them matches, we replace it with a Nothing.
Otherwise we mark the first symbol as Just and move to the next position.
Thus, we will have replaced all the different delimiters by a single one: Nothing. We may then readily slice the string by it.
This is fairly idiomatic, and I will bring the code to your judgement shortly. The problem with this approach is that it is inefficient: in fact, if a pattern failed to match, we would rather advance by more than one symbol.
It would be more efficient to base our work on the research that has been made into finding patterns in a string; this problem is well known and there are great, intricate algorithms that solve it an order of magnitude faster. These algorithms are designed to work with a single pattern, so some work must be put into adapting them to our case; however, I believe they are adaptable. The simplest and eldest of such algorithms is the KMP, and it is already encoded in Haskell. You may wish to take arms and generalize it − a quick path to some amount of fame.
Here is the code:
module SplitSubstr where
-- stackoverflow.com/questions/49228467
import Data.List (unfoldr, isPrefixOf, elemIndex)
import Data.List.Split (splitWhen) -- Package `split`.
import Data.Maybe (catMaybes, isNothing)
-- | Split a (possibly infinite) string at the occurrences of any of the given delimiters.
--
-- λ take 10 $ splitOnSubstrs ["||", "***"] "la||la***fa"
-- ["la","la","fa"]
--
-- λ take 10 $ splitOnSubstrs ["||", "***"] (cycle "la||la***fa||")
-- ["la","la","fa","la","la","fa","la","la","fa","la"]
--
splitOnSubstrs :: [String] -> String -> [String]
splitOnSubstrs delims
= fmap catMaybes -- At this point, there will be only `Just` elements left.
. splitWhen isNothing -- Now we may split at nothings.
. unfoldr f -- Replace the occurences of delimiters with a `Nothing`.
where
-- | This is the base case. It will terminate the `unfoldr` process.
f [ ] = Nothing
-- | This is the recursive case. It is divided into 2 cases:
-- * One of the delimiters may match. We will then replace it with a Nothing.
-- * Otherwise, we will `Just` return the current element.
--
-- Notice that, if there are several patterns that match at this point, we will use the first one.
-- You may sort the patterns by length to always match the longest or the shortest. If you desire
-- more complicated behaviour, you must plug a more involved logic here. In any way, the index
-- should point to one of the patterns that matched.
--
-- vvvvvvvvvvvvvv
f body#(x:xs) = case elemIndex True $ (`isPrefixOf` body) <$> delims of
Just index -> return (Nothing, drop (length $ delims !! index) body)
Nothing -> return (Just x, xs)
It might happen that you will not find this code straightforward. Specifically, the unfoldr part is somewhat dense, so I will add a few words about it.
unfoldr f is an embodiment of a recursion scheme. f is a function that may chip a part from the body: f :: (body -> Maybe (chip, body)).
As long as it keeps chipping, unfoldr keeps applying it to the body. This is called recursive case.
Once it fails (returning Nothing), unfoldr stops and hands you all the chips it thus collected. This is called base case.
In our case, f takes symbols from the string, and fails once the string is empty.
That's it. I hope you send me a postcard when you receive a Turing award for a fast splitting algorithm.

Capitalize Every Other Letter in a String -- take / drop versus head / tail for Lists

I have spent the past afternoon or two poking at my computer as if I had never seen one before. Today's topic Lists
The exercise is to take a string and capitalize every other letter. I did not get very far...
Let's take a list x = String.toList "abcde" and try to analyze it. If we add the results of take 1 and drop 1 we get back the original list
> x = String.toList "abcde"
['a','b','c','d','e'] : List Char
> (List.take 1 x) ++ (List.drop 1 x)
['a','b','c','d','e'] : List Char
I thought head and tail did the same thing, but I get a big error message:
> [List.head x] ++ (List.tail x)
==================================== ERRORS ====================================
-- TYPE MISMATCH --------------------------------------------- repl-temp-000.elm
The right argument of (++) is causing a type mismatch.
7│ [List.head x] ++ (List.tail x)
^^^^^^^^^^^
(++) is expecting the right argument to be a:
List (Maybe Char)
But the right argument is:
Maybe (List Char)
Hint: I always figure out the type of the left argument first and if it is
acceptable on its own, I assume it is "correct" in subsequent checks. So the
problem may actually be in how the left and right arguments interact.
The error message tells me a lot of what's wrong. Not 100% sure how I would fix it. The list joining operator ++ is expecting [Maybe Char] and instead got Maybe [Char]
Let's just try to capitalize the first letter in a string (which is less cool, but actually realistic).
[String.toUpper ( List.head x)] ++ (List.drop 1 x)
This is wrong since Char.toUpper requires String and instead List.head x is a Maybe Char.
[Char.toUpper ( List.head x)] ++ (List.drop 1 x)
This also wrong since Char.toUpper requires Char instead of Maybe Char.
In real life a user could break a script like this by typing non-Unicode character (like an emoji). So maybe Elm's feedback is right. This should be an easy problem it takes "abcde" and turns into "AbCdE" (or possibly "aBcDe"). How to handle errors properly?
The same question in JavaScript: How do I make the first letter of a string uppercase in JavaScript?
In Elm, List.head and List.tail both return they Maybe type because either function could be passed an invalid value; specifically, the empty list. Some languages, like Haskell, throw an error when passing an empty list to head or tail, but Elm tries to eliminate as many runtime errors as possible.
Because of this, you must explicitly handle the exceptional case of the empty list if you choose to use head or tail.
Note: There are probably better ways to achieve your end goal of string mixed capitalization, but I'll focus on the head and tail issue because it's a good learning tool.
Since you're using the concatenation operator, ++, you'll need a List for both arguments, so it's safe to say you could create a couple functions that handle the Maybe return values and translate them to an empty list, which would allow you to use your concatenation operator.
myHead list =
case List.head list of
Just h -> [h]
Nothing -> []
myTail list =
case List.tail list of
Just t -> t
Nothing -> []
Using the case statements above, you can handle all possible outcomes and map them to something usable for your circumstances. Now you can swap myHead and myTail into your code and you should be all set.

Does a function in Haskell always evaluate its return value?

I'm trying to better understand Haskell's laziness, such as when it evaluates an argument to a function.
From this source:
But when a call to const is evaluated (that’s the situation we are interested in, here, after all), its return value is evaluated too ... This is a good general principle: a function obviously is strict in its return value, because when a function application needs to be evaluated, it needs to evaluate, in the body of the function, what gets returned. Starting from there, you can know what must be evaluated by looking at what the return value depends on invariably. Your function will be strict in these arguments, and lazy in the others.
So a function in Haskell always evaluates its own return value? If I have:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
head (foo [1..]) -- = 4
According to the above paragraph, map (* 2) xs, must be evaluated. Intuitively, I would think that means applying the map to the entire list- resulting in an infinite loop.
But, I can successfully take the head of the result. I know that : is lazy in Haskell, so does this mean that evaluating map (* 2) xs just means constructing something else that isn't fully evaluated yet?
What does it mean to evaluate a function applied to an infinite list? If the return value of a function is always evaluated when the function is evaluated, can a function ever actually return a thunk?
Edit:
bar x y = x
var = bar (product [1..]) 1
This code doesn't hang. When I create var, does it not evaluate its body? Or does it set bar to product [1..] and not evaluate that? If the latter, bar is not returning its body in WHNF, right, so did it really 'evaluate' x? How could bar be strict in x if it doesn't hang on computing product [1..]?
First of all, Haskell does not specify when evaluation happens so the question can only be given a definite answer for specific implementations.
The following is true for all non-parallel implementations that I know of, like ghc, hbc, nhc, hugs, etc (all G-machine based, btw).
BTW, something to remember is that when you hear "evaluate" for Haskell it normally means "evaluate to WHNF".
Unlike strict languages you have to distinguish between two "callers" of a function, the first is where the call occurs lexically, and the second is where the value is demanded. For a strict language these two always coincide, but not for a lazy language.
Let's take your example and complicate it a little:
foo [] = []
foo (_:xs) = map (* 2) xs
bar x = (foo [1..], x)
main = print (head (fst (bar 42)))
The foo function occurs in bar. Evaluating bar will return a pair, and the first component of the pair is a thunk corresponding to foo [1..]. So bar is what would be the caller in a strict language, but in the case of a lazy language it doesn't call foo at all, instead it just builds the closure.
Now, in the main function we actually need the value of head (fst (bar 42)) since we have to print it. So the head function will actually be called. The head function is defined by pattern matching, so it needs the value of the argument. So fst is called. It too is defined by pattern matching and needs its argument so bar is called, and bar will return a pair, and fst will evaluate and return its first component. And now finally foo is "called"; and by called I mean that the thunk is evaluated (entered as it's sometimes called in TIM terminology), because the value is needed. The only reason the actual code for foo is called is that we want a value. So foo had better return a value (i.e., a WHNF). The foo function will evaluate its argument and end up in the second branch. Here it will tail call into the code for map. The map function is defined by pattern match and it will evaluate its argument, which is a cons. So map will return the following {(*2) y} : {map (*2) ys}, where I have used {} to indicate a closure being built. So as you can see map just returns a cons cell with the head being a closure and the tail being a closure.
To understand the operational semantics of Haskell better I suggest you look at some paper describing how to translate Haskell to some abstract machine, like the G-machine.
I always found that the term "evaluate," which I had learned in other contexts (e.g., Scheme programming), always got me all confused when I tried to apply it to Haskell, and that I made a breakthrough when I started to think of Haskell in terms of forcing expressions instead of "evaluating" them. Some key differences:
"Evaluation," as I learned the term before, strongly connotes mapping expressions to values that are themselves not expressions. (One common technical term here is "denotations.")
In Haskell, the process of forcing is IMHO most easily understood as expression rewriting. You start with an expression, and you repeatedly rewrite it according to certain rules until you get an equivalent expression that satisfies a certain property.
In Haskell the "certain property" has the unfriendly name weak head normal form ("WHNF"), which really just means that the expression is either a nullary data constructor or an application of a data constructor.
Let's translate that to a very rough set of informal rules. To force an expression expr:
If expr is a nullary constructor or a constructor application, the result of forcing it is expr itself. (It's already in WHNF.)
If expr is a function application f arg, then the result of forcing it is obtained this way:
Find the definition of f.
Can you pattern match this definition against the expression arg? If not, then force arg and try again with the result of that.
Substitute the pattern match variables in the body of f with the parts of (the possibly rewritten) arg that correspond to them, and force the resulting expression.
One way of thinking of this is that when you force an expression, you're trying to rewrite it minimally to reduce it to an equivalent expression in WHNF.
Let's apply this to your example:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
-- We want to force this expression:
head (foo [1..])
We will need definitions for head and `map:
head [] = undefined
head (x:_) = x
map _ [] = []
map f (x:xs) = f x : map f x
-- Not real code, but a rule we'll be using for forcing infinite ranges.
[n..] ==> n : [(n+1)..]
So now:
head (foo [1..]) ==> head (map (*2) [1..]) -- using the definition of foo
==> head (map (*2) (1 : [2..])) -- using the forcing rule for [n..]
==> head (1*2 : map (*2) [2..]) -- using the definition of map
==> 1*2 -- using the definition of head
==> 2 -- using the definition of *
I believe the idea must be that in a lazy language if you're evaluating a function application, it must be because you need the result of the application for something. So whatever reason caused the function application to be reduced in the first place is going to continue to need to reduce the returned result. If we didn't need the function's result we wouldn't be evaluating the call in the first place, the whole application would be left as a thunk.
A key point is that the standard "lazy evaluation" order is demand-driven. You only evaluate what you need. Evaluating more risks violating the language spec's definition of "non-strict semantics" and looping or failing for some programs that should be able to terminate; lazy evaluation has the interesting property that if any evaluation order can cause a particular program to terminate, so can lazy evaluation.1
But if we only evaluate what we need, what does "need" mean? Generally it means either
a pattern match needs to know what constructor a particular value is (e.g. I can't know what branch to take in your definition of foo without knowing whether the argument is [] or _:xs)
a primitive operation needs to know the entire value (e.g. the arithmetic circuits in the CPU can't add or compare thunks; I need to fully evaluate two Int values to call such operations)
the outer driver that executes the main IO action needs to know what the next thing to execute is
So say we've got this program:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
main :: IO ()
main = print (head (foo [1..]))
To execute main, the IO driver has to evaluate the thunk print (head (foo [1..])) to work out that it's print applied to the thunk head (foo [1..]). print needs to evaluate its argument on order to print it, so now we need to evaluate that thunk.
head starts by pattern matching its argument, so now we need to evaluate foo [1..], but only to WHNF - just enough to tell whether the outermost list constructor is [] or :.
foo starts by pattern matching on its argument. So we need to evaluate [1..], also only to WHNF. That's basically 1 : [2..], which is enough to see which branch to take in foo.2
The : case of foo (with xs bound to the thunk [2..]) evaluates to the thunk map (*2) [2..].
So foo is evaluated, and didn't evaluate its body. However, we only did that because head was pattern matching to see if we had [] or x : _. We still don't know that, so we must immediately continue to evaluate the result of foo.
This is what the article means when it says functions are strict in their result. Given that a call to foo is evaluated at all, its result will also be evaluated (and so, anything needed to evaluate the result will also be evaluated).
But how far it needs to be evaluated depends on the calling context. head is only pattern matching on the result of foo, so it only needs a result to WHNF. We can get an infinite list to WHNF (we already did so, with 1 : [2..]), so we don't necessarily get in an infinite loop when evaluating a call to foo. But if head were some sort of primitive operation implemented outside of Haskell that needed to be passed a completely evaluated list, then we'd be evaluating foo [1..] completely, and thus would never finish in order to come back to head.
So, just to complete my example, we're evaluating map (2 *) [2..].
map pattern matches its second argument, so we need to evaluate [2..] as far as 2 : [3..]. That's enough for map to return the thunk (2 *) 2 : map (2 *) [3..], which is in WHNF. And so it's done, we can finally return to head.
head ((2 *) 2 : map (2 *) [3..]) doesn't need to inspect either side of the :, it just needs to know that there is one so it can return the left side. So it just returns the unevaluated thunk (2 *) 2.
Again though, we only evaluated the call to head this far because print needed to know what its result is, so although head doesn't evaluate its result, its result is always evaluated whenever the call to head is.
(2 *) 2 evaluates to 4, print converts that into the string "4" (via show), and the line gets printed to the output. That was the entire main IO action, so the program is done.
1 Implementations of Haskell, such as GHC, do not always use "standard lazy evaluation", and the language spec does not require it. If the compiler can prove that something will always be needed, or cannot loop/error, then it's safe to evaluate it even when lazy evaluation wouldn't (yet) do so. This can often be faster so GHC optimizations do actually do this.
2 I'm skipping over a few details here, like that print does have some non-primitive implementation we could step inside and lazily evaluate, and that [1..] could be further expanded to the functions that actually implement that syntax.
Not necessarily. Haskell is lazy, meaning that it only evaluates when it needs to. This has some interesting effects. If we take the below code, for example:
-- File: lazinessTest.hs
(>?) :: a -> b -> b
a >? b = b
main = (putStrLn "Something") >? (putStrLn "Something else")
This is the output of the program:
$ ./lazinessTest
Something else
This indicates that putStrLn "Something" is never evaluated. But it's still being passed to the function, in the form of a 'thunk'. These 'thunks' are unevaluated values that, rather than being concrete values, are like a breadcrumb-trail of how to compute the value. This is how Haskell laziness works.
In our case, two 'thunks' are passed to >?, but only one is passed out, meaning that only one is evaluated in the end. This also applies in const, where the second argument can be safely ignored, and therefore is never computed. As for map, GHC is smart enough to realise that we don't care about the end of the array, and only bothers to compute what it needs to, in your case the second element of the original list.
However, it's best to leave the thinking about laziness to the compiler and keep coding, unless you're dealing with IO, in which case you really, really should think about laziness, because you can easily go wrong, as I've just demonstrated.
There are lots and lots of online articles on the Haskell wiki to look at, if you want more detail.
Function could evaluate either return type:
head (x:_) = x
or exception/error:
head _ = error "Head: List is empty!"
or bottom (⊥)
a = a
b = last [1 ..]

How to access nth element in a Haskell tuple

I have this:
get3th (_,_,a,_,_,_) = a
which works fine in GHCI but I want to compile it with GHC and it gives error. If I want to write a function to get the nth element of a tuple and be able to run in GHC what should I do?
my all program is like below, what should I do with that?
get3th (_,_,a,_,_,_) = a
main = do
mytuple <- getLine
print $ get3th mytuple
Your problem is that getLine gives you a String, but you want a tuple of some kind. You can fix your problem by converting the String to a tuple – for example by using the built-in read function. The third line here tries to parse the String into a six-tuple of Ints.
main = do
mystring <- getLine
let mytuple = read mystring :: (Int, Int, Int, Int, Int, Int)
print $ get3th mytuple
Note however that while this is useful for learning about types and such, you should never write this kind of code in practise. There are at least two warning signs:
You have a tuple with more than three or so elements. Such a tuple is very rarely needed and can often be replaced by a list, a vector or a custom data type. Tuples are rarely used more than temporarily to bring two kinds of data into one value. If you start using tuples often, think about whether or not you can create your own data type instead.
Using read to read a structure is not a good idea. read will explode your program with a terrible error message at any tiny little mistake, and that's usually not what you want. If you need to parse structures, it's a good idea to use a real parser. read can be enough for simple integers and such, but no more than that.
The type of getLine is IO String, so your program won't type check because you are supplying a String instead of a tuple.
Your program will work if proper parameter is supplied, i.e:
main = do
print $ get3th (1, 2, 3, 4, 5, 6)
It seems to me that your confusion is between tuples and lists. That is an understandable confusion when you first meet Haskell as many other languages only have one similar construct. Tuples use round parens: (1,2). A tuple with n values in it is a type, and each value can be a different type which results in a different tuple type. So (Int, Int) is a different type from (Int, Float), both are two tuples. There are some functions in the prelude which are polymorphic over two tuples, ie fst :: (a,b) -> a which takes the first element. fst is easy to define using pattern matching like your own function:
fst (a,b) = a
Note that fst (1,2) evaluates to 1, but fst (1,2,3) is ill-typed and won't compile.
Now, lists on the other hand, can be of any length, including zero, and still be the same type; but each element must be of the same type. Lists use square brackets: [1,2,3]. The type for a list with elements of type a is written [a]. Lists are constructed from appending values onto the empty list [], so a list with one element can be typed [a], but this is syntactic sugar for a:[], where : is the cons operator which appends a value to the head of the list. Like tuples can be pattern matched, you can use the empty list and the cons operator to pattern match:
head :: [a] -> a
head (x:xs) = x
The pattern match means x is of type a and xs is of type [a], and it is the former we want for head. (This is a prelude function and there is an analogous function tail.)
Note that head is a partial function as we cannot define what it does in the case of the empty list. Calling it on an empty list will result in a runtime error as you can check for yourself in GHCi. A safer option is to use the Maybe type.
safeHead :: [a] -> Maybe a
safeHead (x:xs) = Just x
safeHead [] = Nothing
String in Haskell is simply a synonym for [Char]. So all of these list functions can be used on strings, and getLine returns a String.
Now, in your case you want the 3rd element. There are a couple of ways you could do this, you could call tail a few times then call head, or you could pattern match like (a:b:c:xs). But there is another utility function in the prelude, (!!) which gets the nth element. (Writing this function is a very good beginner exercise). So your program can be written
main = do
myString <- getLine
print $ myString !! 2 --zero indexed
Testing gives
Prelude> main
test
's'
So remember, tuples us ()and are strictly of a given length, but can have members of different types; whereas lists use '[]', can be any length, but each element must be the same type. And Strings are really lists of characters.
EDIT
As an aside, I thought I'd mention that there is a neater way of writing this main function if you are interested.
main = getLine >>= print . (!!3)

Does there exist something like (xs:x)

I'm new to Haskell. I know I can create a reverse function by doing this:
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = (Main.reverse xs) ++ [x]
Is there such a thing as (xs:x) (a list concatenated with an element, i.e. x is the last element in the list) so that I put the last list element at the front of the list?
rotate :: [a] -> [a]
rotate [] = []
rotate (xs:x) = [x] ++ xs
I get these errors when I try to compile a program containing this function:
Occurs check: cannot construct the infinite type: a = [a]
When generalising the type(s) for `rotate'
I'm also new to Haskell, so my answer is not authoritative. Anyway, I would do it using last and init:
Prelude> last [1..10] : init [1..10]
[10,1,2,3,4,5,6,7,8,9]
or
Prelude> [ last [1..10] ] ++ init [1..10]
[10,1,2,3,4,5,6,7,8,9]
The short answer is: this is not possible with pattern matching, you have to use a function.
The long answer is: it's not in standard Haskell, but it is if you are willing to use an extension called View Patterns, and also if you have no problem with your pattern matching eventually taking longer than constant time.
The reason is that pattern matching is based on how the structure is constructed in the first place. A list is an abstract type, which have the following structure:
data List a = Empty | Cons a (List a)
deriving (Show) -- this is just so you can print the List
When you declare a type like that you generate three objects: a type constructor List, and two data constructors: Empty and Cons. The type constructor takes types and turns them into other types, i.e., List takes a type a and creates another type List a. The data constructor works like a function that returns something of type List a. In this case you have:
Empty :: List a
representing an empty list and
Cons :: a -> List a -> List a
which takes a value of type a and a list and appends the value to the head of the list, returning another list. So you can build your lists like this:
empty = Empty -- similar to []
list1 = Cons 1 Empty -- similar to 1:[] = [1]
list2 = Cons 2 list1 -- similar to 2:(1:[]) = 2:[1] = [2,1]
This is more or less how lists work, but in the place of Empty you have [] and in the place of Cons you have (:). When you type something like [1,2,3] this is just syntactic sugar for 1:2:3:[] or Cons 1 (Cons 2 (Cons 3 Empty)).
When you do pattern matching, you are "de-constructing" the type. Having knowledge of how the type is structured allows you to uniquely disassemble it. Consider the function:
head :: List a -> a
head (Empty) = error " the empty list have no head"
head (Cons x xs) = x
What happens on the type matching is that the data constructor is matched to some structure you give. If it matches Empty, than you have an empty list. If if matches Const x xs then x must have type a and must be the head of the list and xs must have type List a and be the tail of the list, cause that's the type of the data constructor:
Cons :: a -> List a -> List a
If Cons x xs is of type List a than x must be a and xs must be List a. The same is true for (x:xs). If you look to the type of (:) in GHCi:
> :t (:)
(:) :: a -> [a] -> [a]
So, if (x:xs) is of type [a], x must be a and xs must be [a] . The error message you get when you try to do (xs:x) and then treat xs like a list, is exactly because of this. By your use of (:) the compiler infers that xs have type a, and by your use of
++, it infers that xs must be [a]. Then it freaks out cause there's no finite type a for which a = [a] - this is what he's trying to tell you with that error message.
If you need to disassemble the structure in other ways that don't match the way the data constructor builds the structure, than you have to write your own function. There are two functions in the standard library that do what you want: last returns the last element of a list, and init returns all-but-the-last elements of the list.
But note that pattern matching happens in constant time. To find out the head and the tail of a list, it doesn't matter how long the list is, you just have to look to the outermost data constructor. Finding the last element is O(N): you have to dig until you find the innermost Cons or the innermost (:), and this requires you to "peel" the structure N times, where N is the size of the list.
If you frequently have to look for the last element in long lists, you might consider if using a list is a good idea after all. You can go after Data.Sequence (constant time access to first and last elements), Data.Map (log(N) time access to any element if you know its key), Data.Array (constant time access to an element if you know its index), Data.Vector or other data structures that match your needs better than lists.
Ok. That was the short answer (:P). The long one you'll have to lookup a bit by yourself, but here's an intro.
You can have this working with a syntax very close to pattern matching by using view patterns. View Patterns are an extension that you can use by having this as the first line of your code:
{-# Language ViewPatterns #-}
The instructions of how to use it are here: http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns
With view patterns you could do something like:
view :: [a] -> (a, [a])
view xs = (last xs, init xs)
someFunction :: [a] -> ...
someFunction (view -> (x,xs)) = ...
than x and xs will be bound to the last and the init of the list you provide to someFunction. Syntactically it feels like pattern matching, but it is really just applying last and init to the given list.
If you're willing to use something different from plain lists, you could have a look at the Seq type in the containers package, as documented here. This has O(1) cons (element at the front) and snoc (element at the back), and allows pattern matching the element from the front and the back, through use of Views.
"Is there such a thing as (xs:x) (a list concatenated with an element, i.e. x is the last element in the list) so that I put the last list element at the front of the list?"
No, not in the sense that you mean. These "patterns" on the left-hand side of a function definition are a reflection of how a data structure is defined by the programmer and stored in memory. Haskell's built-in list implementation is a singly-linked list, ordered from the beginning - so the pattern available for function definitions reflects exactly that, exposing the very first element plus the rest of the list (or alternatively, the empty list).
For a list constructed in this way, the last element is not immediately available as one of the stored components of the list's top-most node. So instead of that value being present in pattern on the left-hand side of the function definition, it's calculated by the function body onthe right-hand side.
Of course, you can define new data structures, so if you want a new list that makes the last element available through pattern-matching, you could build that. But there's be some cost: Maybe you'd just be storing the list backwards, so that it's now the first element which is not available by pattern matching, and requires computation. Maybe you're storing both the first and last value in the structures, which would require additional storage space and bookkeeping.
It's perfectly reasonable to think about multiple implementations of a single data structure concept - to look forward a little bit, this is one use of Haskell's class/instance definitions.
Reversing as you suggested might be much less efficient. Last is not O(1) operation, but is O(N) and that mean that rotating as you suggested becomes O(N^2) alghorhim.
Source:
http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/src/GHC-List.html#last
Your first version has O(n) complexity. Well it is not, becuase ++ is also O(N) operation
you should do this like
rotate l = rev l []
where
rev [] a = a
rev (x:xs) a = rev xs (x:a)
source : http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/src/GHC-List.html#reverse
In your latter example, x is in fact a list. [x] becomes a list of lists, e.g. [[1,2], [3,4]].
(++) wants a list of the same type on both sides. When you are using it, you're doing [[a]] ++ [a] which is why the compiler is complaining. According to your code a would be the same type as [a], which is impossible.
In (x:xs), x is the first item of the list (the head) and xs is everything but the head, i.e., the tail. The names are irrelevant here, you might as well call them (head:tail).
If you really want to take the last item of the input list and put that in the front of the result list, you could do something like:
rotate :: [a] -> [a]
rotate [] = []
rotate lst = (last lst):(rotate $ init lst)
N.B. I haven't tested this code at all as I don't have a Haskell environment available at the moment.

Resources