I'm very new to Haskell and I'm trying to understand these basic lines of code. I have a main module that's very simple:
main = do
words <- readFile "test.txt"
putStrLn $ reverseCharacters words
where reverseCharacters is defined in another module that I have:
reverseCharacters :: String -> String
reverseCharacters x = reverse x
What I am having trouble understanding is why the $ needs to be there. I've read previous posts and looked it up and I'm still having difficulty understanding this. Any help would be greatly appreciated.
$ is an operator, just like +. What it does is treat its first argument (the expression on the left) as a function, and apply it to its second argument (the expression on the right).
So in this case putStrLn $ reverseCharacters words is equivalent to putStrLn (reverseCharacters words). It needs to be there because function application is left associative, so using no $ or parentheses like putStrLn reverseCharacters words would be equivalent to parenthesising this way (putStrLn reverseCharacters) words, which doesn't work (we can't apply putStrLn to reverseCharacters [something of type String -> String], and even if we could we can't apply the result of putStrLn to words [something of type String]).
The $ operator is just another way of explicitly "grouping" the words than using parentheses; because it's an infix operator it forces a "split" in the expression (and because it's a very low precedence infix operator, it works even if the things on the left or right are using other infix operators).
Related
I'm new to Haskell and i'm now watching the "Haskell Rank" series on youtube and i feel i need a better understanding of what's going on in the first video so i can move on.
He starts with the basic "solve me first" exercise on hackerrank, which already comes with the solution:
solveMeFirst a b = a + b
main = do
val1 <- readLn
val2 <- readLn
let sum = solveMeFirst val1 val2
print sum
this is clear, if i put it in a file, compile it and run it, it behaves as expected:
takes two lines of input
prints the sum of the numbers from the user input
through several steps of explaining the roles of each function, he reaches a more functional one-line solution;
he gets there by first working up to the following:
> show $ sum $ map read $ words "1 2"
3
Up until here, i understand everything, including the use of the ($) operator.
To finish, he then defines the main function, using interact to handle the input from stdinand there are two things i really don't understand:
1 - the function composition: he then reaches the solution of:
main = interact $ show . sum . map read . words
where he grabs the previous build up, removes the input "1 2" to compose the function that will be the argument for interact. And, what he doesn't explain thoroughly is "and replace all of its dollar signs with functional composition."
Would love a detailed explanation of step-by-stepping this "inversion" process.
2 - using this one-line solution, and compiling it from a file,
how can i run the file, sending the input as args?
The conversion to composition is based on the simple definition of the . operator (shown here using $ to make the link clear).
(.) f g = \x -> f $ g x
show . sum . map read . words is the same as \x -> show $ sum $ map read $ words x.
The composed function is never explicitly applied to an argument because interact takes the function and produces an IO () value which, when executed by the runtime, will apply the function to an argument also supplied by the runtime.
The runtime supplies the input by reading from standard input.
im trying to make a programm that should read line by line from a file and check if its a palindrom, if it is, then print.
I'm really new to haskell so the only thing i could do is just print out each line, with this code :
main :: IO()
main = do
filecontent <- readFile "palindrom.txt"
mapM_ putStrLn (lines filecontent)
isPalindrom w = w==reverse w
The thing is, i dont know how to go line by line and check if the line is a palindrom ( note that in my file, each line contains only one word). Thanks for any help.
Here is one suggested approach
main :: IO()
main = do
filecontent <- readFile "palindrom.txt"
putStrLn (unlines $ filter isPalindrome $ lines filecontent)
isPalindrome w = w==reverse w
The part in parens is pure code, it has type String->String. It is generally a good idea to isolate pure code as much as possible, because that code tends to be the easiest to reason about, and often is more easily reusable.
You can think of data as flowing from right to left in that section, broken apart by the ($) operators. First you split the content into separate lines, then filter only the palindromes, finally rebuild the full output as a string. Also, because Haskell is lazy, even though it looks like it is treating the input as a single String in memory, it actually is only pulling the data as needed.
Edited to add extra info....
OK, so the heart of the soln is the pure portion:
unlines $ filter isPalindrome $ lines filecontent
The way that ($) works is by evaluating the function to the right, then using that as the input of the stuff on the left. In this case, filecontent is the full input from the file (a String, including newline chars), and the output is STDOUT (also a full string including newline chars).
Let's follow sample input through this process, "abcba\n1234\nK"
unlines $ filter isPalindrome $ lines "abcba\n1234\nK"
First, lines will break this into an array of lines
unlines $ filter isPalindrome ["abcba", "1234", "K"]
Note that the output of lines is being fed into the input for filter.
So, what does filter do? Notice its type
filter :: (a -> Bool) -> [a] -> [a]
This takes 2 input params, the first is a function (which isPalendrome is), the second a list of items. It will test each item in the list using the function, and its output is the same list input, minus items that the function has chosen to remove (returned False on). In our case, the first and third items are in fact palendromes, the second not. Our expression evaluates as follows
unlines ["abcba", "K"]
Finally, unlines is the opposite of lines.... It will concatinate the items again, inserting newlines in between.
"abcba\nK"
Since STDIO itself is a String, this is ready for outputting.
Note that is it perfectly OK to output a list of Strings using non-pure functions, as follows
forM ["1", "2", "3"] $ \item -> do
putStrLn item
This method however mixes pure and impure code, and is considered slightly less idiomatic Haskell code than the former. You will still see this type of thing a lot though!
Have a look at the filter function. You may not want to put all processing on a single line, but use a let expression. Also, your indentation is off:
main :: IO ()
main = do
filecontent <- readFile "palindrom.txt"
let selected = filter ... filecontent
...
how would I get this function to be accepted with putStr in haskell? so it displays each word in a list on a new line??
unlines1 :: [String] -> String
unlines1 [] = []
unlines1 (l:ls) = l ++ (putStr('\n')) : unlines ls
Let me try to be more clear in the larger space provided by an answer.
When you cause GHCi to evaluate a value, e.g.,
> "foo"
GHCi will attempt to show you that value. It does this by determining whether the type of that value is an instance of Show. If it is, GHCi prints the display string that show provides for that value. In the case of strings, show will escape non-printable characters like '\n'. This means that what GHCi actually does is more like:
> putStrLn (show "foo")
This means that
> "foo\nbar"
becomes
> putStrLn (show "foo\nbar")
which, by the definition of show for Strings, becomes
> putStrLn "foo\\nbar"
with the '\n' escaped. This is what GHCi is designed to do. You can't and shouldn't prevent it from doing so.
If, on the other hand, you want to print a String, as in perform the Haskell equivalent of echo or puts or printf, then you must use an IO action to do so. One IO action you can use is putStr :: String -> IO ().
When you evaluate
> putStr "foo"
GHCi will attempt to evaluate the IO () action and display a result. Because it is an IO action, GHCi is designed to execute (perform) the IO for you, in this case printing a string.
So the difference between
> "foo\nbar"
and
> putStr "foo\nbar"
is not that the newline is escaped in one string and unescaped in the other. The newline is always a literal newline. The issue is that the former is showing you the inspectable version of the string (with non-printables escaped) and the latter is actually printing the string.
I am still struggling with Haskell and now I have encountered a problem with wrapping my mind around the Input/Output monad from this example:
main = do
line <- getLine
if null line
then return ()
else do
putStrLn $ reverseWords line
main
reverseWords :: String -> String
reverseWords = unwords . map reverse . words
I understand that because functional language like Haskell cannot be based on side effects of functions, some solution had to be invented. In this case it seems that everything has to be wrapped in a do block. I get simple examples, but in this case I really need someone's explanation:
Why isn't it enough to use one, single do block for I/O actions?
Why do you have to open completely new one in if/else case?
Also, when does the -- I don't know how to call it -- "scope" of the do monad ends, i.e. when can you just use standard Haskell terms/functions?
The do block concerns anything on the same indentation level as the first statement. So in your example it's really just linking two things together:
line <- getLine
and all the rest, which happens to be rather bigger:
if null line
then return ()
else do
putStrLn $ reverseWords line
main
but no matter how complicated, the do syntax doesn't look into these expressions. So all this is exactly the same as
main :: IO ()
main = do
line <- getLine
recurseMain line
with the helper function
recurseMain :: String -> IO ()
recurseMain line
| null line = return ()
| otherwise = do
putStrLn $ reverseWords line
main
Now, obviously the stuff in recurseMain can't know that the function is called within a do block from main, so you need to use another do.
do doesn't actually do anything, it's just syntactic sugar for easily combining statements. A dubious analogy is to compare do to []:
If you have multiple expressions you can combine them into lists using ::
(1 + 2) : (3 * 4) : (5 - 6) : ...
However, this is annoying, so we can instead use [] notation, which compiles to the same thing:
[1+2, 3*4, 5-6, ...]
Similarly, if you have multiple IO statments, you can combine them using >> and >>=:
(putStrLn "What's your name?") >> getLine >>= (\name -> putStrLn $ "Hi " ++ name)
However, this is annoying, so we can instead use do notation, which compiles to the same thing:
do
putStrLn "What's your name?"
name <- getLine
putStrLn $ "Hi " ++ name
Now the answer to why you need multiple do blocks is simple:
If you have multiple lists of values, you need multiple []s (even if they're nested).
If you have multiple sequences of monadic statements, you need multiple dos (even if they're nested).
I wrote a Haskell function that calculates the factorial of every number in a given list and prints it to the screen.
factPrint list =
if null list
then putStrLn ""
else do putStrLn ((show.fact.head) list)
factPrint (tail list)
The function works, but I find the third line a bit confusing.
Why hasn't the compiler(GHC) reported an error on it since there is no "do" before the "putStrLn" (quasi?)function?
If I omit "do" from the 4th line, an error pops up as expected.
I'm quite new to Haskell and its ways, so please pardon me if I said something overly silly.
do putStrLn ((show.fact.head) list)
factPrint (tail list)
is actually another way of writing
putStrLn ((show.fact.head) list) >> factPrint (tail list)
which, in turn, means
putStrLn ((show.fact.head) list) >>= \_ -> factPrint (tail list)
The do notation is a convenient way of stringing these monads together, without this other ugly syntax.
If you only have one statement inside the do, then you are not stringing anything together, and the do is redundant.
If you are new to Haskell, think of do as similar to required braces after if in a C-like language:
if (condition)
printf("a"); // braces not required
else {
printf("b"); // braces required
finish();
}
do works the same way in Haskell.
Maybe it would help to look at the type of factPrint, and then refactor to use pattern matching:
factPrint :: [Int] -> IO ()
factPrint [] = putStrLn ""
factPrint list = do
putStrLn (show.fact.head) list
factPrint (tail list)
So, if factPrint returns IO (), and the type of putStrLn "" is IO (), then it's perfectly legal for factPrint [] to equal putStrLn "". No do required--in fact, you could just say factPrint [] = return () if you didn't want the trailing newline.
do is used to tie multiple monadic expressions together. It has no effect when followed by only a single expression.
For the if to be well-formed it is only necessary that the then-clause and the else-clause have the same type. Since both clauses have the type IO () this is the case.
The do keyword is used for sequencing, an if-then-else in Haskell doesn't have to contain a do at all if each branch is a single statement eg.
if a
then b
else c
You need the do in your example as you are sequencing two operations on your else branch. If you omit the do then the factPrint(tail list) statement is considered to not be part of the function and thus the compiler complains as it's encountered an unexpected statement.