Turtle: how to read a list of files? - haskell

Assume we have a file my_file.txt with contents:
foo
bar
and another file my_other_file.txt containing:
baz
I would like to read the contents of these two files using turtle so that I get a Shell of lines which will produce:
foo
bar
baz
In Haskell's turtle library one can read a list of files by using input, for instance:
view $ input "my_file.txt"
We have that
input :: FilePath -> Shell Line
And Shell has no Monoid instances (which I think makes sense since we cannot associate IO operations), so the only operator I can think of using is (<|>):
view $ foldl (<|>) empty $ map input ["my_file.txt", "my_other_file.txt"]
While this produces the desired effect, I wonder whether there is a library in the turtle eco-system that takes care of this, or whether there is a traverse like operation that can be use on Alternative's.
EDIT: the effect above could be also achieved by using asum:
asum $ input <$> ["my_file.txt", "my_other_file.txt"]

Line has a Monoid instance. If we have a list of Lines, we can mconcat them into a single one:
do
exampleA <- input "my_file.txt"
exampleB <- input "my_other_file.txt"
return $ mconcat [exampleA, exampleB]
Since Shell has an Applicative instance, we can use traverse to use input over a list of files:
traverse input ["my_file.txt","my_other_file.txt"]
We end up with a Shell [Line]. Since Shell is a Functor, we can fmap mconcat (or fold if you don't use a list):
mconcat <$> traverse input ["my_file.txt","my_other_file.txt"]

Related

Haskell sequence of IO actions processing with filtration their results in realtime+perfoming some IO actions in certain moments

I want to do some infinite sequence of IO actions processing with filtration their results in realtime+perfoming some IO actions in certain moments:
We have some function for reducing sequences (see my question haskell elegant way to filter (reduce) sequences of duplicates from infinte list of numbers):
f :: Eq a => [a] -> [a]
f = map head . group
and expression
join $ sequence <$> ((\l -> (print <$> l)) <$> (f <$> (sequence $ replicate 6 getLine)))
if we run this, user can generate any seq of numbers, for ex:
1
2
2
3
3
"1"
"2"
"3"
[(),(),()]
This means that at first all getLine actions performed (6 times in the example and at the end of this all IO actions for filtered list performed, but I want to do IO actions exactly in the moments then sequencing reduces done for some subsequences of same numbers.
How can I archive this output:
1
2
"1"
2
3
"2"
3
3
"3"
[(),(),()]
So I Want this expression not hangs:
join $ sequence <$> ((\l -> (print <$> l)) <$> (f <$> (sequence $ repeat getLine)))
How can I archive real-time output as described above without not blocking it on infinite lists?
Without a 3rd-party library, you can lazily read the contents of standard input, appending a dummy string to the end of the expected input to force output. (There's probably a better solution that I'm stupidly overlooking.)
import System.IO
print_unique :: (String, String) -> IO ()
print_unique (last, current) | last == current = return ()
| otherwise = print last
main = do
contents <- take 6 <$> lines <$> hGetContents stdin
traverse print_unique (zip <*> tail $ (contents ++ [""]))
zip <*> tail produces tuples consisting of the ith and i+1st lines without blocking. print_unique then immediately outputs a line if the following line is different.
Essentially, you are sequencing the output actions as the input is executed, rather than sequencing the input actions.
This seems like a job for a streaming library, like streaming.
{-# LANGUAGE ImportQualifiedPost #-}
module Main where
import Streaming
import Streaming.Prelude qualified as S
main :: IO ()
main =
S.mapM_ print
. S.catMaybes
. S.mapped S.head
. S.group
$ S.replicateM 6 getLine
"streaming" has an API reminiscent to that of lists, but works with effectful sequences.
The nice thing about streaming's version of group is that it doesn't force you to keep the whole group in memory if it isn't needed.
The least intuitive function in this answer is mapped, because it's very general. It's not obvious that streaming's version of head fits as its parameter. The key idea is that the Stream type can represent both normal effectful sequences, and sequences of elements on which groups have been demarcated. This is controlled by changing a functor type parameter (Of in the first case, a nested Stream (Of a) m in the case of grouped Streams).
mapped let's you transform that functor parameter while having some effect in the underlying monad (here IO). head processes the inner Stream (Of a) m groups, getting us back to an Of (Maybe a) functor parameter.
I found a nice solution with iterateUntilM
iterateUntilM (\_->False) (\pn -> getLine >>= (\n -> if n==pn then return n else (if pn/="" then print pn else return ()) >> return n) ) ""
I don't like some verbose with
(if pn/="" then print pn else return ())
if you know how to reduce this please comment)
ps.
It is noteworthy that I made a video about this function :)
And could not immediately apply it :(

Strange behaviour of `ReadP` with regards to `fmap head`

Consider the following repl session:
λ import Text.ParserCombinators.ReadP
λ x $$ y = readP_to_S x y
-- This auxiliary function makes things tidier.
λ many get $$ "abc"
[("","abc"),("a","bc"),("ab","c"),("abc","")]
-- This is reasonable.
λ fmap head (many get) $$ "abc"
[(*** Exception: Prelude.head: empty list
-- Wut?
λ fmap last (many get) $$ "abc"
[(*** Exception: Prelude.last: empty list
-- This works neither.
λ fmap id (many get) $$ "abc"
[("","abc"),("a","bc"),("ab","c"),("abc","")]
-- The list is there until I try to chop its head!
My questions:
What is happening here?
How can I extract a single (preferably longest) parse result?
P.S. My goal is to construct a parser combinator that greedily returns the repetitive application of a given parser. (get in this instance, but in actuality I have a more involved logic.) Chopping the list of intermediate results is one approach I thought would do, but I am fine with any, except that it is preferable not to convert to ReadS and back.

Haskell Input to create a String List

I would like to allow a user to build a list from a series of inputs in Haskell.
The getLine function would be called recursively until the stopping case ("Y") is input, at which point the list is returned.
I know the function needs to be in a similar format to below. I am having trouble assigning the correct type signatures - I think I need to include the IO type somewhere.
getList :: [String] -> [String]
getList list = do line <- getLine
if line == "Y"
then return list
else getList (line : list)
So there's a bunch of things that you need to understand. One of them is the IO x type. A value of this type is a computer program that, when later run, will do something and produce a value of type x. So getLine doesn't do anything by itself; it just is a certain sort of program. Same with let p = putStrLn "hello!". I can sequence p into my program multiple times and it will print hello! multiple times, because the IO () is a program, as a value which Haskell happens to be able to talk about and manipulate. If this were TypeScript I would say type IO<x> = { run: () => Promise<x> } and emphatically that type says that the side-effecting action has not been run yet.
So how do we manipulate these values when the value is a program, for example one that fetches the current system time?
The most fundamental way to chain such programs together is to take a program that produces an x (an IO x) and then a Haskell function which takes an x and constructs a program which produces a y (an x -> IO y and combines them together into a resulting program producing a y (an IO y.) This function is called >>= and pronounced "bind". In fact this way is universal, if we add a program which takes any Haskell value of type x and produces a program which does nothing and produces that value (return :: x -> IO x). This allows you to use, for example, the Prelude function fmap f = (>>= return . f) which takes an a -> b and applies it to an IO a to produce an IO b.
So It is so common to say things like getLine >>= \line -> putStrLn (upcase line ++ "!") that we invented do-notation, writing this as
do
line <- getLine
putStrLn (upcase line ++ "!")
Notice that it's the same basic deal; the last line needs to be an IO y for some y.
The last thing you need to know in Haskell is the convention which actually gets these things run. That is that, in your Haskell source code, you are supposed to create an IO () (a program whose value doesn't matter) called Main.main, and the Haskell compiler is supposed to take this program which you described, and give it to you as an executable which you can run whenever you want. As a very special case, the GHCi interpreter will notice if you produce an IO x expression at the top level and will immediately run it for you, but that is very different from how the rest of the language works. For the most part, Haskell says, describe the program and I will give it to you.
Now that you know that Haskell has no magic and the Haskell IO x type just is a static representation of a computer program as a value, rather than something which does side-effecting stuff when you "reduce" it (like it is in other languages), we can turn to your getList. Clearly getList :: IO [String] makes the most sense based on what you said: a program which allows a user to build a list from a series of inputs.
Now to build the internals, you've got the right guess: we've got to start with a getLine and either finish off the list or continue accepting inputs, prepending the line to the list:
getList = do
line <- getLine
if line == 'exit' then return []
else fmap (line:) getList
You've also identified another way to do it, which depends on taking a list of strings and producing a new list:
getList :: IO [String]
getList = fmap reverse (go []) where
go xs = do
x <- getLine
if x == "exit" then return xs
else go (x : xs)
There are probably several other ways to do it.

Howto create a nested/conditional option with optparse-applicative?

Is possible to create a haskell expression, using the methods in optparse-applicative, that parses program options like this?
program [-a [-b]] ...
-a and -b are optionals flags (implemented using switch), with the constraint that the -b option only is valid if -a is typed before.
Thanks
This is possible, with slight tweaks, two different ways:
You can make a parser that only allows -b if you've got -a, but you can't insist then that the -a comes first, since optparse-applicative's <*> combinator doesn't specify an order.
You can insist that the -b option follows the a option, but you do this by implementing a as a command, so you lose the - in front of it.
Applicative is definitely strong enough for this, since there's no need to inspect the values returned by the parsers to determine whether -b is allowed, so >>= is not necessary; If -a succeeds with any output, -b is allowed.
Examples
I'll use a data type to represent which arguments are present, but in reality these would be more meaningful.
import Options.Applicative
data A = A (Maybe B) deriving Show
data B = B deriving Show
So the options to our program maybe contain an A, which might have a B, and always have a string.
boption :: Parser (Maybe B)
boption = flag Nothing (Just B) (short 'b')
Way 1: standard combinators - -b can only come with -a (any order)
I'll use flag' () (short 'a') which just insists that -a is there, but then use *> instead of <*> to ignore the return value () and just return whatever the boption parser returns, giving options -a [-b]. I'll then tag that with A :: Maybe B -> A and finally I'll make the whole thing optional, so you have options [-a [-b]]
aoption :: Parser (Maybe A)
aoption = optional $ A <$> (flag' () (short 'a' ) *> boption)
main = execParser (info (helper <*> aoption)
(fullDesc <> progDesc "-b is only valid with -a"))
>>= print
Notice that since <*> allows any order, we can put -a after -b (which isn't quite what you asked for, but works OK and makes sense for some applications).
ghci> :main -a
Just (A Nothing)
ghci> :main -a -b
Just (A (Just B))
ghci> :main -b -a
Just (A (Just B))
ghci> :main -b
Usage: <interactive> [-a] [-b]
-b is only valid with -a
*** Exception: ExitFailure 1
Way 2: command subparser - -b can only follow a
You can use command to make a subparser which is only valid when the command string is present. You can use it to handle arguments like cabal does, so that cabal install and cabal update have completely different options. Since command takes a ParserInfo argument, any parser you can give to execParser can be used, so you can actually nest commands arbitrarily deeply. Sadly, commands can't start with -, so it'll be program [a [-b]] ... instead of program [-a [-b]] ....
acommand :: Parser A
acommand = subparser $ command "a" (info (A <$> (helper <*> boption))
(progDesc "you can '-b' if you like with 'a'"))
main = execParser (info (helper <*> optional acommand) fullDesc) >>= print
Which runs like this:
ghci> :main
Nothing
ghci> :main a
Just (A Nothing)
ghci> :main a -b
Just (A (Just B))
ghci> :main -b a
Usage: <interactive> [COMMAND]
*** Exception: ExitFailure 1
So you have to precede -b with a.
I'm afraid you can't. This is precisely the scenario that Applicative alone can't handle while Monad can: changing the structure of later actions based on earlier results. In an applicative computation, the "shape" always needs to be known beforehand; this has some advantages (like speeding up so array combinations, or giving out a nice readable help screen for command-line options), but here it limits you to parsing "flat" options.
The interface of optparse-applicative also has Alternative though, which does allow dependent parsing, albeit in a different way as shown by AndrewC.

Processing monad value before assignment

Is it possible to make those two lines one line:
main = do line <- getLine
let result = words line
what I mean is something like non monadic code
result = words getLine
which -- in my opinion -- would improve readability.
Try this: result <- fmap words getLine
fmap takes a function with a type like a -> b and turns it into f a -> f b for anything that's an instance of Functor, which should include all Monad instances.
There's an equivalent function called liftM that's specific to Monad, for murky historical reasons. You might need to use that instead in some cases, but for standard monads like IO you can stick with fmap.
You can also import Data.Functor or Control.Applicative to get a nice operator version of fmap, so you could write words <$> getLine instead, which often looks prettier.

Resources