List comprehension and converting string to object - haskell

I have a string, like 2.7+i*3.4. I want to parse this string and get Complex number object. I try to do this:
newtype MyComplexNumber = MyComplexNumber (Complex Float)
myReadsCmplx s = [(MyComplexNumber (a :+ b)) |
(a, '+':r1) <- reads s :: [(Float, String)],
(i, '*':r2) <- reads r1 :: [(String, String)],
(b, r3) <- reads r2 :: [(Float, String)]]
But I have empty list:
*Main Data.Complex> myReadsCmplx "2.7+i*3.4"
[]
*Main Data.Complex>

You seem to be using reads as though it were a full monadic parser. It's not. It comes up with a match or none, and if the match it finds does not match your pattern, you get nada. You will be much better off using something like parsec, attoparsec, or even something super-simple like regex-applicative.

Related

Identity parser

As an exercise¹, I've written a string parser that only uses char parsers and Trifecta:
import Text.Trifecta
import Control.Applicative ( pure )
stringParserWithChar :: String -> Parser Char
stringParserWithChar stringToParse =
foldr (\c otherParser -> otherParser >> char c) identityParser
$ reverse stringToParse
where identityParser = pure '?' -- ← This works but I think I can do better
The parser does its job just fine:
parseString (stringParserWithChar "123") mempty "1234"
-- Yields: Success '3'
Yet, I'm not happy with the specific identityParser to which I applied foldr. It seems hacky to have to choose an arbitrary character for pure.
My first intuition was to use mempty but Parser is not a monoid. It is an applicative but empty constitutes an unsuccessful parser².
What I'm looking for instead is a parser that works as a neutral element when combined with other parsers. It should successfully do nothing, i.e., not advance the cursor and let the next parser consume the character.
Is there an identity parser as described above in Trifecta or in another library? Or are parsers not meant to be used in a fold?
¹ The exercise is from the parser combinators chapter of the book Haskell Programming from first principles.
² As helpfully pointed out by cole, Parser is an Alternative and thus a monoid. The empty function stems from Alternative, not Parser's applicative instance.
Don't you want this to parse a String? Right now, as you can tell from the function signature, it parses a Char, returning the last character. Just because you only have a Char parser doesn't mean you can't make a String parser.
I'm going to assume that you want to parse a string, in which case your base case is simple: your identityParser is just pure "".
I think something like this should work (and it should be in the right order but might be reversed).
stringParserWithChar :: String -> Parser String
stringParserWithChar = traverse char
Unrolled, you get something like
stringParserWithChar' :: String -> Parser String
stringParserWithChar' "" = pure ""
stringParserWithChar' (c:cs) = liftA2 (:) (char c) (stringParserWithChar' cs)
-- the above with do notation, note that you can also just sequence the results of
-- 'char c' and 'stringParserWithChar' cs' and instead just return 'pure (c:cs)'
-- stringParserWithChar' (c:cs) = do
-- c' <- char c
-- cs' <- stringParserWithChar' cs
-- pure (c':cs')
Let me know if they don't work since I can't test them right now…
A digression on monoids
My first intuition was to use mempty but Parser is not a monoid.
Ah, but that is not quite the case. Parser is an Alternative, which is a Monoid. But you don't really need to look at the Alt typeclass of Data.Monoid to understand this; Alternative's typeclass definition looks just like a Monoid's:
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
-- more definitions...
class Semigroup a => Monoid a where
mempty :: a
mappend :: a -> a -> a
-- more definitions...
Unfortunately, you want something that acts more like a product instead of an Alt, but that's what the default behavior of Parser does.
Let's rewrite your fold+reverse into just a fold to clarify what's going on:
stringParserWithChar :: String -> Parser Char
stringParserWithChar =
foldl (\otherParser c -> otherParser >> char c) identityParser
where identityParser = pure '?'
Any time you see foldl used to build up something using its Monad instance, that's a bit suspicious[*]. It hints that you really want a monadic fold of some sort. Let's see here...
import Control.Monad
-- foldM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
attempt1 :: String -> Parser Char
attempt1 = foldM _f _acc
This is going to run into the same sort of trouble you saw before: what can you use for a starting value? So let's use a standard trick and start with Maybe:
-- (Control.Monad.<=<)
-- :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c
stringParserWithChar :: String -> Parser Char
stringParserWithChar =
maybe empty pure <=< foldM _f _acc
Now we can start our fold off with Nothing, and immediately switch to Just and stay there. I'll let you fill in the blanks; GHC will helpfully show you their types.
[*] The main exception is when it's a "lazy monad" like Reader, lazy Writer, lazy State, etc. But parser monads are generally strict.

Haskell parser combinator - do notation

I was reading a tutorial regarding building a parser combinator library and i came across a method which i don't quite understand.
newtype Parser a = Parser {parse :: String -> [(a,String)]}
chainl :: Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op a = (p `chainl1` op) <|> return a
chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
p `chainl1` op = do {a <- p; rest a}
where rest a = (do f <- op
b <- p
rest (f a b))
<|> return a
bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ \s -> concatMap (\(a, s') -> parse (f a) s') $ parse p s
the bind is the implementation of the (>>=) operator. I don't quite get how the chainl1 function works. From what I can see you extract f from op and then you apply it to f a b and you recurse, however I do not get how you extract a function from the parser when it should return a list of tuples?
Start by looking at the definition of Parser:
newtype Parser a = Parser {parse :: String -> [(a,String)]}`
A Parser a is really just a wrapper around a function (that we can run later with parse) that takes a String and returns a list of pairs, where each pair contains an a encountered when processing the string, along with the rest of the string that remains to be processed.
Now look at the part of the code in chainl1 that's confusing you: the part where you extract f from op:
f <- op
You remarked: "I do not get how you extract a function from the parser when it should return a list of tuples."
It's true that when we run a Parser a with a string (using parse), we get a list of type [(a,String)] as a result. But this code does not say parse op s. Rather, we are using bind here (with the do-notation syntactic sugar). The problem is that you're thinking about the definition of the Parser datatype, but you're not thinking much about what bind specifically does.
Let's look at what bind is doing in the Parser monad a bit more carefully.
bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ \s -> concatMap (\(a, s') -> parse (f a) s') $ parse p s
What does p >>= f do? It returns a Parser that, when given a string s, does the following: First, it runs parser p with the string to be parsed, s. This, as you correctly noted, returns a list of type [(a, String)]: i.e. a list of the values of type a encountered, along with the string that remained after each value was encountered. Then it takes this list of pairs and applies a function to each pair. Specifically, each (a, s') pair in this list is transformed by (1) applying f to the parsed value a (f a returns a new parser), and then (2) running this new parser with the remaining string s'. This is a function from a tuple to a list of tuples: (a, s') -> [(b, s'')]... and since we're mapping this function over every tuple in the original list returned by parse p s, this ends up giving us a list of lists of tuples: [[(b, s'')]]. So we concatenate (or join) this list into a single list [(b, s'')]. All in all then, we have a function from s to [(b, s'')], which we then wrap in a Parser newtype.
The crucial point is that when we say f <- op, or op >>= \f -> ... that assigns the name f to the values parsed by op, but f is not a list of tuples, b/c it is not the result of running parse op s.
In general, you'll see a lot of Haskell code that defines some datatype SomeMonad a, along with a bind method that hides a lot of the dirty details for you, and lets you get access to the a values you care about using do-notation like so: a <- ma. It may be instructive to look at the State a monad to see how bind passes around state behind the scenes for you. Similarly, here, when combining parsers, you care most about the values the parser is supposed to recognize... bind is hiding all the dirty work that involves the strings that remain upon recognizing a value of type a.

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.
You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Convert String to Tuple, Special Formatting in Haskell

For a test app, I'm trying to convert a special type of string to a tuple. The string is always in the following format, with an int (n>=1) followed by a character.
Examples of Input String:
"2s"
"13f"
"1b"
Examples of Desired Output Tuples (Int, Char):
(2, 's')
(13, 'f')
(1, 'b')
Any pointers would be extremely appreciated. Thanks.
You can use readS to parse the int and get the rest of the string:
readTup :: String -> (Int, Char)
readTup s = (n, head rest)
where [(n, rest)] = reads s
a safer version would be:
maybeReadTup :: String -> Maybe (Int, Char)
maybeReadTup s = do
[(n, [c])] <- return $ reads s
return (n, c)
Here's one way to do it:
import Data.Maybe (listToMaybe)
parseTuple :: String -> Maybe (Int, Char)
parseTuple s = do
(int, (char:_)) <- listToMaybe $ reads s
return (int, char)
This uses the Maybe Monad to express the possible parse failure. Note that if the (char:_) pattern fails to match (i.e., if there is only a number with no character after it), this gets translated into a Nothing result (this is due to how do notation works in Haskell. It calls the fail function of the Monad if pattern matches fail. In the case of Maybe a, we have fail _ = Nothing). The function also evaluates to Nothing if reads can't read an Int at the beginning of the input. If this happens, reads gives [] which is then turned into Nothing by listToMaybe.

Convert nested lists to custom data type

I am trying to convert nested lists into a custom type called Mydata using a list comprehension as follows:
main = do
let a = [["12.345", "1", "4.222111"],
["31.2", "12", "9.1234"],
["43.111111", "3", "8.13"],
["156.112121", "19", "99.99999"]]
let b = foo a
print b
foo xss = [(xs,xs) | xs <- xss, xs <- xss]
where
xs = Mydata (read xs!!0 :: Float) (read xs!!1 :: Int) (read xs!!2 :: Float)
data Mydata = Mydata {valA :: Float, valB :: Int, valC :: Float}
When I run my program I get the following error:
1.hs:11:28:
Couldn't match expected type `String' with actual type `Mydata'
In the first argument of `read', namely `xs'
In the first argument of `(!!)', namely `read xs'
In the first argument of `Mydata', namely `(read xs !! 0 :: Float)'
Can anyone help me figure out what the problem is? Thanks.
xs definition in list comprehension is (perhaps unintentionally) recursive, and makes no sense. Possible implementation follows:
data Mydata = Mydata {valA :: Float, valB :: Int, valC :: Float} deriving Show
-- rely on type inference instead of specifying explicit type for each `read'
dataFromList [a, b, c] = Mydata (read a) (read b) (read c)
dataFromList _ = error "dataFromList: not enough arguments in list"
main = do
let a = [["12.345", "1", "4.222111"],
["31.2", "12", "9.1234"],
["43.111111", "3", "8.13"],
["156.112121", "19", "99.99999"]]
let b = map dataFromList a
-- alternatively
-- let b = [dataFromList triple | triple <- a]
print b
{-
I'm going to take the opportunity to tell you some other things I think will help in the long run as well as fix the problem.
-}
import Control.Applicative
import Data.Maybe
import Network.CGI.Protocol (maybeRead)
{-
The Control.Applicative lets me use <$> and <*> which are really handy functions for working with loads of things (more later).
I'm going to use maybeRead later. I don't know why it's not in Data.Maybe.
Data structure first. I've derived show so we can print Mydata.
-}
data Mydata = Mydata {valA :: Float,
valB :: Int,
valC :: Float}
deriving Show
{-
I've put somedata as a definition in the main body (you had let a = inside main), because I felt you were seriously overusing the IO monad.
It's worth trying to do as much as possible in pure world, because it makes debugging much easier.
Maybe in your actual problem, you'll have read somedata in from somewhere, but for writing the functions,
having a bit of test data like this lying around is a big bonus. (Try though to only mention somedata as a definition
here once, so you don't get a rash of global constants!)
-}
somedata = [["12.345", "1", "4.222111"],
["31.2", "12", "9.1234"],
["43.111111", "3", "8.13"],
["156.112121", "19", "99.99999"]]
somewrong = [ ["1", "2", "3" ], -- OK
["1.0", "2", "3.0" ], -- OK, same value as first one
["1", "2.0", "3" ], -- wrong, decimal for valB
["", "two", "3.3.3"] ] -- wrong, wrong, wrong.
{-
Let's write a function to read a single Mydata, but use Maybe Mydata so we can recover gracefully if it doesn't work out.
maybeRead :: Read a => String -> Maybe a, so it turns strings into Just what you want, or gives you Nothing if it can't.
This is better than simply crashing with an error message.
(Better still would be to return Either an error message explaining the problem or the Right answer, but I'm going to skip that for today.)
I'm going to write this three ways, getting nicer and nicer.
-}
readMydata_v1 :: [String] -> Maybe Mydata
readMydata_v1 [as, bs, cs] = case (maybeRead as, maybeRead bs, maybeRead cs) of
(Just a, Just b, Just c) -> Just $ Mydata a b c
_ -> Nothing
readMydata_v1 _ = Nothing -- anything else is the wrong number of Strings
{-
so we look at (maybeRead as, maybeRead bs, maybeRead cs) and if they all worked, we make a Mydata out of them, then return Just the right answer,
but if something else happened, one of them was a Nothing, so we can't make a Mydata, and so we get Nothing overall.
Try it out in gchi with map readMydata_v1 somedata and map readMydata_v1 somewrong.
Notice how because I used the expression Mydata a b c, it forces the types of a, b and c to be Float, Int and Float in the (Just a, Just b, Just c) pattern.
That pattern is the output of (maybeRead as, maybeRead bs, maybeRead cs), which forces the types of the three uses of maybeRead to be right -
I don't need to give individual type signatures. Type signatures are really handy, but they're not pretty in the middle of a function.
Now I like using Maybe, but I don't like writing loads of case statements inside each other, so I could use the face that Maybe is a Monad.
See Learn You a Haskell for Great Good http://learnyouahaskell.com for more details about monads, but for my purposes here, it's like I can treat a Maybe value like it's IO even though it's not.
-}
readMydata_v2 :: [String] -> Maybe Mydata
readMydata_v2 [as,bs,cs] = do
a <- maybeRead as
b <- maybeRead bs
c <- maybeRead cs
return $ Mydata a b c
readMydata_v2 _ = Nothing -- anything else is the wrong number of Strings
{-
I seem to write no error handling code! Aren't Maybe monads great!
Here we take whatever a we can get out of maybeRead as, whatever b we can get from reading bs
and whatever c we get from cs, and if that has all worked we get Just $ Mydata a b c.
The Maybe monad deals with any Nothings we get by stopping and returning Nothing, and wraps any correct answer in Just.
Whilst this is really nice, it doesn't feel very functional programming, so let's go the whole hog and make it Applicative.
You should read about Applicative in http://learnyouahaskell.com, but for now, lets just use it.
Whenever you find yourself writing
foo x y z = do
thing1 <- something x
thing2 <- somethingelse y
thing3 <- anotherthing x z
thing4 <- yetmore y y z
return $ somefunction thing1 thing2 thing3 thing4
it means you're using a monad when you could more cleanly use an "applicative functor".
All this means in practice is that you could have written that
foo x y z = somefunction <$> something x <*> somethingelse y <*> anotherthing x z <*> yetmore y y z
or if you prefer,
foo x y z = somefunction <$> something x
<*> somethingelse y
<*> anotherthing x z
<*> yetmore y y z
this is nicer because (a) it feels more like ordinary function application (notice that <$> works like $ and <*> works like a space ) and (b) you don't need
to invent names thing1 etc.
It means find out the results of something x and somethingelse y and anotherthing x z and yetmore y y z then apply somefunction to the result.
Let's do readMydata the applicative way:
-}
readMydata_nice :: [String] -> Maybe Mydata
readMydata_nice [a,b,c] = Mydata <$> maybeRead a <*> maybeRead b <*> maybeRead c
readMydata_nice _ = Nothing
{-
Aaaahhhhh, so clean, so functional, so easy. Mmmmm. :) Think more, write less.
This means take the resuts of maybeRead a and maybeRead b and maybeRead c and apply Mydata to the result, but because everything is Maybe, if anything along the way is Nothing, the answer will be Nothing.
Again, you can test this out in ghci with map readMydata_nice somedata or map readMydata_nice somewrong
Anyway, lets write main, which is now more functional too.
-}
main = mapM_ print $ catMaybes $ map readMydata_nice somedata
{-
This takes each list of Strings in somedata and reads them as Maybe Mydata, then throws away the Nothings and turns them into IO print commands and does them one after another.
mapM_ works a bit like map but does every IO it creates. Because it's several prints, each one goes on a seperate line, which is much easier to read.
Here I've decided to use catMaybes to ignore the Nothing values and just print the ones that worked. In a real program,
I'd use Either like I said, so that I can pass an error message instead of silently ignoring wrong data. All the tricks we used on Maybe also work on Either.
-}
In the definition
where
xs = Mydata (read xs!!0 :: Float) (read xs!!1 :: Int) (read xs!!2 :: Float)
you are using xs on the left and the right-hand side of the definition. Therefore, it has to have the same type on both sides. The compiler assumes that the type of xs is MyData and you cannot apply read to a value of that type.

Resources