Related
So ... I messed up a recording in CSV format:
23,95489,0,20,9888
Due to language settings floating point numbers were written with commas as seperator ... in a comma separated value file ...
Problem is that the file does not have a nice formatting for every float. Some have no point at all and the number of numbers behind the point varies too.
My idea was to build a MegaParsec parser that would try to read every possible floating point formatting, move on and if back track if it finds an error.
Eg for the example above:
read 23,95489 -> good
read 0,20 -> good (so far)
read 9888 -> error (because value is too high for column (checked by guard))
(back tracking to 2.) read 0 -> good again
read 20,9888 -> good
done
I've implemented that as (pseudo code here):
floatP = try pointyFloatP <|> unpointyFloatP
lineP = (,,) <$> floatP <* comma <*> floatP <* comma <*> floatP <* comma
My problem is that apparently the try only works in the 'current' float. There is no backtracking to previous positions. Is this correct?
And if so ... how would I go about implementing further back tracking?
How far does “try” back track?
The parser try p consumes exactly as much input as p if p parses successfully, otherwise it does not consume any input at all. So if you look at that in terms of backtracking, it backtracks to the point where you were when you invoked it.
My problem is that apparently the try only works in the 'current' float. There is no backtracking to previous positions. Is this correct?
Yes, try does not "unconsume" input. All it does is to recover from a failure in the parser you give it without consuming any input. It does not undo the effects of any parsers that you've applied previously, nor does it affect subsequent parsers that you apply after try p succeeded.
And if so ... how would I go about implementing further back tracking?
Basically what you want is to not only know whether pointyFloatP succeeds on the current input, but also whether the rest of your lineP would succeed after successfully pointyFloatP - and if it doesn't you want to backtrack back to before you applied pointyFloatP. So basically you want the parser for the whole remaining line in the try, not just the float parser.
To achieve that you can make floatP take the parser for the remaining line as an argument like this:
floatP restP = try (pointyFloatP <*> restP) <|> unpointyFloatP <*> restP
Note that this kind of backtracking isn't going to be very efficient (but I assume you knew that going in).
Update: Include a custom monadic parser for more complex rows.
Using the List Monad for Simple Parsing
The list monad makes a better backtracking "parser" than Megaparsec. For example, to parse the cells:
row :: [String]
row = ["23", "95489", "0", "20", "9888"]
into exactly three columns of values satisfying a particular bound (e.g., less than 30), you can generate all possible parses with:
{-# OPTIONS_GHC -Wall #-}
import Control.Monad
import Control.Applicative
rowResults :: [String] -> [[Double]]
rowResults = cols 3
where cols :: Int -> [String] -> [[Double]]
cols 0 [] = pure [] -- good, finished on time
cols 0 _ = empty -- bad, didn't use all the data
-- otherwise, parse exactly #n# columns from cells #xs#
cols n xs = do
-- form #d# from one or two cells
(d, ys) <- num1 xs <|> num2 xs
-- only accept #d < 30#
guard $ d < 30
ds <- cols (n-1) ys
return $ d : ds
-- read number from a single cell
num1 (x:xs) | ok1 x = pure (read x, xs)
num1 _ = empty
-- read number from two cells
num2 (x:y:zs) | ok1 x && ok2 y = pure (read (x ++ "." ++ y), zs)
num2 _ = empty
-- first cell: "0" is okay, but otherwise can't start with "0"
ok1 "0" = True
ok1 (c:_) | c /= '0' = True
ok1 _ = False
-- second cell: can't end with "0" (or *be* "0")
ok2 xs = last xs /= '0'
The above list-based parser tries to reduce ambiguity by assuming that if "xxx,yyy" is a number, the "xxx" won't start with zeros (unless it's just "0"), and the "yyy" won't end with a zero (or, for that matter, be a single "0"). If this isn't right, just modify ok1 and ok2 as appropriate.
Applied to row, this gives the single unambiguous parse:
> rowResults row
[[23.95489,0.0,20.9888]]
Applied to an ambiguous row, it gives all parses:
> rowResults ["0", "12", "5", "0", "8601"]
[[0.0,12.5,0.8601],[0.0,12.5,0.8601],[0.12,5.0,0.8601]]
Anyway, I'd suggest using a standard CSV parser to parse your file into a matrix of String cells like so:
dat :: [[String]]
dat = [ ["23", "95489", "0", "20", "9888"]
, ["0", "12", "5", "0", "8601"]
, ["23", "2611", "2", "233", "14", "422"]
]
and then use rowResults above get the row numbers of rows that were ambiguous:
> map fst . filter ((>1) . snd) . zip [1..] . map (length . rowResults) $ dat
[2]
>
or unparsable:
> map fst . filter ((==0) . snd) . zip [1..] . map (length . rowResults) $ dat
[]
>
Assuming there are no unparsable rows, you can regenerate one possible fixed file, even if some rows are ambiguous, but just grabbing the first successful parse for each row:
> putStr $ unlines . map (intercalate "," . map show . head . rowResults) $ dat
23.95489,0.0,20.9888
0.0,12.5,0.8601
23.2611,2.233,14.422
>
Using a Custom Monad based on the List Monad for More Complex Parsing
For more complex parsing, for example if you wanted to parse a row like:
type Stream = [String]
row0 :: Stream
row0 = ["Apple", "15", "1", "5016", "2", "5", "3", "1801", "11/13/2018", "X101"]
with a mixture of strings and numbers, it's actually not that difficult to write a monadic parser, based on the list monad, that generates all possible parses.
The key idea is to define a parser as a function that takes a stream and generates a list of possible parses, with each possible parse represented as a tuple of the object successfully parsed from the beginning of the stream paired with the remainder of the stream. Wrapped in a newtype, our parallel parser would look like:
newtype PParser a = PParser (Stream -> [(a, Stream)]) deriving (Functor)
Note the similarity to the type ReadS from Text.ParserCombinators.ReadP, which is also technically an "all possible parses" parser (though you usually only expect one, unambiguous parse back from a reads call):
type ReadS a = String -> [(a, String)]
Anyway, we can define a Monad instance for PParser like so:
instance Applicative PParser where
pure x = PParser (\s -> [(x, s)])
(<*>) = ap
instance Monad PParser where
PParser p >>= f = PParser $ \s1 -> do -- in list monad
(x, s2) <- p s1
let PParser q = f x
(y, s3) <- q s2
return (y, s3)
There's nothing too tricky here: pure x returns a single possible parse, namely the result x with an unchanged stream s, while p >>= f applies the first parser p to generate a list of possible parses, takes them one by one within the list monad to calculate the next parser q to use (which, as per usual for a monadic operation, can depend on the result of the first parse), and generates a list of possible final parses that are returned.
The Alternative and MonadPlus instances are pretty straightforward -- they just lift emptiness and alternation from the list monad:
instance Alternative PParser where
empty = PParser (const empty)
PParser p <|> PParser q = PParser $ \s -> p s <|> q s
instance MonadPlus PParser where
To run our parser, we have:
parse :: PParser a -> Stream -> [a]
parse (PParser p) s = map fst (p s)
and now we can introduce primitives:
-- read a token as-is
token :: PParser String
token = PParser $ \s -> case s of
(x:xs) -> pure (x, xs)
_ -> empty
-- require an end of stream
eof :: PParser ()
eof = PParser $ \s -> case s of
[] -> pure ((), s)
_ -> empty
and combinators:
-- combinator to convert a String to any readable type
convert :: (Read a) => PParser String -> PParser a
convert (PParser p) = PParser $ \s1 -> do
(x, s2) <- p s1 -- for each possible String
(y, "") <- reads x -- get each possible full read
-- (normally only one)
return (y, s2)
and parsers for various "terms" in our CSV row:
-- read a string from a single cell
str :: PParser String
str = token
-- read an integer (any size) from a single cell
int :: PParser Int
int = convert (mfilter ok1 token)
-- read a double from one or two cells
dbl :: PParser Double
dbl = dbl1 <|> dbl2
where dbl1 = convert (mfilter ok1 token)
dbl2 = convert $ do
t1 <- mfilter ok1 token
t2 <- mfilter ok2 token
return $ t1 ++ "." ++ t2
-- read a double that's < 30
dbl30 :: PParser Double
dbl30 = do
x <- dbl
guard $ x < 30
return x
-- rules for first cell of numbers:
-- "0" is okay, but otherwise can't start with "0"
ok1 :: String -> Bool
ok1 "0" = True
ok1 (c:_) | c /= '0' = True
ok1 _ = False
-- rules for second cell of numbers:
-- can't be "0" or end in "0"
ok2 :: String -> Bool
ok2 xs = last xs /= '0'
Then, for a particular row schema, we can write a row parser as we normally would with a monadic parser:
-- a row
data Row = Row String Int Double Double Double
Int String String deriving (Show)
rowResults :: PParser Row
rowResults = Row <$> str <*> int <*> dbl30 <*> dbl30 <*> dbl30
<*> int <*> str <*> str <* eof
and get all possible parses:
> parse rowResults row0
[Row "Apple" 15 1.5016 2.0 5.3 1801 "11/13/2018" "X101"
,Row "Apple" 15 1.5016 2.5 3.0 1801 "11/13/2018" "X101"]
>
The full program is:
{-# LANGUAGE DeriveFunctor #-}
{-# OPTIONS_GHC -Wall #-}
import Control.Monad
import Control.Applicative
type Stream = [String]
newtype PParser a = PParser (Stream -> [(a, Stream)]) deriving (Functor)
instance Applicative PParser where
pure x = PParser (\s -> [(x, s)])
(<*>) = ap
instance Monad PParser where
PParser p >>= f = PParser $ \s1 -> do -- in list monad
(x, s2) <- p s1
let PParser q = f x
(y, s3) <- q s2
return (y, s3)
instance Alternative PParser where
empty = PParser (const empty)
PParser p <|> PParser q = PParser $ \s -> p s <|> q s
instance MonadPlus PParser where
parse :: PParser a -> Stream -> [a]
parse (PParser p) s = map fst (p s)
-- read a token as-is
token :: PParser String
token = PParser $ \s -> case s of
(x:xs) -> pure (x, xs)
_ -> empty
-- require an end of stream
eof :: PParser ()
eof = PParser $ \s -> case s of
[] -> pure ((), s)
_ -> empty
-- combinator to convert a String to any readable type
convert :: (Read a) => PParser String -> PParser a
convert (PParser p) = PParser $ \s1 -> do
(x, s2) <- p s1 -- for each possible String
(y, "") <- reads x -- get each possible full read
-- (normally only one)
return (y, s2)
-- read a string from a single cell
str :: PParser String
str = token
-- read an integer (any size) from a single cell
int :: PParser Int
int = convert (mfilter ok1 token)
-- read a double from one or two cells
dbl :: PParser Double
dbl = dbl1 <|> dbl2
where dbl1 = convert (mfilter ok1 token)
dbl2 = convert $ do
t1 <- mfilter ok1 token
t2 <- mfilter ok2 token
return $ t1 ++ "." ++ t2
-- read a double that's < 30
dbl30 :: PParser Double
dbl30 = do
x <- dbl
guard $ x < 30
return x
-- rules for first cell of numbers:
-- "0" is okay, but otherwise can't start with "0"
ok1 :: String -> Bool
ok1 "0" = True
ok1 (c:_) | c /= '0' = True
ok1 _ = False
-- rules for second cell of numbers:
-- can't be "0" or end in "0"
ok2 :: String -> Bool
ok2 xs = last xs /= '0'
-- a row
data Row = Row String Int Double Double Double
Int String String deriving (Show)
rowResults :: PParser Row
rowResults = Row <$> str <*> int <*> dbl30 <*> dbl30 <*> dbl30
<*> int <*> str <*> str <* eof
row0 :: Stream
row0 = ["Apple", "15", "1", "5016", "2", "5", "3", "1801", "11/13/2018", "X101"]
main = print $ parse rowResults row0
Off-the-shelf Solutions
I find it a little surprising I can't find an existing parser library out there that provides this kind of "all possible parses" parser. The stuff in Text.ParserCombinators.ReadP takes the right approach, but it assumes that you're parsing characters from a String rather than arbitrary tokens from some other stream (in our case, Strings from a [String]).
Maybe someone else can point out an off-the-shelf solution that would save you from having to role your own parser type, instances, and primitives.
When I want to read string to type A I write read str::A. Consider, I want to have generic function which can read string to different types, so I want to write something like read str::A|||B|||C or something similar. The only thing I could think of is:
{-# LANGUAGE TypeOperators #-}
infixr 9 |||
data a ||| b = A a|B b deriving Show
-- OR THIS:
-- data a ||| b = N | A a (a ||| b) | B b (a ||| b) deriving (Data, Show)
instance (Read a, Read b) => Read (a ||| b) where
readPrec = parens $ do
a <- (A <$> readPrec) <|> (B <$> readPrec)
-- OR:
-- a <- (flip A N <$> readPrec) <|> (flip B N <$> readPrec)
return a
And if I want to read something:
> read "'a'"::Int|||Char|||String
B (A 'a')
But what to do with such weird type? I want to fold it to Int or to Char or to String... Or to something another but "atomic" (scalar/simple). Final goal is to read strings like "1,'a'" to list-like [D 1, D 'a']. And main constraint here is that structure is flexible, so string can be "1, 'a'" or "'a', 1" or "\"xxx\", 1, 2, 'a'". I know how to read something separated with delimiter, but this something should be passed as type, not as sum of types like C Char|I Int|S String|etc. Is it possible? Or no way to accomplish it without sum of types?
There’s no way to do this in general using read, because the same input string might parse correctly to more than one of the valid types. You could, however, do this with a function like Text.Read.readMaybe, which returns Nothing on ambiguous input. You might also return a tuple or list of the valid interpretations, or have a rule for which order to attempt to parse the types in, such as: attempt to parse each type in the order they were declared.
Here’s some example code, as proof of concept:
import Data.Maybe (catMaybes, fromJust, isJust, isNothing)
import qualified Text.Read
data AnyOf3 a b c = FirstOf3 a | SecondOf3 b | ThirdOf3 c
instance (Show a, Show b, Show c) => Show (AnyOf3 a b c) where
show (FirstOf3 x) = show x -- Can infer the type from the pattern guard.
show (SecondOf3 x) = show x
show (ThirdOf3 x) = show x
main :: IO ()
main =
(putStrLn . unwords . map show . catMaybes . map readDBS)
["True", "2", "\"foo\"", "bar"] >>
(putStrLn . unwords . map show . readIID) "100"
readMaybe' :: (Read a, Read b, Read c) => String -> Maybe (AnyOf3 a b c)
-- Based on the function from Text.Read
readMaybe' x | isJust a && isNothing b && isNothing c =
(Just . FirstOf3 . fromJust) a -- Can infer the type of a from this.
| isNothing a && isJust b && isNothing c =
(Just . SecondOf3 . fromJust) b -- Can infer the type of b from this.
| isNothing a && isNothing b && isJust c =
(Just . ThirdOf3 . fromJust) c -- Can infer the type of c from this.
| otherwise = Nothing
where a = Text.Read.readMaybe x
b = Text.Read.readMaybe x
c = Text.Read.readMaybe x
readDBS :: String -> Maybe (AnyOf3 Double Bool String)
readDBS = readMaybe'
readToList :: (Read a, Read b, Read c) => String -> [AnyOf3 a b c]
readToList x = repack FirstOf3 x ++ repack SecondOf3 x ++ repack ThirdOf3 x
where repack constructor y | isJust z = [(constructor . fromJust) z]
| otherwise = []
where z = Text.Read.readMaybe y
readIID :: String -> [AnyOf3 Int Integer Double]
readIID = readToList
The first output line echoes every input that parsed successfully, that is, the Boolean constant, the number and the quoted string, but not bar. The second output line echoes every possible interpretation of the input, that is, 100 as an Int, an Integer and a Double.
For something more complicated, you want to write a parser. Haskell has some very good libraries to build them out of combinators. You might look at one such as Parsec. But it’s still helpful to understand what goes on under the hood.
What I'm trying to do is that I want to take a list of strings as input and do some operations then return back a list of strings. The problem is, I am looking for specific yet generic patterns of the string for each case:
func :: [String] -> [String]
func [] = []
func [x] = [x]
func (["Not","(","Not"]:es:[")"]) = es --HERE
func ("Not":"(":pred:"And":rest:")") = ("Not":pred:"Or":(pushNotInwards rest))
func ("Not":"(":pred:"Or":rest:")") = ("Not":pred:"And":(pushNotInwards rest))
func ("Not":"(":"ForAll":x:scope:")") = ("Exists":"Not":"("scope:")")
func ("Not":"(":"Exists":x:scope:")") = ("ForAll":"Not":"(":scope:")")
For the third case for instance, I want to take a list of strings in the form of:
["Not","(","Not",some_strings,")"]
I tried using ++ on the left hand side as:
func (["Not"]++["("]++["Not"])++es++[")"]) = es
I also tried concat and : but they didn't work either. Any suggestions?
You seem to have some confusion about the different string operators.
A String is just a synonym for a list of chars i.e. [Char]. The colon : operator (aka cons) adds one element to the beginning of a list. Here's its type:
*Main> :t (:)
(:) :: a -> [a] -> [a]
For example:
*Main> 1:[2,3]
[1,2,3]
*Main> 'a':"bc"
"abc"
The ++ operator concatenates two lists. Here's its type:
*Main> :t (++)
(++) :: [a] -> [a] -> [a]
Pattern matching can only be done using a data constructor. The : operator is a data constructor, but the ++ operator is not. So you cannot define a function using pattern matching over the ++ operator.
To define a function using pattern matching, I'd suggest defining a new data type for the different functions and qualifier rather than using strings:
-- Logic Operation
data LogicOp =
Not LogicOp | And [LogicOp] | Or [LogicOp] |
Forall String LogicOp | Exists String LogicOp | T | F
deriving (Eq, Show)
func :: LogicOp -> LogicOp
func (Not (Not x)) = x
func (Not (And (pred:rest))) = Or (Not pred:[func (Not (And rest))])
func (Not (Or (pred:rest))) = And (Not pred:[func (Not (Or rest))])
func (Not (Forall x scope)) = Exists x (Not scope)
func (Not (Exists x scope)) = Forall x (Not scope)
func x = x
Here are some examples:
*Main> func (Not (Not T))
T
*Main> func (Not (And [T, F, T]))
Or [Not T,Or [Not F,Or [Not T,Not (And [])]]]
*Main> func (Not (Or [T, F, T]))
And [Not T,And [Not F,And [Not T,Not (Or [])]]]
*Main> func (Not (Forall "x" (And T F))
*Main> func (Not (Forall "x" (And [T, F])))
Exists "x" (Not (And [T,F]))
*Main> func (Not (Exists "x" (And [T, F])))
Forall "x" (Not (And [T,F]))
You should probably not use strings for that. Create a new type:
data SomeExpr = Not SomeExpr
| And SomeExpr SomeExpr
| Or SomeExpr SomeExpr
deriving (Show)
Then you could match on that expression:
func :: SomeExpr -> SomeExpr
func (Not (Not x)) = func x
func (Not (And x y)) = Or (Not $ func x) (Not $ func y)
func (Not (Or x y)) = And (Not $ func x) (Not $ func y)
...
func x = x
You can't pattern match a list in the middle, e.g You want to match [1,2,3,4,5] with (1:middle:5:[]), but this is invalid.
Yes, using an own type has it's own problems, you have to parse it etc, but it is much more easier and safer than with strings (which could have arbitrary content).
I have a function like so:
test :: [Block] -> [Block]
test ((Para foobar):rest) = [Para [(foobar !! 0)]] ++ rest
test ((Plain foobar):rest) = [Plain [(foobar !! 0)]] ++ rest
Block is a datatype that includes Para, Plain and others.
What the function does is not particularly important, but I notice that the body of the function ([Para [(foobar !! 0)]] ++ rest) is the same for both Para and Plain, except that the constructor used is the type of foobar.
Question: is there someway to concisely write this function with the two cases combined?
Something like
test :: [Block] -> [Block]
test ((ParaOrPlain foobar):rest) = [ParaOrPlain [(foobar !! 0)]] ++ rest
where the first ParaOrPlain matches either Para foobar or Plain foobar and the second ParaOrPlain is Para or Plain respectively.
Note that Block can also be (say) a BulletList or OrderedList and I don't want to operate on those. (edit: test x = x for these other types.)
The key is that I don't want to replicate the function body twice since they're identical (barring the call to Para or Plain).
I get the feeling I could use Either, or perhaps my own data type, but I'm not sure how to do it.
Edit
To clarify, I know my function body is cumbersome (I'm new to Haskell) and I thank various answerers for their simplifications.
However at the heart of the problem, I wish to avoid the replication of the Para and Plain line. Something like (in my madeup language...)
# assume we have `test (x:rest)` where `x` is an arbirtrary type
if (class(x) == 'Para' or class(x) == 'Plain') {
conFun = Plain
if (class(x) == 'Para')
conFun = Para
return [conFun ...]
}
# else do nothing (ID function, test x = x)
return x:rest
i.e., I want to know if it's possible in Haskell to assign a constructor function based on the type of the input parameter in this way. Hope that clarifies the question.
One approach is to use a (polymorphic) helper function, e.g.:
helper ctor xs rest = ctor [ head xs ] : rest
test :: [Block] -> [Block]
test ((Para xs):rest) = helper Para xs rest
test ((Plain xs):rest) = helper Plain xs rest
test bs = bs
Note that Para and Plain are just functions of type [whatever] -> Block.
Assuming a data type declaration in the form:
data Block
= Para { field :: [Something] }
| Plain { field :: [Something] }
...
You can simply use a generic record syntax:
test :: [Block] -> [Block]
test (x:rest) = x { field = [(field x) !! 0] } : rest
Live demo
First note that, as user5402 already wrote, you should replace [Para [(foobar !! 0)]] ++ rest with Para [foobar !! 0] : rest, which looks already a good deal nicer. Next note that basically, all you're doing is modify the head of a list, so as a meaningful helper I'd use
modifyHead :: (a->a) -> [a] -> [a]
modifyHead f (x:xs) = f x : xs
modifyHead _ [] = []
test = modifyHead m
where m (Para foobar) = Para [head foobar] -- `head` is bad, but still better than `(!!0)`!
m (Plain foobar) = Plain [head foobar]
This is about as nice as I can get it without using Template Haskell or Data.Data :-)
First of all, we need to enable some extensions and fix the Block datatype for concreteness (if my simplifying assumptions are wrong, tell me and I'll see what can be salvaged!)
{-# LANGUAGE GADTs, LambdaCase #-}
data Block = Para [Int]
| Plain [String]
| Other Bool Block
| YetAnother deriving (Show,Eq)
The important point here is that Para and Plain are unary constructors, but that the datatype itself can contain constructors with a different number of arguments.
As #leftaroundabout and #user5402 explained, we can separate the concerns of modifying a single Block and applying that modification to the first element of a list only, so I'll focus on rewriting
modifyBaseline :: Block -> Block
modifyBaseline (Para xs) = Para [xs !! 0]
modifyBaseline (Plain xs) = Plain [xs !! 0]
modifyBaseline rest = rest
Next, we need to be able to talk about unary constructors as values. There are 3 important things here:
does a value match a given constructor,
if so what values are 'inside',
and how can we (re-)construct such a value.
We package this up nicely in a custom type (t is the type the constructor belongs to, and a is what's inside):
data UnaryConstructor t a = UnaryConstructor {
destruct:: t -> Maybe a
construct :: a -> t
}
So now we can define
para :: UnaryConstructor Block [Int]
para = UnaryConstructor (\case { Para ns -> Just ns ; _ -> Nothing }) Para
plain :: UnaryConstructor Block [Int]
plain = UnaryConstructor (\case { Plain ss -> Just ss ; _ -> Nothing }) Plain
(You can get rid of the LambdaCase extension if you write (\xs -> case xs of { Para ns -> Just ns; _ -> Nothing}) but this way it's nice and compact!)
Next, we need to 'package up' a unary constructor and the function to apply to the value contained in it:
data UnaryConstructorModifier t where
UnaryConstructorModifier :: UnaryConstructor t a -> (a -> a) -> UnaryConstructorModifier t
so that we can write
modifyHelper :: [UnaryConstructorModifier t] -> t -> t
modifyHelper [] t = t
modifyHelper ((UnaryConstructorModifier uc f):ucms) t
| Just x <- destruct uc t = construct uc (f x)
| otherwise = modifyHelper ucms t
and finally (the lambda could have been (\xs -> [xs !! 0]) or (\xs -> [head xs]) to taste):
modify :: Block -> Block
modify = modifyHelper [UnaryConstructorModifier para (\(x:_) -> [x]),
UnaryConstructorModifier plain (\(x:_) -> [x])]
If we now test it with
ghci> let blocks = [Para [1,2,3,4], Plain ["a","b"], Other True (Para [42,43]), YetAnother ]
ghci> map modify blocks == map modifyBaseline blocks
we get True - hooray!
The repetitive bit is now in the definitions of para and plain where you have to write the name of the constructor three times, but there's no way around that without using Template Haskell or Data.Data (that I can see).
Further options for improvement would be to do something for constructors of different arity, and to put a function of type a -> Maybe a in UnaryConstructorModifier to deal with the partiality of (\(x:_) -> [x]) but I think this answers your question nicely.
I hope you can make sense of this, as I've glossed over a couple of details, including what's going on in the definition of UnaryConstructorModifier and the use of pattern guards in modifyHelper - so ask if you need clarification!
The closest you can get to your original idea is probably a "generic Para-or-Plain-modifying" function.
{-# LANGUAGE RankNTypes #-}
modifyBlock :: (forall a . [a]->[a]) -> Block -> Block
modifyBlock f (Plain foobar) = Plain $ f foobar
modifyBlock f (Para foobar) = Plain $ f foobar
Observe that f has in each use a different type!
With that, you can then write
test (b:bs) = modifyBlock (\(h:_) -> [h]) b : bs
This can be done pretty nicely with ViewPatterns
Note: view patterns isn't actually required here, it just makes it looks a bit nicer imho
Note': This assumes the list inside the block is of the same type in both cases
{-# LANGUAGE ViewPatterns #-}
paraOrPlain :: Block a- > Maybe (a -> Block a, a)
paraOrPlain (Plain xs) = Just (Plain,xs)
paraOrPlain (Para xs) = Just (Para,xs)
paraOrPlain _ = Nothing
-- or even better
para (Para xs) = Just (Para,xs)
para _ = Nothing
plain (Plain xs) = Just (Plain,xs)
plain _ = Nothing
paraOrPlain' b = para b <|> plain b -- requires Control.Applicative
test ((paraOrPlain -> Just (con,xs)) : rest) = con (take 1 xs) : rest
Of course you still need to do the pattern matching somewhere - you can't do that generically without TH or Generic - but you can do it in a way that lends itself to reuse.
I've been trying to learn about static analysis of applicative functors. Many sources say that an advantage of using them over monads is the susceptibility to static analysis.
However, the only example I can find of actually performing static analysis is too complicated for me to understand. Are there any simpler examples of this?
Specifically, I want to know if I can performing static analysis on recursive applications. For example, something like:
y = f <$> x <*> y <*> z
When analyzing the above code, is it possible to detect that it is recursive on y? Or does referential transparency still prevent this from being possible?
Applicative functors allow static analysis at runtime. This is better explained by a simpler example.
Imagine you want to calculate a value, but want to track what dependencies that value has. Eg you may use IO a to calculate the value, and have a list of Strings for the dependencies:
data Input a = Input { dependencies :: [String], runInput :: IO a }
Now we can easily make this an instance of Functor and Applicative. The functor instance is trivial. As it doesn't introduce any new dependencies, you just need to map over the runInput value:
instance Functor (Input) where
fmap f (Input deps runInput) = Input deps (fmap f runInput)
The Applicative instance is more complicated. the pure function will just return a value with no dependencies. The <*> combiner will concat the two list of dependencies (removing duplicates), and combine the two actions:
instance Applicative Input where
pure = Input [] . return
(Input deps1 getF) <*> (Input deps2 runInput) = Input (nub $ deps1 ++ deps2) (getF <*> runInput)
With that, we can also make an Input a an instance of Num if Num a:
instance (Num a) => Num (Input a) where
(+) = liftA2 (+)
(*) = liftA2 (*)
abs = liftA abs
signum = liftA signum
fromInteger = pure . fromInteger
Nexts, lets make a couple of Inputs:
getTime :: Input UTCTime
getTime = Input { dependencies = ["Time"], runInput = getCurrentTime }
-- | Ideally this would fetch it from somewhere
stockPriceOf :: String -> Input Double
stockPriceOf stock = Input { dependencies = ["Stock ( " ++ stock ++ " )"], runInput = action } where
action = case stock of
"Apple" -> return 500
"Toyota" -> return 20
Finally, lets make a value that uses some inputs:
portfolioValue :: Input Double
portfolioValue = stockPriceOf "Apple" * 10 + stockPriceOf "Toyota" * 20
This is a pretty cool value. Firstly, we can find the dependencies of portfolioValue as a pure value:
> :t dependencies portfolioValue
dependencies portfolioValue :: [String]
> dependencies portfolioValue
["Stock ( Apple )","Stock ( Toyota )"]
That is the static analysis that Applicative allows - we know the dependencies without having to execute the action.
We can still get the value of the action though:
> runInput portfolioValue >>= print
5400.0
Now, why can't we do the same with Monad? The reason is Monad can express choice, in that one action can determine what the next action will be.
Imagine there was a Monad interface for Input, and you had the following code:
mostPopularStock :: Input String
mostPopularStock = Input { dependencies ["Popular Stock"], getInput = readFromWebMostPopularStock }
newPortfolio = do
stock <- mostPopularStock
stockPriceOf "Apple" * 40 + stockPriceOf stock * 10
Now, how can we calculate the dependencies of newPortolio? It turns out we can't do it without using IO! It will depend on the most popular stock, and the only way to know is to run the IO action. Therefore it isn't possible to statically track dependencies when the type uses Monad, but completely possible with just Applicative. This is a good example of why often less power means more useful - as Applicative doesn't allow choice, dependencies can be calculated statically.
Edit: With regards to the checking if y is recursive on itself, such a check is possible with applicative functors if you are willing to annotate your function names.
data TrackedComp a = TrackedComp { deps :: [String], recursive :: Bool, run :: a}
instance (Show a) => Show (TrackedComp a) where
show comp = "TrackedComp " ++ show (run comp)
instance Functor (TrackedComp) where
fmap f (TrackedComp deps rec1 run) = TrackedComp deps rec1 (f run)
instance Applicative TrackedComp where
pure = TrackedComp [] False
(TrackedComp deps1 rec1 getF) <*> (TrackedComp deps2 rec2 value) =
TrackedComp (combine deps1 deps2) (rec1 || rec2) (getF value)
-- | combine [1,1,1] [2,2,2] = [1,2,1,2,1,2]
combine :: [a] -> [a] -> [a]
combine x [] = x
combine [] y = y
combine (x:xs) (y:ys) = x : y : combine xs ys
instance (Num a) => Num (TrackedComp a) where
(+) = liftA2 (+)
(*) = liftA2 (*)
abs = liftA abs
signum = liftA signum
fromInteger = pure . fromInteger
newComp :: String -> TrackedComp a -> TrackedComp a
newComp name tracked = TrackedComp (name : deps tracked) isRecursive (run tracked) where
isRecursive = (name `elem` deps tracked) || recursive tracked
y :: TrackedComp [Int]
y = newComp "y" $ liftA2 (:) x z
x :: TrackedComp Int
x = newComp "x" $ 38
z :: TrackedComp [Int]
z = newComp "z" $ liftA2 (:) 3 y
> recursive x
False
> recursive y
True
> take 10 $ run y
[38,3,38,3,38,3,38,3,38,3]
Yes, applicative functors allow more analysis than monads. But no, you can't observe the recursion. I've written a paper about parsing which explains the problem in detail:
https://lirias.kuleuven.be/bitstream/123456789/352570/1/gc-jfp.pdf
The paper then discusses an alternative encoding of recursion which does allow analysis and has some other advantages and some downsides. Other related work is:
https://lirias.kuleuven.be/bitstream/123456789/376843/1/p97-devriese.pdf
And more related work can be found in the related work sections of those papers...