How to work with Regex and OverloadedString - haskell

I have been using Text.Regex.Posix in a file, everything works fine so far.
Now, I would like to use OverloadedStrings for something else but in the same file. The problem is when I activate OverloadedString all the code related to regex doesn't compile because the strings becomes ambiguous.
Is there a way to deal with this without having to add type signature to every strings or deactivate OverloadedStrings ?

I see two approaches here. You can do some import shuffling and just alias the functions you need to have less general types, such as
import qualified Text.Regex.Posix as P
import Text.Regex.Posix hiding ((=~))
(=~) :: RegexContext Regex String target => String -> String -> target
(=~) = (P.=~)
Then you don't have to change the code throughout your file. This can lead to confusion though, and it requires FlexibleContexts to work (not a big deal).
Alternatively you can create your own Python-like syntax for specifying the type:
r :: String -> String
r = id
u :: Text -> Text
u = id
b :: ByteString -> ByteString
b = id
example :: Bool
example = r"test" =~ r"te.t"
splitComma :: Text -> Text
splitComma = Data.Text.splitOn (u",")
But this will require you to edit more of your code. It doesn't use any extra language extensions and the code to implement it is very simple, even in comparison to the first method. It Also means that you'll have to use parentheses or $ signs more carefully, but you can also use the r, u, and b functions as, well, functions.

Related

How can I save a variable as a bytestring?

Ik this is a dumb question, but if I have this:
a :: B.ByteString
a = "a"
I get an error that says "Couldn't match type B.ByteString with type [Char]". I know what's the problem but I don't know how to fix it, could you help? thx.
Character string literals in Haskell, by default, are always treated as String, which is equivalent to [Char]. Most string-like data structures define a function called pack to convert from, and the bytestring package is no exception (Note that this is pack from Data.ByteString.Char8; the one in Data.ByteString converts from [Word8]).
import Data.ByteString.Char8(pack)
a :: B.ByteString
a = pack "a"
However, GHC also supports an extension called OverloadedStrings. If you're willing to enable this, ByteString implements a typeclass called IsString. With this extension enabled, the type of a string literal like "a" is no longer [Char] and is instead forall a. IsString a => a (similar to how the type of numerical literals like 3 is forall a. Num a => a). This will happily specialize to ByteString if the type is in scope.
{-# LANGUAGE OverloadedStrings #-}
a :: B.ByteString
a = "a"
If you go this route, make sure you understand the proviso listed in the docs for this instance. For ASCII characters, it won't pose a problem, but if your string has Unicode characters outside the ASCII range, you need to be aware of it.

Parsec returns [Char] instead of Text

I am trying to create a parser for a custom file format. In the format I am working with, some fields have a closing tag like so:
<SOL>
<DATE>0517
<YEAR>86
</SOL>
I am trying to grab the value between the </ and > and use it as part of the bigger parser.
I have come up with the code below. The trouble is, the parser returns [Char] instead of Text. I can pack each Char by doing fmap pack $ return r to get a text value out, but I was hoping type inference would save me from having to do this. Could someone give hints as to why I am getting back [Char] instead of Text, and how I can get back Text without having to manually pack the value?
{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
import Text.Parsec
import Text.Parsec.Text
-- |A closing tag is on its own line and is a "</" followed by some uppercase characters
-- followed by some '>'
closingTag = do
_ <- char '\n'
r <- between (string "</") (char '>') (many upper)
return r
string has the type
string :: Stream s m Char => String -> ParsecT s u m String
(See here for documentation)
So getting a String back is exactly what's supposed to happen.
Type inference doesn't change types, it only infers them. String is a concrete type, so there's no way to infer Text for it.
What you could do, if you need this in a couple of places, is to write a function
text :: Stream s m Char => String -> ParsecT s u m Text
text = fmap pack . string
or even
string' :: (IsString a, Stream s m Char) => String -> ParsecT s u m a
string' = fmap fromString . string
Also, it doesn't matter in this example but you'd probably want to import Text qualified, names like pack are used in a number of different modules.
As Ørjan Johansen correctly pointed out, string isn't actually the problem here, many upper is. The same principle applies though.
The reason you get [Char] here is that upper parses a Char and many turns that into a [Char]. I would write my own combinator along the lines of:
manyPacked = fmap pack . many
You could probably use type-level programming with type classes etc. to automatically choose between many and manyPack depending on the expect return type, but I don't think that's worth it. (It would probably look a bit like Scala's CanBuiltFrom).

Removing backslashes when show Strings?

I have parametric method which concats a String onto a parametric input:
foo::(Show a) => a -> String
foo f = show f ++ " string"
it is fine when I don pass in a string, but when I pass in a string I get extra blackslashes.
is there a way i avoid ths?
show is not really a toString equivalent but rather an inspect or var_dump equivalent. It's not meant for formatting for human output.
You might consider http://hackage.haskell.org/package/text-format
Don't know about "standard" library function but can be simply done with own show-like implementation:
class StrShow a where
showStr :: a -> String
instance StrShow String where
showStr = id
instance Show a => StrShow a where
showStr = show
GHCi> showStr 1
"1"
GHCi> showStr "hello"
"hello"
This way you don't need extra library but have to use lot of ghc's extensions (TypeSynonymInstances, FlexibleInstances, UndecidableInstances, OverlappingInstances) if this is not an issue.
One way of doing this, which isn't very nice but it's certainly possible, is to use the Typeable class.
import Data.Maybe (fromMaybe)
import Data.Typeable (cast)
foo :: (Show a, Typeable a) => a -> String
foo f = fromMaybe (show f) (cast f)
However, this restricts it to members of the Typeable class (which is included in base, so you won't need to depend on any more libraries, and most things will have defined it).
This works by checking if f is a String (or pretending to be a String, which will only happen if someone's been REALLY evil when writing a library), and if it is, returning it, otherwise showing it.

haskell load module in list

Hey haskellers and haskellettes,
is it possible to load a module functions in a list.
in my concrete case i have a list of functions all checked with or
checkRules :: [Nucleotide] -> Bool
checkRules nucs = or $ map ($ nucs) [checkRule1, checkRule2]
i do import checkRule1 and checkRule2 from a seperate module - i don't know if i will need more of them in the future.
i'd like to have the same functionality look something like
-- import all functions from Rules as rules where
-- :t rules ~~> [([Nucleotide] -> Bool)]
checkRules :: [Nucleotide] -> Bool
checkRules nucs = or $ map ($ nucs) rules
the program sorts Pseudo Nucleotide Sequences in viable and nonviable squences according to given rules.
thanks in advance ε/2
Addendum:
So do i think right - i need:
genList :: File -> TypeSignature -> [TypeSignature]
chckfun :: (a->b) -> TypeSignature -> Bool
at compile time.
but i can't generate a list of all functions in the module - as they most probably will have not the same type signature and hence not all fit in one list. so i cannot filter given list with chckfun.
In order to do this i either want to check the written type signatures in the source file (?) or the inferenced types given by the compiler(?).
another problem that comes to my mind is: not every function written in the source file might get exported ?
Is this a problem a haskell beginner should try to solve after 5 months of learning - my brain is shaped like a klein's bottle after all this "compile time thinking".
There is a nice package on Hackage just for this: language-haskell-extract. In particular, the Template Haskell function functionExtractor takes a regular expression and returns a list of the matching top level bindings as (name, value) pairs. As long as they all have matching types, you're good to go.
{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.Extract
myFoo = "Hello"
myBar = "World"
allMyStuff = $(functionExtractor "^my")
main = print allMyStuff
Output:
[("myFoo", "Hello"), ("myBar", "World")]

Best way to implement ad-hoc polymorphism in Haskell?

I have a polymorphic function like:
convert :: (Show a) => a -> String
convert = " [label=" ++ (show a) ++ "]"
But sometimes I want to pass it a Data.Map and do some more fancy key value conversion. I know I can't pattern match here because Data.Map is an abstract data type (according to this similar SO question), but I have been unsuccessful using guards to this end, and I'm not sure if ViewPatterns would help here (and would rather avoid them for portability).
This is more what I want:
import qualified Data.Map as M
convert :: (Show a) => a -> String
convert a
| M.size \=0 = processMap2FancyKVString a -- Heres a Data.Map
| otherwise = " [label=" ++ (show a) ++ "]" -- Probably a string
But this doesn't work because M.size can't take anything other than a Data.Map.
Specifically, I am trying to modify the sl utility function in the Functional Graph Library in order to handle coloring and other attributes of edges in GraphViz output.
Update
I wish I could accept all three answers by TomMD, Antal S-Z, and luqui to this question as they all understood what I really was asking. I would say:
Antal S-Z gave the most 'elegant' solution as applied to the FGL but would also require the most rewriting and rethinking to implement in personal problem.
TomMD gave a great answer that lies somewhere between Antal S-Z's and luqui's in terms of applicability vs. correctness. It also is direct and to the point which I appreciate greatly and why I chose his answer.
luqui gave the best 'get it working quickly' answer which I will probably be using in practice (as I'm a grad student, and this is just some throwaway code to test some ideas). The reason I didn't accept was because TomMD's answer will probably help other people in more general situations better.
With that said, they are all excellent answers and the above classification is a gross simplification. I've also updated the question title to better represent my question (Thanks Thanks again for broadening my horizons everyone!
What you just explained is you want a function that behaves differently based on the type of the input. While you could use a data wrapper, thus closing the function for all time:
data Convertable k a = ConvMap (Map k a) | ConvOther a
convert (ConvMap m) = ...
convert (ConvOther o) = ...
A better way is to use type classes, thus leaving the convert function open and extensible while preventing users from inputting non-sensical combinations (ex: ConvOther M.empty).
class (Show a) => Convertable a where
convert :: a -> String
instance Convertable (M.Map k a) where
convert m = processMap2FancyKVString m
newtype ConvWrapper a = CW a
instance Convertable (ConvWrapper a) where
convert (CW a) = " [label=" ++ (show a) ++ "]"
In this manner you can have the instances you want used for each different data type and every time a new specialization is needed you can extend the definition of convert simply by adding another instance Convertable NewDataType where ....
Some people might frown at the newtype wrapper and suggest an instance like:
instance Convertable a where
convert ...
But this will require the strongly discouraged overlapping and undecidable instances extensions for very little programmer convenience.
You may not be asking the right thing. I'm going to assume that you either have a graph whose nodes are all Maps or you have a graph whose nodes are all something else. If you need a graph where Maps and non-maps coexist, then there is more to your problem (but this solution will still help). See the end of my answer in that case.
The cleanest answer here is simply to use different convert functions for different types, and have any type that depends on convert take it as an argument (a higher order function).
So in GraphViz (avoiding redesigning this crappy code) I would modify the graphviz function to look like:
graphvizWithLabeler :: (a -> String) -> ... -> String
graphvizWithLabeler labeler ... =
...
where sa = labeler a
And then have graphviz trivially delegate to it:
graphviz = graphvizWithLabeler sl
Then graphviz continues to work as before, and you have graphvizWithLabeler when you need the more powerful version.
So for graphs whose nodes are Maps, use graphvizWithLabeler processMap2FancyKVString, otherwise use graphviz. This decision can be postponed as long as possible by taking relevant things as higher order functions or typeclass methods.
If you need to have Maps and other things coexisting in the same graph, then you need to find a single type inhabited by everything a node could be. This is similar to TomMD's suggestion. For example:
data NodeType
= MapNode (Map.Map Foo Bar)
| IntNode Int
Parameterized to the level of genericity you need, of course. Then your labeler function should decide what to do in each of those cases.
A key point to remember is that Haskell has no downcasting. A function of type foo :: a -> a has no way of knowing anything about what was passed to it (within reason, cool your jets pedants). So the function you were trying to write is impossible to express in Haskell. But as you can see, there are other ways to get the job done, and they turn out to be more modular.
Did that tell you what you needed to know to accomplish what you wanted?
Your problem isn't actually the same as in that question. In the question you linked to, Derek Thurn had a function which he knew took a Set a, but couldn't pattern-match. In your case, you're writing a function which will take any a which has an instance of Show; you can't tell what type you're looking at at runtime, and can only rely on the functions which are available to any Showable type. If you want to have a function do different things for different data types, this is known as ad-hoc polymorphism, and is supported in Haskell with type classes like Show. (This is as opposed to parametric polymorphism, which is when you write a function like head (x:_) = x which has type head :: [a] -> a; the unconstrained universal a is what makes that parametric instead.) So to do what you want, you'll have to create your own type class, and instantiate it when you need it. However, it's a little more complicated than usual, because you want to make everything that's part of Show implicitly part of your new type class. This requires some potentially dangerous and probably unnecessarily powerful GHC extensions. Instead, why not simplify things? You can probably figure out the subset of types which you actually need to print in this manner. Once you do that, you can write the code as follows:
{-# LANGUAGE TypeSynonymInstances #-}
module GraphvizTypeclass where
import qualified Data.Map as M
import Data.Map (Map)
import Data.List (intercalate) -- For output formatting
surround :: String -> String -> String -> String
surround before after = (before ++) . (++ after)
squareBrackets :: String -> String
squareBrackets = surround "[" "]"
quoted :: String -> String
quoted = let replace '"' = "\\\""
replace c = [c]
in surround "\"" "\"" . concatMap replace
class GraphvizLabel a where
toGVItem :: a -> String
toGVLabel :: a -> String
toGVLabel = squareBrackets . ("label=" ++) . toGVItem
-- We only need to print Strings, Ints, Chars, and Maps.
instance GraphvizLabel String where
toGVItem = quoted
instance GraphvizLabel Int where
toGVItem = quoted . show
instance GraphvizLabel Char where
toGVItem = toGVItem . (: []) -- Custom behavior: no single quotes.
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
In this setup, everything which we can output to Graphviz is an instance of GraphvizLabel; the toGVItem function quotes things, and toGVLabel puts the whole thing in square brackets for immediate use. (I might have screwed some of the formatting you want up, but that part's just an example.) You then declare what's an instance of GraphvizLabel, and how to turn it into an item. The TypeSynonymInstances flag just lets us write instance GraphvizLabel String instead of instance GraphvizLabel [Char]; it's harmless.
Now, if you really need everything with a Show instance to be an instance of GraphvizLabel as well, there is a way. If you don't really need this, then don't use this code! If you do need to do this, you have to bring to bear the scarily-named UndecidableInstances and OverlappingInstances language extensions (and the less scarily named FlexibleInstances). The reason for this is that you have to assert that everything which is Showable is a GraphvizLabel—but this is hard for the compiler to tell. For instance, if you use this code and write toGVLabel [1,2,3] at the GHCi prompt, you'll get an error, since 1 has type Num a => a, and Char might be an instance of Num! You have to explicitly specify toGVLabel ([1,2,3] :: [Int]) to get it to work. Again, this is probably unnecessarily heavy machinery to bring to bear on your problem. Instead, if you can limit the things you think will be converted to labels, which is very likely, you can just specify those things instead! But if you really want Showability to imply GraphvizLabelability, this is what you need:
{-# LANGUAGE TypeSynonymInstances, FlexibleInstances
, UndecidableInstances, OverlappingInstances #-}
-- Leave the module declaration, imports, formatting code, and class declaration
-- the same.
instance GraphvizLabel String where
toGVItem = quoted
instance Show a => GraphvizLabel a where
toGVItem = quoted . show
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
Notice that your specific cases (GraphvizLabel String and GraphvizLabel (Map k v)) stay the same; you've just collapsed the Int and Char cases into the GraphvizLabel a case. Remember, UndecidableInstances means exactly what it says: the compiler cannot tell if instances are checkable or will instead make the typechecker loop! In this case, I am reasonably sure that everything here is in fact decidable (but if anybody notices where I'm wrong, please let me know). Nevertheless, using UndecidableInstances should always be approached with caution.

Resources