Differentiate between String and [Char] - haskell

I know that String is defined as [Char], yet I would like to make a difference between the two of them in a class instance. Is that possible with some clever trick other than using newtype to create a separate type? I would like to do something like:
class Something a where
doSomething :: a -> a
instance Something String where
doSomething = id
instance (Something a) => Something [a] where
doSomething = doSoemthingElse
And get different results when I call it with doSomething ("a" :: [Char]) and doSomething ("a" :: String).
I do know about FlexibleInstances and OverlappingInstances but they obviously don't cut the case.

That is not possible. String and [Char] are the same type. There is no way to distinguish between the two types. With OverlappingInstances you can create separate instances for String and [a], but [Char] will always use the instance for String.

Why don't you define functions for each case:
doSomethingString :: String -> String
doSomethingString = id
doSomethingChars :: (Char -> Char) -> String -> String
doSomethingChars f = ...

As said before, String and [Char] IS effectively the same.
What you can do though is defining a newtype. It wraps a type (at no-cost as it is completely removed when compiled) and makes it behave like a different type.
newtype Blah = Blah String
instance Monoid Blah where
mempty = Blah ""
mappend (Blah a) (Blah b) = Blah $ a ++ "|" ++ b

Related

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.
You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Is there a compiler-extension for untagged union types in Haskell?

In some languages (#racket/typed, for example), the programmer can specify a union type without discriminating against it, for instance, the type (U Integer String) captures integers and strings, without tagging them (I Integer) (S String) in a data IntOrStringUnion = ... form or anything like that.
Is there a way to do the same in Haskell?
Either is what you're looking for... ish.
In Haskell terms, I'd describe what you're looking for as an anonymous sum type. By anonymous, I mean that it doesn't have a defined name (like something with a data declaration). By sum type, I mean a data type that can have one of several (distinguishable) types; a tagged union or such. (If you're not familiar with this terminology, try Wikipedia for starters.)
We have a well-known idiomatic anonymous product type, which is just a tuple. If you want to have both an Int and a String, you just smush them together with a comma: (Int, String). And tuples (seemingly) can go on forever--(Int, String, Double, Word), and you can pattern-match the same way. (There's a limit, but never mind.)
The well-known idiomatic anonymous sum type is Either, from Data.Either (and the Prelude):
data Either a b = Left a | Right b
deriving (Eq, Ord, Read, Show, Typeable)
It has some shortcomings, most prominently a Functor instance that favors Right in a way that's odd in this context. The problem is that chaining it introduces a lot of awkwardness: the type ends up like Either (Int (Either String (Either Double Word))). Pattern matching is even more awkward, as others have noted.
I just want to note that we can get closer to (what I understand to be) the Racket use case. From my extremely brief Googling, it looks like in Racket you can use functions like isNumber? to determine what type is actually in a given value of a union type. In Haskell, we usually do that with case analysis (pattern matching), but that's awkward with Either, and function using simple pattern-matching will likely end up hard-wired to a particular union type. We can do better.
IsNumber?
I'm going to write a function I think is an idiomatic Haskell stand-in for isNumber?. First, we don't like doing Boolean tests and then running functions that assume their result; instead, we tend to just convert to Maybe and go from there. So the function's type will end with -> Maybe Int. (Using Int as a stand-in for now.)
But what's on the left hand of the arrow? "Something that might be an Int -- or a String, or whatever other types we put in the union." Uh, okay. So it's going to be one of a number of types. That sounds like typeclass, so we'll put a constraint and a type variable on the left hand of the arrow: MightBeInt a => a -> Maybe Int. Okay, let's write out the class:
class MightBeInt a where
isInt :: a -> Maybe Int
fromInt :: Int -> a
Okay, now how do we write the instances? Well, we know if the first parameter to Either is Int, we're golden, so let's write that out. (Incidentally, if you want a nice exercise, only look at the instance ... where parts of these next three code blocks, and try to implement that class members yourself.)
instance MightBeInt (Either Int b) where
isInt (Left i) = Just i
isInt _ = Nothing
fromInt = Left
Fine. And ditto if Int is the second parameter:
instance MightBeInt (Either a Int) where
isInt (Right i) = Just i
isInt _ = Nothing
fromInt = Right
But what about Either String (Either Bool Int)? The trick is to recurse on the right hand type: if it's not Int, is it an instance of MightBeInt itself?
instance MightBeInt b => MightBeInt (Either a b) where
isInt (Right xs) = isInt xs
isInt _ = Nothing
fromInt = Right . fromInt
(Note that these all require FlexibleInstances and OverlappingInstances.) It took me a long time to get a feel for writing and reading these class instances; don't worry if this instance is surprising. The punch line is that we can now do this:
anInt1 :: Either Int String
anInt1 = fromInt 1
anInt2 :: Either String (Either Int Double)
anInt2 = fromInt 2
anInt3 :: Either String Int
anInt3 = fromInt 3
notAnInt :: Either String Int
notAnInt = Left "notint"
ghci> isInt anInt3
Just 3
ghci> isInt notAnInt
Nothing
Great!
Generalizing
Okay, but now do we need to write another type class for each type we want to look up? Nope! We can parameterize the class by the type we want to look up! It's a pretty mechanical translation; the only question is how to tell the compiler what type we're looking for, and that's where Proxy comes to the rescue. (If you don't want to install tagged or run base 4.7, just define data Proxy a = Proxy. It's nothing special, but you'll need PolyKinds.)
class MightBeA t a where
isA :: proxy t -> a -> Maybe t
fromA :: t -> a
instance MightBeA t t where
isA _ = Just
fromA = id
instance MightBeA t (Either t b) where
isA _ (Left i) = Just i
isA _ _ = Nothing
fromA = Left
instance MightBeA t b => MightBeA t (Either a b) where
isA p (Right xs) = isA p xs
isA _ _ = Nothing
fromA = Right . fromA
ghci> isA (Proxy :: Proxy Int) anInt3
Just 3
ghci> isA (Proxy :: Proxy String) notAnInt
Just "notint"
Now the usability situation is... better. The main thing we've lost, by the way, is the exhaustiveness checker.
Notational Parity With (U String Int Double)
For fun, in GHC 7.8 we can use DataKinds and TypeFamilies to eliminate the infix type constructors in favor of type-level lists. (In Haskell, you can't have one type constructor--like IO or Either--take a variable number of parameters, but a type-level list is just one parameter.) It's just a few lines, which I'm not really going to explain:
type family OneOf (as :: [*]) :: * where
OneOf '[] = Void
OneOf '[a] = a
OneOf (a ': as) = Either a (OneOf as)
Note that you'll need to import Data.Void. Now we can do this:
anInt4 :: OneOf '[Int, Double, Float, String]
anInt4 = fromInt 4
ghci> :kind! OneOf '[Int, Double, Float, String]
OneOf '[Int, Double, Float, String] :: *
= Either Int (Either Double (Either Float [Char]))
In other words, OneOf '[Int, Double, Float, String] is the same as Either Int (Either Double (Either Float [Char])).
You need some kind of tagging because you need to be able to check if a value is actually an Integer or a String to use it for anything. One way to alleviate having to create a custom ADT for every combination is to use a type such as
{-# LANGUAGE TypeOperators #-}
data a :+: b = L a | R b
infixr 6 :+:
returnsIntOrString :: Integer -> Integer :+: String
returnsIntOrString i
| i `rem` 2 == 0 = R "Even"
| otherwise = L (i * 2)
returnsOneOfThree :: Integer -> Integer :+: String :+: Bool
returnsOneOfThree i
| i `rem` 2 == 0 = (R . L) "Even"
| i `rem` 3 == 0 = (R . R) False
| otherwise = L (i * 2)

How do I let a function in Haskell depend on the type of its argument?

I tried to write a variation on show that treats strings differently from other instances of Show, by not including the " and returning the string directly. But I don't know how to do it. Pattern matching? Guards? I couldn't find anything about that in any documentation.
Here is what I tried, which doesn't compile:
show_ :: Show a => a -> String
show_ (x :: String) = x
show_ x = show x
If possible, you should wrap your values of type String up in a newtype as #wowofbob suggests.
However, sometimes this isn't feasible, in which case there are two general approaches to making something recognise String specifically.
The first way, which is the natural Haskell approach, is to use a type class just like Show to get different behaviour for different types. So you might write
class Show_ a where
show_ :: a -> String
and then
instance Show_ String where
show_ x = x
instance Show_ Int where
show_ x = show x
and so on for any other type you want to use. This has the disadvantage that you need to explicitly write out Show_ instances for all the types you want.
#AndrewC shows how you can cut each instance down to a single line, but you'll still have to list them all explicitly. You can in theory work around this, as detailed in this question, but it's not pleasant.
The second option is to get true runtime type information with the Typeable class, which is quite short and simple in this particular situation:
import Data.Typeable
[...]
show_ :: (Typeable a, Show a) => a -> String
show_ x =
case cast x :: Maybe String of
Just s -> s
Nothing -> show x
This is not a natural Haskell-ish approach because it means callers can't tell much about what the function will do from the type.
Type classes in general give constrained polymorphism in the sense that the only variations in behaviour of a particular function must come from the variations in the relevant type class instances. The Show_ class gives some indication what it's about from its name, and it might be documented.
However Typeable is a very general class. You are delegating everything to the specific function you are calling; a function with a Typeable constraint might have completely different implementations for lots of different concrete types.
Finally, a further elaboration on the Typeable solution which gets closer to your original code is to use a couple of extensions:
{-# LANGUAGE ViewPatterns, ScopedTypeVariables #-}
import Data.Typeable
[...]
show_ :: (Typeable a, Show a) => a -> String
show_ (cast -> Just (s :: String)) = s
show_ x = show x
The use of ViewPatterns allows us to write the cast inside a pattern, which may fit in more nicely with more complicated examples. In fact we can omit the :: String type constraint because the body of this cases forces s to be the result type of show_, i.e. String, anyway. But that's a little obscure so I think it's better to be explicit.
You can wrap it into newtype and make custom Show instance for it:
newtype PrettyString = PrettyString { toString :: String }
instance Show PrettyString where
show (PrettyString s) = "$$" ++ s ++ "$$" -- for example
And then use it like below:
main = getLine >>= print . PrettyString
TL;DR:
Copy the prelude's way and use showList_ as a class function to generate instances for lists so you can override the definition for String.
Caveat
For the record, wowofbob's answer using a newtype wrapper is the simple, clean solution I would use in real life, but I felt it was instructive to also look at some of how the Prelude does this.
intercalate commas by default
The way this is done in the prelude is to make the Show class have a function for showing lists, with a default definition that you can override.
import Data.List (intercalate)
I'll use intercalate :: [a] -> [[a]] -> [a] to put commas in between stuff:
ghci> intercalate "_._" ["intercalate","works","like","this"]
"intercalate_._works_._like_._this"
Make a showList_ class function, and default to show and comma-separated lists.
So now the class with the default implementation of the showList function and, importantly, a default show_ implementation that just uses the ordinary show function. To be able to use that, we have to insist that the type is already in the Show typeclass, but that's OK as far as I understand you.
class Show a => Show_ a where
show_ :: a -> String
showList_ :: [a] -> String
show_ = show
showList_ xs = '[' : intercalate ", " (map show_ xs) ++ "]"
The real Show class uses functions of type String -> String instead of String directly for efficiency reasons, and a precedence argument to control the use of brackets, but I'll skip all that for simplicity.
Automatically make instances for lists
Now we can use the showList function to provide an instance for lists:
instance Show_ a => Show_ [a] where
show_ xs = showList_ xs
The Show a => superclass makes instances super-easy
Now we come to some instances. Because of our default show_ implementation, we don't need to do any actual programming unless we want to override the default, which we'll do for Char, because String ~ [Char].
instance Show_ Int
instance Show_ Integer
instance Show_ Double
instance Show_ Char where
show_ c = [c] -- so show_ 'd' = "d". You can put show_ = show if you want "'d'"
showList_ = id -- just return the string
In practice:
Now that's not much use to hide " from your output in ghci, because the default show function is used for that, but if we use putStrLn, the quotes disappear:
put :: Show_ a => a -> IO ()
put = putStrLn . show_
ghci> show "hello"
"\"hello\""
ghci> show_ "hello"
"hello"
ghci> put "hello"
hello
ghci> put [2,3,4]
[2, 3, 4]
ghci>

Can't properly define transformation from universal type, that defined with GADT

I've defined an universal data type that can contain anything (well, with current implementation not totally anything)!
Here it is (complete code):
{-#LANGUAGE NoMonomorphismRestriction#-}
{-#LANGUAGE GADTs#-}
{-#LANGUAGE StandaloneDeriving#-}
data AnyT where
Any :: (Show a, Read a) => a -> AnyT
readAnyT :: (Read a, Show a) => (String -> a) -> String -> AnyT
readAnyT readFun str = Any $ readFun str
showAnyT :: AnyT -> String
showAnyT (Any thing) = show thing
deriving instance Show AnyT --Just for convinience!
a = [Any "Hahaha", Any 123]
And I can play with it in console:
*Main> a
[Any "Hahaha",Any 123]
it :: [AnyT]
*Main> readAnyT (read::String->Float) "134"
Any 134.0
it :: AnyT
*Main> showAnyT $ Any 125
"125"
it :: String
Well, I have it, but I need to process it somehow. For example, let's define transformation functions (functions definition, add to previous code):
toAnyT :: (Show a, Read a) => a -> AnyT -- Rather useless
toAnyT a = Any a
fromAny :: AnyT -> a
fromAny (Any thing) = thing
And there is the problem! the fromAny definition from previous code is incorrect! And I don't know how to make it correct. I get the error in GHCi:
2.hs:18:23:
Could not deduce (a ~ a1)
from the context (Show a1, Read a1)
bound by a pattern with constructor
Any :: forall a. (Show a, Read a) => a -> AnyT,
in an equation for `fromAny'
at 2.hs:18:10-18
`a' is a rigid type variable bound by
the type signature for fromAny :: AnyT -> a at 2.hs:17:12
`a1' is a rigid type variable bound by
a pattern with constructor
Any :: forall a. (Show a, Read a) => a -> AnyT,
in an equation for `fromAny'
at 2.hs:18:10
In the expression: thing
In an equation for `fromAny': fromAny (Any thing) = thing
Failed, modules loaded: none.
I tried some other ways that giving errors too.
I have rather bad solution for this: defining necessary functions via showAnyT and read (replace previous function definitions):
toAnyT :: (Show a, Read a) => a -> AnyT -- Rather useless
toAnyT a = Any a
fromAny :: Read a => AnyT -> a
fromAny thing = read (showAnyT thing)
Yes, it's work. I can play with it:
*Main> fromAny $ Any 1352 ::Float
1352.0
it :: Float
*Main> fromAny $ Any 1352 ::Int
1352
it :: Int
*Main> fromAny $ Any "Haha" ::String
"Haha"
it :: String
But I think it's bad, because it uses string to transform.
Could you please help me to find neat and good solution?
First a disclaimer: I don't know the whole context of the problem you are trying to solve, but the first impression I get is that this kind of use of existentials is the wrong tool for the job and you might be trying to implement some code pattern that is common in object-oriented languaged but a poor fit for Haskell.
That said, existential types like the one you have here are usually like black holes where once you put something in, the type information is lost forever and you can't cast the value back to its original type. However, you can operate on existential values via typeclasses (as you've done with Show and Read) so you can use the typeclass Typeable to retain the original type information:
import Data.Typeable
data AnyT where
Any :: (Show a, Read a, Typeable a) => a -> AnyT
Now you can implement all the functions you have, as long as you add the new constraint to them as well:
readAnyT :: (Read a, Show a, Typeable a) => (String -> a) -> String -> AnyT
readAnyT readFun str = Any $ readFun str
showAnyT :: AnyT -> String
showAnyT (Any thing) = show thing
toAnyT :: (Show a, Read a, Typeable a) => a -> AnyT -- Rather useless
toAnyT a = Any a
fromAny can be implemented as returning a Maybe a (since you cannot be sure if the value you are getting out is of the type you are expecting).
fromAny :: Typeable a => AnyT -> Maybe a
fromAny (Any thing) = cast thing
You're using GADTs to create an existential data type. The type a in the constructor existed, but there's no way to recover it. The only information available to you is that it has Show and Read instances. The exact type is forgotten, because that's what your constructor's type instructs the type system to do. "Make sure this type has the proper instances, then forget what it is."
There is one function you've missed, by the way:
readLike :: String -> AnyT -> AnyT
readLike s (Any a) = Any $ read s `asTypeOf` a
Within the context of the pattern match, the compiler knows that whatever type a has, there is a Read instance, and it can apply that instance. Even though it's not sure what type a is. But all it can do with it is either show it, or read strings as the same type as it.
What you have is something called Existential type. If you follow that link than you will find that in this pattern the only way to work with the "data" inside the container type is to use type classes.
In your current example you mentioned that a should have Read and Show instances and that means only the functions in these type classes can be used on a and nothing else and if you want to support some more operations on a then it should be constrained with the required type class.
Think it like this: You can put anything in a box. Now when you extract something out of that box you have no way to specify what you will get out of it as you can put anything inside it. Now if you say that you can put any eatable inside this box, then you are sure that when you pick something from this box it will be eatable.

Haskell type declarations

In Haskell, why does this compile:
splice :: String -> String -> String
splice a b = a ++ b
main = print (splice "hi" "ya")
but this does not:
splice :: (String a) => a -> a -> a
splice a b = a ++ b
main = print (splice "hi" "ya")
>> Type constructor `String' used as a class
I would have thought these were the same thing. Is there a way to use the second style, which avoids repeating the type name 3 times?
The => syntax in types is for typeclasses.
When you say f :: (Something a) => a, you aren't saying that a is a Something, you're saying that it is a type "in the group of" Something types.
For example, Num is a typeclass, which includes such types as Int and Float.
Still, there is no type Num, so I can't say
f :: Num -> Num
f x = x + 5
However, I could either say
f :: Int -> Int
f x = x + 5
or
f :: (Num a) => a -> a
f x = x + 5
Actually, it is possible:
Prelude> :set -XTypeFamilies
Prelude> let splice :: (a~String) => a->a->a; splice a b = a++b
Prelude> :t splice
splice :: String -> String -> String
This uses the equational constraint ~. But I'd avoid that, it's not really much shorter than simply writing String -> String -> String, rather harder to understand, and more difficult for the compiler to resolve.
Is there a way to use the second style, which avoids repeating the type name 3 times?
For simplifying type signatures, you may use type synonyms. For example you could write
type S = String
splice :: S -> S -> S
or something like
type BinOp a = a -> a -> a
splice :: BinOp String
however, for something as simple as String -> String -> String, I recommend just typing it out. Type synonyms should be used to make type signatures more readable, not less.
In this particular case, you could also generalize your type signature to
splice :: [a] -> [a] -> [a]
since it doesn't depend on the elements being characters at all.
Well... String is a type, and you were trying to use it as a class.
If you want an example of a polymorphic version of your splice function, try:
import Data.Monoid
splice :: Monoid a=> a -> a -> a
splice = mappend
EDIT: so the syntax here is that Uppercase words appearing left of => are type classes constraining variables that appear to the right of =>. All the Uppercase words to the right are names of types
You might find explanations in this Learn You a Haskell chapter handy.

Resources