How can I make the signature of this function more precise

How can I make the signature of this function more precise - haskell

I have two functions:
prompt :: Text -> (Text -> Either Text a) -> IO a
subPrompt :: Text -> (Text -> Bool) -> IO a -> IO (Maybe (Text, a))
subPrompt takes a second prompt (argument 3) and displays it if the function in argument 2 comes back as true after running the first prompt.
What I don't like is that argument 3 is IO a I would like it to be something more like:
subPrompt :: Text -> (Text -> Bool) -> prompt -> IO (Maybe (Text, a))
But I know I can't do that. I'm stuck trying to think of a way to make it clearer from the signature what the third argument is. Is there some way I can define a clearer type? Or maybe I'm overthinking it and IO a is actually fine - I'm pretty new to haskell.

One way is to reify the two things as a data structure. So:
{-# LANGUAGE GADTs #-}
data Prompt a where
Prompt :: Text -> (Text -> Either Text a) -> Prompt a
SubPrompt :: Text -> (Text -> Bool) -> Prompt a -> Prompt (Maybe (Text, a))
Now because the third argument to SubPrompt is a Prompt, you know it must either be a call to SubPrompt or Prompt -- definitely not some arbitrary IO action that might do filesystem access or some other nasty thing.
Then you can write an interpreter for this tiny DSL into IO:
runPrompt :: Prompt a -> IO a
runPrompt (Prompt cue validator) = {- what your old prompt used to do -}
runPrompt (SubPrompt cue deeper sub) = {- what your old subPrompt used to do,
calling runPrompt on sub where needed -}
Besides the benefit of being sure you don't have arbitrary IO as an argument to SubPrompt, this has the side benefit that it makes testing easier. Later you could implement a second interpreter that is completely pure; say, something like this, which takes a list of texts to be treated as user inputs and returns a list of texts that the prompt output:
data PromptResult a = Done a | NeedsMoreInput (Prompt a)
purePrompt :: Prompt a -> [Text] -> ([Text], PromptResult a)
purePrompt = {- ... -}

Nothing wrong with making the second prompt a simple IO a - specially if you document what it is somewhere.
That said, yes, it is good practice to make the types as self-explanatory as possible; you can create an alias:
type Prompt a = IO a
and then use it in subPrompt's signature:
subPrompt :: Text -> (Text -> Bool) -> Prompt a -> IO (Maybe (Text, a))
This makes the signature more self-explanatory, while still allowing you to pass any IO a as the third parameter (the keyword type just creates an alias).
But wait, there is more: you'd rather not accidentally pass any IO a that isn't actually a prompt! You don't want to pass it an IO action that, say, launches the missiles...
So, we can declare an actual Prompt type (not just an alias, but a real type):
newtype Prompt a = Prompt { getPrompt :: IO a }
This allows you to wrap any value of type IO a inside a type, ensuring it doesn't get mixed up with other functions with the same type, but different semantics.
The signature of subPrompt remains the same as before:
subPrompt :: Text -> (Text -> Bool) -> Prompt a -> IO (Maybe (Text, a))
But now you cannot pass just any old IO a to it; to pass your prompt, for example, you have to wrap it:
subPrompt "Do we proceed?" askYesNo (Prompt (prompt "Please enter your name" processName))
(subPrompt won't be able to call it directly, but will have to extract "prompt" from inside the wrapper: let actualPrompt = getPrompt wrappedPrompt)

Related

IO (Maybe Picture) -> Picture

I'm creating a game with Gloss.
I have this function :
block :: IO (Maybe Picture)
block = loadJuicyPNG "block.png"
How do I take this IO (Maybe Picture) and turn it into a Picture?

You need to bind the value. This is done either with the bind function (>>=), or by do-notation:
main :: IO ()
main = do
pic <- block
case pic of
Just p -> ... -- loading succeeded, p is a Picture
Nothing -> ... -- loading failed
It is a Maybe Picture because the loading might fail, and you have to handle that possible failure somehow.

This is basically the same answer as Bartek's, but using a different approach.
Let's say you have a function foo :: Picture -> Picture the transforms a picture in some way. It expects a Picture as an argument, but all you have is block :: IO (Maybe Picture); there may or may not be a picture buried in there, but it's all you have.
To start, let's assume you have some function foo' :: Maybe Picture -> Maybe Picture. It's definition is simple:
foo' :: Maybe Picture -> Maybe Picture
foo' = fmap foo
So simple, in fact, that you never actually write it; wherever you would use foo', you just use fmap foo directly. What this function does, you will recall, is return Nothing if it gets Nothing, and return Just (foo x) if it gets some value Just x.
Now, given that you have foo', how do you apply it to the value buried in the IO type? For that, we'll use the Monad instanced for IO, which provides us with two functions (types here specialized to IO):
return :: a -> IO a
(>>=) :: IO a -> (a -> IO b) -> IO b
In our case, we recognize that both a and b are Maybe Picture. If foo' :: Maybe Picture -> Maybe Picture, then return . foo' :: Maybe Picture -> IO (Maybe Picture). That means we can finally "apply" foo to our picture:
> :t block >>= return . (fmap foo)
block >>= return . (fmap foo) :: IO (Maybe Picture)
But we aren't really applying foo ourselves. What we are really doing is lifting foo into a context where, once block is executed, foo' can be called on whatever block produces.

Best practices for talking to an API

I'm trying to create some bindings for an API in Haskell. I noticed some functions have a tremendous number of arguments, e.g.
myApiFunction :: Key -> Account -> Int -> String -> Int -> Int -> IO (MyType)
It's not necessarily bad, per se, to have this many arguments. But as a user I don't like long argument functions. However, each of these args is absolutely 100% necessary.
Is there a more haskell-ish way to abstract over the common parts of these functions? Everything past account here is used to build a URL, so I would need it available, and what it stands for depends entirely on the function. Certain things are consistent though, like Key and Account, and I'm wondering what the best to abstract over these arguments is.
Thank you!

You can combine these into more descriptive data types:
data Config = Config
{ cKey :: Key
, cAccount :: Account
}
Then maybe have types or newtypes to make the other arguments more descriptive:
-- I have no idea what these actually should be, I'm just making up something
type Count = Int
type Name = String
type Position = (Int, Int)
myApiFunction :: Config -> Count -> Name -> Position -> IO MyType
myApiFunction conf count name (x, y) =
myPreviousApiFunction (cKey conf)
(cAccount conf)
name
name
x
y
If the Config is always needed, then I would recommend working in a Reader monad, which you can easily do as
myApiFunction
:: (MonadReader Config io, MonadIO io)
=> Count -> Name -> Position
-> io MyType
myApiFunction count name (x, y) = do
conf <- ask
liftIO $ myPreviousApiFunction
(cKey conf)
(cAccount conf)
name
name
x
y
This uses the mtl library for monad transformers. If you don't want to have to type that constraint over and over, you can also use the ConstraintKinds extension to alias it:
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE FlexibleContexts #-}
...
type ApiCtx io = (MonadReader Config io, MonadIO io)
...
myApiFunction
:: ApiCtx io
=> Count -> Location -> Position
-> io MyType
myApiFunction ...
Depending on your specific application, you could also split it up into multiple function. I've seen plenty of APIs before that had something like
withCount :: ApiCtx io => Count -> io a -> io a
withName :: ApiCtx io => Name -> io a -> io a
withPos :: ApiCtx io => Position -> io a -> io a
(&) :: a -> (a -> b) -> b
request :: ApiCtx io => io MyType
> :set +m -- Multi-line input
> let r = request & withCount 1
| & withName "foo"
| & withPos (1, 2)
> runReaderT r (Config key acct)
These are just a handful of techniques, there are others out there as well but they generally start becoming more complex after this. Others will have different preferences on how to do this, and I'm sure plenty would disagree with me on whether some of these are even good practice (specifically ConstraintKinds, it isn't universally accepted).
If you find yourself having type signatures that are too large a lot, even after applying some of these techniques, then maybe you're approaching the problem from the wrong direction, maybe those functions can be broken down into simpler intermediate steps, maybe some of those arguments can be grouped together logically into more specific data types, maybe you just need a larger record structure to handle setting up complex operations. It's pretty open ended right now.

How to explicitly instantiate/specialise a polymorphic Haskell function?

I was wondering whether it is possible to explicitly instantiate/specialise a polymorphic function in Haskell? What I mean is, imagine I've a function like the following:
parseFile :: FromJSON a => FilePath -> IO Either String a
The structure into which it attempts to parse the file's contents will depend on the type of a. Now, I know it's possible to specify a by annotation:
parseFile myPath :: IO Either String MyType
What I was wondering was whether it's possible to specialise parseFile more explicitly, for instance with something like (specialise parseFile MyType) to turn it into parseFile :: FilePath -> IO Either String MyType
The reason I ask is that the method of annotation can become clumsy with larger functions. For instance, imagine parseFile gets called by foo which gets called by bar, and bar's return value has a complex type like
:: FromJSON a => IO (([Int],String), (Int, String, Int), a, (Double, [String]))
This means that if I want to call bar with a as MyType, I have to annotate the call with
:: IO (([Int],String), (Int, String, Int), MyType, (Double, [String]))
If I want to call bar multiple times to process different types, I end up writing this annotation multiple times, which seems like unnecessary duplication.
res1 <- bar inputA :: IO (([Int],String), (Int, String, Int), MyType, (Double, [String]))
res2 <- bar inputB :: IO (([Int],String), (Int, String, Int), OtherType, (Double, [String]))
res3 <- bar inputC :: IO (([Int],String), (Int, String, Int), YetAnotherType, (Double, [String]))
Is there a way to avoid this? I'm aware it would be possible to bind the result of bar inputA and use it in a function expecting a MyType, allowing the type engine to infer that the a in question was a MyType without requiring explicit annotation. This seems to sacrifice type safety however, as if I accidentally used the result of the above bar inputB (an OtherType) in a function that expects a MyType, for instance, the type system wouldn't complain, instead the program would fail at runtime when attempting to parse inputB into a MyType, as inputB contains an OtherType, not a MyType.

First, a small correction, the type should be
parseFile :: FromJSON a => FilePath -> IO (Either String a)
The parenthesis are important and necessary
There are a couple ways around this. For example, if you had a function
useMyType :: MyType -> IO ()
useMyType = undefined
Then you used parseFile as
main = do
result <- parseFile "data.json"
case result of
Left err -> putStrLn err
Right mt -> useMyType mt
No extra type annotations are required, GHC can infer the type of mt by its use with useMyType.
Another way is to simply assign it to a concretely typed name:
parseMyTypeFromFile :: FilePath -> IO (Either String MyType)
parseMyTypeFromFile = parseFile
main = do
result <- parseMyTypeFromFile "data.json"
case result of
Left err -> putStrLn err
Right mt -> useMyType mt
And where ever you use parseMyTypeFromFile no explicit annotation is necessary. This is the same as a common practice for specifying the type of read:
readInt :: String -> Int
readInt = read
For solving the bar problem, if you have a type that complex I would at least suggest creating an alias for it, if not its own data type entirely, possibly with record fields and whatnot. Something similar to
data BarType a = (([Int], String), (Int, String, Int), a, (Double, [String]))
Then you can write bar as
bar :: FromJSON a => InputType -> IO (BarType a)
bar input = implementation details
which makes bar nicer to read too. Then you can just do
res1 <- bar inputA :: IO (BarType MyType)
res2 <- bar inputB :: IO (BarType OtherType)
res3 <- bar inputC :: IO (BarType YetAnotherType)
I would consider this perfectly clear and idiomatic Haskell, personally. Not only is it immediately readable and clear what you're doing, but by having a name to refer to the complex type, you minimize the chance of typos, take advantage of IDE autocompletion, and can put documentation on the type itself to let others (and your future self) know what all those fields mean.

You can't make a polymorphic function provided elsewhere and given an explicit annotation into a more restricted version with the same name. But you can do something like:
parseFileOfMyType :: FilePath -> IO Either String MyType
parseFileOfMyType = parseFile
A surprising number of useful functions in various libraries are similar type-specific aliases of unassuming functions like id. Anyway, you should be able to make type-constrained versions of those examples using this technique.
Another solution to the verbosity problem would be to create type aliases:
type MyInputParse a = IO (([Int],String), (Int, String, Int), a, (Double, [String]))
res1 <- bar inputA :: MyInputParse MyType
res2 <- bar inputB :: MyInputParse OtherType
res3 <- bar inputC :: MyInputParse YetAnotherType
In the not-too-distant future, GHC will possibly be getting a mechanism to provide partial type signatures, which will let you leave some sort of hole in the type signature that inference will fill in while you make the part you're interested in specific. But it's not there yet.

Simplest way to join functions of same meaning but different return value type

I'm writing small "hello world" type of program, which groups same files by different "reasons", e.g. same size, same content, same checksum etc.
So, I've got to the point when I want to write a function like this (DuplicateReason is an algebraic type which states the reason why two files are identical):
getDuplicatesByMethods :: (Eq a) => [((FilePath -> a), DuplicateReason)] -> IO [DuplicateGroup]
Where in each tuple, first function would be the one that by file's path returns you some (Eq a) value, like bytestring (with content), or Word32 with checksum, or Int with size.
Clearly, Haskell doesn't like that these functions are of different types, so I need to somehow gather them.
The only way I see it to create a type like
data GroupableValue = GroupString String | GroupInt Int | GroupWord32 Word32
And then to make life easier to make typeclass like
class GroupableValueClass a where
toGroupableValue :: a -> GroupableValue
fromGroupableValue :: GroupableValue -> a
and implement instance for each value I'm going to get.
Question: am I doing it right and (if no) is there a simpler way to solve this task?
Update:
Here's full minimal code that should describe what I want (simplified, with no IO etc.):
data DuplicateGroup = DuplicateGroup
-- method for "same size" -- returns size
m1 :: String -> Int
m1 content = 10
-- method for "same content" -- returns content
m2 :: String -> String
m2 content = "sample content"
groupByMethods :: (Eq a) => [(String -> a)] -> [DuplicateGroup]
groupByMethods predicates = undefined
main :: IO ()
main = do
let groups = (groupByMethods [m1, m2])
return ()

Lists are always homogeneous, so you can't put items with a different a in to the same list (as you noticed). There are several ways to design around this, but I usually prefer using GADTs. For example:
{-# LANGUAGE GADTs #-}
import Data.ByteString (ByteString)
import Data.Word
data DuplicateReason = Size | Checksum | Content
data DuplicateGroup
data DuplicateTest where
DuplicateTest :: Eq a => (FilePath -> IO a) -> DuplicateReason -> DuplicateTest
getSize :: FilePath -> IO Integer
getSize = undefined
getChecksum :: FilePath -> IO Word32
getChecksum = undefined
getContent :: FilePath -> IO ByteString
getContent = undefined
getDuplicatesByMethods :: [DuplicateTest] -> IO [DuplicateGroup]
getDuplicatesByMethods = undefined
This solution still needs a new type, but at least you don't have to specify all cases in advance or create boilerplate type-classes. Now, since the generic type a is essentially "hidden" inside the GADT, you can define a list that contains functions with different return types, wrapped in the DuplicateTest GADT.
getDuplicatesByMethods
[ DuplicateTest getSize Size
, DuplicateTest getChecksum Checksum
, DuplicateTest getContent Content
]
You can also solve this without using any language extensions or introducing new types by simply re-thinking your functions. The main intention is to group files according to some property a, so we could define getDuplicatesByMethods as
getDuplicatesByMethods :: [([FilePath] -> IO [[FilePath]], DuplicateReason)] -> IO [DuplicateGroup]
I.e. we take in a function that groups files according to some criteria. Then we can define a helper function
groupWith :: Eq a => (FilePath -> IO a) -> [FilePath] -> IO [[FilePath]]
and call getDuplicatesByMethods like this
getDuplicatesByMethods
[ (groupWith getSize, Size)
, (groupWith getChecksum, Checksum)
, (groupWith getContent, Content)
]

Haskell data serialization of some data implementing a common type class

Let's start with the following
data A = A String deriving Show
data B = B String deriving Show
class X a where
spooge :: a -> Q
[ Some implementations of X for A and B ]
Now let's say we have custom implementations of show and read, named show' and read' respectively which utilize Show as a serialization mechanism. I want show' and read' to have types
show' :: X a => a -> String
read' :: X a => String -> a
So I can do things like
f :: String -> [Q]
f d = map (\x -> spooge $ read' x) d
Where data could have been
[show' (A "foo"), show' (B "bar")]
In summary, I wanna serialize stuff of various types which share a common typeclass so I can call their separate implementations on the deserialized stuff automatically.
Now, I realize you could write some template haskell which would generate a wrapper type, like
data XWrap = AWrap A | BWrap B deriving (Show)
and serialize the wrapped type which would guarantee that the type info would be stored with it, and that we'd be able to get ourselves back at least an XWrap... but is there a better way using haskell ninja-ery?
EDIT
Okay I need to be more application specific. This is an API. Users will define their As, and Bs and fs as they see fit. I don't ever want them hacking through the rest of the code updating their XWraps, or switches or anything. The most i'm willing to compromise is one list somewhere of all the A, B, etc. in some format. Why?
Here's the application. A is "Download a file from an FTP server." B is "convert from flac to mp3". A contains username, password, port, etc. information. B contains file path information. There could be MANY As and Bs. Hundreds. As many as people are willing to compile into the program. Two was just an example. A and B are Xs, and Xs shall be called "Tickets." Q is IO (). Spooge is runTicket. I want to read the tickets off into their relevant data types and then write generic code that will runTicket on the stuff read' from the stuff on disk. At some point I have to jam type information into the serialized data.

I'd first like to stress for all our happy listeners out there that XWrap is a very good way, and a lot of the time you can write one yourself faster than writing it using Template Haskell.
You say you can get back "at least an XWrap", as if that meant you couldn't recover the types A and B from XWrap or you couldn't use your typeclass on them. Not true! You can even define
separateAB :: [XWrap] -> ([A],[B])
If you didn't want them mixed together, you should serialise them seperately!
This is nicer than haskell ninja-ery; maybe you don't need to handle arbitrary instances, maybe just the ones you made.
Do you really need your original types back? If you feel like using existential types because you just want to spooge your deserialised data, why not either serialise the Q itself, or have some intermediate data type PoisedToSpooge that you serialise, which can deserialise to give you all the data you need for a really good spooging. Why not make it an instance of X too?
You could add a method to your X class that converts to PoisedToSpooge.
You could call it something fun like toPoisedToSpooge, which trips nicely off the tongue, don't you think? :)
Anyway this would remove your typesystem complexity at the same time as resolving the annoying ambiguous type in
f d = map (\x -> spooge $ read' x) d -- oops, the type of read' x depends on the String
You can replace read' with
stringToPoisedToSpoogeToDeserialise :: String -> PoisedToSpooge -- use to deserialise
and define
f d = map (\x -> spooge $ stringToPoisedToSpoogeToDeserialise x) -- no ambiguous type
which we could of course write more succincly as
f = map (spooge.stringToPoisedToSpoogeToDeserialise)
although I recognise the irony here in suggesting making your code more succinct. :)

If what you really want is a heterogeneous list then use existential types. If you want serialization then use Cereal + ByteString. If you want dynamic typing, which is what I think your actual goal is, then use Data.Dynamic. If none of this is what you want, or you want me to expand please press the pound key.
Based on your edit, I don't see any reason a list of thunks won't work. In what way does IO () fail to represent both the operations of "Download a file from an FTP server" and "convert from flac to MP3"?

I'll assume you want to do more things with deserialised Tickets
than run them, because if not you may as well ask the user to supply a bunch of String -> IO()
or similar, nothing clever needed at all.
If so, hooray! It's not often I feel it's appropriate to recommend advanced language features like this.
class Ticketable a where
show' :: a -> String
read' :: String -> Maybe a
runTicket :: a -> IO ()
-- other useful things to do with tickets
This all hinges on the type of read'. read' :: Ticket a => String -> a isn't very useful,
because the only thing it can do with invalid data is crash.
If we change the type to read' :: Ticket a => String -> Maybe a this can allow us to read from disk and
try all the possibilities or fail altogether.
(Alternatively you could use a parser: parse :: Ticket a => String -> Maybe (a,String).)
Let's use a GADT to give us ExistentialQuantification without the syntax and with nicer error messages:
{-# LANGUAGE GADTs #-}
data Ticket where
MkTicket :: Ticketable a => a -> Ticket
showT :: Ticket -> String
showT (MkTicket a) = show' a
runT :: Ticket -> IO()
runT (MkTicket a) = runTicket a
Notice how the MkTicket contstuctor supplies the context Ticketable a for free! GADTs are great.
It would be nice to make Ticket and instance of Ticketable, but that won't work, because there would be
an ambiguous type a hidden in it. Let's take functions that read Ticketable types and make them read
Tickets.
ticketize :: Ticketable a => (String -> Maybe a) -> (String -> Maybe Ticket)
ticketize = ((.).fmap) MkTicket -- a little pointfree fun
You could use some unusual sentinel string such as
"\n-+-+-+-+-+-Ticket-+-+-+-Border-+-+-+-+-+-+-+-\n" to separate your serialised data or better, use separate files
altogether. For this example, I'll just use "\n" as the separator.
readTickets :: [String -> Maybe Ticket] -> String -> [Maybe Ticket]
readTickets readers xs = map (foldr orelse (const Nothing) readers) (lines xs)
orelse :: (a -> Maybe b) -> (a -> Maybe b) -> (a -> Maybe b)
(f `orelse` g) x = case f x of
Nothing -> g x
just_y -> just_y
Now let's get rid of the Justs and ignore the Nothings:
runAll :: [String -> Maybe Ticket] -> String -> IO ()
runAll ps xs = mapM_ runT . catMaybes $ readTickets ps xs
Let's make a trivial ticket that just prints the contents of some directory
newtype Dir = Dir {unDir :: FilePath} deriving Show
readDir xs = let (front,back) = splitAt 4 xs in
if front == "dir:" then Just $ Dir back else Nothing
instance Ticketable Dir where
show' (Dir p) = "dir:"++show p
read' = readDir
runTicket (Dir p) = doesDirectoryExist p >>= flip when
(getDirectoryContents >=> mapM_ putStrLn $ p)
and an even more trivial ticket
data HelloWorld = HelloWorld deriving Show
readHW "HelloWorld" = Just HelloWorld
readHW _ = Nothing
instance Ticketable HelloWorld where
show' HelloWorld = "HelloWorld"
read' = readHW
runTicket HelloWorld = putStrLn "Hello World!"
and then put it all together:
myreaders = [ticketize readDir,ticketize readHW]
main = runAll myreaders $ unlines ["HelloWorld",".","HelloWorld","..",",HelloWorld"]

Just use Either. Your users don't even have to wrap it themselves. You have your deserializer wrap it in the Either for you. I don't know exactly what your serialization protocol is, but I assume that you have some way to detect which kind of request, and the following example assumes the first byte distinguishes the two requests:
deserializeRequest :: IO (Either A B)
deserializeRequest = do
byte <- get1stByte
case byte of
0 -> do
...
return $ Left $ A <A's fields>
1 -> do
...
return $ Right $ B <B's fields>
Then you don't even need to type-class spooge. Just make it a function of Either A B:
spooge :: Either A B -> Q

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string