I got a problem with thinking recursively in Haskell.
I am trying to build a survey application where questions conditionally lead to new questions based on the user's answers.
I got:
- Questions - a list of questions
- QuestionPaths - a list of question paths of questions which lead to new questions
- Answers a list of user's answers
You can think of QuestionPaths as a list of tuples where:
type QuestionPath = (QuestionId, AnswerChoice, NextQuestionId)
Basically this would read: If user answers a question QuestionId with an answer AnswerChoice, ask him NextQuestionId next.
I have attempted to model this problem domain with multiway trees (nodes can have multiple children):
data YesNo = Yes | No
data AnswerChoice = Skip
| Boolean YesNo
| ChooseOne [Text]
type Condition = AnswerChoice
data QuestionTree = QuestionNode {
question :: Question
, condition :: Condition
, userAnswer :: Maybe AnswerChoice
, children :: QuestionForest
}
type QuestionForest = [QuestionTree]
Unfortunately, I am clueless now on how to write algorithms that compose trees like this.
I basically need these kinds of functions for composition and traversal:
-- Constructs the tree from seed data
constructTree :: Questions -> QuestionPaths -> Answers -> QuestionTree
-- | Inserts answer to question in the tree
answerQuestion :: Question -> AnswerChoice
-- | Fetches the next unanswered question by traversing the tree.
getNextUnanswered :: QuestionTree -> Question
Can you please help me understand what would be the best way to construct and traverse a tree such as this?
What I'd do in such a case, is to store the answers in a separate data structure - not to insert them in the questions tree; put the answers in a separate list/Set, or in a file or database, and let the questions tree be immutable.
In order to keep track of which questions remain to be asked, you can "consume" the tree - keep your program state pointing to the next question, throwing away the questions already answered (letting the garbage collector reclaim them).
I'd design the tree like this:
data AllowedAnswers = YesOrNo {
ifUserAnsweredYes :: QuestionTree,
ifUserAnsweredNo :: QuestionTree
}
| Choices [(Text, QuestionTree)]
data QuestionTree = Question {
description :: Text
, allowedAnswers :: AllowedAnswers
, ifUserSkipsThisQuestion :: QuestionTree
}
| EndOfQuestions
Notice several things:
You don't have to worry about multiple possible paths leading to the same question - you can place the same QuestionTree node in multiple places, and it will be shared (Haskell won't create multiple copies of it)
This design has no place to hold the user's answers - they are stored elsewhere (i.e. a list somewhere, or a file) - no need to mutate the question tree.
As the user answers questions, just move your "pointer" to the next QuestionTree, depending on what the user answered.
As for "how to construct this tree from the (QuestionId, AnswerChoice, NextQuestionId) list" - I think I'd first convert it into a map: ```Map QuestionId [(AnswerChoice, Maybe QuestionId)], then I'd construct the tree by, starting with the ID of the first question, and fetching its immediate children from the Map, build the subtrees.
Example (for a very simplified case where the only possible answers are "yes" or "no", with no skipping allowed):
buildTree questionMap questionId = case Map.lookup questionId questionMap of
Nothing -> EndOfQuestions
Just (description, [("yes", nextQuestionIdIfYes), ("no", nextQuestionIdIfNo)]) ->
Question { description = description
, allowedAnswers = YesOrNo {
ifUserAnsweredYes = buildTree questionMap nextQuestionIdIfYes
, ifUserAnsweredNo = buildTree questionMap nextQuestionIdIfNo
}
, ifUserSkipsThisQuestion = EndOfQuestions
}
If you are wondering "why not just use the Map directly?" - yes, you could (and often it will be the right solution), but consider:
The QuestionTree structure expresses the programmer's intention more idiomatically than a Map of Id -> Thing
It is structurally guaranteed to have a child QuestionTree whenever pertinent - no need to do Map.lookup, which will return a Maybe, which you must verify that contains a Just (even when you know there is going to be a next question, even if it is EndOfQuestions)
Related
If I wanted to perform a search on a problem space and I wanted to keep track of different states a node has already visited, I several options to do it depending on the constraints of those states. However; is there a way I can dispatch a function or another depending on the constraint of the states the user is using as input? For example, if I had:
data Node a = Node { state :: a, cost :: Double }
And I wanted to perform a search on a Problem a, is there a way I could check if a is Eq, Ord or Hashable and then call a different kind of search? In pseudocode, something like:
search :: Eq a => Problem a -> Node a
search problem#(... initial ...) -- Where initial is a State of type a
| (Hashable initial) = searchHash problem
| (Ord initial) = searchOrd problem
| otherwise = searchEq problem
I am aware I could just let the user choose one search or another depending on their own use; but being able to do something like that could be very handy for me since search is not really one of the user endpoints as such (one example could be a function bfs, that calls search with some parameters to make it behave like a Breadth-First Search).
No, you can't do this. However, you could make your own class:
class Memorable a where
type Memory a
remember :: a -> Memory a -> Memory a
known :: a -> Memory a -> Bool
Instantiate this class for a few base types, and add some default implementations for folks that want to add new instances, e.g.
-- suitable implementations of Memorable methods and type families for hashable things
type HashMemory = Data.HashSet.HashSet
hashRemember = Data.HashSet.insert
hashKnown = Data.HashSet.member
-- suitable implementations for orderable things
type OrdMemory = Data.Set.Set
ordRemember = Data.Set.insert
ordKnown = Data.Set.member
-- suitable implementations for merely equatable things
type EqMemory = Prelude.[]
eqRemember = (Prelude.:)
eqKnown = Prelude.elem
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I have a list of movies in a Database.
type Database = [Film]
type Title = String
type Actor = String
type Cast = [Actor]
type Fan = String
type Fans = [Fan]
type Year = Int
type Period = (Year, Year)
type Film = (Title, Cast, Year, Fans)
What i want to be able to do is find out what movie out of my list.
Function One
Has the most Fans
Filtered by a particular actors name.
Function Two
Overall top 5 films (By amount of fans)
Descending order
I currently have two snippets of code that i'm either trying to make work together. Or find a better sollution:
inCast :: Actor -> Film -> Bool
inCast givenActor (_, cast, _, _) = any (\actor -> actor == givenActor) cast
and
import Data.List
import Data.Ord
bestFilm :: Database -> Film
bestFilm = maximumBy $ comparing (length . fans)
Am i going completely the wrong way about this?
Many thanks for any help in advance.
EDIT:
additional code that i have. I can't seem to use it though, to aid me in getting this part solved.
Any ideas?
filmsWithFan :: Fan -> [Film]
filmsWithFan givenFan = filter (isFan givenFan) testDatabase
I think you are definitely on the right track. Though the last function, filmsWithFan, will not help you in this case. I'll give you some hints:
Function 1:
Think about the type signature you want to have for your function first:
topByFanAndActor :: Actor -> Database -> Film
topByFanAndActor actor films = undefined
Think about how you can combine the two functions you provided, mainly inCast and bestFilm to achieve that type signature. You will definitely need to use a higher order function that processes lists. If you'll need another hint I can tell what function that would be.
EDIT:
So you need to transform the give list films into a list of films the given actor stars in. To do that you need to filter that list using your function inCast. Afterwards you need to extract the movie with the maximum number of fans from that list, for that you will use bestFilm.
Function 2:
In this case the type signature will be quite simple:
topFiveDesc :: Database -> Database
but you can make it a bit nicer if you make the number of movies return variable:
topDesc :: Database -> Int -> Database
topDesc films num = undefined
Think about exactly what you have to do with the films now. You need to sort it by a certain criterion (the number of fans). A criterion defined similarly to the one in bestFilm. And afterwards you need to take the first num films from that list.
The AI code for my little soccer game basically works like this: There is a function that derives facts that describe the current situation on the pitch:
deriveFacts :: GameState -> [Fact]
... where the facts look kind of like this:
data Fact =
FactCanIntercept ObjId ObjId
| FactBestPosition Team Spot Spot
| FactKickOff
| FactBallCarrier ObjId
| FactBestShootingVector Velocity3
| FactBestPassingVector Velocity3
| ...
... and there are rules that look uniformly like this:
rule_shoot facts = do
FactBallCarrier ballCarrier <- checkBallCarrier facts
FactBestShootingVector goalVector <- checkBestShootingVector facts
return [message (ballCarrier, shoot goalVector)]
And then there is a rule engine that runs all the rules and gathers up the resulting messages.
That's fine so far and works quite nicely. What's kind of annoying, though: In this approach, I need an "accessor" function like this for every Fact:
checkBestShootingVector :: [Fact] -> Maybe Fact
checkBestShootingVector facts =
listToMaybe [e | e#(FactBestShootingVector {}) <- facts]
This approach leads to a lot of ugly boilerplate code. My question: Is there an elegant solution that removes the need for creating these accessor functions by hand?
You could refactor most of your Fact data as a record object
data FactRec = FR {
canIntercept :: [(ObjId,ObjId)], -- use lists for things you previously had multiple times
factBestPosition :: [(Team,Spot,Spot)], -- not sure about this one, maybe two entries not one list
kickOff :: Bool,
ballCarrier :: ObjID,
factBestShootingVector :: Velocity3,
factBestPassingVector :: Velocity3,
.....
}
Which gives you bulti-in accessor functions
if kickOff fr then something else somethingelse
So you wouldn't need to write all the check functions, but instead do
rule_shoot facts = message (ballCarrier facts, shoot $ goalVector facts)
If there are facts that genuinely might or might not be there, they could be of type Maybe something, and if there are facts that can be there an arbitrary number of times, they can be lists.
Consider the following problem: given a list of length three of tuples (String,Int), is there a pair of elements having the same "Int" part? (For example, [("bob",5),("gertrude",3),("al",5)] contains such a pair, but [("bob",5),("gertrude",3),("al",1)] does not.)
This is how I would implement such a function:
import Data.List (sortBy)
import Data.Function (on)
hasPair::[(String,Int)]->Bool
hasPair = napkin . sortBy (compare `on` snd)
where napkin [(_, a),(_, b),(_, c)] | a == b = True
| b == c = True
| otherwise = False
I've used pattern matching to bind names to the "Int" part of the tuples, but I want to sort first (in order to group like members), so I've put the pattern-matching function inside a where clause. But this brings me to my question: what's a good strategy for picking names for functions that live inside where clauses? I want to be able to think of such names quickly. For this example, "hasPair" seems like a good choice, but it's already taken! I find that pattern comes up a lot - the natural-seeming name for a helper function is already taken by the outer function that calls it. So I have, at times, called such helper functions things like "op", "foo", and even "helper" - here I have chosen "napkin" to emphasize its use-it-once, throw-it-away nature.
So, dear Stackoverflow readers, what would you have called "napkin"? And more importantly, how do you approach this issue in general?
General rules for locally-scoped variable naming.
f , k, g, h for super simple local, semi-anonymous things
go for (tail) recursive helpers (precedent)
n , m, i, j for length and size and other numeric values
v for results of map lookups and other dictionary types
s and t for strings.
a:as and x:xs and y:ys for lists.
(a,b,c,_) for tuple fields.
These generally only apply for arguments to HOFs. For your case, I'd go with something like k or eq3.
Use apostrophes sparingly, for derived values.
I tend to call boolean valued functions p for predicate. pred, unfortunately, is already taken.
In cases like this, where the inner function is basically the same as the outer function, but with different preconditions (requiring that the list is sorted), I sometimes use the same name with a prime, e.g. hasPairs'.
However, in this case, I would rather try to break down the problem into parts that are useful by themselves at the top level. That usually also makes naming them easier.
hasPair :: [(String, Int)] -> Bool
hasPair = hasDuplicate . map snd
hasDuplicate :: Ord a => [a] -> Bool
hasDuplicate = not . isStrictlySorted . sort
isStrictlySorted :: Ord a => [a] -> Bool
isStrictlySorted xs = and $ zipWith (<) xs (tail xs)
My strategy follows Don's suggestions fairly closely:
If there is an obvious name for it, use that.
Use go if it is the "worker" or otherwise very similar in purpose to the original function.
Follow personal conventions based on context, e.g. step and start for args to a fold.
If all else fails, just go with a generic name, like f
There are two techniques that I personally avoid. One is using the apostrophe version of the original function, e.g. hasPair' in the where clause of hasPair. It's too easy to accidentally write one when you meant the other; I prefer to use go in such cases. But this isn't a huge deal as long as the functions have different types. The other is using names that might connote something, but not anything that has to do with what the function actually does. napkin would fall into this category. When you revisit this code, this naming choice will probably baffle you, as you will have forgotten the original reason that you named it napkin. (Because napkins have 4 corners? Because they are easily folded? Because they clean up messes? They're found at restaurants?) Other offenders are things like bob and myCoolFunc.
If you have given a function a name that is more descriptive than go or h, then you should be able to look at either the context in which it is used, or the body of the function, and in both situations get a pretty good idea of why that name was chosen. This is where my point #3 comes in: personal conventions. Much of Don's advice applies. If you are using Haskell in a collaborative situation, then coordinate with your team and decide on certain conventions for common situations.
I just uncovered this confusion and would like a confirmation that it is what it is. Unless, of course, I am just missing something.
Say, I have these data declarations:
data VmInfo = VmInfo {name, index, id :: String} deriving (Show)
data HostInfo = HostInfo {name, index, id :: String} deriving (Show)
vm = VmInfo "vm1" "01" "74653"
host = HostInfo "host1" "02" "98732"
What I always thought and what seems to be so natural and logical is this:
vmName = vm.name
hostName = host.name
But this, obviously, does not work. I got this.
Questions
So my questions are.
When I create a data type with record syntax, do I have to make sure that all the fields have unique names? If yes - why?
Is there a clean way or something similar to a "scope resolution operator", like :: or ., etc., so that Haskell distinguishes which data type the name (or any other none unique fields) belongs to and returns the correct result?
What is the correct way to deal with this if I have several declarations with the same field names?
As a side note.
In general, I need to return data types similar to the above example.
First I returned them as tuples (seemed to me the correct way at the time). But tuples are hard to work with as it is impossible to extract individual parts of a complex type as easy as with the lists using "!!". So next thing I thought of the dictionaries/hashes.
When I tried using dictionaries I thought what is the point of having own data types then?
Playing/learning data types I encountered the fact that led me to the above question.
So it looks like it is easier for me to use dictionaries instead of own data types as I can use the same fields for different objects.
Can you please elaborate on this and tell me how it is done in real world?
Haskell record syntax is a bit of a hack, but the record name emerges as a function, and that function has to have a unique type. So you can share record-field names among constructors of a single datatype but not among distinct datatypes.
What is the correct way to deal with this if I have several declarations with the same field names?
You can't. You have to use distinct field names. If you want an overloaded name to select from a record, you can try using a type class. But basically, field names in Haskell don't work the way they do in say, C or Pascal. Calling it "record syntax" might have been a mistake.
But tuples are hard to work with as it is impossible to extract individual parts of a complex type
Actually, this can be quite easy using pattern matching. Example
smallId :: VmInfo -> Bool
smallId (VmInfo { vmId = n }) = n < 10
As to how this is done in the "real world", Haskell programmers tend to rely heavily on knowing what type each field is at compile time. If you want the type of a field to vary, a Haskell programmer introduces a type parameter to carry varying information. Example
data VmInfo a = VmInfo { vmId :: Int, vmName :: String, vmInfo :: a }
Now you can have VmInfo String, VmInfo Dictionary, VmInfo Node, or whatever you want.
Summary: each field name must belong to a unique type, and experienced Haskell programmers work with the static type system instead of trying to work around it. And you definitely want to learn about pattern matching.
There are more reasons why this doesn't work: lowercase typenames and data constructors, OO-language-style member access with .. In Haskell, those member access functions actually are free functions, i.e. vmName = name vm rather than vmName = vm.name, that's why they can't have same names in different data types.
If you really want functions that can operate on both VmInfo and HostInfo objects, you need a type class, such as
class MachineInfo m where
name :: m -> String
index :: m -> String -- why String anyway? Shouldn't this be an Int?
id :: m -> String
and make instances
instance MachineInfo VmInfo where
name (VmInfo vmName _ _) = vmName
index (VmInfo _ vmIndex _) = vmIndex
...
instance MachineInfo HostInfo where
...
Then name machine will work if machine is a VmInfo as well as if it's a HostInfo.
Currently, the named fields are top-level functions, so in one scope there can only be one function with that name. There are plans to create a new record system that would allow having fields of the same name in different record types in the same scope, but that's still in the design phase.
For the time being, you can make do with unique field names, or define each type in its own module and use the module-qualified name.
Lenses can help take some of the pain out of dealing with getting and setting data structure elements, especially when they get nested. They give you something that looks, if you squint, kind of like object-oriented accessors.
Learn more about the Lens family of types and functions here: http://lens.github.io/tutorial.html
As an example for what they look like, this is a snippet from the Pong example found at the above github page:
data Pong = Pong
{ _ballPos :: Point
, _ballSpeed :: Vector
, _paddle1 :: Float
, _paddle2 :: Float
, _score :: (Int, Int)
, _vectors :: [Vector]
-- Since gloss doesn't cover this, we store the set of pressed keys
, _keys :: Set Key
}
-- Some nice lenses to go with it
makeLenses ''Pong
That makes lenses to access the members without the underscores via some TemplateHaskell magic.
Later on, there's an example of using them:
-- Update the paddles
updatePaddles :: Float -> State Pong ()
updatePaddles time = do
p <- get
let paddleMovement = time * paddleSpeed
keyPressed key = p^.keys.contains (SpecialKey key)
-- Update the player's paddle based on keys
when (keyPressed KeyUp) $ paddle1 += paddleMovement
when (keyPressed KeyDown) $ paddle1 -= paddleMovement
-- Calculate the optimal position
let optimal = hitPos (p^.ballPos) (p^.ballSpeed)
acc = accuracy p
target = optimal * acc + (p^.ballPos._y) * (1 - acc)
dist = target - p^.paddle2
-- Move the CPU's paddle towards this optimal position as needed
when (abs dist > paddleHeight/3) $
case compare dist 0 of
GT -> paddle2 += paddleMovement
LT -> paddle2 -= paddleMovement
_ -> return ()
-- Make sure both paddles don't leave the playing area
paddle1 %= clamp (paddleHeight/2)
paddle2 %= clamp (paddleHeight/2)
I recommend checking out the whole program in its original location and looking through the rest of the lens material; it's very interesting even if you don't end up using them.
Yes, you cannot have two records in the same module with the same field names. The field names are added to the module's scope as functions, so you would use name vm rather than vm.name. You could have two records with the same field names in different modules and import one of the modules qualified as some name, but this is probably awkward to work with.
For a case like this, you should probably just use a normal algebraic data type:
data VMInfo = VMInfo String String String
(Note that the VMInfo has to be capitalized.)
Now you can access the fields of VMInfo by pattern matching:
myFunc (VMInfo name index id) = ... -- name, index and id are bound here