Abstracting over repeated application within Monad chains

Abstracting over repeated application within Monad chains - haskell

The Haskell wikibook has an example that shows how to chain lookup commands when trying to find different pieces of connected information throughout a database, seen here:
getTaxOwed :: String -- their name
-> Maybe Double -- the amount of tax they owe
getTaxOwed name =
lookup name phonebook >>=
(\number -> lookup number governmentDatabase) >>=
(\registration -> lookup registration taxDatabase)
and rewritten in do notation:
getTaxOwed name = do
number <- lookup name phonebook
registration <- lookup number governmentDatabase
lookup registration taxDatabase
Now, anytime I see a function repeated more than once I immediately try to think of ways to abstract over its repeated application, but as I haven't used Monads much in practice yet, and as they seem to already be at a pretty high level of abstraction, I didn't know how to approach that in this case.
What are some ways, if any, a coder could abstract over the common pattern above, that is, a call to lookup in every line?
(an aside: is this an appropriate context for the phrase "abstract over"? I felt it captured my meaning, but I'm not sure, and I'd like to make sure I'm using terminology appropriately as a relatively new coder; I looked through other posts which clarified its use and meaning but I still can't figure it out for this particular example)

Big thanks to Carsten for the link to foldM! Credit to them for the insight of this answer.
So, if we use foldM, we can write a function that repeatedly performs a lookup chained through multiple directories that depend upon each previous result. If, thanks to the use of monads, at any point lookup cannot find the current key in a directory, it will terminate, and return Nothing:
lookupALot :: Eq a => a -> [(a,b)] -> Maybe b
lookupALot key directories = foldM lookup key directories
this has output of the form
foldM f k1 [d1, d2, ..., dm] -- k == key, d == directory
==
do
k2 <- f k1 d1
k3 <- f k2 d2
...
f km dm
which is exactly the same structure as
do
number <- lookup name phonebook
registration <- lookup number governmentDatabase
lookup registration taxDatabase
Hence, a more compact way of writing getTaxOwed would be:
getTaxOwed :: String -> Maybe Double
getTaxOwed name = foldM lookup name [phonebook, governmentDatabase, taxDatabase]
Which kinda blows me away! That line of code will find the phone-number associated with a person's name, then check the governmentDatabase with their number for their registration, and finally find their tax information from that registration. Note though, that this will only work for data in the form of [(a,b)], as indicated by the type of lookupALot.

Related

Need help storing the previous element of a list (Haskell)

I'm currently working on an assignment. I have a function called gamaTipo that converts the values of a tuple into a data type previously defined by my professor.
The problem is: in order for gamaTipo to work, it needs to receive some preceding element. gamaTipo is defined like this: gamaTipo :: Peca -> (Int,Int) -> Peca where Peca is the data type defined by my professor.
What I need to do is to create a funcion that takes a list of tuples and converts it into Peca data type. The part that im strugling with is taking the preceding element of the list. i.e : let's say we have a list [(1,2),(3,4)] where the first element of the list (1,2) always corresponds to Dirt Ramp (data type defined by professor). I have to create a function convert :: [(Int,Int)] -> [Peca] where in order to calculate the element (3,4) i need to first translate (1,2) into Peca, and use it as the previous element to translate (3,4)
Here's what I've tried so far:
updateTuple :: [(Int,Int)] -> [Peca]
updateTuple [] = []
updateTuple ((x,y):xs) = let previous = Dirt Ramp
in (gamaTipo previous (x,y)): updateTuple xs
Although I get no error messages with this code, the expected output isn't correct. I'm also sorry if it's not easy to understand what I'm asking, English isn't my native tongue and it's hard to express my self. Thank you in advance! :)

If I understand correctly, your program needs to have a basic structure something like this:
updateTuple :: [(Int, Int)] -> [Peca]
updateTuple = go initialValue
where
go prev (xy:xys) =
let next = getNextValue prev xy
in prev : (go next xys)
go prev [] = prev
Basically, what’s happening here is:
updateTuple is defined in terms of a helper function go. (Note that ‘helper function’ isn’t standard terminology, it’s just what I’ve decided to call it).
go has an extra argument, which is used to store the previous value.
The implementation of go can then make use of the previous value.
When go recurses, the recursive call can then pass the newly-calculated value as the new ‘previous value’.
This is a reasonably common pattern in Haskell: if a recursive function requires an extra argument, then a new function (often named go) can be defined which has that extra argument. Then the original function can be defined in terms of go.

Growing a list in haskell

I'm learning Haskell by writing an OSC musical sequencer to use it with SuperCollider. But because I'd like to make fairly complex stuff with it, it will work like a programming language where you can declare variables and define functions so you can write music in an algorithmic way. The grammar is unusual in that we're coding sequences and sometimes a bar will reference the last bar (something like "play that last chord again but a fifth above").
I don't feel satisfied with my own explanation, but that's the best I can without getting too technical.
Anyway, what I'm coding now is the parser for that language, stateless so far, but now I need some way to implement a growing list of the declared variables and alikes using a dictionary in the [("key","value")] fashion, so I can add new values as I go parsing bar by bar.
I know this involves monads, which I don't really understand yet, but I need something meaningful enough to start toying with them or else I find the raw theory a bit too raw.
So what would be a clean and simple way to start?
Thanks and sorry if the question was too long.
Edit on how the thing works:
we input a string to the main parsing function, say
"afunction(3) ; anotherone(1) + [3,2,1]"
we identify closures first, then kinds of chars (letters, nums, etc) and group them together, so we get a list like:
[("word","afunction"),("parenth","(3)"),("space"," "),("semicolon",";"),("space"," "),("word","anotherone"),("parenth","(1)"),("space"," "),("opadd","+"),("space"," "),("bracket","[3,2,1]")]
then we use a function that tags all those tuples with the indices of the original string they occupy, like:
[("word","afunction",(0,8)),("parenth","(3)",(9,11)),("space"," ",(12,13)) ...]
then cut it in a list of bars, which in my language are separated using a semicolon, and then in notes, using commas.
And now I'm at the stage where those functions should be executed sequentially, but because some of them are reading or modifying previously declared values, I need to keep track of that change. For example, let's say the function f(x) moves the pitch of the last note by x semitones, so
f(9), -- from an original base value of 0 (say that's an A440) we go to 9
f(-2), -- 9-2 = 7, so a fifth from A
f(-3); -- 9-2-3, a minor third down from the last value.
etc
But sometimes it can get a bit more complicated than that, don't make me explain how cause I could bore you to death.

Adding an item to a list
You can make a new list that contains one more item than an existing list with the : constructor.
("key", "value") : existing
Where existing is a list you've already made
Keeping track of changing state
You can keep track of changing state between functions by passing the state from each function to the next. This is all the State monad is doing. State s a is a value of type a that depends on (and changes) a state s.
{- ┌---- type of the state
v v-- type of the value -}
data State s a = State { runState :: s -> (a, s) }
{- ^ ^ ^ ^
a function ---|--┘ | |
that takes a state ---┘ | |
and returns | |
a value that depends on the state ---┘ |
and a new state ------┘ -}
The bind operation >>= for State takes a value that depends on (and changes) the state and a function to compute another value that depends on (and changes) the state and combines them to make a new value that depends on (and changes) the state.
m >>= k = State $ \s ->
let ~(a, s') = runState m s
in runState (k a) s'

Finding list entry with the highest count

I have an Entry data type
data Entry = Entry {
count :: Integer,
name :: String }
Then I want to write a function, that takes the name and a list of Entrys as arguments an give me the Entrys with the highest count. What I have so far is
searchEntry :: String -> [Entry] -> Maybe Integer
searchEntry _ [] = Nothing
searchEntry name1 (x:xs) =
if name x == name1
then Just (count x)
else searchEntry name xs
That gives me the FIRST Entry that the function finds, but I want the Entry with the highest count. How can I implement that?

My suggestion would be to break the problem into two parts:
Find all entries matching a given name
Find the entry with the highest count
You could set it up as
entriesByName :: String -> [Entry] -> [Entry]
entriesByName name entries = undefined
-- Use Maybe since the list might be empty
entryWithHighestCount :: [Entry] -> Maybe Entry
entryWithHighestCount entries = undefined
entryByNameWithHighestCount :: String -> [Entry] -> Maybe Entry
entryByNameWithHighestCount name entires = entryWithHighestCount $ entriesByName name entries
All you have to do is implement the relatively simple functions that are used to implement getEntryByNameWithHighestCount.

You need to add an inner method that takes a current result as a parameter and returns that instead of Nothing when reaching the end of the method.
Also you would need to update your result found logic to compare a potentially existing function and the found value.

I would consider changing the signature of the function to String->Maybe Entry (or String->[Entry]) if you indeed want to return the "Entry" items with the highest count.
Otherwise, you can actually do what you want as a oneliner using some pretty common Haskell functions....
As Bheklilr mentioned, the name filter can be done first, and it is really easy to do this using the filter function....
filter (hasName theName) entries
Note that hasName can be written out fully as a separate function, but Haskell also offers you the following shortcut.
hasName = (== theName) . name
Now you just need the maximum value.... Haskell has a maximum function, but it only works on the Ord class. You can make Entry an instance of Ord, or you can just use the related maximumBy function, that takes an extra ordering function
maximumBy orderFunction entries2
Again, you can write orderFunction yourself (which you might want to do as an excercise), but haskell again offers a shortcut.
orderFunction = compare `on` count
You will need to import some libs to get this all to work (Data.Function, Data.List). You also will need to put in some extra code to account for the Nothing case.
It might be worth it to write out the functions longhand first, but I recommend that you use Hoogle to lookup and understand compare, on, and maximumBy.... Using tricks like this can really shorten your code.
Putting it all together, you can get the entry with the maximum count like this
maxEntry = maximumBy (compare `on` count) $ filter ((theName ==) . name) $ entries
You will need to modify this to account for the Nothing case, or if you want to return all max Entries (this just chooses one), or if you really wanted to return count, and not the entry.

What makes a good name for a helper function?

Consider the following problem: given a list of length three of tuples (String,Int), is there a pair of elements having the same "Int" part? (For example, [("bob",5),("gertrude",3),("al",5)] contains such a pair, but [("bob",5),("gertrude",3),("al",1)] does not.)
This is how I would implement such a function:
import Data.List (sortBy)
import Data.Function (on)
hasPair::[(String,Int)]->Bool
hasPair = napkin . sortBy (compare `on` snd)
where napkin [(_, a),(_, b),(_, c)] | a == b = True
| b == c = True
| otherwise = False
I've used pattern matching to bind names to the "Int" part of the tuples, but I want to sort first (in order to group like members), so I've put the pattern-matching function inside a where clause. But this brings me to my question: what's a good strategy for picking names for functions that live inside where clauses? I want to be able to think of such names quickly. For this example, "hasPair" seems like a good choice, but it's already taken! I find that pattern comes up a lot - the natural-seeming name for a helper function is already taken by the outer function that calls it. So I have, at times, called such helper functions things like "op", "foo", and even "helper" - here I have chosen "napkin" to emphasize its use-it-once, throw-it-away nature.
So, dear Stackoverflow readers, what would you have called "napkin"? And more importantly, how do you approach this issue in general?

General rules for locally-scoped variable naming.
f , k, g, h for super simple local, semi-anonymous things
go for (tail) recursive helpers (precedent)
n , m, i, j for length and size and other numeric values
v for results of map lookups and other dictionary types
s and t for strings.
a:as and x:xs and y:ys for lists.
(a,b,c,_) for tuple fields.
These generally only apply for arguments to HOFs. For your case, I'd go with something like k or eq3.
Use apostrophes sparingly, for derived values.

I tend to call boolean valued functions p for predicate. pred, unfortunately, is already taken.

In cases like this, where the inner function is basically the same as the outer function, but with different preconditions (requiring that the list is sorted), I sometimes use the same name with a prime, e.g. hasPairs'.
However, in this case, I would rather try to break down the problem into parts that are useful by themselves at the top level. That usually also makes naming them easier.
hasPair :: [(String, Int)] -> Bool
hasPair = hasDuplicate . map snd
hasDuplicate :: Ord a => [a] -> Bool
hasDuplicate = not . isStrictlySorted . sort
isStrictlySorted :: Ord a => [a] -> Bool
isStrictlySorted xs = and $ zipWith (<) xs (tail xs)

My strategy follows Don's suggestions fairly closely:
If there is an obvious name for it, use that.
Use go if it is the "worker" or otherwise very similar in purpose to the original function.
Follow personal conventions based on context, e.g. step and start for args to a fold.
If all else fails, just go with a generic name, like f
There are two techniques that I personally avoid. One is using the apostrophe version of the original function, e.g. hasPair' in the where clause of hasPair. It's too easy to accidentally write one when you meant the other; I prefer to use go in such cases. But this isn't a huge deal as long as the functions have different types. The other is using names that might connote something, but not anything that has to do with what the function actually does. napkin would fall into this category. When you revisit this code, this naming choice will probably baffle you, as you will have forgotten the original reason that you named it napkin. (Because napkins have 4 corners? Because they are easily folded? Because they clean up messes? They're found at restaurants?) Other offenders are things like bob and myCoolFunc.
If you have given a function a name that is more descriptive than go or h, then you should be able to look at either the context in which it is used, or the body of the function, and in both situations get a pretty good idea of why that name was chosen. This is where my point #3 comes in: personal conventions. Much of Don's advice applies. If you are using Haskell in a collaborative situation, then coordinate with your team and decide on certain conventions for common situations.

Haskell: Confusion with own data types. Record syntax and unique fields

I just uncovered this confusion and would like a confirmation that it is what it is. Unless, of course, I am just missing something.
Say, I have these data declarations:
data VmInfo = VmInfo {name, index, id :: String} deriving (Show)
data HostInfo = HostInfo {name, index, id :: String} deriving (Show)
vm = VmInfo "vm1" "01" "74653"
host = HostInfo "host1" "02" "98732"
What I always thought and what seems to be so natural and logical is this:
vmName = vm.name
hostName = host.name
But this, obviously, does not work. I got this.
Questions
So my questions are.
When I create a data type with record syntax, do I have to make sure that all the fields have unique names? If yes - why?
Is there a clean way or something similar to a "scope resolution operator", like :: or ., etc., so that Haskell distinguishes which data type the name (or any other none unique fields) belongs to and returns the correct result?
What is the correct way to deal with this if I have several declarations with the same field names?
As a side note.
In general, I need to return data types similar to the above example.
First I returned them as tuples (seemed to me the correct way at the time). But tuples are hard to work with as it is impossible to extract individual parts of a complex type as easy as with the lists using "!!". So next thing I thought of the dictionaries/hashes.
When I tried using dictionaries I thought what is the point of having own data types then?
Playing/learning data types I encountered the fact that led me to the above question.
So it looks like it is easier for me to use dictionaries instead of own data types as I can use the same fields for different objects.
Can you please elaborate on this and tell me how it is done in real world?

Haskell record syntax is a bit of a hack, but the record name emerges as a function, and that function has to have a unique type. So you can share record-field names among constructors of a single datatype but not among distinct datatypes.
What is the correct way to deal with this if I have several declarations with the same field names?
You can't. You have to use distinct field names. If you want an overloaded name to select from a record, you can try using a type class. But basically, field names in Haskell don't work the way they do in say, C or Pascal. Calling it "record syntax" might have been a mistake.
But tuples are hard to work with as it is impossible to extract individual parts of a complex type
Actually, this can be quite easy using pattern matching. Example
smallId :: VmInfo -> Bool
smallId (VmInfo { vmId = n }) = n < 10
As to how this is done in the "real world", Haskell programmers tend to rely heavily on knowing what type each field is at compile time. If you want the type of a field to vary, a Haskell programmer introduces a type parameter to carry varying information. Example
data VmInfo a = VmInfo { vmId :: Int, vmName :: String, vmInfo :: a }
Now you can have VmInfo String, VmInfo Dictionary, VmInfo Node, or whatever you want.
Summary: each field name must belong to a unique type, and experienced Haskell programmers work with the static type system instead of trying to work around it. And you definitely want to learn about pattern matching.

There are more reasons why this doesn't work: lowercase typenames and data constructors, OO-language-style member access with .. In Haskell, those member access functions actually are free functions, i.e. vmName = name vm rather than vmName = vm.name, that's why they can't have same names in different data types.
If you really want functions that can operate on both VmInfo and HostInfo objects, you need a type class, such as
class MachineInfo m where
name :: m -> String
index :: m -> String -- why String anyway? Shouldn't this be an Int?
id :: m -> String
and make instances
instance MachineInfo VmInfo where
name (VmInfo vmName _ _) = vmName
index (VmInfo _ vmIndex _) = vmIndex
...
instance MachineInfo HostInfo where
...
Then name machine will work if machine is a VmInfo as well as if it's a HostInfo.

Currently, the named fields are top-level functions, so in one scope there can only be one function with that name. There are plans to create a new record system that would allow having fields of the same name in different record types in the same scope, but that's still in the design phase.
For the time being, you can make do with unique field names, or define each type in its own module and use the module-qualified name.

Lenses can help take some of the pain out of dealing with getting and setting data structure elements, especially when they get nested. They give you something that looks, if you squint, kind of like object-oriented accessors.
Learn more about the Lens family of types and functions here: http://lens.github.io/tutorial.html
As an example for what they look like, this is a snippet from the Pong example found at the above github page:
data Pong = Pong
{ _ballPos :: Point
, _ballSpeed :: Vector
, _paddle1 :: Float
, _paddle2 :: Float
, _score :: (Int, Int)
, _vectors :: [Vector]
-- Since gloss doesn't cover this, we store the set of pressed keys
, _keys :: Set Key
}
-- Some nice lenses to go with it
makeLenses ''Pong
That makes lenses to access the members without the underscores via some TemplateHaskell magic.
Later on, there's an example of using them:
-- Update the paddles
updatePaddles :: Float -> State Pong ()
updatePaddles time = do
p <- get
let paddleMovement = time * paddleSpeed
keyPressed key = p^.keys.contains (SpecialKey key)
-- Update the player's paddle based on keys
when (keyPressed KeyUp) $ paddle1 += paddleMovement
when (keyPressed KeyDown) $ paddle1 -= paddleMovement
-- Calculate the optimal position
let optimal = hitPos (p^.ballPos) (p^.ballSpeed)
acc = accuracy p
target = optimal * acc + (p^.ballPos._y) * (1 - acc)
dist = target - p^.paddle2
-- Move the CPU's paddle towards this optimal position as needed
when (abs dist > paddleHeight/3) $
case compare dist 0 of
GT -> paddle2 += paddleMovement
LT -> paddle2 -= paddleMovement
_ -> return ()
-- Make sure both paddles don't leave the playing area
paddle1 %= clamp (paddleHeight/2)
paddle2 %= clamp (paddleHeight/2)
I recommend checking out the whole program in its original location and looking through the rest of the lens material; it's very interesting even if you don't end up using them.

Yes, you cannot have two records in the same module with the same field names. The field names are added to the module's scope as functions, so you would use name vm rather than vm.name. You could have two records with the same field names in different modules and import one of the modules qualified as some name, but this is probably awkward to work with.
For a case like this, you should probably just use a normal algebraic data type:
data VMInfo = VMInfo String String String
(Note that the VMInfo has to be capitalized.)
Now you can access the fields of VMInfo by pattern matching:
myFunc (VMInfo name index id) = ... -- name, index and id are bound here

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string