Memoizing functions in a hash table - reference

I must complete an exercise in which
I write a function memo : (('a -> 'b) -> 'a -> 'b) -> stats -> 'a -> 'b. The function memo takes as input a function f (the function we want to memoize), a record of type stats (its type declaration is given below) to keep track of how often we access the local store and how many entries we add, and returns a function of type 'a -> 'b.
When this function is called with an input of type 'a, it will run f, memoize intermediate results, and return as a final result a value of type 'b. It will also update the values in its stats record accordingly. For example, to create a memoizing function that sums up numbers up to x, we have memo (fun g x -> if x=0 then 0 else x + g(x-1)) stats
The record type stats has two fields:
entries: the number of results that have been memoized
lkp: the number of times a memoized result has been found in the store instead of executing the function
Note the type of the given function f: it in itself requires another function as argument. Specifically, the function g passed to f will be the function that f will call in its recursive case.
The issue I am having is that the number of lookups for me is always off by 1 or 2. If anyone could clarify why this is happening or give me a hint, I'd appreciate it. Below is my attempt, which computes the correct values for the function calls and the correct number of entries. Hopefully the code below will make it clearer:
type stats =
{ entries : int ref;
lkp : int ref }
let memo (f: (('a -> 'b) -> 'a -> 'b)) (stats: stats) : ('a -> 'b) =
let map = Hashtbl.create 1000 in
let rec g x =
match Hashtbl.find_opt map x with
| None -> let result = f (g) x in Hashtbl.add map x result ; stats.entries := !(stats.entries) + 1 ; result
| Some v -> stats.lkp := !(stats.lkp)+ 1 ; v
in stats.entries := !(stats.entries) + 1 ; f g

Let's deobfuscate your code:
type stats = {
entries : int ref;
lkp : int ref
}
let memo (f: (('a -> 'b) -> 'a -> 'b)) (stats: stats) : ('a -> 'b) =
let map = Hashtbl.create 1000 in
let rec g x =
match Hashtbl.find_opt map x with
| None ->
let result = f (g) x in
Hashtbl.add map x result;
incr stats.entries;
result
| Some v ->
incr stats.lkp;
v in
incr stats.entries;
f g
Now, we can easily see that in case if we have a cache miss (the None branch), we increment the number of entries (looks correct) but not the number of lookups (looks suspicious, I would still count this as a lookup, unless by lookup you mean a cache hit). When we have a cache hit (the Some branch), we increment the number of lookups. So far so good. We also can see that for some reason we increment the number of entries unconditionally when we call memo. That totally looks odd, we do not add any entries here. Most likely it is some leftover code.
I hope the rest is clear. The main take away is that you should try to write programs that are syntactically easy to digest. Do not afraid to use the Enter button on your keyboard, do not try to make too much at once. Also, try to use an IDE (vscode, emacs, vim) that supports automatic indentation. It will save you from nasty surprises.

Related

Errors while creating a power function

First of all, I want to say that I'm very very inexperienced with Haskell, and I know that I have done something (or multiple things) terribly wrong, been struggling for hours but I can't seem to find it.
power :: Int -> Int -> Int
power x y | y == 0 = 1
| x == 0 = 0
list = replicate y x
foldr (*) x list
main = print $ power 3 5
Error most of the time is either x and y not being passed to the replicate function or that foldr is a naked function, I understand what they both mean but have no idea on how I can pass the variables or come up with a solution.
You here created four functions: power, list, foldr and main. But you use variables x and y in the definition of the list function.
You can work with a where clause to specify subexpressions, for example:
power :: Int -> Int -> Int
power x y | y == 0 = 1
| x == 0 = 0
| otherwise = foldr (*) 1 list
where list = replicate y x
or perhaps more elegant with pattern matching:
power :: Int -> Int -> Int
power 0 _ = 0
power x y = foldr (*) 1 (replicate y x)
main = print $ power 3 5
Here we can also eliminate the case for x0, since our foldr starts working with 1, not x.
This algorithm is however not very efficient, since it is linear in the value of y. By checking recursively if the exponent is even or odd, you can make it faster. I leave this as an exercise.
You were very close! The main things that need to be fixed are:
When writing a definition with guards, the “fallback” case needs to be a guard as well, conventionally written with otherwise.
Recall that a definition without guards looks like this, with one left side (a name and parameter patterns/names) and one right side (an expression):
name patterns = expression
With guard conditions, there is one right-hand side for each guard:
name patterns | condition1 = expression1
| condition2 = expression2
…
| otherwise = expressionn
otherwise is really just an alias for True, that is, such a guard always matches. The only thing special about otherwise is that the compiler uses it as a hint when analysing whether a pattern match covers all possible cases.
In order to define a variable list, local to the definition of power, using the parameters x and y, you need to use either a let…in… expression, that is, let block in expression, or a where clause, equation where block. A block is a series of items (in this case, local definitions) which must all be written starting at the same column of indentation, or be delimited by explicit curly braces {…} and semicolons ;.
Using let…in… follows the structure of your original code pretty closely. I will adjust the indentation style to avoid needing to align anything, by putting a newline and a constant amount of indentation instead.
power :: Int -> Int -> Int
power x y
| y == 0 = 1
| x == 0 = 0
| otherwise = let
list = replicate y x
in foldr (*) x list
main :: IO ()
main = print $ power 3 5
Attaching a where clause to an equation is slightly more common than using a let…in… expression on the right side of an equation.
power :: Int -> Int -> Int
power x y
| y == 0 = 1
| x == 0 = 0
| otherwise = foldr (*) x list
where
list = replicate y x
main :: IO ()
main = print $ power 3 5
Note that in this case, there is a slight difference: the variable list is visible in all of the right-hand sides, although we only use it in one of them. With let list = … in e, list is only defined within e. In general, it’s helpful for readability to keep the scope of a variable as small as possible, although you can certainly go overboard:
a = …
where
b = …
where
c = …
where
d = …
-- If you see this much nesting, rethink!
If you run into issues with alignment and indentation, you can always use explicit delimiters instead. The code I wrote is equivalent to the following.
power :: Int -> Int -> Int; -- Begin ‘power’ signature.
power x y
| y == 0 = 1
| x == 0 = 0
| otherwise = let { -- Begin ‘let’ block.
list = replicate y x; -- End ‘list’ equation.
} in foldr (*) x list; -- End ‘let’ block, then end ‘power’ equation.
main :: IO (); -- Begin ‘main’ signature.
main = print $ power 3 5; -- End ‘main’ equation.
Or similarly with where { … }.

How do I use String.iter in F#

I cannot find any examples of String.iter so I've been looking at Seq.iter and Array.iter's examples and trying to apply it to chars in a string but I just can't get it right. Could somebody please give me an example on how to use String.iter. I need to do functions with each char in a string.
Here is what I was doing previously but I know this can be improved on and made way more efficient, I don't want to have to convert a string to a char list just to cycle through it.
let chars = [ 'a'; 'b'; 'c' ]
let mutable result = 0
for c in chars do
match c with
| 'a' -> (result <- result + 1)
| 'b' -> (result <- result + 2)
| 'c' -> (result <- result + 3)
| _ -> printfn "test"
printfn "result of %A is %d" chars result
System.Console.ReadKey() |> ignore
First of all, F# has type string (which is a .NET type) and this is distinct type from char list (which is a functional F# list of characters). In your example, you are creating a list of characters, so you can best process it using functions from the List module.
Regarding iteration - in your case, you are accumulating some state and so iter is not the operation you need (iter is used for performing some imperative action for each element).
To solve your specific problem, the nicest option is to use List.sumBy:
let result = chars |> List.sumBy (fun c ->
match c with
| 'a' -> 1
| 'b' -> 2
| 'c' -> 3
| _ ->
printfn "test"
0 )
The sumBy function sums the numbers returned for each element and so you just need to return 1, 2 or 3. In the remaining case, we print (leaving the same side-effect) and return 0 because we just want to keep the same sum.
More generally, you could use List.fold which lets you accumulate results as you iterate over the list:
let result = chars |> List.fold (fun result c ->
match c with
| 'a' -> result + 1
| 'b' -> result + 2
| 'c' -> result + 3
| _ ->
printfn "test"
result ) 0
In all of these, you can replace List. with Seq. because functions in Seq. work on any sequence (lists, arrays, strings, etc.). This might be a bit slower, but that's typically not an issue. The String module has fewer functions and you could use it if you defined your input as "abc" rather than ['a';'b';'c']
EDIT: To answer the questions in comments, if you can use Seq.sumBy directly on a string and you can use int to convert a character to its numerical code, so you can use that to eliminate pattern matching (this handles all characters in the same way, you might want to filter invalid ones out using Seq.filter first, depending on what logic you're implementing):
let str = "abc"
let result = str |> Seq.sumBy (fun c -> (int c) - 96)

Haskell Continuation passing style index of element in list

There's a series of examples I'm trying to do to practice Haskell. I'm currently learning about continuation passing, but I'm a bit confused as to how to implement a function like find index of element in list that works like this:
index 3 [1,2,3] id = 2
Examples like factorial made sense since there wasn't really any processing of the data other than multiplication, but in the case of the index function, I need to compare the element I'm looking at with the element I'm looking for, and I just can't seem to figure out how to do that with the function parameter.
Any help would be great.
first let me show you a possible implementation:
index :: Eq a => a -> [a] -> (Int -> Int) -> Int
index _ [] _ = error "not found"
index x (x':xs) cont
| x == x' = cont 0
| otherwise = index x xs (\ind -> cont $ ind + 1)
if you prefer point-free style:
index :: Eq a => a -> [a] -> (Int -> Int) -> Int
index _ [] _ = error "not found"
index x (x':xs) cont
| x == x' = cont 0
| otherwise = index x xs (cont . (+1))
how it works
The trick is to use the continuations to count up the indices - those continuations will get the index to the right and just increment it.
As you see this will cause an error if it cannot find the element.
examples:
λ> index 1 [1,2,3] id
0
λ> index 2 [1,2,3] id
1
λ> index 3 [1,2,3] id
2
λ> index 4 [1,2,3] id
*** Exception: not found
how I figured it out
A good way to figure out stuff like this is by first writing down the recursive call with the continuation:
useCont a (x:xs) cont = useCont a xs (\valFromXs -> cont $ ??)
And now you have to think about what you want valFromXs to be (as a type and as a value) - but remember your typical start (as here) will be to make the first continuation id, so the type can only be Int -> Int. So it should be clear that we are talking about of index-transformation here. As useCont will only know about the tail xs in the next call it seems natural to see this index as relative to xs and from here the rest should follow rather quickly.
IMO this is just another instance of
Let the types guide you Luke
;)
remarks
I don't think that this is a typical use of continuations in Haskell.
For once you can use an accumulator argument for this as well (which is conceptional simpler):
index :: Eq a => a -> [a] -> Int -> Int
index _ [] _ = error "not found"
index x (x':xs) ind
| x == x' = ind
| otherwise = index x xs (ind+1)
or (see List.elemIndex) you can use Haskells laziness/list-comprehensions to make it look even nicer:
index :: Eq a => a -> [a] -> Int
index x xs = head [ i | (x',i) <- zip xs [0..], x'== x ]
If you have a value a then to convert it to CPS style you replace it with something like (a -> r) -> r for some unspecified r. In your case, the base function is index :: Eq a => a -> [a] -> Maybe Int and so the CPS form is
index :: Eq a => a -> [a] -> (Maybe Int -> r) -> r
or even
index :: Eq a => a -> [a] -> (Int -> r) -> r -> r
Let's implement the latter.
index x as success failure =
Notably, there are two continuations, one for the successful result and one for a failing one. We'll apply them as necessary and induct on the structure of the list just like usual. First, clearly, if the as list is empty then this is a failure
case as of
[] -> failure
(a:as') -> ...
In the success case, we're, as normal, interested in whether x == a. When it is true we pass the success continuation the index 0, since, after all, we found a match at the 0th index of our input list.
case as of
...
(a:as') | x == a -> success 0
| otherwise -> ...
So what happens when we don't yet have a match? If we were to pass the success continuation in unchanged then it would, assuming a match is found, eventually be called with 0 as an argument. This loses information about the fact that we've attempted to call it once already, though. We can rectify that by modifying the continuation
case as of
...
(a:as') ...
| otherwise -> index x as' (fun idx -> success (idx + 1)) failure
Another way to think about it is that we have the collect "post" actions in the continuation since ultimately the result of the computation will pass through that code
-- looking for the value 5, we begin by recursing
1 :
2 :
3 :
4 :
5 : _ -- match at index 0; push it through the continuation
0 -- lines from here down live in the continuation
+1
+1
+1
+1
This might be even more clear if we write the recursive branch in pointfree style
| otherwise -> index x as' (success . (+1)) failure
which shows how we're modifying the continuation to include one more increment for each recursive call. All together the code is
index :: Eq a => a -> [a] -> (Int -> r) -> r -> r
index x as success failure
case as of
[] -> failure
(a:as') | x == a -> success 0
| otherwise -> index x as' (success . (+1)) failure

How do I recursively use newStdGen in Haskell? (to get different random results on each iteration)

I use System.Random and System.Random.Shuffle to shuffle the order of characters in a string, I shuffle it using:
shuffle' string (length string) g
g being a getStdGen.
Now the problem is that the shuffle can result in an order that's identical to the original order, resulting in a string that isn't really shuffled, so when this happens I want to just shuffle it recursively until it hits a a shuffled string that's not the original string (which should usually happen on the first or second try), but this means I need to create a new random number generator on each recursion so it wont just shuffle it exactly the same way every time.
But how do I do that? Defining a
newg = newStdGen
in "where", and using it results in:
Jumble.hs:20:14:
Could not deduce (RandomGen (IO StdGen))
arising from a use of shuffle'
from the context (Eq a)
bound by the inferred type of
shuffleString :: Eq a => IO StdGen -> [a] -> [a]
at Jumble.hs:(15,1)-(22,18)
Possible fix:
add an instance declaration for (RandomGen (IO StdGen))
In the expression: shuffle' string (length string) g
In an equation for `shuffled':
shuffled = shuffle' string (length string) g
In an equation for `shuffleString':
shuffleString g string
= if shuffled == original then
shuffleString newg shuffled
else
shuffled
where
shuffled = shuffle' string (length string) g
original = string
newg = newStdGen
Jumble.hs:38:30:
Couldn't match expected type `IO StdGen' with actual type `StdGen'
In the first argument of `jumble', namely `g'
In the first argument of `map', namely `(jumble g)'
In the expression: (map (jumble g) word_list)
I'm very new to Haskell and functional programming in general and have only learned the basics, one thing that might be relevant which I don't know yet is the difference between "x = value", "x <- value", and "let x = value".
Complete code:
import System.Random
import System.Random.Shuffle
middle :: [Char] -> [Char]
middle word
| length word >= 4 = (init (tail word))
| otherwise = word
shuffleString g string =
if shuffled == original
then shuffleString g shuffled
else shuffled
where
shuffled = shuffle' string (length string) g
original = string
jumble g word
| length word >= 4 = h ++ m ++ l
| otherwise = word
where
h = [(head word)]
m = (shuffleString g (middle word))
l = [(last word)]
main = do
g <- getStdGen
putStrLn "Hello, what would you like to jumble?"
text <- getLine
-- let text = "Example text"
let word_list = words text
let jumbled = (map (jumble g) word_list)
let output = unwords jumbled
putStrLn output
This is pretty simple, you know that g has type StdGen, which is an instance of the RandomGen typeclass. The RandomGen typeclass has the functions next :: g -> (Int, g), genRange :: g -> (Int, Int), and split :: g -> (g, g). Two of these functions return a new random generator, namely next and split. For your purposes, you can use either quite easily to get a new generator, but I would just recommend using next for simplicity. You could rewrite your shuffleString function to something like
shuffleString :: RandomGen g => g -> String -> String
shuffleString g string =
if shuffled == original
then shuffleString (snd $ next g) shuffled
else shuffled
where
shuffled = shuffle' string (length string) g
original = string
End of answer to this question
One thing that might be relevant which I don't know yet is the difference between "x = value", "x <- value", and "let x = value".
These three different forms of assignment are used in different contexts. At the top level of your code, you can define functions and values using the simple x = value syntax. These statements are not being "executed" inside any context other than the current module, and most people would find it pedantic to have to write
module Main where
let main :: IO ()
main = do
putStrLn "Hello, World"
putStrLn "Exiting now"
since there isn't any ambiguity at this level. It also helps to delimit this context since it is only at the top level that you can declare data types, type aliases, and type classes, these can not be declared inside functions.
The second form, let x = value, actually comes in two variants, the let x = value in <expr> inside pure functions, and simply let x = value inside monadic functions (do notation). For example:
myFunc :: Int -> Int
myFunc x =
let y = x + 2
z = y * y
in z * z
Lets you store intermediate results, so you get a faster execution than
myFuncBad :: Int -> Int
myFuncBad x = (x + 2) * (x + 2) * (x + 2) * (x + 2)
But the former is also equivalent to
myFunc :: Int -> Int
myFunc x = z * z
where
y = x + 2
z = y * y
There are subtle difference between let ... in ... and where ..., but you don't need to worry about it at this point, other than the following is only possible using let ... in ..., not where ...:
myFunc x = (\y -> let z = y * y in z * z) (x + 2)
The let ... syntax (without the in ...) is used only in monadic do notation to perform much the same purpose, but usually using values bound inside it:
something :: IO Int
something = do
putStr "Enter an int: "
x <- getLine
let y = myFunc (read x)
return (y * y)
This simply allows y to be available to all proceeding statements in the function, and the in ... part is not needed because it's not ambiguous at this point.
The final form of x <- value is used especially in monadic do notation, and is specifically for extracting a value out of its monadic context. That may sound complicated, so here's a simple example. Take the function getLine. It has the type IO String, meaning it performs an IO action that returns a String. The types IO String and String are not the same, you can't call length getLine, because length doesn't work for IO String, but it does for String. However, we frequently want that String value inside the IO context, without having to worry about it being wrapped in the IO monad. This is what the <- is for. In this function
main = do
line <- getLine
print (length line)
getLine still has the type IO String, but line now has the type String, and can be fed into functions that expect a String. Whenever you see x <- something, the something is a monadic context, and x is the value being extracted from that context.
So why does Haskell have so many different ways of defining values? It all comes down to its type system, which tries really hard to ensure that you can't accidentally launch the missiles, or corrupt a file system, or do something you didn't really intend to do. It also helps to visually separate what is an action, and what is a computation in source code, so that at a glance you can tell if an action is being performed or not. It does take a while to get used to, and there are probably valid arguments that it could be simplified, but changing anything would also break backwards compatibility.
And that concludes today's episode of Way Too Much Information(tm)
(Note: To other readers, if I've said something incorrect or potentially misleading, please feel free to edit or leave a comment pointing out the mistake. I don't pretend to be perfect in my descriptions of Haskell syntax.)

How can I get a field from each element of a list of custom data types in Haskell?

First of all, if the title is confusing I apologise - I don't know how to phrase it.
I'm learning Haskell and tackling the Knapsack Problem but having a problem with list comprehension.
data Object = Item { name :: String,
weight:: Double,
profit :: Double,
efficiency :: Double }
deriving (Read, Show)
I have a function that takes a list from a .csv file and calculates efficiency and sorts it:
getItemsAsList
= do
body <- readFile "items.csv"
let ls = split '\n' body
let lc = map (split ',') ls
let itemList = map (loadItem) lc
let sorted = sortItems efficiency itemList
return sorted
Functions used:
loadItem :: [[Char]] -> Object
loadItem (n:ow:op:xs) = Item n w p (p/w)
where
w = read ow :: Double
p = read op :: Double
sortItems :: Ord a => (t -> a) -> [t] -> [t]
sortItems fn [ ] = [ ]
sortItems fn (pivot:rest)
= sortItems fn [x | x <- rest, (fn x) > (fn pivot)]
++ [pivot] ++
sortItems fn [x | x <- rest, (fn x) <= (fn pivot)]
split :: Char -> [Char] -> [[Char]]
split _ [] = []
split delim str = if before == [] then
split delim (drop 1 remainder)
else
before: split delim (drop 1 remainder)
where
(before, remainder) = span (/=delim) str
What I am trying to do is write a function that will go through the list returned by the getItemsAsList function and get the value of the weight field from each element and sum them together. From this I can hopefully implement the greedy solution to the problem, once I understand how to get the elements.
Also, the getItemsAsList function returns IO [Object]
Thanks.
To get the weight from a single Object, you do weight obj. Thus, to get the weight from each element of a list of Objects, you do map weight objlist or [weight obj | obj <- objlist]. Also, the Prelude has a sum function which works exactly as you'd expect. Put them all together, and you're done.
You are treating the result of getItemsAsList, which is a monadic function, as a normal value instead of as an IO action.
The concept of a monad is usually explained as it being a box, which you can "unpack" the value from (using the <- operator). When you call it from a pure function, you cannot unpack the value, and instead are just left with the box. (that is what the IO [Object] is, it is an IO box containing an Object list value). You can however, freely use pure functions from inside a monad.
The solution is to call and unpack the value of getItemsAsList from within a monad, and then pass it onto your other pure functions to carry out whatever the rest of your task is.
Once you have unpacked the list of objects from getItemsAsList using the <- operator, you can pass it into other pure functions.

Resources