Stop and copy garbage collector in OCAML - garbage-collection

I have a homework assignment to finish writing a stopy-and-copy garbage collector in OCaml. There are 3 functions that need to be written. The first thing to be known is that a 64 slot array named ram will be what the garbage collector uses as memory. Each slot will contain an object of type 'cell'. That type may look like the following:
Object (id, size, references)
ObjData _
Free
FwdPointer (_)
I believe I am okay with the first function but the second function I need help with.
The function is:
let rec scan_tospace (free : int) (unscanned : int) =
Here is the objective of the function:
(* Scan To-space, copy all referenced objects to the To-space and
update references in objects. Recurse until the free pointer is
identical to the unscanned pointer.
[free] is the pointer to the next free address in the To-space,
[unscanned] is the address of the first unscanned object in the
To-space.
Return the address of the free pointer after all objects have been
scanned.
*)
What I want to do is pattern matching on the element at the unscanned pointer. If it is an object (x,y,z) then I want the result to go through each element in the integer list z and apply the function 'let copy_obj (free : int) (addr : int) =' to it as the argument addr. The problem is that the function copy_obj takes 2 arguments and I can't figure out how to also insert the second argument when calling List.iter like so:
List.iter obj_copy free z
I've also tried this as the result when it successfully matches with an object:
List.iter (fun k -> match k with
| int k -> copy_obj free k) z;
There I get this:
Error: This expression has type int * int
but an expression was expected of type unit
I didn't post any code on purpose, but if you'd liked see it I can post more. I didn't want to give any answers away. Also, I'm not look for someone to write any code for me, another reason I didn't post much of it. Any ideas in the right direction would be very helpful, thanks!

List.iter obj_copy free z should be List.iter (obj_copy free) z so the free is the first argument to obj_copy and the items in the list are the second. With this change you should get the same error as your later code.
The problem here is the use of List.iter. As you go over the referenced objects and copy them free has to change. Otherwise you copy each object over the previous one. You also need to remember where the copied objects now are so you can update the references in the outer object. Accordingly copy_obj returns (I assume) a tuple of the object and the new free.
You have to use List.fold_left or List.fold_right or manual recursion to go through the list while keeping track of free and the copied objects.

Related

Need help storing the previous element of a list (Haskell)

I'm currently working on an assignment. I have a function called gamaTipo that converts the values of a tuple into a data type previously defined by my professor.
The problem is: in order for gamaTipo to work, it needs to receive some preceding element. gamaTipo is defined like this: gamaTipo :: Peca -> (Int,Int) -> Peca where Peca is the data type defined by my professor.
What I need to do is to create a funcion that takes a list of tuples and converts it into Peca data type. The part that im strugling with is taking the preceding element of the list. i.e : let's say we have a list [(1,2),(3,4)] where the first element of the list (1,2) always corresponds to Dirt Ramp (data type defined by professor). I have to create a function convert :: [(Int,Int)] -> [Peca] where in order to calculate the element (3,4) i need to first translate (1,2) into Peca, and use it as the previous element to translate (3,4)
Here's what I've tried so far:
updateTuple :: [(Int,Int)] -> [Peca]
updateTuple [] = []
updateTuple ((x,y):xs) = let previous = Dirt Ramp
in (gamaTipo previous (x,y)): updateTuple xs
Although I get no error messages with this code, the expected output isn't correct. I'm also sorry if it's not easy to understand what I'm asking, English isn't my native tongue and it's hard to express my self. Thank you in advance! :)
If I understand correctly, your program needs to have a basic structure something like this:
updateTuple :: [(Int, Int)] -> [Peca]
updateTuple = go initialValue
where
go prev (xy:xys) =
let next = getNextValue prev xy
in prev : (go next xys)
go prev [] = prev
Basically, what’s happening here is:
updateTuple is defined in terms of a helper function go. (Note that ‘helper function’ isn’t standard terminology, it’s just what I’ve decided to call it).
go has an extra argument, which is used to store the previous value.
The implementation of go can then make use of the previous value.
When go recurses, the recursive call can then pass the newly-calculated value as the new ‘previous value’.
This is a reasonably common pattern in Haskell: if a recursive function requires an extra argument, then a new function (often named go) can be defined which has that extra argument. Then the original function can be defined in terms of go.

Pass by Reference in Haskell?

Coming from a C# background, I would say that the ref keyword is very useful in certain situations where changes to a method parameter are desired to directly influence the passed value for value types of for setting a parameter to null.
Also, the out keyword can come in handy when returning a multitude of various logically unconnected values.
My question is: is it possible to pass a parameter to a function by reference in Haskell? If not, what is the direct alternative (if any)?
There is no difference between "pass-by-value" and "pass-by-reference" in languages like Haskell and ML, because it's not possible to assign to a variable in these languages. It's not possible to have "changes to a method parameter" in the first place in influence any passed variable.
It depends on context. Without any context, no, you can't (at least not in the way you mean). With context, you may very well be able to do this if you want. In particular, if you're working in IO or ST, you can use IORef or STRef respectively, as well as mutable arrays, vectors, hash tables, weak hash tables (IO only, I believe), etc. A function can take one or more of these and produce an action that (when executed) will modify the contents of those references.
Another sort of context, StateT, gives the illusion of a mutable "state" value implemented purely. You can use a compound state and pass around lenses into it, simulating references for certain purposes.
My question is: is it possible to pass a parameter to a function by reference in Haskell? If not, what is the direct alternative (if any)?
No, values in Haskell are immutable (well, the do notation can create some illusion of mutability, but it all happens inside a function and is an entirely different topic). If you want to change the value, you will have to return the changed value and let the caller deal with it. For instance, see the random number generating function next that returns the value and the updated RNG.
Also, the out keyword can come in handy when returning a multitude of various logically unconnected values.
Consequently, you can't have out either. If you want to return several entirely disconnected values (at which point you should probably think why are disconnected values being returned from a single function), return a tuple.
No, it's not possible, because Haskell variables are immutable, therefore, the creators of Haskell must have reasoned there's no point of passing a reference that cannot be changed.
consider a Haskell variable:
let x = 37
In order to change this, we need to make a temporary variable, and then set the first variable to the temporary variable (with modifications).
let tripleX = x * 3
let x = tripleX
If Haskell had pass by reference, could we do this?
The answer is no.
Suppose we tried:
tripleVar :: Int -> IO()
tripleVar var = do
let times_3 = var * 3
let var = times_3
The problem with this code is the last line; Although we can imagine the variable being passed by reference, the new variable isn't.
In other words, we're introducing a new local variable with the same name;
Take a look again at the last line:
let var = times_3
Haskell doesn't know that we want to "change" a global variable; since we can't reassign it, we are creating a new variable with the same name on the local scope, thus not changing the reference. :-(
tripleVar :: Int -> IO()
tripleVar var = do
let tripleVar = var
let var = tripleVar * 3
return()
main = do
let x = 4
tripleVar x
print x -- 4 :(

Growing a list in haskell

I'm learning Haskell by writing an OSC musical sequencer to use it with SuperCollider. But because I'd like to make fairly complex stuff with it, it will work like a programming language where you can declare variables and define functions so you can write music in an algorithmic way. The grammar is unusual in that we're coding sequences and sometimes a bar will reference the last bar (something like "play that last chord again but a fifth above").
I don't feel satisfied with my own explanation, but that's the best I can without getting too technical.
Anyway, what I'm coding now is the parser for that language, stateless so far, but now I need some way to implement a growing list of the declared variables and alikes using a dictionary in the [("key","value")] fashion, so I can add new values as I go parsing bar by bar.
I know this involves monads, which I don't really understand yet, but I need something meaningful enough to start toying with them or else I find the raw theory a bit too raw.
So what would be a clean and simple way to start?
Thanks and sorry if the question was too long.
Edit on how the thing works:
we input a string to the main parsing function, say
"afunction(3) ; anotherone(1) + [3,2,1]"
we identify closures first, then kinds of chars (letters, nums, etc) and group them together, so we get a list like:
[("word","afunction"),("parenth","(3)"),("space"," "),("semicolon",";"),("space"," "),("word","anotherone"),("parenth","(1)"),("space"," "),("opadd","+"),("space"," "),("bracket","[3,2,1]")]
then we use a function that tags all those tuples with the indices of the original string they occupy, like:
[("word","afunction",(0,8)),("parenth","(3)",(9,11)),("space"," ",(12,13)) ...]
then cut it in a list of bars, which in my language are separated using a semicolon, and then in notes, using commas.
And now I'm at the stage where those functions should be executed sequentially, but because some of them are reading or modifying previously declared values, I need to keep track of that change. For example, let's say the function f(x) moves the pitch of the last note by x semitones, so
f(9), -- from an original base value of 0 (say that's an A440) we go to 9
f(-2), -- 9-2 = 7, so a fifth from A
f(-3); -- 9-2-3, a minor third down from the last value.
etc
But sometimes it can get a bit more complicated than that, don't make me explain how cause I could bore you to death.
Adding an item to a list
You can make a new list that contains one more item than an existing list with the : constructor.
("key", "value") : existing
Where existing is a list you've already made
Keeping track of changing state
You can keep track of changing state between functions by passing the state from each function to the next. This is all the State monad is doing. State s a is a value of type a that depends on (and changes) a state s.
{- ┌---- type of the state
v v-- type of the value -}
data State s a = State { runState :: s -> (a, s) }
{- ^ ^ ^ ^
a function ---|--┘ | |
that takes a state ---┘ | |
and returns | |
a value that depends on the state ---┘ |
and a new state ------┘ -}
The bind operation >>= for State takes a value that depends on (and changes) the state and a function to compute another value that depends on (and changes) the state and combines them to make a new value that depends on (and changes) the state.
m >>= k = State $ \s ->
let ~(a, s') = runState m s
in runState (k a) s'

Why does concatenation of lists take O(n)?

According to the theory of ADTs (Algebraic Data Types) the concatenation of two lists has to take O(n) where n is the length of the first list. You, basically, have to recursively iterate through the first list until you find the end.
From a different point of view, one can argue that the second list can simply be linked to the last element of the first. This would take constant time, if the end of the first list is known.
What am I missing here ?
Operationally, an Haskell list is typically represented by a pointer to the first cell of a single-linked list (roughly). In this way, tail just returns the pointer to the next cell (it does not have to copy anything), and consing x : in front of the list allocates a new cell, makes it point to the old list, and returns the new pointer. The list accessed by the old pointer is unchanged, so there's no need to copy it.
If you instead append a value with ++ [x], then you can not modify the original liked list by changing its last pointer unless you know that the original list will never be accessed. More concretely, consider
x = [1..5]
n = length (x ++ [6]) + length x
If you modify x when doing x++[6], the value of n would turn up to be 12, which is wrong. The last x refer to the unchanged list which has length 5, so the result of n must be 11.
Practically, you can't expect the compiler to optimize this, even in those cases in which x is no longer used and it could, theoretically, be updated in place (a "linear" use). What happens is that the evaluation of x++[6] must be ready for the worst-case in which x is reused afterwards, and so it must copy the whole list x.
As #Ben notes, saying "the list is copied" is imprecise. What actually happens is that the cells with the pointers are copied (the so-called "spine" on the list), but the elements are not. For instance,
x = [[1,2],[2,3]]
y = x ++ [[3,4]]
requires only to allocate [1,2],[2,3],[3,4] once. The lists of lists x,y will share pointers to the lists of integers, which do not have to be duplicated.
What you're asking for is related to a question I wrote for TCS Stackexchange some time back: the data structure that supports constant-time concatenation of functional lists is a difference list.
A way of handling such lists in a functional programming language was worked out by Yasuhiko Minamide in the 90s; I effectively rediscovered it a while back. However, the good run-time guarantees require language-level support that's not available in Haskell.
It's because of immutable state. A list is an object + a pointer, so if we imagined a list as a Tuple it might look like this:
let tupleList = ("a", ("b", ("c", [])))
Now let's get the first item in this "list" with a "head" function. This head function takes O(1) time because we can use fst:
> fst tupleList
If we want to swap out the first item in the list with a different one we could do this:
let tupleList2 = ("x",snd tupleList)
Which can also be done in O(1). Why? Because absolutely no other element in the list stores a reference to the first entry. Because of immutable state, we now have two lists, tupleList and tupleList2. When we made tupleList2 we didn't copy the whole list. Because the original pointers are immutable we can continue to reference them but use something else at the start of our list.
Now let's try to get the last element of our 3 item list:
> snd . snd $ fst tupleList
That happened in O(3), which is equal to the length of our list i.e. O(n).
But couldn't we store a pointer to the last element in the list and access that in O(1)? To do that we would need an array, not a list. An array allows O(1) lookup time of any element as it is a primitive data structure implemented on a register level.
(ASIDE: If you're unsure of why we would use a Linked List instead of an Array then you should do some more reading about data structures, algorithms on data structures and Big-O time complexity of various operations like get, poll, insert, delete, sort, etc).
Now that we've established that, let's look at concatenation. Let's concat tupleList with a new list, ("e", ("f", [])). To do this we have to traverse the whole list just like getting the last element:
tupleList3 = (fst tupleList, (snd $ fst tupleList, (snd . snd $ fst tupleList, ("e", ("f", [])))
The above operation is actually worse than O(n) time, because for each element in the list we have to re-read the list up to that index. But if we ignore that for a moment and focus on the key aspect: in order to get to the last element in the list, we must traverse the entire structure.
You may be asking, why don't we just store in memory what the last list item is? That way appending to the end of the list would be done in O(1). But not so fast, we can't change the last list item without changing the entire list. Why?
Let's take a stab at how that might look:
data Queue a = Queue { last :: Queue a, head :: a, next :: Queue a} | Empty
appendEnd :: a -> Queue a -> Queue a
appendEnd a2 (Queue l, h, n) = ????
IF I modify "last", which is an immutable variable, I won't actually be modifying the pointer for the last item in the queue. I will be creating a copy of the last item. Everything else that referenced that original item, will continue referencing the original item.
So in order to update the last item in the queue, I have to update everything that has a reference to it. Which can only be done in optimally O(n) time.
So in our traditional list, we have our final item:
List a []
But if we want to change it, we make a copy of it. Now the second last item has a reference to an old version. So we need to update that item.
List a (List a [])
But if we update the second last item we make a copy of it. Now the third last item has an old reference. So we need to update that. Repeat until we get to the head of the list. And we come full circle. Nothing keeps a reference to the head of the list so editing that takes O(1).
This is the reason that Haskell doesn't have Doubly Linked Lists. It's also why a "Queue" (or at least a FIFO queue) can't be implemented in a traditional way. Making a Queue in Haskell involves some serious re-thinking of traditional data structures.
If you become even more curious about how all of this works, consider getting the book Purely Funtional Data Structures.
EDIT: If you've ever seen this: http://visualgo.net/list.html you might notice that in the visualization "Insert Tail" happens in O(1). But in order to do that we need to modify the final entry in the list to give it a new pointer. Updating a pointer mutates state which is not allowed in a purely functional language. Hopefully that was made clear with the rest of my post.
In order to concatenate two lists (call them xs and ys), we need to modify the final node in xs in order to link it to (i.e. point at) the first node of ys.
But Haskell lists are immutable, so we have to create a copy of xs first. This operation is O(n) (where n is the length of xs).
Example:
xs
|
v
1 -> 2 -> 3
1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7
^ ^
| |
xs ++ ys ys

Does Data.Map work in a pass-by-value or pass-by-reference way? (Better explanation inside)

I have a recursive function working within the scope of strictly defined interface, so I can't change the function signatures.
The code compiles fine, and even runs fines without error. My problem is that it's a large result set, so it's very hard to test if there is a semantic error.
My primary question is: In a sequence of function calls A to B to A to B to breaking condition, considering the same original Map is passed to all functions until breaking condition, and that some functions only return an Integer, would an insert on a Map in a function that only returns an Integer still be reflected once control is returned to the first function?
primaryFunc :: SuperType -> MyMap -> (Integer, MyMap)
primaryFunc (SubType1 a) mapInstance = do
let returnInt = func1 a mapInstance
(returnInt, mapInstance)
primaryFunc (SubType2 c) mapInstance = do
let returnInt = primaryFunc_nonprefix_SuperType c mapInstance
let returnSuperType = (Const returnInt)
let returnTable = H.insert c returnSuperType mapInstance
(returnInt, returnTable)
primaryFunc (ConstSubType d) mapInstance = do
let returnInt = d
(returnInt, mapInstance)
func1 :: SubType1 -> MyMap -> Integer
func1 oe vt = do
--do stuff with input and map data to get return int
returnInt = primaryFunc
returnInt
func2 :: SubType2 -> MyMap -> Integer
func2 pe vt = do
--do stuff with input and map data to get return int
returnInt = primaryFunc
returnInt
Your question is almost impossibly dense and ambiguous, but it should be possible to answer what you term your "primary" question from the simplest first principles of Haskell:
No Haskell function updates a value (e.g. a map). At most it can return a modified copy of its input.
Outside of the IO monad, no function can have side effects. No function can affect the value of any variable assigned before it was called; all it can do is return a value.
So if you pass a map as a parameter to a function, nothing the function does can alter your existing reference to that value. If you want an updated value, you can only get that from the output of a function to which you have passed the original value as input. New value, new reference.
Because of this, you should have absolute clarity at any depth within your web of functions about which value you are working with. Knowing this, you should be able to answer your own question. Frankly, this is such a fundamental characteristic of Haskell that I am perplexed that you even need to ask.
If a function only returns an integer, then any operations you perform on any values made available to the function can only affect the output - that is, the integer value returned. Nothing done within the function can affect anything else (short of causing the whole program to crash).
So if function A has a reference to a map and it passes this value to function B which returns an int, nothing function B does can affect A's copy of the map. If function B were allowed to secretly alter A's copy of the map, that would be a side effect. Side effects are not allowed.
You need to understand that Haskell does not have variables as you understand them. It has immutable values, references to immutable values and functions (which take inputs and return new outputs). Functions do not have variables which are in scope for other functions which might alter those variables on the fly. That cannot happen.
As an aside, not only does the code you posted show that you do not understand the basics of Haskell syntax, the question you asked shows that you haven't understood the primary characteristics of Haskell as a language. Not only are these fundamentals things which can be understood before having learned any syntax, they are things you need to know to make sense of the syntax.
If you have a deadline, meet it using a tool you do understand. Then go learn Haskell properly.
In addition, you will find that
an insert on a Map in a function that only returns an Integer
is nearly impossible to express. Yes, you can technically do it like in
insert k v map `seq` 42 -- force an insert and throw away the result
but if you think that, for example:
let done = insert k v map in 42
does anything with the map, you're probably wrong.
In no case, however, is the original map altered.

Resources