Identity of simulation objects in Haskell

Identity of simulation objects in Haskell - haskell

Writing a simulation in an object-oriented language, each object has an identity--that is, a way to distinguish it from every other object in the simulation, even if other objects have the exact same attributes. An object retains its identity, no matter how much it changes over time. This is because each object has a unique location in memory, and we can express that location with pointers or references. This works even if you don't impose an additional identity system like GUIDs. (Which you would often do to support things like networking or databases which don't think in terms of pointers.)
I don't believe there is an equivalent concept in Haskell. So, would the standard approach be to use something like GUIDs?
Update to clarify the problem: Identity is an important concept in my problem domain for one reason: Objects have relationships to each other, and these must be preserved. For example, Haskell would normally say a red car is a red car, and all red cars are identical (provided color is the only attribute cars have). But what if each red car must be linked to its owner? And what if the owner can repaint his cars?
Final update synthesizing the answers: The consensus seems to be that you should only add identifiers to data types if some part of the simulation will actually use those identifiers, and there's no other way to express the same information. E.g. for the case where a Person owns multiple Cars, each of which has a color, a Person can keep a list of immutable Cars. That fully expresses the ownership relationship as long as you have access to the Person.
The Person may or may not need some kind of unique identifier. One scenario where this would occur is: There's a function that takes a Car and a collection of all Persons and imposes a ParkingTicket on the appropriate Person. The Car's color cannot uniquely identify the Person who gets the ticket. So we can give the Person an ID and have the Car store it.
But even this could potentially be avoided with a better design. Perhaps our Cars now have an additional attribute of type ParkingPosition, which can be evaluated as legal or illegal. So we pass the collection of Persons to a function that looks at each Person's list of Cars, checks each one's ParkingPosition, and imposes the ParkingTicket on that Person if appropriate. Since this function always knows which Person it's looking at, there's no need for the Car to record that info.
So in many cases, assigning IDs is not as necessary as it first may seem.

Why do you want to "solve" this non-problem? Object identity is a problem with OO languages which Haskell happily avoids.
In a world of immutable objects, two objects with identical values are the same object. Put the same immutable object twice into a list and you have two different objects wherever you want to see things that way (they "both" contribute to the total number of elements, for example, and they have unique indexes) without any of the problems that Java-style reference equality causes. You can even save that list to a database table and get two different rows, if you like. What more do you need?
UPDATE
Jarret, you seem to be convinced that the objects in your model must have genuinely separate identities just because real life ones would be distinct. However, in a simulation, this only matters in the contexts where the objects need to be differentiated. Generally, you only need unique identifiers in a context where objects must be differentiated and tracked, and not outside those contexts. So identify those contexts, map the lifecycle of an object that is important to your simulation (not to the "real" world), and create the appropriate identifiers.
You keep providing answers to your own questions. If cars have owners, then Bob's red car can be distinguished from Carol's red car. If bob can repaint his car, you can replace his red car with a blue car. You only need more if
Your simulation has cars without owners
You need to be able to distinguish between one ownerless red car and another.
In a simple model, 1 may be true and 2 not. In which case, all ownerless red cars are the same red car so why bother making them distinct?
In your missile simulation, why do missiles need to track their owning launchers? They're not aimed at their launchers! If the launcher can continue to control the missile after it is launched, then the launcher needs to know which missiles it owns but the reverse is not true. The missile just needs to know its trajectory and target. When it lands and explodes, what is the significance of the owner? Will it make a bigger bang if it was launched from launcher A rather than launcher B?
Your launcher can be empty or it can have n missiles still available to fire. It has a location. Targets have locations. At any one time there are k missiles in flight; each missile has a position, a velocity/trajectory and an explosive power. Any missile whose position is coincident with the ground should be transformed into an exploding missile, or a dud etc etc.
In each of those contexts, which information is important? Is the launcher identity really important at detonation time? Why? Is the enemy going to launch a retaliatory strike? No? Then that's not important information for the detonation. It probably isn't even important information after launch. Launching can simply be a step where the number of missiles belonging to Launcher A is decremented while the number of missiles in flight is incremented.
Now, you might have a good answer to these questions, but you should fully map your model before you start lumbering objects with identities they may not need.

My approach would be to store all state information in a data record, like
data ObjState = ObjState
{ objStName :: String
, objStPos :: (Int, Int)
, objStSize :: (Int, Int)
} deriving (Eq, Show)
data Obj = Obj
{ objId :: Int
, objState :: ObjState
} deriving (Show)
instance Eq Obj where
obj1 == obj2 = objId obj1 == objId obj2
And the state should be managed by the API/library/application. If you need true pointers to mutable structures, then there are built-in libraries for it, but they're considered unsafe and dangerous to use unless you know what you're doing (and even then, you have to be cautious). Check out the Foreign modules in base for more information.

In Haskell the concepts of values and identities are decoupled. All variables are simply immutable bindings to values.
There are a few types whose value is a mutable reference to another value, such as IORef, MVar and TVar, these can be used as identities.
You can perform identity checks by comparing two MVars and an equality check by comparing their referenced values.
An excellent talk by Rich Hickey goes in detail over the issue: http://www.infoq.com/presentations/Value-Values

You can always write:
> let mylist = take 25 $ cycle "a"
> mylist
"aaaaaaaaaaaaaaaaaaaaaaaaa"
> zip mylist [1..]
[('a',1),('a',2),('a',3),('a',4),('a',5),('a',6),('a',7),('a',8),('a',9),('a',10),
('a',11),('a',12),('a',13),('a',14),('a',15),('a',16),('a',17),('a',18),('a',19),
('a',20),('a',21),('a',22),('a',23),('a',24),('a',25)]
If we are not joking - save it as part of data
data MyObj = MyObj {id ::Int, ...}
UPDATED
If we want to work with colors and ids separately, we can do next in Haskell:
data Color = Green | Red | Blue deriving (Eq, Show)
data Car = Car {carid :: Int, color :: Color} deriving (Show)
garage = [Car 1 Green, Car 2 Green, Car 3 Red]
redCars = filter ((== Red) . color) garage
greenCars = filter ((== Green) . color) garage
paint2Blue car = car {color=Blue}
isGreen = (== Green) . color
newGarage = map (\car -> if isGreen car then paint2Blue car else car) garage
And see result in gchi:
> garage
[Car {carid = 1, color = Green},Car {carid = 2, color = Green},Car {carid = 3, color = Red}]
> redCars
[Car {carid = 3, color = Red}]
> greenCars
[Car {carid = 1, color = Green},Car {carid = 2, color = Green}]
> newGarage
[Car {carid = 1, color = Blue},Car {carid = 2, color = Blue},Car {carid = 3, color = Red}]

Related

Representing a chessboard in Haskell

I'm trying to implement a function in Haskell that returns a list containing all possible moves for the player who's up. The function's only argument is a String composed of an actual state of the board (in Forsyth-Edwards Notation ) followed by the moving player(b/w).
Notation example : rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w (starting board state)
A move is transmitted as a string of the form [origin]-[destination]. The destination is always a position of the form [column][row], where the lower left square is called a1 and the upper right square is called h8. A move would be for example the move "b3-c4". (no castling/En-passant).
In Java I would use a 2d Array for the Board, but in Haskell I can't find a similar solution (I'm new to functional programming).
What would be a good way/data structure to represent the chess board in?

There are two primary options for storing a board state. The first is a 2D list of Maybe, where a piece would be represented as, e.g. Just $ Piece Black King and a blank square would be represented as Nothing. This optimizes determining if a square is occupied over listing where pieces are (which might be important if you plan to add rendering later):
type Board = Vector (Vector (Maybe Piece))
data Piece = Piece { color :: Color
, type :: PieceType }
The second option is to store a list of pieces and their locations. This implementation is faster to enumerate the locations of all pieces, but slower to check if there is a piece on a particular square:
type Pieces = [Placement]
type Placement = { position :: Position
, piece :: Piece }
data Position =
Pos { rank :: Int
, file :: Int }
deriving (Show, Eq)
data Piece =
Piece { color :: Color
, ptype :: PieceType }
deriving Show
EDIT: It's worth noting that with an 8x8 grid and a maximum of 32 pieces on the board, the performance hit either way is going to be minimal unless you're doing a lot of calculations.

Data.Vector, has constant time lookup by index.
A chessboard can be represented as a Vector (Vector (Maybe Piece)). To define Piece, see ADTs

Proof there is no instance for a given UML-Diagram

Given the diagram in the top-right corner, I'm supposed to decide whether there is any valid instance of it. Now the given image is a counterproof by example ('wegen' means 'because of'). The counterproof uses the cardinality ('Mächtigkeit') of the objects.
I don't understand, why for example 2*|A| equals |C|, as in UML, A would be in relation with 2 objects of C (rel1). So for every A there have to be 2 C to make a valid instance. 2*|A| = |C| should therefore be |A| = 2*|C|.
Why is it the other way around?

2*|A| = |C| since there is double the amount of C objects compared to A because each A has two C associated.
|A| = |B| because they have a 1-1 relation
3*|C| = 2*|B| because each C has 3 B and each B has 2 C
(4) and (5) are just substitutions where the last gives a contradiction
q.e.d
P.S. As #ShiDoiSi pointed out there is no {unique} constraint in the multiplicities. This will make it possible to have multiple associations to the same instance. Ergo, you have 1-1 relations. So with that being the case you actually CAN have a valid instantiation of the model.
Now go and tell that to your teacher xD

Searching through a multi-branch graph and returning a path [C#]

In my situation I have territory objects. Each territory knows what other territories they are connected to through an array. Here is an visualization of said territories as they would appear on a map:
If you were to map out the connections on a graph, they would look like this:
So say I have a unit stationed in territory [b] and I want to move it to territory [e], I'm looking for a method of searching through this graph and returning a final array that represents the path my unit in territory [b] must take. In this scenario, I would be looking for it to return
[b, e].
If I wanted to go from territory [a] to territory [f] then it would return:
[a, b, e, f].
I would love examples, but even just posts pointing me in the right direction are appreciated. Thanks in advance! :)

Have you heard of Breadth-First Search (BFS) before?
Basically, you simply put your initial territory, "a" in your example, into an otherwise empty queue Q
The second data structure you need is an array of booleans with as many elements as you have territories, in this case 9. It helps with remembering which territories we have already checked. We call it V (for "visited"). It needs to be initialized as follows: All elements equal false except the one corresponding to the initial square. That is for all territories t, we have V[t] = false, but V[a] = true because "a" is already in the queue.
The third and final data structure you need is an array to store the parent nodes (i.e. which node we are coming from). It also has as many elements as you have territories. We call it P (for "parent") and every element points to itself initially, that is for all t in P, set P[t] = t.
Then, it is quite simple:
while Q is not empty:
t = front element in the queue (remove it also from Q)
if t = f we can break from while loop //because we have found the goal
for all neighbors s of t for which V[s] = false:
add s into the back of Q //we have never explored the territory s yet as V[s] = false
set V[s] = true //we do not have to visit s again in the future
//we found s because there is a connection from t to s
//therefore, we need to remember that in s we are coming from the node t
//to do this we simply set the parent of s to t:
P[s] = t
How do you read the solution now?
Simply check the parent of f, then the parent of that and then the parent of that and so on until you find the beginning. You will know what the beginning is once you have found an element which has itself as the parent (remember that we let them point to itself initially) or you can also just compare it to a.
Basically, you just need a empty list L, add f into it and then
while f != a:
f = P[f]
add f into L
Note that this obviously fails if there exists no path because f will never equal a.
Therefore, this:
while f != P[f]:
f = P[f]
add f into L
is a bit nicer. It exploits the fact that initially all territories point to themselves in P.
If you try this on paper with you example above, then you will end up with
L = [f, e, b, a]
If you simply reverse this list, then you have what you wanted.
I don't know C#, so I didn't bother to use C# syntax. I assume that you know it is easiest to index your territories with integers and then use an array to access them.
You will realize quite quickly why this works. It's called breadth-first search because you consider only neighbors of the territory "a" first, with trivially shortest path to them (only 1 edge) and only once you processed all these, then territories that are further away will appear in the queue (only 2 edges from the start now) and so on. This is why we use a queue for this task and not something like a stack.
Also, this is linear in the number of territories and edges because you only need to look at every territory and edge (at most) once (though edges from both directions).
The algorithm I have given to you is basically the same as https://en.wikipedia.org/wiki/Breadth-first_search with only the P data structure added to keep track where you are coming from to be able to figure out the path taken.

Solving maximizing problems in Alloy (or other optimization problems)

I've bought and read the Software Abstractions book (great book actually) a couple of months if not 1.5 years ago. I've read online tutorials and slides on Alloy, etc. Of course, I've also done exercises and a few models of my own. I've even preached for Alloy in some confs. Congrats for Alloy btw!
Now, I am wondering if one can model and solve maximizing problems over integers in Alloy. I don't see how it could be done but I thought asking real experts could give me a more definitive answer.
For instance, say you have a model similar to this:
open util/ordering[State] as states
sig State {
i, j, k: Int
}{
i >= 0
j >= 0
k >= 0
}
pred subi (s, s': State) {
s'.i = minus[s.i, 2]
s'.j = s.j
s'.k = s.k
}
pred subj (s, s': State) {
s'.i = s.i
s'.j = minus[s.j, 1]
s'.k = s.k
}
pred subk (s, s': State) {
s'.i = s.i
s'.j = s.j
s'.k = minus[s.k, 3]
}
pred init (s: State) {
// one example
s.i = 10
s.j = 8
s.k = 17
}
fact traces {
init[states/first]
all s: State - states/last | let s' = states/next[s] |
subi[s, s'] or subj[s, s'] or subk[s, s']
let s = states/last | (s.i > 0 => (s.j = 0 and s.k = 0)) and
(s.j > 0 => (s.i = 0 and s.k = 0)) and
(s.k > 0 => (s.i = 0 and s.j = 0))
}
run {} for 14 State, 6 Int
I could have used Naturals but let's forget it. What if I want the trace which leads to the maximal i, j or k in the last state? Can I constrain it?
Some intuition is telling me I could do it by trial and error, i.e., find one solution and then manually add a constraint in the model for the variable to be stricly greater than the one value I just found, until it is unsatisfiable. But can it be done more elegantly and efficiently?
Thanks!
Fred
EDIT: I realize that for this particular problem, the maximum is easy to find, of course. Keep the maximal value in the initial state as-is and only decrease the other two and you're good. But my point was to illustrate one simple problem to optimize so that it can be applied to harder problems.

Your intuition is right: trial and error is certainly a possible approach, and I use it regularly in similar situations (e.g. to find minimal sets of axioms that entail the properties I want).
Whether it can be done more directly and elegantly depends, I think, on whether a solution to the problem can be represented by an atom or must be a set or other non-atomic object. Given a problem whose solutions will all be atoms of type T, a predicate Solution which is true of atomic solutions to a problem, and a comparison relation gt which holds over atoms of the appropriate type(s), then you can certainly write
pred Maximum[ a : T ] {
Solution[a]
and
all s : T | Solution[s] implies
(gt[a,s] or a = s)
}
run Maximum for 5
Since Alloy is resolutely first-order, you cannot write the equivalent predicate for solutions which involve sets, relations, functions, paths through a graph, etc. (Or rather, you can write them, but the Analyzer cannot analyze them.)
But of course one can also introduce signatures called MySet, MyRelation, etc., so that one has one atom for each set, relation, etc., that one needs in a problem. This sometimes works, but it does run into the difficulty that such problems sometimes need all possible sets, relations, functions, etc., to exist (as in set theory they do), while Alloy will not, in general, create an atom of type MySet for every possible set of objects in univ. Jackson discusses this technique in sections 3.2.3 (see "Is there a loss of expressive power in the restriction to flat relations?"), 5.2.2 "Skolemization", and 5.3 "Unbounded universal quantifiers" of his book, and the discussion has thus far repaid repeated rereadings. (I have penciled in an additional index entry in my copy of the book pointing to these sections, under the heading 'Second-order logic, faking it', so I can find them again when I need them.)
All of that said, however: in section 4.8 of his book, Jackson writes "Integers are not actually very useful. If you think you need them, think again; ... Of course, if you have a heavily numerical problem, you're likely to need integers (and more), but then Alloy is probably not suitable anyway."

How to model a 2D world in Haskell

I'm making a game. The game consists of an infinite plane. Units must be on a discrete square, so they can be located with a simple Location { x :: Int, y :: Int }
There might be many kinds of Units. Some might be creatures, and some are just objects, like a block of stone, or wood (think 2d minecraft there). Many will be empty (just grass or whatever).
How would you model this in Haskell? I've considered doing the below, but what about Object vs Creature? they might have different fields? Normalize them all on Unit?
data Unit = Unit { x :: Int, y :: Int, type :: String, ... many shared properties... }
I've also considered having a location type
data Location = Location { x :: Int, y :: Int, unit :: Unit }
-- or this
data Location = Location { x :: Int, y :: Int }
data Unit = Unit { unitFields... , location :: Location }
Do you have any ideas? In an OO language, I probably would have had Location or Unit inherit from the other, and made the specific types of Unit inherit from each other.
Another consideration is this will be sending lots of these objects over the wire, so I'll need to serialize them to JSON for use on the client, and don't want to write tons of parsing boilerplate.

Location is just a simple two-dimensional Point type.
I would advise against tying Units to the location they're in; just use a Map Location Unit to handle the map between positions on the grid and what (if anything) is present there.
As far as the specific types of Unit go, I would, at the very least, recommend factoring common fields out into a data type:
data UnitInfo = UnitInfo { ... }
data BlockType = Grass | Wood | ...
data Unit
= NPC UnitInfo AgentID
| Player UnitInfo PlayerID
| Block UnitInfo BlockType
or similar.
Generally, factor common things out into their own data-type, and keep data as simple and "isolated" as possible (i.e. move things like "what location is this unit at?" into separate structures associating the two, so that individual data-types are as "timeless", reusable and abstract as possible).
Having a String for the "type" of a Unit is a strong antipattern in Haskell; it generally indicates you're trying to implement dynamic typing or OOP structures with data types, which is a bad fit.
Your JSON requirement complicates things, but this FAQ entry shows a good example of how to idiomatically achieve this kind of genericity in Haskell with no String typing or fancy type-class hacks, using functions and data-types as the primary units of abstraction. Of course, the former causes you problems here; it's hard to serialise functions as JSON. However, you could maintain a map from an ADT representing a "type" of creature or block, etc., and its actual implementation:
-- records containing functions to describe arbitrary behaviour; see FAQ entry
data BlockOps = BlockOps { ... }
data CreatureOps = CreatureOps { ... }
data Block = Block { ... }
data Creature = Creature { ... }
data Unit = BlockUnit Block | CreatureUnit Creature
newtype GameField = GameField (Map Point Unit)
-- these types are sent over the network, and mapped *back* to the "rich" but
-- non-transferable structures describing their behaviour; of course, this means
-- that BlockOps and CreatureOps must contain a BlockType/CreatureType to map
-- them back to this representation
data BlockType = Grass | Wood | ...
data CreatureType = ...
blockTypes :: Map BlockType BlockOps
creatureTypes :: Map CreatureType CreatureOps
This lets you have all the extensibility and don't-repeat-yourself nature of typical OOP structures, while keeping the functional simplicity and allowing simple network transfer of game states.
Generally, you should avoid thinking in terms of inheritance and other OOP concepts; instead, try to think of dynamic behaviour in terms of functions and composition of simpler structures. A function is the most powerful tool in functional programming, hence the name, and can represent any complex pattern of behaviour. It's also best not to let requirements like network play influence your basic design; like the example above, it's almost always possible to layer these things on top of a design built for expressiveness and simplicity, rather than constraints like communication format.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string