I already asked about the Cartesian product and disjoint union in Alloy here . There, I considered sets as unary defined predicates.
What if I simple want to disjoint union two simple signatures in Alloy.
suppose I have the following signatures:
sig A {}
sig B {}
I would like to define a relation from A to B Û B, where I used Û for disjoint union. Is this possible directly in alloy?
I can think of two approaches. (But I realize, re-reading your question that I have no idea what your last paragraph means. So perhaps this is all irrelevant to your goals.)
First: A disjoint union labels each member with a flag so you know which parent set it came from, and so that no element in the disjoint union is in both parent sets. If the point of the exercise is to ensure that you know what parent set each member of the disjoint union came from, and to ensure that no member of the disjoint union came from more than one parent set, then in this case, normal union seems to do what you need. In your example, signatures A and B are already disjoint, and it's always possible to tell whether a given atom is in A or in B. So the first approach is just to use the expression A + B.
Second: If A + B won't do, for reasons not given in the question, and you really really want a set of pairs, then define that set of pairs. In each pair, either the first element is from A and the second element is 1 (or some other flag) or else the first element is from B and the second element is 2 (or some other flag).
One way to write this would be:
{v : X + Y, n : Int | (v in X and n = 1) or (v in Y and n = 2) }
Another equivalent way would be:
{x : X, y : Int | y = 1}
+
{x : Y, y : Int | y = 2}
A third way is even simpler:
{v : X, n : 1} + {v : Y, n : 2}
And simpler yet:
(X -> 1) + (Y -> 2)
Like any expression, this can be packaged in a function:
fun du[Left, Right : set univ] : (Left + Right) -> Int {
(Left -> 1) + (Right -> 2)
}
And then the disjoint union of A and B can be written du[A, B].
I repeat my advice to spend some time learning about comprehensions.
Related
Given a two-place data constructor, I can partially apply it to one argument then apply that to the second. Why can't I use the same syntax for pattern matching?
data Point = MkPoint Float Float
x = 1.0 :: Float; y = 2.0 :: Float
thisPoint = ((MkPoint x) y) -- partially apply the constructor
(MkPoint x1 y1) = thisPoint -- pattern match OK
((MkPoint x2) y2) = thisPoint -- 'partially apply' the pattern, but rejected: parse error at y2
((MkPoint x3 y3)) = thisPoint -- this accepted, with double-parens
Why do I want to do that? I want to grab the constructor and first arg as an as-pattern, so I can apply it to a different second arg. (Yes the work-round in this example is easy. Realistically I have a much more complex pattern, with several args of which I want to split out the last.):
(mkPx#(MkPoint x4) y4) = thisPoint -- also parse error
thatPoint = mkPx (y4 * 2)
I think there's no fundamental reason to prevent this kind of match.
Certainly it wouldn't do to allow you to write
f (MkPoint x1) = x1
and have that match a partially-applied constructor, i.e. a function. So, one reason to specify it as it was specified here is for simplicity: the RHS of an # has to be a pattern. Simple, easy to parse, easy to understand. (Remember, the very first origins of the language were to serve as a testbed for PL researchers to tinker. Simple and uniform is the word of the day for that purpose.) MkPoint x1 isn't a pattern, therefore mkPx#(MkPoint x1) isn't allowed.
I suspect that if you did the work needed to carefully specify what is and isn't allowed, wrote up a proposal, and volunteered to hack on the parser and desugarer as needed, the GHC folks would be amenable to adding a language extension. Seems like a lot of work for not much benefit, though.
Perhaps record update syntax will scratch your itch with much less effort.
data Point = MkPoint {x, y :: Float}
m#(MkPoint { x = x5 }) = m { x = x5 + 1 }
You also indicate that, aside from the motivation, you wonder what part of the Report says that the pattern you want can't happen. The relevant grammar productions from the Report are here:
pat → lpat
lpat → apat
| gcon apat1 … apatk (arity gcon = k, k ≥ 1)
apat → var [ # apat] (as pattern)
| gcon (arity gcon = 0)
| ( pat ) (parenthesized pattern)
(I have elided some productions that don't really change any of the following discussion.)
Notice that as-patterns must have an apat on their right-hand side. apats are 0-arity constructors (in which case it's not possible to partially apply it) or parenthesized lpats. The lpat production shown above indicates that for constructors of arity k, there must be exactly k apat fields. Since MkPoint has arity 2, MkPoint x is therefore not an lpat, and so (MkPoint x) is not an apat, and so m#(MkPoint x) is not an apat (and so not produced by pat → lpat → apat).
I can partially apply [a constructor] to one argument then apply that to the second.
thisPoint = ((MkPoint x) y) -- partially apply the constructor
There's another way to achieve that without parens, also I can permute the arguments
thisPoint = MkPoint x $ y
thisPoint = flip MkPoint y $ x
Do I expect I could pattern match on that? No, because flip, ($) are just arbitrary functions/operators.
I want to grab the constructor and first arg as an as-pattern, ...
What's special about the first arg? Or the all-but-last arg (since you indicate your real application is more complex)? Do you you expect you could grab the constructor + third and fourth args as an as-pattern?
Haskell's pattern matching wants to keep it simple. If you want a binding to the constructor applied to an arbitrary subset of arguments, use a lambda expression mentioning your previously-bound var(s):
mkPy = \ x5 -> MkPoint x5 y1 -- y1 bound as per your q
f x zero = Nothing
f x y = Just $ x / y
where zero = 0
The literal-bound identifier zero simply matches all after the warning Pattern match(es) are overlapped.
That's how Haskell's syntax works; every lowercase-initial variable name in a pattern (re)binds that name. Any existing binding will be shadowed.
But even if that weren't the case, the binding for zero would not be visible to the first alternative, because of how Haskell's syntax works. A similar thing happens in the following version:
f = \v1 v2 -> case (v1, v2) of
(x, zero) -> Nothing
(x, y) -> Just $ x / y
where zero = 0
The where clause only applies to the one alternative that it's part of, not to the whole list of alternatives. That code is pretty much the same thing as
f = \v1 v2 -> case (v1, v2) of
(x, zero) -> Nothing
(x, y) -> let zero = 0 in Just $ x / y
If bound identifiers had different semantics than unbound identifiers in a pattern match, that could be quite error prone as binding a new identifier could mess up pattern matches anywhere that identifier is in scope.
For example let's say you're importing some module Foo (unqualified). And now the module Foo is changed to add the binding x = 42 for some reason. Now in your pattern match you'd suddenly be comparing the first argument against 42 rather than binding it to x. That's a pretty hard to find bug.
So to avoid this kind of scenario, identifier patterns have the same semantics regardless of whether they're already bound somewhere.
Because they are very fragile. What does this compute?
f x y z = 2*x + 3*y + z
Would you expect this to be equal to
f x 3 z = 2*x + 9 + z
f _ _ _ = error "non-exhaustive patterns!"
only because there's a y = 3 defined somewhere in the same 1000+ line module?
Also consider this:
import SomeLibrary
f x y z = 2*x + 3*y + z
What if in a future release SomeLibrary defines y? We don't want that to suddenly stop working.
Finally, what if there is no Eq instance for y?
y :: a -> a
y = id
f :: a -> (a -> a) -> a
f x y = y x
f x w = w (w x)
Sure, it is a contrived example, but there's no way the runtime can compare the input function to check whether it is equal to y or not.
To disambiguate this, some new languages like Swift uses two different syntaxes. E.g. (pseudo-code)
switch someValue {
case .a(x) : ... // compare by equality using the outer x
case .b(let x) : ... // redefine x as a new local variable, shadowing the outer one
}
zero is just a variable that occurs inside a pattern, just like y does in the second line. There is no difference between the two. When a variable that occurs inside a pattern, this introduces a new variable. If there was a binding for that variable already, the new variable shadows the old one.
So you cannot use an already bound variable inside a pattern. Instead, you should do something like that:
f x y | y == zero = Nothing
where zero = 0
f x y = Just $ x / y
Notice that I also moved the where clause to bring it in scope for the first line.
I found this statement while studying Functional Reactive Programming, from "Plugging a Space Leak with an Arrow" by Hai Liu and Paul Hudak ( page 5) :
Suppose we wish to define a function that repeats its argument indefinitely:
repeat x = x : repeat x
or, in lambdas:
repeat = λx → x : repeat x
This requires O(n) space. But we can achieve O(1) space by writing instead:
repeat = λx → let xs = x : xs
in xs
The difference here seems small but it hugely prompts the space efficiency. Why and how it happens ? The best guess I've made is to evaluate them by hand:
r = \x -> x: r x
r 3
-> 3: r 3
-> 3: 3: 3: ........
-> [3,3,3,......]
As above, we will need to create infinite new thunks for these recursion. Then I try to evaluate the second one:
r = \x -> let xs = x:xs in xs
r 3
-> let xs = 3:xs in xs
-> xs, according to the definition above:
-> 3:xs, where xs = 3:xs
-> 3:xs:xs, where xs = 3:xs
In the second form the xs appears and can be shared between every places it occurring, so I guess that's why we can only require O(1) spaces rather than O(n). But I'm not sure whether I'm right or not.
BTW: The keyword "shared" comes from the same paper's page 4:
The problem here is that the standard call-by-need evaluation rules
are unable to recognize that the function:
f = λdt → integralC (1 + dt) (f dt)
is the same as:
f = λdt → let x = integralC (1 + dt) x in x
The former definition causes work to be repeated in the recursive call
to f, whereas in the latter case the computation is shared.
It's easiest to understand with pictures:
The first version
repeat x = x : repeat x
creates a chain of (:) constructors ending in a thunk which will replace itself with more constructors as you demand them. Thus, O(n) space.
The second version
repeat x = let xs = x : xs in xs
uses let to "tie the knot", creating a single (:) constructor which refers to itself.
Put simply, variables are shared, but function applications are not. In
repeat x = x : repeat x
it is a coincidence (from the language's perspective) that the (co)recursive call to repeat is with the same argument. So, without additional optimization (which is called static argument transformation), the function will be called again and again.
But when you write
repeat x = let xs = x : xs in xs
there are no recursive function calls. You take an x, and construct a cyclic value xs using it. All sharing is explicit.
If you want to understand it more formally, you need to familiarize yourself with the semantics of lazy evaluation, such as A Natural Semantics for Lazy Evaluation.
Your intuition about xs being shared is correct. To restate the author's example in terms of repeat, instead of integral, when you write:
repeat x = x : repeat x
the language does not recognize that the repeat x on the right is the same as the value produced by the expression x : repeat x. Whereas if you write
repeat x = let xs = x : xs in xs
you're explicitly creating a structure that when evaluated looks like this:
{hd: x, tl:|}
^ |
\________/
This code :
y :: Int
y = y + 1
When executed causes GHCI to hang.
y :: Int; this means y is of type Int
y = y + 1; this means y is defined to be an Int + 1
Please correct me if I'm incorrect in definitions of statements.
Why does y not evaluate ?
Is reason that y is being added to an Int, but its just being added to a type not to a value ?
That's because it recurses infinitely. You are calling y which is defined as y + 1. So how will the evaluation proceed ?
It goes like this:
y
y + 1
(y + 1) + 1
((y + 1) + 1) + 1
and so on...
Speaking more broadly, a Haskell file (or GHCi) does NOT contain a list of imperatives/orders, like some other programming languages. It is a different style of programming language. Instead there are a handful of kinds of top-level statements which you have access to:
You can define values. y = y + 1 defines the symbol y to be an application of a function (+) to two other parameters, y and 1. This definition holds throughout the file, in particular to things above the definition and within the definition. So, you can totally write y = x + 1 and then x = 2 in a .hs file and ask GHCi for y and it will say 3. Note that this gets more complicated with the let keyword, which forms a "wall" to this expansive nature of definition: let accepts a block of definitions and a value-expression and scopes those definitions to within the combined (block-of-definitions, value-expression) context, but walls off those definitions from the world outside the let. So this is valid, too:
Prelude> let y = x + 1; x = 2
Prelude> y
3
You can define data structures and their constructors. A constructor is a special function which we allow to participate in pattern-matching: in other words, Haskell knows how to invert or destructure every constructor. You can also define a type synonym and a newtype, which is halfway in-between those.
You can provide metadata about individual values (type declarations). These are really helpful for narrowing down where a type error is because they set up a "wall" for the type inference algorithm. They also can have semantic effects either in adding polymorphism (Haskell has a "monomorphism restriction" which often bites newcomers) or in restricting that polymorphism to a concrete type.
You can provide metadata about the package as a whole: both how it incorporates other packages (import statements) and how it can be used by other packages (module statements).
None of these are orders that you're giving to the Haskell system; instead your file is all one big description of a module. Similarly within an expression (the first part above) there are only a couple of things you can do, and they are not usually imperatives: you can apply values to other values, create functions, you can create local definitions (let) and pattern-match (case), and you can locally add type metadata. Everything else, including do notation, is just a more convenient way ("syntactic sugar") to do those things above.
Your two statements are a type declaration ("the type of the y defined by this module will be an integer") and a definition ("to compute a y as defined by this module, first compute the value of a y, then add one to it"). Haskell reads both of them together and says, "oh, y has type Int, so (+) is the specific Int-Plus operation that I know, (+) :: Int -> Int -> Int, and then 1 is the specific Int of that name which I know... . It will then confirm that the types are self-consistent and produce some imperative code which loops forever.
Haskell has no variables, only constants, thus, you cannot use the same style as in other languages, with updatable values referring to the last. It does mean, however, that you can do some pretty awesome things, which you've tripped upon.
Take this declaration as an example:
myList = 1 : myList
When evaluated, this will refer to itself, thus doing this:
myList = 1 : myList -- but I'm referring to myList, so...
myList = 1 : (1 : myList) -- but I'm referring to myList, so...
myList = 1 : (1 : (1 : myList)) -- but I'm referring to myList, so...
myList = 1 : 1 : 1 : 1 : 1 : 1... -- etc.
The same goes for your constant y:
y = y + 1 -- but I'm referring to y, so...
y = y + 1 + 1 -- but I'm referring to y, so...
y = y + 1 + 1 + 1 -- but I'm referring to y, so...
y = 1 + 1 + 1 + 1 ... -- etc.
GHCi can never completely evaluate the value of y, because it's infinite, causing GHCi to hang.
I want to define a function that considers it's equally-typed arguments without considering their order. For example:
weirdCommutative :: Int -> Int -> Int
weirdCommutative 0 1 = 999
weirdCommutative x y = x + y
I would like this function to actually be commutative.
One option is adding the pattern:
weirdCommutative 1 0 = 999
or even better:
weirdCommutative 1 0 = weirdCommutative 0 1
Now lets look at the general case: There could be more than two arguments and/or two values that need to be considered without order - So considering all possible cases becomes tricky.
Does anyone know of a clean, natural way to define commutative functions in Haskell?
I want to emphasize that the solution I am looking for is general and cannot assume anything about the type (nor its deconstruction type-set) except that values can be compared using == (meaning that the type is in the Eq typeclass but not necessarily in the Ord typeclass).
There is actually a package that provides a monad and some scaffolding for defining and using commutative functions. Also see this blog post.
In a case like Int, you can simply order the arguments and feed them to a (local) partial function that only accepts the arguments in that canonically ordered form:
weirdCommutative x y
| x > y = f y x
| otherwise = f x y
where f 0 1 = 999
f x' y' = x' + y'
Now, obviously most types aren't in the Ord class – but if you're deconstructing the arguments by pattern-matching, chances are you can define at least some partial ordering. It doesn't really need to be >.