How do I make a Record Type bit addressable in Haskell? - haskell

I have a record type that is 4 Word32.
data MyType = MyType {a :: Word32, b :: Word32, c :: Word32, d :: Word32 }
Most of the time, I want to treat this type as 4 separate Word32. However, sometimes I wish to treat it as a single stream of binary data (128 bits long, the concatenation of the 4 Word32). I know that in Python, I would write different accessor functions for this "structure", so that I could read/modify it in both ways. But this is Haskell. I am wondering how an experienced Haskeller would go about this?

There's a class for that :-)
import Data.Bits
newtype MyWord128 = MyWord128 MyType
instance Num MyWord128 where
-- implement this one
instance Bits MyWord128 where
-- and then this one, which is what you really want
Check out the documentation for Data.Bits. A complete minimal definition is to provide an implementation of .&., .|., complement, shift, rotate, bitSize and isSigned (or a few other possible combinations: see the doc for details). Annoyingly you also have to implement Num, although it's not entirely clear to me why they defined it that way.

If you really want it to be like a struct of four word32's, you might want to use strict/unpacked fields:
data MyType = MyType { a :: {-# UNPACK #-} !Word32
, b :: {-# UNPACK #-} !Word32
, c :: {-# UNPACK #-} !Word32
, d :: {-# UNPACK #-} !Word32 }
deriving (Show)
Then, let's define a couple of bit-fiddling functions:
mask :: Bits a => Int -> a
mask count = (1 `shiftL` count) - 1
bitRange :: Bits a => Int -> Int -> a -> a
bitRange low count val = (val `shiftR` low) .&. mask count
Now you can just write 128-bit accessors for this type:
from128 :: Integer -> MyType
from128 val = MyType (bitsFrom 0)
(bitsFrom 32)
(bitsFrom 64)
(bitsFrom 96)
where
bitsFrom i = fromIntegral (bitRange i 32 val)
to128 :: MyType -> Integer
to128 (MyType a b c d) =
foldl' (.|.) 0 [
bitsTo a 0,
bitsTo b 32,
bitsTo c 64,
bitsTo d 96
]
where
bitsTo val i = fromIntegral val `shiftL` i
For the a b c d fields, you can just use fclabels. You can also make an fclabel bijective Functor (:<->:):
myType128 :: MyType :<->: Integer
myType128 = to128 :<->: from128

Related

fromInteger is a cast?

I'm looking at this, as well as contemplating the whole issue of non-decimal literals, e.g., 1, being just sugar for fromInteger 1 and then I find the type is
λ> :t 1
1 :: Num p => p
This and the statement
An integer literal represents the application of the function
fromInteger to the appropriate value of type Integer.
have me wondering what is really going on. Likewise,
λ> :t 3.149
3.149 :: Fractional p => p
Richard Bird says
A floating-point literal such as 3.149 represents the application of
fromRational to an appropriate rational number. Thus 3.149 :: Fractional a => a
Not understanding what the application of fromRational to an appropriate rational number means. Then he says this is all necessary to be able to add, e.g., 42 + 3.149.
I feel there's a lot going on here that I just don't understand. Like there's too much hand-waving for me. It seems like a cast of an unidentified non-decimal or decimal to specific types, Integer and Rational. So first, why is 1 actually fromInteger 1 internally? I realize every expression must be evaluated as a type, but why is fromInteger and fromRational involved?
Auxillary
So at this page
The workhorse for converting from integral types is fromIntegral,
which will convert from any Integral type into any Numeric type (which
includes Int, Integer, Rational, and Double): fromIntegral :: (Num b, Integral a) => a -> b
Then comes the example
λ> sqrt 1
1.0
λ> sqrt (1 :: Int)
... error...
λ> sqrt (fromInteger 1)
1.0
λ> :t sqrt 1
sqrt 1 :: Floating a => a
λ> :t sqrt (1 :: Int)
...error...
λ> :t sqrt
sqrt :: Floating a => a -> a
λ> :t sqrt (fromInteger 1)
sqrt (fromInteger 1) :: Floating a => a
So yes, this is a cast, but I don't know the mechanism of how fromI* is doing this --- since technically it's not a cast in a C/C++ sense. All instances of Num must have a fromInteger. It seems like under the hood Haskell is taking whatever you put in and generic-izing it to Integer or Rational, then "giving it back" to the original function, e.g., with sqrt (fromInteger 1) being of type Floating a => a. This is very mysterious to someone prone to over-thinking.
So yes, 1 is a literal, a constant that is polymorphic. It may represent 1 in any type that instantiates Num. The role of fromInteger must be to allowing a value (a cast) to be extracted from an integer constant consistent with what the situation calls for. But this is hand-waving talk at some point. I dont' get how this is actually happening.
Perhaps this will help...
Imagine a language, like Haskell, except that the literal program text 1 represents a term of type Integer with value one, and the literal program text 3.14 represents a term of type Rational with value 3.14. Let's call this language "AnnoyingHaskell".
To be clear, when I say "represents" in the above paragraph, I mean that the AnnoyingHaskell compiler actually compiles those literals into machine code that produces an Integer term whose value is the number 1 in the first case, and a Rational term whose value is the number 3.14 in the second case. An Integer is -- at it's core -- an arbitrary precision integer as implemented by the GMP library, while a Rational is a pair of two Integers, understood to be the numerator and denominator of a rational number. For this particular rational, the two integers would be 157 and 50 (i.e., 157/50=3.14).
AnnoyingHaskell would be... erm... annoying to use. For example, the following expression would not type check:
take 3 "hello"
because 3 is an Integer but take's first argument is an Int. Similarly, the expression:
42 + 3.149
would not type check, because 42 is an Integer and 3.149 is a Rational, and in AnnoyingHaskell, as in Haskell itself, you cannot add an Integer and a Rational.
Because this is annoying, the designers of Haskell made the decision that the literal program text 42 and 3.149 should be treated as if they were the AnnoyingHaskell expressions fromInteger 42 and fromRational 3.149.
The AnnoyingHaskell expression:
fromInteger 42 + fromRational 3.149
does type check. Specifically, the polymorphic function:
fromInteger :: (Num a) => Integer -> a
accepts the AnnoyingHaskell literal 42 :: Integer as its argument, and the resulting subexpression fromInteger 42 has resulting type Num a => a for some fresh type a. Similarly, fromRational 3.149 is of type Fractional b => b for some fresh type b. The + operator unifies these two types into a single type (Num c, Fractional c) => c, but Num c is redundant because Num is a superclass of Fractional, so the whole expression has a polymorphic type:
fromInteger 42 + fromRational 3.149 :: Fractional c => c
That is, this expression can be instantiated at any type with a Fractional constraint. For example. In the Haskell program:
main = print $ 42 + 3.149
which is equivalent to the AnnoyingHaskell program:
main = print $ fromInteger 42 + fromRational 3.149
the usual "defaulting" rules apply, and because the expression passed to the print statement is an unknown type c with a Fractional c constraint, it is defaulted to Double, allowing the program to actually run, computing and printing the desired Double.
If the compiler was awful, this program would run by creating a 42 :: Integer on the heap, calling fromInteger (specialized to fromInteger :: Integer -> Double) to create a 42 :: Double, then create 3.149 :: Rational on the heap, calling fromRational (specialized to fromRational :: Rational -> Double) to create a 3.149 :: Double, and then add them together to create the final answer 45.149 :: Double. Because the compiler isn't so awful, it just creates the number 45.149 :: Double directly.
Perhaps this will help more. One thing you seem to be struggling with is the nature of a value of type Num a => a, like the one produced by fromInteger (1 :: Integer). I think you're somehow imagining that fromInteger "packages" up the 1 :: Integer in a box so it can later be cast by special compiler magic to a 1 :: Int or 1 :: Double.
That's not what's happening.
Consider the following type class:
{-# LANGUAGE FlexibleInstances #-}
class Thing a where
thing :: a
with associated instances:
instance Thing Bool where thing = True
instance Thing Int where thing = 16
instance Thing String where thing = "hello, world"
instance Thing (Int -> String) where thing n = replicate n '*'
and observe the result of running:
main = do
print (thing :: Bool)
print (thing :: Int)
print (thing :: String)
print $ (thing :: Int -> String) 15
Hopefully, you're comfortable enough with type classes that you don't find the output surprising. And presumably you don't think that thing contains some specific, identifiable "thing" that is being "cast" to a Bool, Int, etc. It's simply that thing is a polymorphic value whose definition depends on its type; that's just how type classes work.
Now, consider the similar example:
{-# LANGUAGE FlexibleInstances #-}
import Data.Ratio
import Data.Word
import Unsafe.Coerce
class Three a where
three :: a
-- for base >= 4.10.0.0, can import GHC.Float (castWord64ToDouble)
-- more generally, we can use this unsafe coercion:
castWord64ToDouble :: Word64 -> Double
castWord64ToDouble w = unsafeCoerce w
instance Three Int where
three = length "aaa"
instance Three Double where
three = castWord64ToDouble 0x4008000000000000
instance Three Rational where
three = (6 :: Integer) % (2 :: Integer)
main = do
print (three :: Int)
print (three :: Double)
print (three :: Rational)
print $ take three "abcdef"
print $ (sqrt three :: Double)
Can you see here how three :: Three a => a represents a value that can be used as an Int, Double, or Rational? If you want to think of it as a cast, that's fine, but obviously there's no identifiable single "3" that's packaged up in the value three being cast to different types by compiler magic. It's just that a different definition of three is invoked, depending on the type demanded by the caller.
From here, it's not a big leap to:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MagicHash #-}
import Data.Ratio
import Data.Word
import Unsafe.Coerce
class MyFromInteger a where
myFromInteger :: Integer -> a
instance MyFromInteger Integer where
myFromInteger x = x
instance MyFromInteger Int where
-- for base >= 4.10.0.0 can use the following:
-- -- Note: data Integer = IS Int | ...
-- myFromInteger (IS i) = I# i
-- myFromInteger _ = error "not supported"
-- to support more GHC versions, we'll just use this extremely
-- dangerous coercion:
myFromInteger i = unsafeCoerce i
instance MyFromInteger Rational where
myFromInteger x = x % (1 :: Integer)
main = do
print (myFromInteger 1 :: Integer)
print (myFromInteger 2 :: Int)
print (myFromInteger 3 :: Rational)
print $ take (myFromInteger 4) "abcdef"
Conceptually, the base library's fromInteger (1 :: Integer) :: Num a => a is no different than this code's myFromInteger (1 :: Integer) :: MyFromInteger a => a, except that the implementations are better and more types have instances.
See, it's not that the expression fromInteger (1 :: Integer) boxes up a 1 :: Integer into a package of type Num a => a for later casting. It's that the type context for this expression causes dispatch to an appropriate Num type class instance, and a different definition of fromInteger is invoked, depending on the required type. That fromInteger function is always called with argument 1 :: Integer, but the returned type depends on the context, and the code invoked by the fromInteger call (i.e., the definition of fromInteger used) to convert or "cast" the argument 1 :: Integer to a "one" value of the desired type depends on which return type is demanded.
And, to go a step further, as long as we take care of a technical detail by turning off the monomorphism restriction, we can write:
{-# LANGUAGE NoMonomorphismRestriction #-}
main = do
let two = myFromInteger 2
print (two :: Integer)
print (two :: Int)
print (two :: Rational)
This may look strange, but just as myFromInteger 2 is an expression of type Num a => a whose final value is produced using a definition of myFromInteger, depending on what type is ultimately demanded, the expression two is also an expression of type Num a => a whose final value is produced using a definition of myFromInteger that depends on what type is ultimately demanded, even though the literal program text myFromInteger does not appear in the expression two. Moreover, continuing with:
let four = two + two
print (four :: Integer)
print (four :: Int)
print (four :: Rational)
the expression four of type Num a => a will produce a final value that depends on the definition of myFromInteger and the definition of (+) that are determined by the finally demanded return type.
In other words, rather than thinking of four as a packaged 4 :: Integer that's going to be cast to various types, you need to think of four as completely equivalent to its full definition:
four = myFromInteger 2 + myFromInteger 2
with a final value that will be determined by using the definitions of myFromInteger and (+) that are appropriate for whatever type is demanded of four, whether its four :: Integer or four :: Rational.
The same goes for sqrt (fromIntegral 1) After:
x = sqrt (fromIntegral (1 :: Integer))
the value of x :: Floating a => a is equivalent to the full expression:
sqrt (fromIntegral (1 :: Integer))
and every place it is is used, it will be calculated using definitions of sqrt and fromIntegral determined by the Floating and Num instances for the final type demanded.
Here's all the code in one file, testing with GHC 8.2.2 and 9.2.4.
{-# LANGUAGE Haskell98 #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
import Data.Ratio
import GHC.Num
import GHC.Int
import GHC.Float (castWord64ToDouble)
class Thing a where
thing :: a
instance Thing Bool where thing = True
instance Thing Int where thing = 16
instance Thing String where thing = "hello, world"
instance Thing (Int -> String) where thing n = replicate n '*'
class Three a where
three :: a
instance Three Int where
three = length "aaa"
instance Three Double where
three = castWord64ToDouble 0x4008000000000000
instance Three Rational where
three = (6 :: Integer) % (2 :: Integer)
class MyFromInteger a where
myFromInteger :: Integer -> a
instance MyFromInteger Integer where
myFromInteger x = x
instance MyFromInteger Int where
-- Note: data Integer = IS Int | ...
myFromInteger (IS i) = I# i
myFromInteger _ = error "not supported"
instance MyFromInteger Rational where
myFromInteger x = x % (1 :: Integer)
main = do
print (thing :: Bool)
print (thing :: Int)
print (thing :: String)
print $ (thing :: Int -> String) 15
print (three :: Int)
print (three :: Double)
print (three :: Rational)
print $ take three "abcdef"
print $ (sqrt three :: Double)
print (myFromInteger 1 :: Integer)
print (myFromInteger 2 :: Int)
print (myFromInteger 3 :: Rational)
print $ take (myFromInteger 4) "abcdef"
let two = myFromInteger 2
print (two :: Integer)
print (two :: Int)
print (two :: Rational)
let four = two + two
print (four :: Integer)
print (four :: Int)
print (four :: Rational)

Using choose in frequency Haskell QuickCheck

So I have the code below and I am trying to make it an instance of Arbitrary:
data MyData = I Int | B Bool
instance Arbitrary MyData where
arbitrary = do {
frequency [(1, return (I 1)),
(1, return (choose((B True), (B False))))]
}
With this however I get (understandable) the error:
Couldn't match type ‘Gen MyData’ with ‘MyData’
Expected type: Gen MyData
Actual type: Gen (Gen MyData)
How can I accomplish to implement this? Also instead of (I 1) I would like to return I with a random Int. However using the arbitrary function instead of 1 leads to the same error.
Since you seem to want to distribute evenly between I and B constructors, a simpler solution would be to use oneof instead of frequency:
data MyData = I Int | B Bool deriving (Eq, Show)
instance Arbitrary MyData where
arbitrary = oneof [genI, genB]
where genI = fmap I arbitrary
genB = fmap B arbitrary
The genI and genB generators use the underlying Arbitrary instances of Int and Bool by mapping raw integers and Boolean values to the respective case constructors.
Here's a set of sample data:
> sample (arbitrary :: Gen MyData)
B False
B False
I 2
B False
I 1
I 7
B False
B False
B True
I 7
B False
As you can see, it also accomplishes to pick arbitrary integers.
The code in the OP has several problems. The first error message is that the return type is nested. One way to get around that is to remove the do notation. This, however, doesn't solve the problem.
Even if you reduce it to the following, it doesn't type-check:
instance Arbitrary MyData where
arbitrary =
frequency [(1, return (I 1)),
(1, choose(B True, B False))]
This attempt produces the error:
Q72160684.hs:10:21: error:
* No instance for (random-1.1:System.Random.Random MyData)
arising from a use of `choose'
* In the expression: choose (B True, B False)
In the expression: (1, choose (B True, B False))
In the first argument of `frequency', namely
`[(1, return (I 1)), (1, choose (B True, B False))]'
|
10 | (1, choose(B True, B False))]
| ^^^^^^^^^^^^^^^^^^^^^^^
The choose method requires the input to be Random instances, which MyData isn't.
If you really want to use frequency rather than oneof, the easiest way may be to first get oneof to work, since you can view frequency as a generalisation of oneof.
First, to make the code a little more succinct, I've used <$> instead of fmap and then inlined both generators:
instance Arbitrary MyData where
arbitrary = oneof [I <$> arbitrary, B <$> arbitrary]
Now replace oneof with frequency and change each generator to a weighted tuple:
instance Arbitrary MyData where
arbitrary = frequency [(10, I <$> arbitrary), (1, B <$> arbitrary)]
Sampling from this instance illustrates that the distribution is now skewed:
> sample (arbitrary :: Gen MyData)
I 0
I (-2)
I (-4)
I (-1)
I 0
I 8
B True
I 1
I 3
I (-3)
I (-16)
There are 10 I values and only 1 B value.
You can derive it with generic-random (since 1.5.0.0).
Deriving via GenericArbitraryU: Gives a uniform distribution (like oneof from Mark Seemann's answer):
{-# Language DataKinds #-}
{-# Language DeriveGeneric #-}
{-# Language DerivingVia #-}
import Test.QuickCheck
import GHC.Generics
import Generic.Random.DerivingVia
-- ghci> :set -XTypeApplications
-- ghci> sample #MyData arbitrary
-- I 0
-- I 1
-- B True
-- I 6
-- I (-5)
-- I (-7)
-- B True
-- I (-10)
-- B True
-- B True
-- I (-9)
data MyData = I Int | B Bool
deriving
stock (Show, Generic)
deriving Arbitrary
via GenericArbitraryU MyData
Deriving via GenericArbitrary: gives a weighted distribution specified by a type-level list of numbers. They denote the frequency of each constructor (like frequency):
-- ghci> sample #MyData arbitrary
-- I 0
-- I (-2)
-- I 4
-- I 5
-- I 2
-- I 0
-- B False
-- I (-9)
-- I (-10)
-- I (-3)
-- I (-8)
data MyData = I Int | B Bool
deriving
stock (Show, Generic)
deriving Arbitrary
via GenericArbitrary '[10, 1] MyData

DataKind Unions

I'm not sure if it is the right terminology, but is it possible to declare function types that take in an 'union' of datakinds?
For example, I know I can do the following:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
...
data Shape'
= Circle'
| Square'
| Triangle'
data Shape :: Shape' -> * where
Circle :: { radius :: Int} -> Shape Circle'
Square :: { side :: Int} -> Shape Square'
Triangle
:: { a :: Int
, b :: Int
, c :: Int}
-> Shape Triangle'
test1 :: Shape Circle' -> Int
test1 = undefined
However, what if I want to take in a shape that is either a circle or a square? What if I also want to take in all shapes for a separate function?
Is there a way for me to either define a set of Shape' kinds to use, or a way for me to allow multiple datakind definitions per data?
Edit:
The usage of unions doesn't seem to work:
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
...
type family Union (a :: [k]) (r :: k) :: Constraint where
Union (x ': xs) x = ()
Union (x ': xs) y = Union xs y
data Shape'
= Circle'
| Square'
| Triangle'
data Shape :: Shape' -> * where
Circle :: { radius :: Int} -> Shape Circle'
Square :: { side :: Int} -> Shape Square'
Triangle
:: { a :: Int
, b :: Int
, c :: Int}
-> Shape Triangle'
test1 :: Union [Circle', Triangle'] s => Shape s -> Int
test1 Circle {} = undefined
test1 Triangle {} = undefined
test1 Square {} = undefined
The part above compiles
You can accomplish something like this in (I think) a reasonably clean way using a type family together with ConstraintKinds and PolyKinds:
type family Union (a :: [k]) (r :: k) :: Constraint where
Union (x ': xs) x = ()
Union (x ': xs) y = Union xs y
test1 :: Union [Circle', Triangle'] s => Shape s -> Int
test1 = undefined
The () above is the empty constraint (it's like an empty "list" of type class constraints).
The first "equation" of the type family makes use of the nonlinear pattern matching available in type families (it uses x twice on the left hand side). The type family also makes use of the fact that if none of the cases match, it will not give you a valid constraint.
You should also be able to use a type-level Boolean instead of ConstraintKinds. That would be a bit more cumbersome and I think it would be best to avoid using a type-level Boolean here (if you can).
Side-note (I can never remember this and I had to look it up for this answer): You get Constraint in-scope by importing it from GHC.Exts.
Edit: Partially disallowing unreachable definitions
Here is a modification to get it to (partially) disallow unreachable definitions as well as invalid calls. It is slightly more roundabout, but it seems to work.
Modify Union to give a * instead of a constraint, like this:
type family Union (a :: [k]) (r :: k) :: * where
Union (x ': xs) x = ()
Union (x ': xs) y = Union xs y
It doesn't matter too much what the type is, as long as it has an inhabitant you can pattern match on, so I give back the () type (the unit type).
This is how you would use it:
test1 :: Shape s -> Union [Circle', Triangle'] s -> Int
test1 Circle {} () = undefined
test1 Triangle {} () = undefined
-- test1 Square {} () = undefined -- This line won't compile
If you forget to match on it (like, if you put a variable name like x instead of matching on the () constructor), it is possible that an unreachable case can be defined. It will still give a type error at the call-site when you actually try to reach that case, though (so, even if you don't match on the Union argument, the call test1 (Square undefined) () will not type check).
Note that it seems the Union argument must come after the Shape argument in order for this to work (fully as described, anyway).
This is getting kind of awful, but I guess you could require a proof that it's either a circle or a square using Data.Type.Equality:
test1 :: Either (s :~: Circle') (s :~: Square') -> Shape s -> Int
Now the user has to give an extra argument (a "proof term") saying which one it is.
In fact you can use the proof term idea to "complete" bradm's solution, with:
class MyOpClass sh where
myOp :: Shape sh -> Int
shapeConstraint :: Either (sh :~: Circle') (sh :~: Square')
Now nobody can go adding any more instances (unless they use undefined, which would be impolite).
You could use typeclasses:
class MyOpClass sh where
myOp :: Shape sh -> Int
instance MyOpClass Circle' where
myOp (Circle r) = _
instance MyOpClass Square' where
myOP (Square s) = _
This doesn't feel like a particularly 'complete' solution to me - anyone could go back and add another instance MyOpClass Triangle' - but I can't think of any other solution. Potentially you could avoid this problem simply by not exporting the typeclass however.
Another solution I've noticed, though pretty verbose, is to create a kind that has a list of feature booleans. You can then pattern match on the features when restricting the type:
-- [circleOrSquare] [triangleOrSquare]
data Shape' =
Shape'' Bool
Bool
data Shape :: Shape' -> * where
Circle :: { radius :: Int} -> Shape (Shape'' True False)
Square :: { side :: Int} -> Shape (Shape'' True True)
Triangle
:: { a :: Int
, b :: Int
, c :: Int}
-> Shape (Shape'' False True)
test1 :: Shape (Shape'' True x) -> Int
test1 Circle {} = 2
test1 Square {} = 2
test1 Triangle {} = 2
Here, Triangle will fail to match:
• Couldn't match type ‘'True’ with ‘'False’
Inaccessible code in
a pattern with constructor:
Triangle :: Int -> Int -> Int -> Shape ('Shape'' 'False 'True),
in an equation for ‘test1’
• In the pattern: Triangle {}
In an equation for ‘test1’: test1 Triangle {} = 2
|
52 | test1 Triangle {} = 2
| ^^^^^^^^^^^
Unfortunately, I don't think you can write this as a record, which may be clearer and avoids the ordering of the features.
This might be usable in conjunction with the class examples for readability.

How can I encode and enforce legal FSM state transitions with a type system?

Suppose I have a type Thing with a state property A | B | C,
and legal state transitions are A->B, A->C, C->A.
I could write:
transitionToA :: Thing -> Maybe Thing
which would return Nothing if Thing was in a state which cannot transition to A.
But I'd like to define my type, and the transition functions in such a way that transitions can only be called on appropriate types.
An option is to create separate types AThing BThing CThing but that doesn't seem maintainable in complex cases.
Another approach is to encode each state as it's own type:
data A = A Thing
data B = B Thing
data C = C Thing
and
transitionCToA :: C Thing -> A Thing
This seems cleaner to me. But it occurred to me that A,B,C are then functors where all of Things functions could be mapped except the transition functions.
With typeclasses I could create somthing like:
class ToA t where
toA :: t -> A Thing
Which seems cleaner still.
Are there other preferred approaches that would work in Haskell and PureScript?
Here's a fairly simple way that uses a (potentially phantom) type parameter to track which state a Thing is in:
{-# LANGUAGE DataKinds, KindSignatures #-}
-- note: not exporting the constructors of Thing
module Thing (Thing, transAB, transAC, transCA) where
data State = A | B | C
data Thing (s :: State) = {- elided; can even be a data family instead -}
transAB :: Thing A -> Thing B
transAC :: Thing A -> Thing C
transCA :: Thing C -> Thing A
transAB = {- elided -}
transAC = {- elided -}
transCA = {- elided -}
You could use a type class (available in PureScript) along with phantom types as John suggested, but using the type class as a final encoding of the type of paths:
data A -- States at the type level
data B
data C
class Path p where
ab :: p A B -- One-step paths
ac :: p A C
ca :: p C A
trans :: forall a b c. p c b -> p b a -> p c a -- Joining paths
refl :: forall a. p a a
Now you can create a type of valid paths:
type ValidPath a b = forall p. (Path p) => p a b
roundTrip :: ValidPath A A
roundTrip = trans ca ac
Paths can only be constructed by using the one-step paths you provide.
You can write instances to use your paths, but importantly, any instance has to respect the valid transitions at the type level.
For example, here is an interpretation which calculates lengths of paths:
newtype Length = Length Int
instance pathLength :: Path Length where
ab = Length 1
ac = Length 1
ca = Length 1
trans (Length n) (Length m) = Length (n + m)
refl = Length 0
Since your goal is to prevent developers from performing illegal transitions, you may want to look into phantom types. Phantom types allow you to model type-safe transitions without leveraging more advanced features of the type system; as such they are portable to many languages.
Here's a PureScript encoding of your above problem:
foreign import data A :: *
foreign import data B :: *
foreign import data C :: *
data Thing a = Thing
transitionToA :: Thing C -> Thing A
Phantom types work well to model valid state transitions when you have the property that two different states cannot transition to the same state (unless all states can transition to that state). You can workaround this limitation by using type classes (class CanTransitionToA a where trans :: Thing a -> Thing A), but at this point, you should investigate other approaches.
If you want to store a list of transitions so that you can process it later, you can do something like this:
{-# LANGUAGE DataKinds, GADTs, KindSignatures, PolyKinds #-}
data State = A | B | C
data Edge (a :: State) (b :: State) where
EdgeAB :: Edge A B
EdgeAC :: Edge A C
EdgeCA :: Edge C A
data Domino (f :: k -> k -> *) (a :: k) (b :: k) where
I :: Domino f a a
(:>>:) :: f a b -> Domino f b c -> Domino f a c
infixr :>>:
example :: Domino Edge A B
example = EdgeAC :>>: EdgeCA :>>: EdgeAB :>>: I
You can turn that into an instance of Path by writing a concatenation function for Domino:
{-# LANGUAGE FlexibleInstances #-}
instance Path (Domino Edge) where
ab = EdgeAB :>>: I
ac = EdgeAC :>>: I
ca = EdgeCA :>>: I
refl = I
trans I es' = es'
trans (e :>>: es) es' = e :>>: (es `trans` es')
In fact, this makes me wonder if Hackage already has a package that defines "indexed monoids":
class IMonoid (m :: k -> k -> *) where
imempty :: m a a
imappend :: m a b -> m b c -> m a c
instance IMonoid (Domino e) where
imempty = I
imappend I es' = es'
imappend (e :>>: es) es' = e :>>: (es `imappend` es')

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.
You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.
To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2
Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.
You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Resources