Vector-space library and constraining the scalar type - haskell

I'm working on a program that uses the vector-space library, and I'm having some troubles with using it.
See the code below.
import Data.VectorSpace
-- scale a vector with a float
step :: (VectorSpace a) => a -> Float -> a
step x dt = x ^* dt
When compiling this code segment I get errors regarding the associated scalar type for the vector typeclass.
Could not deduce (Scalar a ~ Float)
from the context (VectorSpace a)
bound by the type signature for
step :: VectorSpace a => a -> Float -> a
at Test.hs:5:9-42
In the expression: x ^* dt
In an equation for `step': step x dt = x ^* dt
Is there a type signature that will fix this compiler error? Or is there a better library to use for descibing the operations that I'm looking for in a type (like addition and scaling)? In the end I'm hoping to use the code like for things.
step (1,1) 0.5
step 1 0.5
Essentially I'm hoping to reuse some of the instances that vector-space defines.

EDIT: found signature on hackage to be incorrect
You can just add the constraint about which GHC complained:
{-# LANGUAGE GADTs #-}
import Data.VectorSpace
step :: (VectorSpace a, Scalar a ~ Float) => a -> Float -> a
step x dt = x ^* dt

Related

Easy function gives compile error on conversion from Int to Double

Why does this easy function which computes the distance between 2 integer points in the plane not compile?
distance :: (Int, Int) -> (Int, Int) -> Double
distance (x, y) (u, v) = sqrt ((x - u) ^ 2 + (y - v) ^ 2)
I get the error Couldn't match expected type ‘Double’ with actual type ‘Int’.
It is frustrating such an easy mathematical function consumes so much of my time. Any explanation why this goes wrong and the most elegant way to fix this is appreciated.
This is my solution to overcome the problem
distance :: (Int, Int) -> (Int, Int) -> Double
distance (x, y) (u, v) =
let xd = fromIntegral x :: Double
yd = fromIntegral y :: Double
ud = fromIntegral u :: Double
vd = fromIntegral v :: Double
in sqrt ((xd - ud) ^ 2 + (yd - vd) ^ 2)
but there must be a more elegant way.
Most languages only do type inference (if any) “in direction of data flow”. E.g., you start with a value 2 in Java or Python, that'll be an int. You calculate something like 2 + 4, and the + operator infers from the integer arguments that the result is also int. In dynamic languages this is the only way that's possible at all (because the types are only an “associated property” of values). In static languages like C++, the inference-step is only done once at compile time, but it's still done largely “as if the types were associated properties of values”.
Not so in Haskell. Like other Hindley-Milner languages, it has a type system that works completely independent of any runtime data flow directions. It can still do forward-inference ((2::Int) + (4::Int) is unambiguously of type Int), but it's only a special case – types can just as well be inferred in the “reverse direction”, i.e. if you write (x + y) :: Int the compiler is able to infer that both x and y must have type Int as well.
This reverse-polymorphism enables many nice tricks – example:
Prelude Debug.SimpleReflect> 2 + 4 :: Expr
2 + 4
Prelude Debug.SimpleReflect> 7^3 :: Expr
7 * 7 * 7
...but it only works if the language never does implicit conversions, not even in “safe†, obvious cases” like Int -> Integer.
Usually, the type checker automatically infers the most sensible type. For your original implementation, the checker would infer the type
distance :: Floating a => (a, a) -> (a, a) -> a
and that – or perhaps the specialised version
distance :: (Double,Double) -> (Double,Double) -> Double
is a much more sensible type than your (Int, Int) -> ... attempt, because the Euclidean distance makes actually no sense on a discrete grid (you'd want something like a Taxcab distance there).
What you'd actually want is distance from the vector-space package. This is more general, works not only on 2-tuples but any suitable space.
†Int -> Double is actually not a safe conversion – try float(1000000000000000001) in Python! So even without Hindley-Milner, this is not really a very smart thing to do implicitly.
SOLVED: now I have this
distance :: (Int, Int) -> (Int, Int) -> Double
distance (x, y) (u, v) = sqrt (fromIntegral ((x - u) ^ 2 + (y - v) ^ 2))

Typeclass instances with constrained return type

I'm implementing a notion of inner product that's general over the container and numerical types. The definition states that the return type of this operation is a (non-negative) real number.
One option (shown below) is to write all instances by hand, for each numerical type (Float, Double, Complex Float, Complex Double, Complex CFloat, Complex CDouble, etc.). The primitive types aren't many, but I dislike the repetition.
Another option, or so I thought, is to have a parametric instance with a constraint such as RealFloat (which represents Float and Double).
{-# language MultiParamTypeClasses, TypeFamilies, FlexibleInstances #-}
module Test where
import Data.Complex
class Hilbert c e where
type HT e :: *
dot :: c e -> c e -> HT e
instance Hilbert [] Double where
type HT Double = Double
dot x y = sum $ zipWith (*) x y
instance Hilbert [] (Complex Double) where
type HT (Complex Double) = Double
a `dot` b = realPart $ sum $ zipWith (*) (conjugate <$> a) b
Question
Why does the instance below not work ("Couldn't match type e with Double.. expected type HT e, actual type e")?
instance RealFloat e => Hilbert [] e where
type HT e = Double
dot x y = sum $ zipWith (*) x y
Well, that particular instance doesn't work because the sum only yields an e, but you want the result to be Double. As e is constrained to RealFrac, this is easy to fix though, as any Real (questionable though is is mathematically) can be converted to a Fractional:
dot x y = realToFrac . sum $ zipWith (*) x y
However, that generic instance prevents you from also defining complex instances: with instance RealFloat e => Hilbert [] e where you cover all types, even if they aren't really real numbers. You could still instantiate Complex as an overlapping instance, but I'd rather stay away from those if I could help it.
It's also questionable if such vectorspace classes should be defined on * -> * at all. Yes, linear also does it this way, but IMO parametricity doesn't work in our favour in this application. Have you checked out the vector-space package? Mind, it isn't exactly complete for doing serious linear algebra; that's a gap I hope to fill with my linearmap-category package.

What can type families do that multi param type classes and functional dependencies cannot

I have played around with TypeFamilies, FunctionalDependencies, and MultiParamTypeClasses. And it seems to me as though TypeFamilies doesn't add any concrete functionality over the other two. (But not vice versa). But I know type families are pretty well liked so I feel like I am missing something:
"open" relation between types, such as a conversion function, which does not seem possible with TypeFamilies. Done with MultiParamTypeClasses:
class Convert a b where
convert :: a -> b
instance Convert Foo Bar where
convert = foo2Bar
instance Convert Foo Baz where
convert = foo2Baz
instance Convert Bar Baz where
convert = bar2Baz
Surjective relation between types, such as a sort of type safe pseudo-duck typing mechanism, that would normally be done with a standard type family. Done with MultiParamTypeClasses and FunctionalDependencies:
class HasLength a b | a -> b where
getLength :: a -> b
instance HasLength [a] Int where
getLength = length
instance HasLength (Set a) Int where
getLength = S.size
instance HasLength Event DateDiff where
getLength = dateDiff (start event) (end event)
Bijective relation between types, such as for an unboxed container, which could be done through TypeFamilies with a data family, although then you have to declare a new data type for every contained type, such as with a newtype. Either that or with an injective type family, which I think is not available prior to GHC 8. Done with MultiParamTypeClasses and FunctionalDependencies:
class Unboxed a b | a -> b, b -> a where
toList :: a -> [b]
fromList :: [b] -> a
instance Unboxed FooVector Foo where
toList = fooVector2List
fromList = list2FooVector
instance Unboxed BarVector Bar where
toList = barVector2List
fromList = list2BarVector
And lastly a surjective relations between two types and a third type, such as python2 or java style division function, which can be done with TypeFamilies by also using MultiParamTypeClasses. Done with MultiParamTypeClasses and FunctionalDependencies:
class Divide a b c | a b -> c where
divide :: a -> b -> c
instance Divide Int Int Int where
divide = div
instance Divide Int Double Double where
divide = (/) . fromIntegral
instance Divide Double Int Double where
divide = (. fromIntegral) . (/)
instance Divide Double Double Double where
divide = (/)
One other thing I should also add is that it seems like FunctionalDependencies and MultiParamTypeClasses are also quite a bit more concise (for the examples above anyway) as you only have to write the type once, and you don't have to come up with a dummy type name which you then have to type for every instance like you do with TypeFamilies:
instance FooBar LongTypeName LongerTypeName where
FooBarResult LongTypeName LongerTypeName = LongestTypeName
fooBar = someFunction
vs:
instance FooBar LongTypeName LongerTypeName LongestTypeName where
fooBar = someFunction
So unless I am convinced otherwise it really seems like I should just not bother with TypeFamilies and use solely FunctionalDependencies and MultiParamTypeClasses. Because as far as I can tell it will make my code more concise, more consistent (one less extension to care about), and will also give me more flexibility such as with open type relationships or bijective relations (potentially the latter is solver by GHC 8).
Here's an example of where TypeFamilies really shines compared to MultiParamClasses with FunctionalDependencies. In fact, I challenge you to come up with an equivalent MultiParamClasses solution, even one that uses FlexibleInstances, OverlappingInstance, etc.
Consider the problem of type level substitution (I ran across a specific variant of this in Quipper in QData.hs). Essentially what you want to do is recursively substitute one type for another. For example, I want to be able to
substitute Int for Bool in Either [Int] String and get Either [Bool] String,
substitute [Int] for Bool in Either [Int] String and get Either Bool String,
substitute [Int] for [Bool] in Either [Int] String and get Either [Bool] String.
All in all, I want the usual notion of type level substitution. With a closed type family, I can do this for any types (albeit I need an extra line for each higher-kinded type constructor - I stopped at * -> * -> * -> * -> *).
{-# LANGUAGE TypeFamilies #-}
-- Subsitute type `x` for type `y` in type `a`
type family Substitute x y a where
Substitute x y x = y
Substitute x y (k a b c d) = k (Substitute x y a) (Substitute x y b) (Substitute x y c) (Substitute x y d)
Substitute x y (k a b c) = k (Substitute x y a) (Substitute x y b) (Substitute x y c)
Substitute x y (k a b) = k (Substitute x y a) (Substitute x y b)
Substitute x y (k a) = k (Substitute x y a)
Substitute x y a = a
And trying at ghci I get the desired output:
> :t undefined :: Substitute Int Bool (Either [Int] String)
undefined :: Either [Bool] [Char]
> :t undefined :: Substitute [Int] Bool (Either [Int] String)
undefined :: Either Bool [Char]
> :t undefined :: Substitute [Int] [Bool] (Either [Int] String)
undefined :: Either [Bool] [Char]
With that said, maybe you should be asking yourself why am I using MultiParamClasses and not TypeFamilies. Of the examples you gave above, all except Convert translate to type families (albeit you will need an extra line per instance for the type declaration).
Then again, for Convert, I am not convinced it is a good idea to define such a thing. The natural extension to Convert would be instances such as
instance (Convert a b, Convert b c) => Convert a c where
convert = convert . convert
instance Convert a a where
convert = id
which are as unresolvable for GHC as they are elegant to write...
To be clear, I am not saying there are no uses of MultiParamClasses, just that when possible you should be using TypeFamilies - they let you think about type-level functions instead of just relations.
This old HaskellWiki page does an OK job of comparing the two.
EDIT
Some more contrasting and history I stumbled upon from augustss blog
Type families grew out of the need to have type classes with
associated types. The latter is not strictly necessary since it can be
emulated with multi-parameter type classes, but it gives a much nicer
notation in many cases. The same is true for type families; they can
also be emulated by multi-parameter type classes. But MPTC gives a
very logic programming style of doing type computation; whereas type
families (which are just type functions that can pattern match on the
arguments) is like functional programming.
Using closed type families
adds some extra strength that cannot be achieved by type classes. To
get the same power from type classes we would need to add closed type
classes. Which would be quite useful; this is what instance chains
gives you.
Functional dependencies only affect the process of constraint solving, while type families introduced the notion of non-syntactic type equality, represented in GHC's intermediate form by coercions. This means type families interact better with GADTs. See this question for the canonical example of how functional dependencies fail here.

How to overload a function for multiplying [Double] in Haskell (ad-hoc polymorphism)?

The way to have ad-hoc polymorphism (function overloading) in Haskell is through type classes (see answers to this, this and this question, among others).
But I'm struggling to define an overloaded mult (product) function for the following cases:
mult: [Double] -> Double -> [Double]
mult: Double -> [Double] -> [Double]
mult: [Double] -> [Double] -> [Double]
Thanks
(At least, case 1 [Double]*Double and case 3 [Double]*[Double] would be necessary).
As always, statements like "I'm trying (with no success) this" are not quite as useful as you would like: it's good that you included your code, but if you are getting an error message from the compiler, tell us what it is! They're very instructive, and are printed for a reason.
I just tried what you wrote, and this is in fact the error message you are (probably) getting:
*Multiplication> mul 1 [2]
Non type-variable argument
in the constraint: Multipliable ta [t] tc
(Use FlexibleContexts to permit this)
When checking that ‘it’ has the inferred type
it :: forall ta tc t. (Num ta, Num t, Multipliable ta [t] tc) => tc
Now, you could try just turning on FlexibleContexts, but that doesn't seem to solve the problem. But, as is often the case when the compiler is telling you it's having trouble inferring types, you should try adding some explicit types and see if that helps:
*Multiplication> mul (1::Double) [2 :: Double]
[2.0]
Basically, the compiler can't be sure which overload of mul you want: 1 and 2 are polymorphic and could be any numeric type, and while there is only one suitable overload for mul now, the compiler doesn't make such an inference unless it can prove no other overload could ever exist in this context. Fully specifying the argument types is enough to resolve the problem.
An alternative approach to this particular problem is to use a typeclass for each argument, to convert it into the canonical type [Double], rather than a typeclass for the arguments as a whole. This is a more specific solution than general ad hoc polymorphism, and not all problems will fit, but for something like treating a single number like a list of numbers it should be fine:
module Multiplication where
import Control.Monad (liftM2)
class AsDoubles a where
doubles :: a -> [Double]
instance AsDoubles Double where
doubles = return
instance AsDoubles [Double] where
doubles = id
mult :: (AsDoubles a, AsDoubles b) => a -> b -> [Double]
mult x y = liftM2 (*) (doubles x) (doubles y)
*Multiplication> mult [(1 :: Double)..5] [(1 :: Double)..3]
[1.0,2.0,3.0, -- whitespace added for readability
2.0,4.0,6.0,
3.0,6.0,9.0,
4.0,8.0,12.0,
5.0,10.0,15.0]
I've managed to do it this way. Certainly not very nice.
I think anyone should consider the comments and critics by leftaroundaobut to the question, that I quote below for convenience and relevance.
{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies, FlexibleInstances #-}
class Multipliable ta tb tc | ta tb -> tc where
mul :: ta -> tb -> tc
instance Multipliable [Double] Double [Double] where
mul p k = map (*k) p --mul p k = map (\a -> k * a) p
instance Multipliable Double [Double] [Double] where
mul k p = map (*k) p --mul p k = map (\a -> k * a) p
instance Multipliable [Double] [Double] [Double] where
mul p q = p -- dummy implementation
r = [1.0, 2.0, 3.0] :: [Double]
r1 = (mul :: [Double] -> Double -> [Double]) r 2.0
r2 = (mul :: Double -> [Double] -> [Double]) 2.0 r
r3 = (mul :: [Double] -> [Double] -> [Double]) r1 r2
main = do
print r1
print r2
print r3
Why do you want this anyway? Just because Matlab allows multiplying
anything you throw at it doesn't mean this is a good idea. Check out
vector-space for properly dealing with
multidimensional-multiplications. Alternatively, if you don't care so
much for mathematical elegance, you can use hmatrix (which is in fact
a lot like Matlab/Octave in Haskell), or linear.
I think it's a bad idea in general, and really unnecessary in Haskell because you can just write map (*x) ys or zipWith (*) xs ys
to make you intent explicit. This of course doesn't work for
polymorphic code that's supposed to handle both scalars and vectors –
however, writing such code to just deal with scalars or lists of any
length is rather asking for trouble. It's awkward to specify which
list needs to have a length matching which other list and what length
the result will be etc.. This is where vector-space or linear shine,
because they check dimensions at compile time.

OCaml functors (parametrized modules) emulation in Haskell

Is there any recommended way to use typeclasses to emulate OCaml-like parametrized modules?
For an instance, I need the module that implements the complex
generic computation, that may be parmetrized with different
misc. types, functions, etc. To be more specific, let it be
kMeans implementation that could be parametrized with different
types of values, vector types (list, unboxed vector, vector, tuple, etc),
and distance calculation strategy.
For convenience, to avoid crazy amount of intermediate types, I want to
have this computation polymorphic by DataSet class, that contains all
required interfaces. I also tried to use TypeFamilies to avoid a lot
of typeclass parameters (that cause problems as well):
{-# Language MultiParamTypeClasses
, TypeFamilies
, FlexibleContexts
, FlexibleInstances
, EmptyDataDecls
, FunctionalDependencies
#-}
module Main where
import qualified Data.List as L
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import Distances
-- contains instances for Euclid distance
-- import Distances.Euclid as E
-- contains instances for Kulback-Leibler "distance"
-- import Distances.Kullback as K
class ( Num (Elem c)
, Ord (TLabel c)
, WithDistance (TVect c) (Elem c)
, WithDistance (TBoxType c) (Elem c)
)
=> DataSet c where
type Elem c :: *
type TLabel c :: *
type TVect c :: * -> *
data TDistType c :: *
data TObservation c :: *
data TBoxType c :: * -> *
observations :: c -> [TObservation c]
measurements :: TObservation c -> [Elem c]
label :: TObservation c -> TLabel c
distance :: TBoxType c (Elem c) -> TBoxType c (Elem c) -> Elem c
distance = distance_
instance DataSet () where
type Elem () = Float
type TLabel () = Int
data TObservation () = TObservationUnit [Float]
data TDistType ()
type TVect () = V.Vector
data TBoxType () v = VectorBox (V.Vector v)
observations () = replicate 10 (TObservationUnit [0,0,0,0])
measurements (TObservationUnit xs) = xs
label (TObservationUnit _) = 111
kMeans :: ( Floating (Elem c)
, DataSet c
) => c
-> [TObservation c]
kMeans s = undefined -- here the implementation
where
labels = map label (observations s)
www = L.map (V.fromList.measurements) (observations s)
zzz = L.zipWith distance_ www www
wtf1 = L.foldl wtf2 0 (observations s)
wtf2 acc xs = acc + L.sum (measurements xs)
qq = V.fromList [1,2,3 :: Float]
l = distance (VectorBox qq) (VectorBox qq)
instance Floating a => WithDistance (TBoxType ()) a where
distance_ xs ys = undefined
instance Floating a => WithDistance V.Vector a where
distance_ xs ys = sqrt $ V.sum (V.zipWith (\x y -> (x+y)**2) xs ys)
This code somehow compiles and work, but it's pretty ugly and hacky.
The kMeans should be parametrized by value type (number, float point number, anything),
box type (vector,list,unboxed vector, tuple may be) and distance calculation strategy.
There are also types for Observation (that's the type of sample provided by user,
there should be a lot of them, measurements that contained in each observation).
So the problems are:
1) If the function does not contains the parametric types in it's signature,
types will not be deduced
2) Still no idea, how to declare typeclass WithDistance to have different instances
for different distance type (Euclid, Kullback, anything else via phantom types).
Right now WithDistance just polymorphic by box type and value type, so if we need
different strategies, we may only put them in different modules and import the required
module. But this is a hack and non-typed approach, right?
All of this may be done pretty easy in OCaml with is't modules. What the proper approach
to implement such things in Haskell?
Typeclasses with TypeFamilies somehow look similar to parametric modules, but they
work different. I really need something like that.
It is really the case that Haskell lacks useful features found in *ML module systems.
There is ongoing effort to extend Haskell's module system: http://plv.mpi-sws.org/backpack/
But I think you can get a bit further without those ML modules.
Your design follows God class anti-pattern and that is why it is anti-modular.
Type class can be useful only if every type can have no more than a single instance of that class. E.g. DataSet () instance fixes type TVect () = V.Vector and you can't easily create similar instance but with TVect = U.Vector.
You need to start with implementing kMeans function, then generalize it by replacing concrete types with type variables and constraining those type variables with type classes when needed.
Here is little example. At first you have some non-general implementation:
kMeans :: Int -> [(Double,Double)] -> [[(Double,Double)]]
kMeans k points = ...
Then you generalize it by distance calculation strategy:
kMeans
:: Int
-> ((Double,Double) -> (Double,Double) -> Double)
-> [(Double,Double)]
-> [[(Double,Double)]]
kMeans k distance points = ...
Now you can generalize it by type of points, but this requires introducing a class that will capture some properties of points that are used by distance computation e.g. getting list of coordinates:
kMeans
:: Point p
=> Int -> (p -> p -> Coord p) -> [p]
-> [[p]]
kMeans k distance points = ...
class Num (Coord p) => Point p where
type Coord p
coords :: p -> [Coord p]
euclidianDistance
:: (Point p, Floating (Coord p))
=> p -> p -> Coord p
euclidianDistance a b
= sum $ map (**2) $ zipWith (-) (coords a) (coords b)
Now you may wish to make it a bit faster by replacing lists with vectors:
kMeans
:: (Point p, DataSet vec p)
=> Int -> (p -> p -> Coord p) -> vec p
-> [vec p]
kMeans k distance points = ...
class DataSet vec p where
map :: ...
foldl' :: ...
instance Unbox p => DataSet U.Vector p where
map = U.map
foldl' = U.foldl'
And so on.
Suggested approach is to generalize various parts of algorithm and constrain those parts with small loosely coupled type classes (when required).
It is a bad style to collect everything in a single monolithic type class.

Resources