Haskell - Making Int an instance of my class - haskell

When I try to make Int an Instance of my class Gueltig by:
instance Gueltig Int where
ist_gueltig (Int a) = Ja
, why do I get the error message "Undefined data constructor "Int""?
How do I make instances of Int?
Thanks for your help!

There's nothing to pattern-match here. You don't care about the value of the integer, just that it is an integer. Which is already proven by the compiler before even choosing that instance. So, just make it
instance Gültig Int where
istGültig _ = Ja
Alternatively you could also write istGültig a = Ja, but variables that aren't used should not have names (in fact, -Wall will trigger a warning if they do).
OTOH, Int a would only be valid if there was a data type like
data Int = Int {intData :: ???}
If fact Int does kind of look like this, but this is a GHC implementation detail:
data {-# CTYPE "HsInt" #-} Int = I# Int#
Here, #Int is a hard-baked machine representation type. It's an unlifted type, meaning it behaves in some regards different from standard Haskell values. You don't want to deal with this unless you really know why.
Normally, you should just treat Int as an abstract type that can't be further unwrapped or anything. Just use it directly with standard numeric / comparison operators.

Related

Pass no char to a function that is expecting it in Haskell

I am working with Haskell and I have defined the following type
--Build type Transition--
data Transition = Transition {
start_state :: Int,
symbol :: Char,
end_state :: Int
} deriving Show
and I would like to be able to define the following Transition
Transition 0 '' 1
which would be mean "a transition given by no symbol" (I need it to compute the epsilon closure of a NFA). How can I do this?
Thank you!
Well the idea of defining a type is that every value you pass to that field is a "member" of that type. Char only contains only characters (and the empty string is not a character) and undefined (but it is advisable not to use undefined here).
Usually in case you want to make values optional, you can use a Maybe a type instead, so:
data Transaction = Transaction {
start_state :: Int,
symbol :: Maybe Char,
end_state :: Int
} deriving Show
So now we can pass two kinds of values: Nothing which thus should be interpreted as "no character", or Just x, with x a character, and this thus acts as a character, so in your case, that would be:
Transaction 0 Nothing 1
Maybe is also an instance of Functor, Applicative and Monad, which should make working with Maybe types quite convenient (yes it can sometimes introduce some extra work, but by using fmap, etc. the amount of pattern matching shifting to Maybe Char should be rather low).
Note: like #amalloy says, an NFA (and DFA) has Transitions, not Transactions.

Safest way to generate random GADT with Hedgehog (or any other property-based testing framework)

I have GADT like this one:
data TType a where
TInt :: TType Int
TBool :: TType Bool
I want to have a function like this one:
genTType :: Gen (TType a)
Which can generate random constructor of TType type. I can do this simply by creating existentially qualified data type like
data AnyType = forall a . MkAnyType (TType a)
then generate random number from 0 to 1 (including) and create AnyType depending on the integer value. Like this:
intToAnyType :: Int -> AnyType
intToAnyType 0 = MkAnyType TInt
intToAnyType 1 = MkAnyType TBool
intToAnyType _ = error "Impossible happened"
But this approach has couple drawbacks to me:
No external type safety. If I add another constructor to TType data type I can forgot to fix tests and compiler won't warn me about this.
Compiler can't stop me from writing intToAnyType 1 = MkAnyType TInt.
I don't like this error. Int type is too broad to me. It would be nice to make this pattern-matching exhaustive.
What can I do in Haskell to eliminate as much as possible drawbacks here? Preferably using generators from this module:
https://hackage.haskell.org/package/hedgehog-0.5.1/docs/Hedgehog-Gen.html
Generating genTType with Template Haskell is probably your best bet to automate maintenance of the generators, because there is no generic programming support for GADTs.
For your last point, instead of generating an integer and then mapping it to a value, use oneof or element.
element [MkAnyType TInt, MkAnyType TBool]

Understanding `GHC.TypeLits`

I'm trying to wrap my head around the GHC extensions KindSignatures and DataKinds. Looking at the Data.Modular package, I understand roughly that
newtype i `Mod` (n :: Nat) = Mod i deriving (Eq, Ord)
is sort of equivalent to declaring a c++ template <typename T, int N> (with the constructor taking only one argument of type T). However, looking at the GHC.TypeLits package, I don't understand most of what is happening. Any general explanation about this package would be helpful. Before this question gets flagged as off topic though, here are some specific sub-questions:
a KnownNat class makes sense, with required function letting you extract the type variable from the type, but what does natVal do, and what is the proxy type variable?
Where would you use someNatVal?
Finally, what is SomeNat - how can a type level number be unknown? Isn't the whole point of a type level number that it is known at compile time?
This question is quite broad -- I'll address a few points, only.
The proxy type variable is just a type variable of kind * -> *, i.e. the kind of type contructors. Pragmatically, if you have a function
foo :: proxy a -> ...
you can pass to it values of type e.g. Maybe Int, choosing proxy = Maybe and a = Int. You can pass also values of type [] Char (also written as [Char]). Or, more commonly, a value of type Proxy Int, where Proxy is a data type defined as
data Proxy a = Proxy
i.e. a data type which does not carry any run-time information (there's a single value for it!), but which carries compile-time information (the phantom type variable a).
Assume N is a type of kind Nat -- a compile-time natural. We can write a function
bar :: N -> ...
but calling this would require us to build a value of type N -- which is immaterial. The purpose of type N is to carry compile-time information, only, and its run-time values are not things that we really want to use. In fact, N could have no values at all, except for bottom. We could call
bar (undefined :: N)
but this looks weird. Reading this, we must realize that bar is lazy in its first argument and that it will not cause divergence trying to use it. The problem is that the bar :: N -> ... type signature is misleading: it claims that the result may depend on the value of the argument of type N, when this is not really the case. Instead, if we use
baz :: Proxy N -> ...
the intent is clear -- there's only a single run-time value for that: Proxy :: Proxy N. It is equally clear that the N value is only present at compile time.
Sometimes, instead of using the specific Proxy N, the code is slightly generalized to
foo :: proxy N -> ...
which achieves the same goal, but permits also different Proxy types. (Personally, I'm not terribly thrilled by this generalization.)
Back to the question: natVal is a function which turns a compile-time only natural into a run-time value. I.e. it converts Proxy N into Int, returning just the constant.
You analogy with C++ templates might get closer if you used type template arguments to model compile time naturals. E.g.
template <typename N> struct S { using pred = N; };
struct Z {};
template <typename N> int natVal();
template <typename N> int natVal() { return 1 + natVal<typename N::pred>(); }
template <> int natVal<Z>() { return 0; }
int main() {
cout << natVal<S<S<Z>>>() << endl; // outputs 2
return 0;
}
Just pretend there are no public constructors for S and Z: their run-time values are unimportant, only their compile-time information matters.

Changing a single record field to be strict leads to worse performance

I have a program that uses haskell-src-exts, and to improve performance I decided to make some record fields strict. This resulted in much worse performance.
Here's the complete module that I'm changing:
{-# LANGUAGE DeriveDataTypeable, BangPatterns #-}
module Cortex.Hackage.HaskellSrcExts.Language.Haskell.Exts.SrcSpan(
SrcSpan, srcSpan, srcSpanFilename, srcSpanStartLine,
srcSpanStartColumn, srcSpanEndLine, srcSpanEndColumn,
) where
import Control.DeepSeq
import Data.Data
data SrcSpan = SrcSpanX
{ srcSpanFilename :: String
, srcSpanStartLine :: Int
, srcSpanStartColumn :: Int
, srcSpanEndLine :: Int
, srcSpanEndColumn :: Int
}
deriving (Eq,Ord,Show,Typeable,Data)
srcSpan :: String -> Int -> Int -> Int -> Int -> SrcSpan
srcSpan fn !sl !sc !el !ec = SrcSpanX fn sl sc el ec
instance NFData SrcSpan where
rnf (SrcSpanX x1 x2 x3 x4 x5) = rnf x1
Note that the only way to construct a SrcSpan is by using the srcSpan function which is strict in all the Ints.
With this code my program (sorry, I can't share it) runs in 163s.
Now change a single line, e.g.,
, srcSpanStartLine :: !Int
I.e., the srcSpanStartLine field is now marked as strict. My program now takes 198s to run. So making that one field strict increases the running time by about 20%.
How is this possible? The code for the srcSpan function should be the same regardless since it is already strict. The code for the srcSpanStartLine selector should be a bit simpler since it no longer has to evaluate.
I've experimented with -funbox-strict-fields and -funbox-small-strict-field on and off. It doesn't make any noticeable difference. I'm using ghc 7.8.3.
Has anyone seen something similar? Any bright ideas what might cause it?
With some more investigation I can answer my own question. The short answer is uniplate.
Slightly longer answer. In one place I used uniplate to get the children of a Pat (haskell-src-exts type for patterns). The call looked like children p and the type of this instance of children was Pat SrcSpanInfo -> [Pat SrcSpanInfo]. So it's doing no recursion, just returning the immediate children of a node.
Uniplate uses two very different methods depending on if there are strict fields in the type your operating on. Without strict fields it reasonable fast, with strict fields it switches to using gfoldl and is incredibly slow. And even though my use of uniplate didn't directly involve a strict field, it slowed down.
Conclusion: Beware uniplate if you have a strict field anywhere in sight!

Overloading function signatures haskell

I get the following error message when I compile:
Duplicate type signature:
weightedMedian.hs:71:0-39: findVal :: [ValPair] -> Double -> Double
weightedMedian.hs:68:0-36: findVal :: [ValPair] -> Int -> Double
My solution is to have findValI and findValD. However, findValI just converts the Int type to a Double and calls findValD.
Also I can't pattern match on types of Num (Int, Double) so I can't just change the type signature to
findVal :: [ValPair] -> Num -> Double
In many languages I wouldn't need different names. Why do I need different names in Haskell? Would this be hard to add to the language? Or are there dragons there?
Ad-hoc polymorphism (and name overloading) are provided in Haskell by typeclasses:
class CanFindVal a where
findVal :: [ValPair] -> a -> Double
instance CanFindVal Double where
findVal xs d = ...
instance CanFindVal Int where
findVal xs d = findVal xs (fromIntegral d :: Double)
Note that in this case, since findVal "really" needs a Double, I'd just always have it take a double, and when I needed to pass it an int, just use fromIntegral at the call site. You generally want typeclasses when there's actually different behavior or logic involved, rather than promiscuously.
Supporting both findVal :: [ValPair] -> Double -> Double and findVal :: [ValPair] -> Int -> Double requires ad-hoc polymorphism (see http://www.haskell.org/haskellwiki/Ad-hoc_polymorphism) which is generally dangerous. The reason is that ad-hoc polymorphism allows for changing semantics with the same syntax.
Haskell prefers what is called parametric polymorphism. You see this all the time with type signatures, where you have a type variable.
Haskell supports a safer version of ad-hoc polymorphism via type classes.
You have three options.
Continue what you are doing with an explicit function name. This is reasonable, it is even used by some c libraries, for example opengl.
Use a custom type class. This is probably the best way, but is heavy and requires a fair amount of code (by haskells very compact standards). Look at sclv's answer for code.
Try using an existing type class and (if you use GHC) get the performance with specializations.
Like this:
findVal :: Num a => [ValPair] -> a -> Double
{-# SPECIALISE findVal :: [ValPair] -> Int -> Double #-}
{-# SPECIALISE findVal :: [ValPair] -> Double -> Double #-}
findVal = ...
Haskell does not support C++-style overloading (well it sortof does with typeclasses, but we don't use them in the same way). And yeah there are some dragons associated to adding it, mostly having to do with type inference (becomes exponential time or undecidable or something like that). However, seeing "convenience" code like this is pretty uncommon in Haskell. Which one is it, an Int or a Double? Since your Int method delegates to the Double method, my guess is that Double is the "correct" one. Just use that one. Because of literal overloading, you can still call it as:
findVal whatever 42
And the 42 will be treated as a Double. The only case where this breaks is if you got something that is fundamentally an Int somewhere and you need need to pass it as this argument. Then use fromIntegral. But if you strive to have your code use the "correct" type everywhere, this case will be uncommon (and when you do have to convert, it will be worth drawing attention to that).
In this case, I think it's easy to write a function that handles both Int and Double for the second argument. Just write findVal so that is calls realToFrac on the second argument. That will convert the Int to a Double and just leave a Double alone. Then let the compiler deduce the type for you, if you are lazy.
In many other programming languages you can declare (sort of) functions having the same name but different other things in their signatures, such as different parameter types. That is called overloading and certainly is the most popular way to achieve ad-hoc polymorphism.
Haskell deliberately does NOT support overloading because it's designers don't consider it as the best way to achieve ad-hoc polymorphism. The Haskell way rather is constrained polymorphism and it involves declaring type classes and class instances

Resources