Add with carry on Word8 - haskell

I can't find a function addWithCarry :: Word8 -> Word8 -> (Word8, Bool) already defined in base. The only function documented as caring about carries seems to be addIntC# in GHC.Prim but it seems to never be pushed upwards through the various abstraction layers.
I could obviously roll out my own by testing whether the output value is in range and it's in fact what I am currently doing but I'd rather reuse an (potentially more efficient) already defined one.
Is there such a thing?

If you look at the source for Word8's Num instance, you'll see that everything is done by converting to a Word# unboxed value and performing operations on that, and then narrowing down to a 8-bit value. I suspected that doing comparison on that Word# value would be more efficient, so I implemented such a thing. It's available on lpaste (which I find easier to read than StackOverflow).
Note that it includes both a test suite and Criterion benchmark. On my system, all of the various tests take ~31ns for the boxed version (user5402's implementation) and ~24ns for the primops versions.
The important function from the lpaste above is primops, which is:
primops :: Word8 -> Word8 -> (Word8, Bool)
primops (W8# x#) (W8# y#) =
(W8# (narrow8Word# z#), isTrue# (gtWord# z# 255##))
where
z# = plusWord# x# y#

A way to do this is:
addWithCarry :: Word8 -> Word8 -> (Word8, Bool)
addWithCarry x y = (z, carry)
where z = x + y
carry = z < x

Related

A recursion scheme from Int -> Int?

The foldr identity is
foldr (:) []
More generally, with folds you can either destroy structure and end up with a summary value or inject structure in such a way that you end up with the same output structure.
[Int] -> [Int]
or
[Int] -> Int
or
[Int] -> ?
I'm wondering if there a similar identity with unfoldr/l.
I know how to get
Int -> [Int]
with unfold/ana.
I'm looking for some kind of way to go from
Int -> Int
with a recursion scheme.
Taking a cue from your remark about factorials, we can note that natural numbers can be treated as a recursive data structure:
data Nat = Zero | Succ Nat
In terms of the recursion-schemes machinery, the corresponding base functor would be:
data NatF a = ZeroF | SuccF a
deriving (Functor)
NatF, however, is isomorphic to Maybe. That being so, recursion-schemes conveniently makes Maybe the base functor of the Natural type from base. For instance, here is the type of ana specialised to Natural:
ana #Natural :: (a -> Maybe a) -> a -> Natural
We can use it to write the identity unfold for Natural:
{-# LANGUAGE LambdaCase #-}
import Numeric.Natural
import Data.Functor.Foldable
idNatAna :: Natural -> Natural
idNatAna = ana $ \case
0 -> Nothing
x -> Just (x - 1)
The coalgebra we just gave to ana is project for Natural, project being the function that unwraps one layer of the recursive structure. In terms of the recursion-schemes vocabulary, ana project is the identity unfold, and cata embed is the identity fold. (In particular, project for lists is uncons from Data.List, except that it is encoded with ListF instead of Maybe.)
By the way, the factorial function can be expressed as a paramorphism on naturals (as pointed out in the note at the end of this question). We can also implement that in terms of recursion-schemes:
fact :: Natural -> Natural
fact = para $ \case
Nothing -> 1
Just (predec, prod) -> prod * (predec + 1)
para makes available, at each recursive step, the rest of the structure to be folded (if we were folding a list, that would be its tail). In this case, I have called the value thus provided predec because at the n-th recursive step from bottom to top predec is n - 1.
Note that user11228628's hylomorphism is probably a more efficient implementation, if you happen to care about that. (I haven't benchmarked them, though.)
The kind of recursion scheme that deals with building up an intermediate structure and tearing it down, so that the structure doesn't appear in the input or output, is a hylomorphism, spelled hylo in recursion-schemes.
To use a hylomorphism, you need to specify an algebra (something that consumes one step of a recursive structure) and a coalgebra (something that produces one step of a recursive structure), and you need to have a data type for the kind of structure you're using, of course.
You suggested factorial, so let's look into how to write that as a hylomorphism.
One way to look at factorial is as the product of a list of numbers counting down from the initial n. In this framing, we can think of the product as our algebra, tearing down the list one cons at a time, and the count-down as our coalgebra, building up the list as n is decremented.
recursion-schemes gives us ListF as a handy base functor for lists, so we'll use that as the data type produced by the coalgebra and consumed by the algebra. Its constructors are Nil and Cons, which of course resemble the constructors for full lists, except that a ListF, like any base structure in a recursion scheme, uses a type parameter in the place that lists would use actual recursion (meaning that Cons :: a -> b -> ListF a b instead of (:) :: a -> [a] -> [a]).
So that determines our types. Now defining fact is a rather fill-in-the-blanks exercise:
import Prelude hiding (product)
import Data.Functor.Foldable
product :: ListF Int Int -> Int
product Nil = 1
product (Cons a b) = a * b
countDown :: Int -> ListF Int Int
countDown 0 = Nil
countDown n = Cons n (n - 1)
fact :: Int -> Int
fact = hylo product countDown

Haskell type checking in code

Could you please show me how can I check if type of func is Tree or not, in code not in command page?
data Tree = Leaf Float | Gate [Char] Tree Tree deriving (Show, Eq, Ord)
func a = Leaf a
Well, there are a few answers, which zigzag in their answers to "is this possible".
You could ask ghci
ghci> :t func
func :: Float -> Tree
which tells you the type.
But you said in your comment that you are wanting to write
if func == Tree then 0 else 1
which is not possible. In particular, you can't write any function like
isTree :: a -> Bool
isTree x = if x :: Tree then True else False
because it would violate parametericity, which is a neat property that all polymorphic functions in Haskell have, which is explored in the paper Theorems for Free.
But you can write such a function with some simple generic mechanisms that have popped up; essentially, if you want to know the type of something at runtime, it needs to have a Typeable constraint (from the module Data.Typeable). Almost every type is Typeable -- we just use the constraint to indicate the violation of parametericity and to indicate to the compiler that it needs to pass runtime type information.
import Data.Typeable
import Data.Maybe (isJust)
data Tree = Leaf Float | ...
deriving (Typeable) -- we need Trees to be typeable for this to work
isTree :: (Typeable a) => a -> Bool
isTree x = isJust (cast x :: Maybe Tree)
But from my experience, you probably don't actually need to ask this question. In Haskell this question is a lot less necessary than in other languages. But I can't be sure unless I know what you are trying to accomplish by asking.
Here's how to determine what the type of a binding is in Haskell: take something like f a1 a2 a3 ... an = someExpression and turn it into f = \a1 -> \a2 -> \a3 -> ... \an -> someExpression. Then find the type of the expression on the right hand side.
To find the type of an expression, simply add a SomeType -> for each lambda, where SomeType is whatever the appropriate type of the bound variable is. Then use the known types in the remaining (lambda-less) expression to find its actual type.
For your example: func a = Leaf a turns into func = \a -> Leaf a. Now to find the type of \a -> Leaf a, we add a SomeType -> for the lambda, where SomeType is Float in this case. (because Leaf :: Float -> Tree, so if Leaf is applied to a, then a :: Float) This gives us Float -> ???
Now we find the type of the lambda-less expression Leaf (a :: Float), which is Tree because Leaf :: Float -> Tree. Now we can add substitute Tree for ??? to get Float -> Tree, the actual type of func.
As you can see, we did that all by just looking at the source code. This means that no matter what, func will always have that type, so there is no need to check whether or not it does. In fact, the compiler will throw out all information about the type of func when it compiles your code, and your code will still work properly because of type-checking. (The caveat to this (Typeable) is pointed out in the other answer)
TL;DR: Haskell is statically typed, so func always has the type Float -> Tree, so asking how to check whether that is true doesn't make sense.

Haskell Type Promotion

I am currently working my way through Write Yourself a Scheme in 48 Hours and am stuck on type promotion.
In short, scheme has a numerical tower (Integer->Rational->Real->Complex) with the numeric promotions one would expect. I modeled numbers with the obvious
data Number = Integer Integer | Rational Rational | Real Double | Complex (Complex Double)
so using Rank2Types seemed like an easy way to make functions work over this range of types. For Num a this looks like
liftNum :: (forall a . Num a => a -> a -> a) -> LispVal -> LispVal -> ThrowsError LispVal
liftNum f a b = case typeEnum a `max` typeEnum b of
ComplexType -> return . Number . Complex $ toComplex a `f` toComplex b
RealType -> return . Number . Real $ toReal a `f` toReal b
RationalType -> return . Number . Rational $ toFrac a `f` toFrac b
IntType -> return . Number . Integer $ toInt a `f` toInt b
_ -> typeErr a b "Number"
which works but quickly becomes verbose because a separate block is required for each type class.
Even worse, this implementation of Complex is simplified since scheme can use separate types for the real and complex part. Implementing this would require a custom version holding two Number's which would make the verbosity even worse if I wanted to avoid making the type recursive.
As far as I know there is no way to abstract over the context so I am hoping for a cleaner way to implement this number logic.
Thanks for reading!
Here's a proposal. The primary thing we want your typeEnum function to do that it doesn't yet is bring a Num a dictionary into scope. So let's use GADTs to make that happen. I'll simplify a few things to make it easier to explain the idea and write the code, but nothing essential: I'll focus on Number rather than LispVal and I won't report detailed errors when things go wrong. First some boilerplate:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE Rank2Types #-}
import Control.Applicative
import Data.Complex
Now, you didn't give your definition of a type enumeration. But I'll give mine, because it's part of the secret sauce: my type enumeration is going to have a connection between Haskell's term level and Haskell's type level via a GADT.
data TypeEnum a where
Integer :: TypeEnum Integer
Rational :: TypeEnum Rational
Real :: TypeEnum Double
Complex :: TypeEnum (Complex Double)
Because of this connection, my Number type won't need to repeat each of these cases again. (I suspect your TypeEnum and Number types are quite repetitive compared to each other.)
data Number where
Number :: TypeEnum a -> a -> Number
Now we're going to define a new type that you didn't have, which will tie a TypeEnum together with a Num dictionary for the appropriate type. This will be the return type of our typeEnum function.
data TypeDict where
TypeDict :: Num a => TypeEnum a -> TypeDict
ordering :: TypeEnum a -> Int
ordering Integer = 0 -- lowest
ordering Rational = 1
ordering Real = 2
ordering Complex = 3 -- highest
instance Eq TypeDict where TypeDict l == TypeDict r = ordering l == ordering r
instance Ord TypeDict where compare (TypeDict l) (TypeDict r) = compare (ordering l) (ordering r)
The ordering function reflects (my guess at) the direction that casts can go. If you try to implement Eq and Ord yourself for this type, without peeking at my solution, I suspect you will discover why GHC balks at deriving these classes for GADTs. (At the very least, it took me a few tries! The obvious definitions don't type-check, and the slightly less obvious definitions had the wrong behavior.)
Now we are ready to write a function that produces a dictionary for a number.
typeEnum :: Number -> TypeDict
typeEnum (Number Integer _) = TypeDict Integer
typeEnum (Number Rational _) = TypeDict Rational
typeEnum (Number Real _) = TypeDict Real
typeEnum (Number Complex _) = TypeDict Complex
We will also need the casting function; you can essentially just concatenate your definitions of toComplex and friends here:
-- combines toComplex, toFrac, toReal, toInt
to :: TypeEnum a -> Number -> Maybe a
to Rational (Number Integer n) = Just (fromInteger n)
to Rational (Number Rational n) = Just n
to Rational _ = Nothing
-- etc.
to _ _ = Nothing
Once we have this machinery in place, liftNum is surprisingly short. We just find the appropriate type to cast to, get a dictionary for that type, and perform the casts and the operation.
liftNum :: (forall a. Num a => a -> a -> a) -> Number -> Number -> Maybe Number
liftNum f a b = case typeEnum a `max` typeEnum b of
TypeDict ty -> Number ty <$> liftA2 f (to ty a) (to ty b)
At this point you may be complaining: your ultimate goal was to not have one case per class instance in liftNum, and we've achieved that goal, but it looks like we just pushed it off into the definition of typeEnum, where there is one case per class instance. However, I defend myself: you have not shown us your typeEnum, which I suspect already had one case per class instance. So we have not incurred any new cost in functions other than liftNum, and have indeed significantly simplified liftNum. This also gives a smooth upgrade path for more complex Complex manipulations: extend the TypeEnum definition, the cast ordering, and the to function and you're good to go; liftNum may stay the same. (If it turns out that types are not linearly ordered but instead some kind of lattice or similar, then you can switch away from the Ord class.)

How can an arbitrary Num contain any other numeric type?

I'm just starting with Haskell, and I thought I'd start by making a random image generator. I looked around a bit and found JuicyPixels, which offers a neat function called generateImage. The example that they give doesn't seem to work out of the box.
Their example:
imageCreator :: String -> IO ()
imageCreator path = writePng path $ generateImage pixelRenderer 250 300
where pixelRenderer x y = PixelRGB8 x y 128
when I try this, I get that generateImage expects an Int -> Int -> PixelRGB8 whereas pixelRenderer is of type Pixel8 -> Pixel8 -> PixelRGB8. PixelRGB8 is of type Pixel8 -> Pixel8 -> Pixel8 -> PixelRGB8, so it makes sense that pixelRenderer is doing some type inference to determine that x and y are of type Pixel8. If I define a type signature that asserts that they are of type Int (so the function gets accepted by generateImage, PixelRGB8 complains that it needs Pixel8s, not Ints.
Pixel8 is just a type alias for Word8. After some hair pulling, I discovered that the way to convert an Int to a Word8 is by using fromIntegral.
The type signature for fromIntegral is (Integral a, Num b) => a -> b. It seems to me that the function doesn't actually know what you want to convert it to, so it converts to the very generic Num class. So theoretically, the output of this is a variable of any type that fits the type class Num (correct me if I'm mistaken here--as I understand it, classes are kind of like "interfaces" where types are more like classes/primitives in OOP). If I assign a variable
let n = fromIntegral 5
:t n -- n :: Num b => b
So I'm wondering... what is 'b'? I can use this variable as anything, and it will implicitly cast to any numeric type, as it seems. Not only will it implicitly cast to a Word8, it will implicitly cast to a Pixel8, meaning fromPixel effectively gets turned from (as I understood it) (Integral a, Num b) => a -> b to (Integral a) => a -> Pixel8 depending on context.
Can someone please clarify exactly what's happening here? Why can I use a generic Num as any type that fits Num, both mechanically and "ethically"? I don't understand how the implicit conversion is implemented (if I were to create my own class, I feel like I would need to add explicit conversion functions). I also don't really know why this works; here I can use a pretty unsafe type and convert it implicitly to anything else. (for example, fromIntegral 50000 gets translated to 80 if I implicitly convert it to a Word8)
A common implementation of type classes such as Num is dictionary-passing. Roughly, when the compiler sees something like
f :: Num a => a -> a
f x = x + 2
it transforms it into something like
f :: (Integer -> a, a -> a -> a) -> a -> a
-- ^-- the "dictionary"
f (dictFromInteger, dictPlus) x = dictPlus x (dictFromInteger 2)
The latter basically says: "pass me an implementation for these methods of class Num for your type a, and I will use them to produce a function a -> a for you".
Values such as your n :: Num b => b are no different. They are compiled into things such as
n :: (Integer -> b) -> b
n dictFromInteger = dictFromInteger 5 -- roughly
As you can see, this turns innocent-looking integer literals into functions, which can (and does) impact performance. However, in many circumstances the compiler can realize that the full polymorphic version is not actually needed, and remove all the dictionaries.
For instance, if you write f 3 but f expects Int, the "polymorphic" 3 can be converted at compile time. So type inference can aid the optimization phase (and user-written type annotation can greatly help here). Further, some other optimizations can be triggered manually, e.g. using the GHC SPECIALIZE pragma. Finally, the dreaded monomorphism restriction tries hard to force non-functions to remain non-functions after translation, at the cost of some loss of polymorphism. However, the MR is now being regarded as harmful, since it can cause puzzling type errors in some contexts.

defining functions with/without lambdas

Which difference in does it make if I define a function with a lambda expression or without so when compiling the module with GHC
f :: A -> B
f = \x -> ...
vs.
f :: A -> B
f x = ...
I think I saw that it helps the compiler to inline the function but other than that can it have an impact on my code if I change from the first to the second version.
I am trying to understand someone else's code and get behind the reasoning why this function is defined in the first and not the second way.
To answer that question, I wrote a little program with both ways, and looked at the Core generated:
f1 :: Int -> Int
f1 = \x -> x + 2
{-# NOINLINE f1 #-}
f2 :: Int -> Int
f2 x = x + 2
{-# NOINLINE f2 #-}
I get the core by running ghc test.hs -ddump-simpl. The relevant part is:
f1_rjG :: Int -> Int
[GblId, Arity=1, Str=DmdType]
f1_rjG =
\ (x_alH :: Int) -> + # Int GHC.Num.$fNumInt x_alH (GHC.Types.I# 2)
f2_rlx :: Int -> Int
[GblId, Arity=1, Str=DmdType]
f2_rlx =
\ (x_amG :: Int) -> + # Int GHC.Num.$fNumInt x_amG (GHC.Types.I# 2)
The results are identical, so to answer your question: there is no impact from changing from one form to the other.
That being said, I recommend looking at leftaroundabout's answer, which deals about the cases where there actually is a difference.
First off, the second form is just more flexible (it allows you to do pattern matching, with other clauses below for alternative cases).
When there's only one clause, it's actually equivalent to a lambda... unless you have a where scope. Namely,
f = \x -> someCalculation x y
where y = expensiveConstCalculation
is more efficient than
f x = someCalculation x y
where y = expensiveConstCalculation
because in the latter, y is always recalculated when you evaluate f with a different argument. In the lambda form, y is re-used:
If the signature of f is monomorphic, then f is a constant applicative form, i.e. global constant. That means y is shared throughout your entire program, and only someCalculation needs to be re-done for each call of f. This is typically ideal performance-wise, though of course it also means that y keeps occupying memory.
If f s polymorphic, then it is in fact implicitly a function of the types you're using it with. That means you don't get global sharing, but if you write e.g. map f longList, then still y needs to be computed only once before getting mapped over the list.
That's the gist of the performance differences. Now, of course GHC can rearrange stuff and since it's guaranteed that the results are the same, it might always transform one form to the other if deemed more efficient. But normally it doesn't.

Resources