What is the purpose of the function (toEnum . fromEnum) in Haskell code? - haskell

I saw the function toEnum . fromEnum being used on Chars in HaskellNet.Network.Auth.
b64Encode :: String -> String
b64Encode = map (toEnum.fromEnum) . B64.encode . map (toEnum.fromEnum)
b64Decode :: String -> String
b64Decode = map (toEnum.fromEnum) . B64.decode . map (toEnum.fromEnum)
At first glance this function should be identical to id, right? Why is it here?

It can be equivalent to id, but only in certain situations. Since fromEnum :: Enum a => a -> Int can convert any Enum to an Int, and toEnum :: Enum a => Int -> a can convert an Int to any Enum, it follows that toEnum . fromEnum is a general method of converting from any enum to any enum — that is, (toEnum . fromEnum) :: (Enum a, Enum b) => a -> b. As you have observed, this should indeed identical to id (if the Enum instance has been implemented properly, that is), but only when you select a and b as being the same type; otherwise, it converts from one Enum instance to a different Enum instance.
As for why it is used in that particular place: I really have no idea. B64.decode and B64.encode appear to both be String -> String, and b64Decode and b64Encode are also String -> String, so toEnum . fromEnum converts from Char to Char — so in this case it should be identical to id. In other words, toEnum . fromEnum does nothing here and probably should be removed (although I won’t rule out the possibility that the Enum instance for Char is implemented in such a way that this isn’t id).
EDIT: #K.A.Buhr has found an explanation for this in the Git history of the project. It appears that encode and decode used to have signatures involving ByteString, so toEnum and fromEnum were used to convert between lists of Word8 (for ByteString) and lists of Char (for String). At some point encode and decode were switched to use String rather than ByteString, but no-one removed the toEnum and fromEnum.

Related

Putting a type in the Read typeclass doesn't work in the REPL

I'm defining a type GosperInteger, representing the Eisenstein integers in a complex base, and I'd like to enter these numbers in the REPL and do operations on them. So I put the type in the Read and Show typeclasses. Here's the code (there's also an Internals module, see https://github.com/phma/gosperbase to run it):
module Data.GosperBase where
import Data.Array.Unboxed
import Data.Word
import Data.GosperBase.Internals
import qualified Data.Sequence as Seq
import Data.Sequence ((><), (<|), (|>), Seq((:<|)), Seq((:|>)))
import Data.Char
import Data.List
import Data.Maybe
{- This computes complex numbers in base 2.5-√(-3/4), called the Gosper base
because it is the scale factor from one Gosper island to the next bigger one.
The digits are cyclotomic:
2 3
6 0 1
4 5
For layout of all numbers up to 3 digits, see doc/GosperBase.ps .
-}
newtype GosperInteger = GosperInteger (Seq.Seq Word)
chunkDigitsInt :: Seq.Seq Char -> Maybe (Seq.Seq (Seq.Seq Char))
-- ^If the string ends in 'G', reverses the rest of the characters
-- and groups them into chunks of digitsPerLimb.
chunkDigitsInt (as:|>'G') = Just (Seq.reverse (Seq.chunksOf (fromIntegral digitsPerLimb) (Seq.reverse as)))
chunkDigitsInt as = Nothing
parseChunkRjust :: Seq.Seq Char -> Maybe Word
parseChunkRjust Seq.Empty = Just 0
parseChunkRjust (n:<|ns) =
let ms = parseChunkRjust ns
in case ms of
Just num -> if (n >= '0' && n < '7')
then Just (7 * num + fromIntegral (ord n - ord '0'))
else Nothing
Nothing -> Nothing
showLimb :: Word -> Word -> String
showLimb _ 0 = ""
showLimb val ndig = chr (fromIntegral ((val `div` 7 ^ (ndig-1)) `mod` 7) + ord '0') : (showLimb val (ndig-1))
parseRjust :: Seq.Seq Char -> Maybe (Seq.Seq Word)
parseRjust as =
let ns = chunkDigitsInt as
in case ns of
Just chunks -> traverse parseChunkRjust chunks
Nothing -> Nothing
showRjust' :: Seq.Seq Word -> String
showRjust' Seq.Empty = ""
showRjust' (a:<|as) = (showLimb a digitsPerLimb) ++ (showRjust' as)
showRjust :: Seq.Seq Word -> String
showRjust Seq.Empty = "0"
showRjust (a:<|as) = (showLimb a (snd (msdPosLimb a))) ++ (showRjust' as)
parse1InitTail :: (String, String) -> Maybe (GosperInteger, String)
parse1InitTail (a,b) =
let aParse = parseRjust (Seq.fromList a)
in case aParse of
Just mant -> Just (GosperInteger mant,b)
Nothing -> Nothing
parseGosperInteger :: String -> [(GosperInteger, String)]
parseGosperInteger str =
let its = zip (inits str) (tails str) -- TODO stop on invalid char
in catMaybes (fmap parse1InitTail its)
instance Read GosperInteger where
readsPrec _ str = parseGosperInteger str
instance Show GosperInteger where
show (GosperInteger m) = showRjust m ++ "G"
iAdd :: GosperInteger -> GosperInteger -> GosperInteger
iAdd (GosperInteger a) (GosperInteger b) =
GosperInteger (stripLeading0 (addRjust a b))
iMult :: GosperInteger -> GosperInteger -> GosperInteger
iMult (GosperInteger a) (GosperInteger b) =
GosperInteger (stripLeading0 (mulMant a b))
I'd like to do
> 425G * 256301G
16061525G
which requires putting GosperInteger in the Num typeclass, which I haven't done yet.
Showing a number works, and calling read on a string works, but reading a number typed into the REPL does not. Why?
> read "45G" :: GosperInteger
45G
> 45G
<interactive>:2:3: error: Data constructor not in scope: G
It is not possible to do that in a proper way (you can probably bodge this by writing an odd Num instance).
I think a better approach would be to just write that num instance, then you can write:
ghci> 425 * 256301 :: GosperInteger
16061525
If you don't want to have to write that :: GosperInteger signature you can do a few things:
Use ghci> default (GosperInteger, Double) that will mean it will automatically pick your GosperInteger type if there is ambiguity. You can also use this in normal source files.
Define a function g :: GosperInteger -> GosperInteger; g = id which you can use to disambiguate manually with less syntactic overhead:
ghci> g (425 * 256301)
16061525
The GHCi repl doesn't simply call read on the text that you type in. Instead, it has a much more complicated parser that separates your text into various tokens. One type of token is numeric: any integral number you type in will get "read" as an Integer. Of course, if you type 32 and want it to be an Int, not an Integer, this would be a problem, so the Num type class has a super convenient fromInteger function. With this, an Integer token can be converted into any instance of the Num class.
But, you want something slightly different: you want the parser to group together the numeric token along with the G token and treat them as one unit. For full support, you'd need to make an extension to the GHC parser, much like how if you type 2e7 into the prompt, you correctly get a floating point number. This isn't a simple change you can address in your source file or GHCi settings.
With all that said, there are some hacks we can play with. As Noughtmare mentions, "you can probably bodge this by writing an odd Num instance", and indeed you can! Fair warning: you probably don't want to do this, but let's explore it anyway.
The problem is that the parser returned two tokens, one that's numeric and the other that's G. Since it's uppercase, that G token is being interpreted as a data constructor (your error message pointed that out too: " Data constructor not in scope: G"). The key is to use this to our advantage.
Consider the following:
data G = G
deriving Show
instance Num (G -> GosperInteger) where
fromInteger i G = integerToGosperInteger i
Now, assuming you wrote that function integerToGosperInteger, this instance would let you type, e.g., 45G and produce a GosperInteger 45G. Hurrah! You can even do 425G * 256301G and it will work as expected. Furthermore, if you cleverly omit a fromInteger definition from your Num GosperInteger class, then you'll get a runtime error if you try to simply use a number like 425 as as GosperInteger (that is, you'll get an error for implicit coercions that don't have the G).
There are some problems.
If you try this, you'll find that type inference is pretty terrible. It probably won't work right at the prompt unless you set default (GosperInteger, Double), and you'll probably want to use lots of type annotations in your source files.
If you leave out the G, you'll get terrible type error messages or, even worse, runtime errors.
You'll get a warning that your Num instance for G -> GosperInteger is incomplete. It is incomplete, but there's no sensible definitions for anything else. You could suppress the warning or set all of the missing methods to error "This isn't how this is supposed to be used" or something, but it's still a bit of a blemish in the code.
But, if you can deal with the problems and you squint hard enough, it sorta kinda gets you what you want.

How to generate strings drawn from every possible character?

At the moment I'm generating strings like this:
arbStr :: Gen String
arbStr = listOf $ elements (alpha ++ digits)
where alpha = ['a'..'z']
digits = ['0'..'9']
But obviously this only generates strings from alpha num chars. How can I do it to generate from all possible chars?
Char is a instance of both the Enum and Bounded typeclass, you can make use of the arbitraryBoundedEnum :: (Bounded a, Enum a) => Gen a function:
import Test.QuickCheck(Gen, arbitraryBoundedEnum, listOf)
arbStr :: Gen String
arbStr = listOf arbitraryBoundedEnum
For example:
Prelude Test.QuickCheck> sample arbStr
""
""
"\821749"
"\433465\930384\375110\256215\894544"
"\431263\866378\313505\1069229\238290\882442"
""
"\126116\518750\861881\340014\42369\89768\1017349\590547\331782\974313\582098"
"\426281"
"\799929\592960\724287\1032975\364929\721969\560296\994687\762805\1070924\537634\492995\1079045\1079821"
"\496024\32639\969438\322614\332989\512797\447233\655608\278184\590725\102710\925060\74864\854859\312624\1087010\12444\251595"
"\682370\1089979\391815"
Or you can make use of the arbitrary in the Arbitrary Char typeclass:
import Test.QuickCheck(Gen, arbitrary, listOf)
arbStr :: Gen String
arbStr = listOf arbitrary
Note that the arbitrary for Char is implemented such that ASCII characters are (three times) more common than non-ASCII characters, so the "distribution" is different.
Since Char is an instance of Bounded as well as Enum (confirm this by asking GHCI for :i Char), you can simply write
[minBound..maxBound] :: [Char]
to get a list of all legal characters. Obviously this will not lead to efficient random access, though! So you could instead convert the bounds to Int with Data.Char.ord :: Char -> Int, and use QuickCheck's feature to select from a range of integers, then map back to a character with Data.Chra.chr :: Int -> Char.
When we do like
λ> length ([minBound..maxBound] :: [Char])
1114112
we get the number of all characters and say Wow..! If you think the list is too big then you may always do like drop x . take y to limit the range.
Accordingly, if you need n many random characters just shuffle :: [a] -> IO [a] the list and do a take n from that shuffled list.
Edit:
Well of course... since shuffling could be expensive, it's best if we chose a clever strategy. It would be ideal to randomly limit the all characters list. So just
make a limits = liftM sort . mapM randomRIO $ replicate 2 (0,1114112) :: (Ord a, Random a, Num a) => IO [a]
limits >>= \[min,max] -> return . drop min . take max $ ([minBound..maxBound] :: [Char])
Finally just take n many like random Chars like liftM . take n from the result of Item 2.

Haskell list type conversion

Support I have a list of type [Char], and the values enclosed are values of a different type if I remove their quotations which describe them as characters. e.g. ['2','3','4'] represents a list of integers given we change their type.
I have a similar but more complicated requirement, I need to change a [Char] to [SomeType] where SomeType is some arbitrary type corresponding to the values without the character quotations.
Assuming you have some function foo :: Char -> SomeType, you just need to map this function over your list of Char.
bar :: [Char] -> [SomeType]
bar cs = map foo cs
I hope I get this correctly and there is a way (if the data-constructors are just one-letters too) - you use the auto-deriving for Read:
data X = A | B | Y
deriving (Show, Read)
parse :: String -> [X]
parse = map (read . return)
(the return will just wrap a single character back into a singleton-list making it a String)
example
λ> parse "BAY"
[B,A,Y]

Haskell: Function that converts Char to Word8

I'm trying to find a function that converts a Char to a Word8, and another that converts it back. I've looked on Hoogle and there are no functions of type Char -> Word8. I want to do this because Word8 has an instance of the Num class, and I would be able to add numbers to them to change them to another number. For example, my function would look something like:
shift :: Char -> Int -> Char
shift c x = toChar $ (toWord8 c) + x
Any ideas?
You can use fromEnum and toEnum to go via Int instead. This has the added benefit of also supporting Unicode.
> toEnum (fromEnum 'a' + 3) :: Char
'd'
I found these using FP Complete's Hoogle: http://haddocks.fpcomplete.com/fp/7.7/20131212-1/utf8-string/Codec-Binary-UTF8-String.html#v:encodeChar
A Char doesn't necessarily contain only one byte, so the conversions you're describing could lose information. Would these functions work instead?

Haskell ASCII codes

I'm trying to make a function that takes an a (which can be any type: int, char...) and creates a list which has that input replicated the number of times that corresponds to its ASCII code.
I've created this:
toList n = replicate (fromEnum n) n
When trying to use the function in the cmd it says it could match the expected type int with char, however if i use my function directly in the cmd with an actual value it does what it's supposed.
What i mean is: toList 'a' --> gives me an error
replicate (fromEnum 'a') 'a' --> gives a result without problem
I've loaded the module Data.Char (ord)
How can I fix this, and why does this happens?
Thanks in advance :)
What you're missing is a type declaration. You say that you want it to be able to take any type, but what you really want is toList to take something that is an instance of Enum. When you play around with it in GHCi, it'll let you do let toList n = replicate (fromEnum n) n, because GHCi will automatically pick some defaults that seem to make sense, but when compiling a module with GHC, it won't work without the type declaration. You want
toList :: (Enum a) => a -> [a]
toList n = replicate (fromEnum n) n
The reason why you have to have the (Enum a) => in the type signature is because fromEnum has the type signature (Enum a) => a -> Int. So you see it doesn't just take any type, only those that have an instance for Enum.

Resources