How to generate strings drawn from every possible character? - haskell

At the moment I'm generating strings like this:
arbStr :: Gen String
arbStr = listOf $ elements (alpha ++ digits)
where alpha = ['a'..'z']
digits = ['0'..'9']
But obviously this only generates strings from alpha num chars. How can I do it to generate from all possible chars?

Char is a instance of both the Enum and Bounded typeclass, you can make use of the arbitraryBoundedEnum :: (Bounded a, Enum a) => Gen a function:
import Test.QuickCheck(Gen, arbitraryBoundedEnum, listOf)
arbStr :: Gen String
arbStr = listOf arbitraryBoundedEnum
For example:
Prelude Test.QuickCheck> sample arbStr
""
""
"\821749"
"\433465\930384\375110\256215\894544"
"\431263\866378\313505\1069229\238290\882442"
""
"\126116\518750\861881\340014\42369\89768\1017349\590547\331782\974313\582098"
"\426281"
"\799929\592960\724287\1032975\364929\721969\560296\994687\762805\1070924\537634\492995\1079045\1079821"
"\496024\32639\969438\322614\332989\512797\447233\655608\278184\590725\102710\925060\74864\854859\312624\1087010\12444\251595"
"\682370\1089979\391815"
Or you can make use of the arbitrary in the Arbitrary Char typeclass:
import Test.QuickCheck(Gen, arbitrary, listOf)
arbStr :: Gen String
arbStr = listOf arbitrary
Note that the arbitrary for Char is implemented such that ASCII characters are (three times) more common than non-ASCII characters, so the "distribution" is different.

Since Char is an instance of Bounded as well as Enum (confirm this by asking GHCI for :i Char), you can simply write
[minBound..maxBound] :: [Char]
to get a list of all legal characters. Obviously this will not lead to efficient random access, though! So you could instead convert the bounds to Int with Data.Char.ord :: Char -> Int, and use QuickCheck's feature to select from a range of integers, then map back to a character with Data.Chra.chr :: Int -> Char.

When we do like
λ> length ([minBound..maxBound] :: [Char])
1114112
we get the number of all characters and say Wow..! If you think the list is too big then you may always do like drop x . take y to limit the range.
Accordingly, if you need n many random characters just shuffle :: [a] -> IO [a] the list and do a take n from that shuffled list.
Edit:
Well of course... since shuffling could be expensive, it's best if we chose a clever strategy. It would be ideal to randomly limit the all characters list. So just
make a limits = liftM sort . mapM randomRIO $ replicate 2 (0,1114112) :: (Ord a, Random a, Num a) => IO [a]
limits >>= \[min,max] -> return . drop min . take max $ ([minBound..maxBound] :: [Char])
Finally just take n many like random Chars like liftM . take n from the result of Item 2.

Related

Putting a type in the Read typeclass doesn't work in the REPL

I'm defining a type GosperInteger, representing the Eisenstein integers in a complex base, and I'd like to enter these numbers in the REPL and do operations on them. So I put the type in the Read and Show typeclasses. Here's the code (there's also an Internals module, see https://github.com/phma/gosperbase to run it):
module Data.GosperBase where
import Data.Array.Unboxed
import Data.Word
import Data.GosperBase.Internals
import qualified Data.Sequence as Seq
import Data.Sequence ((><), (<|), (|>), Seq((:<|)), Seq((:|>)))
import Data.Char
import Data.List
import Data.Maybe
{- This computes complex numbers in base 2.5-√(-3/4), called the Gosper base
because it is the scale factor from one Gosper island to the next bigger one.
The digits are cyclotomic:
2 3
6 0 1
4 5
For layout of all numbers up to 3 digits, see doc/GosperBase.ps .
-}
newtype GosperInteger = GosperInteger (Seq.Seq Word)
chunkDigitsInt :: Seq.Seq Char -> Maybe (Seq.Seq (Seq.Seq Char))
-- ^If the string ends in 'G', reverses the rest of the characters
-- and groups them into chunks of digitsPerLimb.
chunkDigitsInt (as:|>'G') = Just (Seq.reverse (Seq.chunksOf (fromIntegral digitsPerLimb) (Seq.reverse as)))
chunkDigitsInt as = Nothing
parseChunkRjust :: Seq.Seq Char -> Maybe Word
parseChunkRjust Seq.Empty = Just 0
parseChunkRjust (n:<|ns) =
let ms = parseChunkRjust ns
in case ms of
Just num -> if (n >= '0' && n < '7')
then Just (7 * num + fromIntegral (ord n - ord '0'))
else Nothing
Nothing -> Nothing
showLimb :: Word -> Word -> String
showLimb _ 0 = ""
showLimb val ndig = chr (fromIntegral ((val `div` 7 ^ (ndig-1)) `mod` 7) + ord '0') : (showLimb val (ndig-1))
parseRjust :: Seq.Seq Char -> Maybe (Seq.Seq Word)
parseRjust as =
let ns = chunkDigitsInt as
in case ns of
Just chunks -> traverse parseChunkRjust chunks
Nothing -> Nothing
showRjust' :: Seq.Seq Word -> String
showRjust' Seq.Empty = ""
showRjust' (a:<|as) = (showLimb a digitsPerLimb) ++ (showRjust' as)
showRjust :: Seq.Seq Word -> String
showRjust Seq.Empty = "0"
showRjust (a:<|as) = (showLimb a (snd (msdPosLimb a))) ++ (showRjust' as)
parse1InitTail :: (String, String) -> Maybe (GosperInteger, String)
parse1InitTail (a,b) =
let aParse = parseRjust (Seq.fromList a)
in case aParse of
Just mant -> Just (GosperInteger mant,b)
Nothing -> Nothing
parseGosperInteger :: String -> [(GosperInteger, String)]
parseGosperInteger str =
let its = zip (inits str) (tails str) -- TODO stop on invalid char
in catMaybes (fmap parse1InitTail its)
instance Read GosperInteger where
readsPrec _ str = parseGosperInteger str
instance Show GosperInteger where
show (GosperInteger m) = showRjust m ++ "G"
iAdd :: GosperInteger -> GosperInteger -> GosperInteger
iAdd (GosperInteger a) (GosperInteger b) =
GosperInteger (stripLeading0 (addRjust a b))
iMult :: GosperInteger -> GosperInteger -> GosperInteger
iMult (GosperInteger a) (GosperInteger b) =
GosperInteger (stripLeading0 (mulMant a b))
I'd like to do
> 425G * 256301G
16061525G
which requires putting GosperInteger in the Num typeclass, which I haven't done yet.
Showing a number works, and calling read on a string works, but reading a number typed into the REPL does not. Why?
> read "45G" :: GosperInteger
45G
> 45G
<interactive>:2:3: error: Data constructor not in scope: G
It is not possible to do that in a proper way (you can probably bodge this by writing an odd Num instance).
I think a better approach would be to just write that num instance, then you can write:
ghci> 425 * 256301 :: GosperInteger
16061525
If you don't want to have to write that :: GosperInteger signature you can do a few things:
Use ghci> default (GosperInteger, Double) that will mean it will automatically pick your GosperInteger type if there is ambiguity. You can also use this in normal source files.
Define a function g :: GosperInteger -> GosperInteger; g = id which you can use to disambiguate manually with less syntactic overhead:
ghci> g (425 * 256301)
16061525
The GHCi repl doesn't simply call read on the text that you type in. Instead, it has a much more complicated parser that separates your text into various tokens. One type of token is numeric: any integral number you type in will get "read" as an Integer. Of course, if you type 32 and want it to be an Int, not an Integer, this would be a problem, so the Num type class has a super convenient fromInteger function. With this, an Integer token can be converted into any instance of the Num class.
But, you want something slightly different: you want the parser to group together the numeric token along with the G token and treat them as one unit. For full support, you'd need to make an extension to the GHC parser, much like how if you type 2e7 into the prompt, you correctly get a floating point number. This isn't a simple change you can address in your source file or GHCi settings.
With all that said, there are some hacks we can play with. As Noughtmare mentions, "you can probably bodge this by writing an odd Num instance", and indeed you can! Fair warning: you probably don't want to do this, but let's explore it anyway.
The problem is that the parser returned two tokens, one that's numeric and the other that's G. Since it's uppercase, that G token is being interpreted as a data constructor (your error message pointed that out too: " Data constructor not in scope: G"). The key is to use this to our advantage.
Consider the following:
data G = G
deriving Show
instance Num (G -> GosperInteger) where
fromInteger i G = integerToGosperInteger i
Now, assuming you wrote that function integerToGosperInteger, this instance would let you type, e.g., 45G and produce a GosperInteger 45G. Hurrah! You can even do 425G * 256301G and it will work as expected. Furthermore, if you cleverly omit a fromInteger definition from your Num GosperInteger class, then you'll get a runtime error if you try to simply use a number like 425 as as GosperInteger (that is, you'll get an error for implicit coercions that don't have the G).
There are some problems.
If you try this, you'll find that type inference is pretty terrible. It probably won't work right at the prompt unless you set default (GosperInteger, Double), and you'll probably want to use lots of type annotations in your source files.
If you leave out the G, you'll get terrible type error messages or, even worse, runtime errors.
You'll get a warning that your Num instance for G -> GosperInteger is incomplete. It is incomplete, but there's no sensible definitions for anything else. You could suppress the warning or set all of the missing methods to error "This isn't how this is supposed to be used" or something, but it's still a bit of a blemish in the code.
But, if you can deal with the problems and you squint hard enough, it sorta kinda gets you what you want.

Beginner Haskell, quickcheck generator

I want to generate an increasing number of lists i.e.
prelude>sample' incList
[[],[19],[6,110],[24,67,81]....]
How should I use vectorOf?
incList:: Gen [Integer]
incList=
do x<-vectorOf [0..] arbitrary
return x
I cant think of a way to just take out the first number from the list one at a time :/ Maybe something with fmap take 1, I dunno..
I think you here aim to do too much at once. Let us first construct a generator for a random list of Ordered objects with a given length in ascending order:
import Data.List(sort)
incList :: (Arbitrary a, Ord a) => Int -> Gen [a]
incList n = fmap sort (vectorOf n arbitrary)
Now we can construct a Generator that generates an endless list of lists by each time incrementing the size with one:
incLists :: (Arbitrary a, Ord a) => Gen [[a]]
incLists = mapM incList [0..]
We can then generate values from this Generator with generate :: Gen a -> IO [a]:
Prelude File> generate incLists :: IO [[Int]]
[[],[-19],[6,25],[-19,-14,15],[-4,6,20,28],[-23,-19,-6,-1,22],[-29,-21,-13,-9,-9,15],[-23,-15,-4,3,3,27,27],[-29,-29,-26,-25,18,19,23,27],[-24,-23,-16,-14,0,13,17,17,23],[-29,-15,-12,-4,-1,1,2,20,22,26],[-26,-24,-22,-16,-12,5,5,10,11,25,29],[-29,-28,-20,-14,-9,-7,-3,14,15,20,26,28],...]

Haskell - wrapping and unwrapping newtype wrappers - is there an easier way?

I'm writing a function pad that takes a list and pads it until it is a certain size. I tried 2 implementations:
pad :: Monoid a => Int -> [a] -> [a]
pad len list = replicate (len - length list) mempty ++ list
and
pad :: Int -> a -> [a] -> [a]
pad len value list = replicate (len - length list) value ++ list
The first one seems to be a logical usage of Monoid but calling it with lists of integers (or anything that is a Monoid in multiple ways) is a pain:
(fmap getSum) <$> pad 8 <$> (fmap Sum) <$> [1,2,3]
I don't really mind the extra typing, but it doesn't even seem to convey the meaning very well. How would you implement this function?
I'd probably use your second example, to be honest. Adding a Monoid constraint just to use mempty as a "default value" is overkill. It also sends the wrong message to users of this function, who may be confused about what you need mappend for when you really don't. They would also have to make a newtype and a Monoid instance if they wanted to pad with a different value.
Instead, consider changing the order of the arguments so that the value comes first. Then you can just apply it partially whenever you need to pad a lot of lists with the same value. You can also recover the first version with pad mempty if you need it.

Create a type that can contain an int and a string in either order

I'm following this introduction to Haskell, and this particular place (user defined types 2.2) I'm finding particularly obscure. To the point, I don't even understand what part of it is code, and what part is the thoughts of the author. (What is Pt - it is never defined anywhere?). Needless to say, I can't execute / compile it.
As an example that would make it easier for me to understand, I wanted to define a type, which is a pair of an Integer and a String, or a String and an Integer, but nothing else.
The theoretical function that would use it would look like so:
combine :: StringIntPair -> String
combine a b = (show a) ++ b
combine a b = a ++ (show b)
If you need a working code, that does the same, here's CL code for doing it:
(defgeneric combine (a b)
(:documentation "Combines strings and integers"))
(defmethod combine ((a string) (b integer))
(concatenate 'string a (write-to-string b)))
(defmethod combine ((a integer) (b string))
(concatenate 'string (write-to-string a) b))
(combine 100 "500")
Here's one way to define the datatype:
data StringIntPair = StringInt String Int |
IntString Int String
deriving (Show, Eq, Ord)
Note that I've defined two constructors for type StringIntPair, and they are StringInt and IntString.
Now in the definition of combine:
combine :: StringIntPair -> String
combine (StringInt s i) = s ++ (show i)
combine (IntString i s) = (show i) ++ s
I'm using pattern matching to match the constructors and select the correct behavior.
Here are some examples of usage:
*Main> let y = StringInt "abc" 123
*Main> let z = IntString 789 "a string"
*Main> combine y
"abc123"
*Main> combine z
"789a string"
*Main> :t y
y :: StringIntPair
*Main> :t z
z :: StringIntPair
A few things to note about the examples:
StringIntPair is a type; doing :t <expression> in the interpreter shows the type of an expression
StringInt and IntString are constructors of the same type
the vertical bar (|) separates constructors
a well-written function should match each constructor of its argument's types; that's why I've written combine with two patterns, one for each constructor
data StringIntPair = StringInt String Int
| IntString Int String
combine :: StringIntPair -> String
combine (StringInt s i) = s ++ (show i)
combine (IntString i s) = (show i) ++ s
So it can be used like that:
> combine $ StringInt "asdf" 3
"asdf3"
> combine $ IntString 4 "fasdf"
"4fasdf"
Since Haskell is strongly typed, you always know what type a variable has. Additionally, you will never know more. For instance, consider the function length that calculates the length of a list. It has the type:
length :: [a] -> Int
That is, it takes a list of arbitrary a (although all elements have the same type) and returns and Int. The function may never look inside one of the lists node and inspect what is stored in there, since it hasn't and can't get any informations about what type that stuff stored has. This makes Haskell pretty efficient, since, as opposed to typical OOP languages such as Java, no type information has to be stored at runtime.
To make it possible to have different types of variables in one parameter, one can use an Algebraic Data Type (ADT). One, that stores either a String and an Int or an Int and a String can be defined as:
data StringIntPair = StringInt String Int
| IntString Int String
You can find out about which of the two is taken by pattern matching on the parameter. (Notice that you have only one, since both the string and the in are encapsulated in an ADT):
combine :: StringIntPair -> String
combine (StringInt str int) = str ++ show int
combine (IntString int str) = show int ++ str

Generating an infinite sequence in Haskell

I know that infinite sequences are possible in Haskell - however, I'm not entirely sure how to generate one
Given a method
generate::Integer->Integer
which take an integer and produces the next integer in the sequence, how would I build an infinite sequence out of this?
If you want your sequence to start from 1 then it is -
iterate generate 1
Please notice that first letter of function is lowercase, not uppercase. Otherwise it would be data type, not function.
//edit: I just realized not just data types start with capital, it could be data constructor or type class as well, but that wasn't the point. :)
Adding to Matajon's answer: a way to discover the iterate function other than asking here would be to use Hoogle.
Hoogle's first answer for the query (a -> a) -> [a] is iterate.
Update (2023): Hoogle's scoring appears to have changed and iterate is no longer given with this query. It's full type has another a parameter and with the full type it is given in the results.
There are several ways to do it, but one is:
gen :: (a -> a) -> a -> [a]
gen f s = s : gen f (f s)
This function takes a functon f and some valus s and returns s, after wich it calls itself with that same f, and the result of f s. Demonstration:
Prelude> :t succ
succ :: (Enum a) => a -> a
Prelude> let gen f s = s : gen f (f s)
Prelude> take 10 $ gen succ 3
[3,4,5,6,7,8,9,10,11,12]
In the above example succ acts as the function generate :: Integer -> Integer which you mention. But observe that gen will work with any function of type a -> a.
Edit: and indeed, gen is identical to the function iterate from the Prelude (and Data.List).

Resources