is it possible to have a set comprehension in haskell? - haskell

In Haskell we have list generators, such as:
[x+y | x<-[1,2,3], y<-[1,2,3]]
with which we get
[2,3,4,3,4,5,4,5,6]
Is it possible to have a set generator which doesn't automatically add an element if it is already in the list?
In our example we would obtain:
[2,3,4,5,6]
If so, how? If it is not already implemented, how would you implement it?

Haskell can do this, but not quite out-of-the-box.
The basic underpinning is that a list comprehension can also be written as a monadic binding chain, in the list monad:
Prelude> [x+y | x<-[1,2,3], y<-[1,2,3]]
[2,3,4,3,4,5,4,5,6]
Prelude> [1,2,3] >>= \x -> [1,2,3] >>= \y -> return (x+y)
[2,3,4,3,4,5,4,5,6]
...or, with better readable do-syntax (which is syntactic sugar for monadic binding)
Prelude> do x<-[1,2,3]; y<-[1,2,3]; return (x+y)
[2,3,4,3,4,5,4,5,6]
In fact, there's a language extension that also turns all list comprehensions into syntactic sugar for such a monadic chain. Example in the tuple (aka writer) monad:
Prelude> :set -XMonadComprehensions
Prelude> [x+y | x <- ("Hello", 4), y <- ("World", 5)] :: (String, Int)
("HelloWorld",9)
So really, all we need is a set monad. This is sensible enough, however Data.Set.Set is not a monad on Hask (the category of all Haskell types) but only only the subcategory that satisfies the Ord constraint (which is needed for lookup / to avoid duplicates). In the case of sets, there is however a hack that allows hiding that constraint from the actual monad instance; it's used in the set-monad package. Et voilà:
Prelude Data.Set.Monad> [x+y | x<-fromList[1,2,3], y<-fromList[1,2,3]]
fromList [2,3,4,5,6]
The hack that's needed for instance Monad Set comes at a price. It works like this:
{-# LANGUAGE GADTs, RankNTypes #-}
import qualified Data.Set as DS
data Set r where
Prim :: (Ord r => DS.Set r) -> Set r
Return :: a -> Set a
Bind :: Set a -> (a -> Set b) -> Set b
...
What this means is: a value of type Data.Set.Monad.Set Int does not really contain a concrete set of integers. Rather, it contains an abstract syntax expression for a computation that yields a set as the result. This isn't great for performance, in particular it means that values won't be shared. So don't use this for big sets.
There's a better option: use it directly as a monad in the proper category (which only contains orderable types to begin with). This unfortunately requires even more language-bending, but it's possible; I've made an example in the constrained-categories library.

If your values can be put in Data.Set.Set (i.e. they are in class Ord) you can just apply Data.Set.toList . Data.Set.fromList to your list:
Prelude> import Data.Set
Prelude Data.Set> Data.Set.toList . Data.Set.fromList $ [x+y | x<-[1,2,3], y<-[1,2,3]]
[2,3,4,5,6]
The complexity of this would be O(n log n).
If the type obeys (Eq a, Hashable a), you can use Data.HashSet in much the same way. The average complexity is O(n).
If all you have is Eq, you have to get by with something like Data.List.nub:
Prelude> import Data.List
Prelude Data.List> nub [x+y | x<-[1,2,3], y<-[1,2,3]]
[2,3,4,5,6]
but the complexity is inherently quadratic.

Related

How can I turn a [TExp a] into a TExp [a], or otherwise apply refineTH to multiple values programatically?

I've been using refined for refinement types in Haskell recently, and have encountered a major usability problem. I can't figure out how to refine an entire list of values at compile time.
For example I can write:
{-# LANGUAGE TemplateHaskell #-}
import Refined
oneToThree :: [Refined Positive Int]
oneToThree = [$$(refineTH 1), $$(refineTH 2), $$(refineTH 3)]
But I can't do this precludes the ability of using range syntax, because Refined doesn't (for good reason) have an instance for Enum.
I would like to be able to do something like
oneToThree :: [Refined Positive Int]
oneToThree = $$(traverse refineTH [1..3])
but I can't get this to compile because I can't lift [TExp (Refined Positive Int)] into TExp [Refined Positive Int].
Is there template haskell magic that I missing that will let me do this?
Would also be open to suggestions for better lightweight refinement type libraries if someone has a suggestion.
sequenceQTExpList :: [Q (TExp a)] -> Q (TExp [a])
sequenceQTExpList [] = [|| [] ||]
sequenceQTExpList (x:xs) = [|| $$(x) : $$(sequenceQTExpList xs) ||]
Then use it as
$$(sequenceQTExpList $ map refineTH [1..3])
You're right that it feels like a traverse. The type is a bit off, though, with the extra Qs floating around. I don't see anything offhand that lets you combine those layers usefully.
Unfortunately, a lot of the mechanism used there is TH syntax rather than functions. There just isn't an obvious way to do both the lifting and the splicing as functions, so you're stuck writing bespoke helpers for each container type instead of getting to use Traversable. It's an interesting problem. If there's a clean solution, it'd have a good chance of making it into a future version of template Haskell if it was brought up to the maintainers. But I just don't see it right now.
This works (it needs to be in a different file than you use it in because of the stage restriction, though):
import Language.Haskell.TH.Syntax (Exp(ListE), TExp(TExp))
makeTypedTHList :: [TExp a] -> TExp [a]
makeTypedTHList xs = TExp $ ListE [x | TExp x <- xs]
You'd then use it like this:
{-# LANGUAGE TemplateHaskell #-}
import Refined
import AboveCodeInSeparateModuleBecauseOfStageRestriction (makeTypedTHList)
oneToThree :: [Refined Positive Int]
oneToThree = $$(makeTypedTHList <$> traverse refineTH [1..3])
However, calling the TExp constructor yourself subverts some of the safety of typed Template Haskell (although I think this particular case is safe). Ideally, I'd prefer an approach that didn't require doing that, but I can't think of one.

How to "iterate" over a function whose type changes among iteration but the formal definition is the same

I have just started learning Haskell and I come across the following problem. I try to "iterate" the function \x->[x]. I expect to get the result [[8]] by
foldr1 (.) (replicate 2 (\x->[x])) $ (8 :: Int)
This does not work, and gives the following error message:
Occurs check: cannot construct the infinite type: a ~ [a]
Expected type: [a -> a]
Actual type: [a -> [a]]
I can understand why it doesn't work. It is because that foldr1 has type signature foldr1 :: Foldable t => (a -> a -> a) -> a -> t a -> a, and takes a -> a -> a as the type signature of its first parameter, not a -> a -> b
Neither does this, for the same reason:
((!! 2) $ iterate (\x->[x]) .) id) (8 :: Int)
However, this works:
(\x->[x]) $ (\x->[x]) $ (8 :: Int)
and I understand that the first (\x->[x]) and the second one are of different type (namely [Int]->[[Int]] and Int->[Int]), although formally they look the same.
Now say that I need to change the 2 to a large number, say 100.
My question is, is there a way to construct such a list? Do I have to resort to meta-programming techniques such as Template Haskell? If I have to resort to meta-programming, how can I do it?
As a side node, I have also tried to construct the string representation of such a list and read it. Although the string is much easier to construct, I don't know how to read such a string. For example,
read "[[[[[8]]]]]" :: ??
I don't know how to construct the ?? part when the number of nested layers is not known a priori. The only way I can think of is resorting to meta-programming.
The question above may not seem interesting enough, and I have a "real-life" case. Consider the following function:
natSucc x = [Left x,Right [x]]
This is the succ function used in the formal definition of natural numbers. Again, I cannot simply foldr1-replicate or !!-iterate it.
Any help will be appreciated. Suggestions on code styles are also welcome.
Edit:
After viewing the 3 answers given so far (again, thank you all very much for your time and efforts) I realized this is a more general problem that is not limited to lists. A similar type of problem can be composed for each valid type of functor (what if I want to get Just Just Just 8, although that may not make much sense on its own?).
You'll certainly agree that 2 :: Int and 4 :: Int have the same type. Because Haskell is not dependently typed†, that means foldr1 (.) (replicate 2 (\x->[x])) (8 :: Int) and foldr1 (.) (replicate 4 (\x->[x])) (8 :: Int) must have the same type, in contradiction with your idea that the former should give [[8]] :: [[Int]] and the latter [[[[8]]]] :: [[[[Int]]]]. In particular, it should be possible to put both of these expressions in a single list (Haskell lists need to have the same type for all their elements). But this just doesn't work.
The point is that you don't really want a Haskell list type: you want to be able to have different-depth branches in a single structure. Well, you can have that, and it doesn't require any clever type system hacks – we just need to be clear that this is not a list, but a tree. Something like this:
data Tree a = Leaf a | Rose [Tree a]
Then you can do
Prelude> foldr1 (.) (replicate 2 (\x->Rose [x])) $ Leaf (8 :: Int)
Rose [Rose [Leaf 8]]
Prelude> foldr1 (.) (replicate 4 (\x->Rose [x])) $ Leaf (8 :: Int)
Rose [Rose [Rose [Rose [Leaf 8]]]]
†Actually, modern GHC Haskell has quite a bunch of dependently-typed features (see DaniDiaz' answer), but these are still quite clearly separated from the value-level language.
I'd like to propose a very simple alternative which doesn't require any extensions or trickery: don't use different types.
Here is a type which can hold lists with any number of nestings, provided you say how many up front:
data NestList a = Zero a | Succ (NestList [a]) deriving Show
instance Functor NestList where
fmap f (Zero a) = Zero (f a)
fmap f (Succ as) = Succ (fmap (map f) as)
A value of this type is a church numeral indicating how many layers of nesting there are, followed by a value with that many layers of nesting; for example,
Succ (Succ (Zero [['a']])) :: NestList Char
It's now easy-cheesy to write your \x -> [x] iteration; since we want one more layer of nesting, we add one Succ.
> iterate (\x -> Succ (fmap (:[]) x)) (Zero 8) !! 5
Succ (Succ (Succ (Succ (Succ (Zero [[[[[8]]]]])))))
Your proposal for how to implement natural numbers can be modified similarly to use a simple recursive type. But the standard way is even cleaner: just take the above NestList and drop all the arguments.
data Nat = Zero | Succ Nat
This problem indeed requires somewhat advanced type-level programming.
I followed #chi's suggestion in the comments, and searched for a library that provided inductive type-level naturals with their corresponding singletons. I found the fin library, which is used in the answer.
The usual extensions for type-level trickery:
{-# language DataKinds, PolyKinds, KindSignatures, ScopedTypeVariables, TypeFamilies #-}
Here's a type family that maps a type-level natural and an element type to the type of the corresponding nested list:
import Data.Type.Nat
type family Nested (n::Nat) a where
Nested Z a = [a]
Nested (S n) a = [Nested n a]
For example, we can test from ghci that
*Main> :kind! Nested Nat3 Int
Nested Nat3 Int :: *
= [[[[Int]]]]
(Nat3 is a convenient alias defined in Data.Type.Nat.)
And here's a newtype that wraps the function we want to construct. It uses the type family to express the level of nesting
newtype Iterate (n::Nat) a = Iterate { runIterate :: (a -> [a]) -> a -> Nested n a }
The fin library provides a really nifty induction1 function that lets us compute a result by induction on Nat. We can use it to compute the Iterate that corresponds to every Nat. The Nat is passed implicitly, as a constraint:
iterate' :: forall n a. SNatI n => Iterate (n::Nat) a
iterate' =
let step :: forall m. SNatI m => Iterate m a -> Iterate (S m) a
step (Iterate recN) = Iterate (\f a -> [recN f a])
in induction1 (Iterate id) step
Testing the function in ghci (using -XTypeApplications to supply the Nat):
*Main> runIterate (iterate' #Nat3) pure True
[[[[True]]]]

How can Data.IntMap.Strict be kept strict after a traversal?

Data.IntMap.Strict docs say:
Be aware that the Functor, Traversable and Data instances are the same
as for the Data.IntMap.Lazy module, so if they are used on strict
maps, the resulting maps will be lazy.
I use IntMap.traverseWithKey (Functor f => Applicative f) because I want a mapWithKey and a maprWithKey which doesn't exist. Instead I use the Backwards functor on a strict map. How can I keep the map strict after an Applicative use?
You can use mapAccumWithKey and mapAccumRWithKey, without using the accumulator argument. After optimization it's most likely exactly as fast as a mapWithKey function would be.
EDIT: if you're doing monadic traversals, and would like to make the possibly resulting IntMap-s strict, then you can achieve that by returning strict values inside the monadic action.
import Data.IntMap.Strict
import Control.Applicative
m :: IntMap Int
m = fromList $ zip [0..] (replicate 10 0)
traverse (\n -> Just (n + 100)) m returns a Just m map that contains n + 100 thunks. traverse (\n -> Just $! n + 100) m returns a map containing evaluated Int-s. Likewise, use return $! x in other monads to get strict results.

How do I implement this function without using mixed lists?

Let's say I have a function that takes a list of function-list pairs, and maps the function in each pair onto the list in the pair, for example:
myFunction [("SOME"++,["DAY","ONE"]), (show,[1,2])] == [["SOMEDAY", "SOMEONE"],["1","2"]]
Is there a way of implementing myFunction so that the code I provided above will work as is without any modifications?
My problem is I can't figure out how to implement myFunction because the types of each sub-list could be different (in my example I have a list of strings ["DAY", ONE"], and a list of numbers: [1,2]). I know that each function in the list will convert its list into a list of strings (so the final list will have type [[Char]]), but I don't know how to express this in Haskell.
You can do it with existential types
{-# LANGUAGE ExistentialQuantification #-}
data T = forall a b. Show b => (:?:) (a -> b) [a]
table =
[ ("SOME"++) :?: ["DAY","ONE"]
, (show) :?: [1,2]
, (+1) :?: [2.9, pi]
]
And run it as:
apply :: T -> String
apply (f :?: xs) = show $ map f xs
main = print $ map apply table
You want to use existential quantification to define a type that can hold any value as long as it is a member of the Show typeclass. For example:
{-# LANGUAGE ExistentialQuantification #-}
data S = forall a. Show a => S a
instance Show S where
show (S s) = show s
f :: [S] -> [String]
f xs = map show xs
And now in ghci:
*Main> f [S 1, S True, S 'c']
["1","True","'c'"]
You won't be able to run the code in your question without modification, because it contains a heterogeneous list, which the Haskell type system forbids. Instead you can wrap heterogeneous types up as a variant type (if you know in advance all the types that will be required) or as an existentially quantified type (if you don't know what types will be required, but you do know a property that they must satisfy).

Random-Pivot Quicksort in Haskell

Is it possible to implement a quicksort in Haskell (with RANDOM-PIVOT) that still has a simple Ord a => [a]->[a] signature?
I'm starting to understand Monads, and, for now, I'm kind of interpreting monads as somethink like a 'command pattern', which works great for IO.
So, I understand that a function that returns a random number should actually return a monadic value like IO, because, otherwise, it would break referential transparency. I also understand that there should be no way to 'extract' the random integer from the returned monadic value, because, otherwise, it would, again, break referential transparency.
But yet, I still think that it should be possible to implement a 'pure' [a]->[a] quicksort function, even if it uses random pivot, because, it IS referential transparent. From my point of view, the random pivot is just a implementation detail, and shouldn't change the function's signature
OBS: I'm not actually interested in the specific quicksort problem (so, I don't want to sound rude but I'm not looking for "use mergesort" or "random pivot doesn't increase performance in practice" kind of answers) I'm actually interested in how to implement a 'pure' function that uses 'impure' functions inside it, in cases like quicksort, where I can assure that the function actually is a pure one.
Quicksort is just a good example.
You are making a false assumption that picking the pivot point is just an implementation detail. Consider a partial ordering on a set. Like a quicksort on cards where
card a < card b if the face value is less but if you were to evaluate booleans:
4 spades < 4 hearts (false)
4 hearts < 4 spades (false)
4 hearts = 4 spades (false)
In that case the choice of pivots would determine the final ordering of the cards. In precisely the same way
for a function like
a = get random integer
b = a + 3
print b
is determined by a. If you are randomly choosing something then your computation is or could be non deterministic.
OK, check this out.
Select portions copied form the hashable package, and voodoo magic language pragmas
{-# LANGUAGE FlexibleInstances, UndecidableInstances, NoMonomorphismRestriction, OverlappingInstances #-}
import System.Random (mkStdGen, next, split)
import Data.List (foldl')
import Data.Bits (shiftL, xor)
class Hashable a where
hash :: a -> Int
instance (Integral a) => Hashable a where
hash = fromIntegral
instance Hashable Char where
hash = fromEnum
instance (Hashable a) => Hashable [a] where
hash = foldl' combine 0 . map hash
-- ask the authors of the hashable package about this if interested
combine h1 h2 = (h1 + h1 `shiftL` 5) `xor` h2
OK, so now we can take a list of anything Hashable and turn it into an Int. I've provided Char and Integral a instances here, more and better instances are in the hashable packge, which also allows salting and stuff.
This is all just so we can make a number generator.
genFromHashable = mkStdGen . hash
So now the fun part. Let's write a function that takes a random number generator, a comparator function, and a list. Then we'll sort the list by consulting the generator to select a pivot, and the comparator to partition the list.
qSortByGen _ _ [] = []
qSortByGen g f xs = qSortByGen g'' f l ++ mid ++ qSortByGen g''' f r
where (l, mid, r) = partition (`f` pivot) xs
pivot = xs !! (pivotLoc `mod` length xs)
(pivotLoc, g') = next g
(g'', g''') = split g'
partition f = foldl' step ([],[],[])
where step (l,mid,r) x = case f x of
LT -> (x:l,mid,r)
EQ -> (l,x:mid,r)
GT -> (l,mid,x:r)
Library functions: next grabs an Int from the generator, and produces a new generator. split forks the generator into two distinct generators.
My functions: partition uses f :: a -> Ordering to partition the list into three lists. If you know folds, it should be quite clear. (Note that it does not preserve the initial ordering of the elements in the sublists; it reverses them. Using a foldr could remedy this were it an issue.) qSortByGen works just like I said before: consult the generator for the pivot, partition the list, fork the generator for use in the two recursive calls, recursively sort the left and right sides, and concatenate it all together.
Convenience functions are easy to compose from here
qSortBy f xs = qSortByGen (genFromHashable xs) f xs
qSort = qSortBy compare
Notice the final function's signature.
ghci> :t qSort
qSort :: (Ord a, Hashable a) => [a] -> [a]
The type inside the list must implement both Hashable and Ord. There's the "pure" function you were asking for, with one logical added requirement. The more general functions are less restrictive in their requirements.
ghci> :t qSortBy
qSortBy :: (Hashable a) => (a -> a -> Ordering) -> [a] -> [a]
ghci> :t qSortByGen
qSortByGen
:: (System.Random.RandomGen t) =>
t -> (a -> a -> Ordering) -> [a] -> [a]
Final notes
qSort will behave exactly the same way for all inputs. The "random" pivot selection is. in fact, deterministic. But it is obscured by hashing the list and then seeding a random number generator, making it "random" enough for me. ;)
qSort also only works for lists with length less than maxBound :: Int, which ghci tells me is 9,223,372,036,854,775,807. I thought there would be an issue with negative indexes, but in my ad-hoc testing I haven't run into it yet.
Or, you can just live with the IO monad for "truer" randomness.
qSortIO xs = do g <- getStdGen -- add getStdGen to your imports
return $ qSortByGen g compare xs
ghci> :t qSortIO
qSortIO :: (Ord a) => [a] -> IO [a]
ghci> qSortIO "Hello world"
" Hdellloorw"
ghci> qSort "Hello world"
" Hdellloorw"
In such cases, where you know that the function is referentially transparent, but you can't proof it to the compiler, you may use the function unsafePerformIO :: IO a -> a from the module Data.Unsafe.
For instance, you may use unsafePerformIO to get an initial random state and then do anything using just this state.
But please notice: Don't use it if it's not really needed. And even then, think twice about it. unsafePerformIO is somewhat the root of all evil, since it's consequences can be dramatical - anything is possible from coercing different types to crashing the RTS using this function.
Haskell provides the ST monad to perform non-referentially-transparent actions with a referentially transparent result.
Note that it doesn't enforce referential transparency; it just insures that potentially non-referentially-transparent temporary state can't leak out. Nothing can prevent you from returning manipulated pure input data that was rearranged in a non-reproducible way. Best is to implement the same thing in both ST and pure ways and use QuickCheck to compare them on random inputs.

Resources