Infinite (finally-periodic) HList in Haskell - haskell

let's say I have an infinite sequence of actions, each of which returns the result of a certain type. Something like:
newtype Stream a = Stream (IO (a, Stream a))
But with a varying over time. I want to strongly type this sequence. It's obviously does not make sence for arbitrary infinite type sequence and naive approach such that:
data HStream :: [u] -> * where Cons :: Proxy x -> HStream xs -> HStream (x ': xs)
infiniteInt = Cons (Proxy :: Proxy Int) infiniteInt
will lead to an infinite type, which is not supported by Haskell's type system. But I don't see nothing wrong with a finally-periodic HLists (i.e. such what type sequence will repeat itself from some point: [Bool, Int, Int, Sting, Int, Sting, Int, Sting ... ]). And I also think that if we have some strongly normalizing way to describe infinite type or some way to provide an evidence of infinite type equality which can be checked in finite number of steps, it should be possible to typecheck program with such infinite types.
Does anyone have any idea how such types can be represented and used in Haskell? Let's start from infinite finally-periodic hlist for now, but I will also appreciate if someone has an idea how it can be generalized for wider class of infinite tupes and where generalization limits lays.

Make HLists infinite and periodic with this One Cool Trick!
When you add an element to your periodic heterogeneous stream, don't extend the list of types by which it's indexed. Rotate it.
type family Append x xs where
Append x '[] = '[x]
Append x (y ': xs) = y ': Append x xs
infixr 5 :::
data HStream as where
(:::) :: { headHS :: a, tailHS :: HStream (Append a as) } -> HStream (a ': as)
myHStream :: HStream '[Char, Bool, Int]
myHStream = 'c' ::: True ::: 3 ::: 'x' ::: False ::: -5 ::: myHStream

One general option is to switch from an HList, which encodes the types of all the elements, to a type-aligned list (or, more generally, a type-aligned sequence), which only ensures transitions along valid paths.
data TAList c x z where
Nil :: TAList c x x
Cons :: c x y -> TAList c y z -> TAList c x z
So you could encode your transitions, with some care, using a possibility-large GADT for c and an appropriate kind of your choice for x and z. Infinite type-aligned lists are no problem, because they're polymorphic in their final type argument.
You could probably use a McBride-style indexing scheme instead of an Atkey one to get more flexibility, at the cost of more complexity.

Related

Design options for constructor constraints: GADT compare PatternSynonym Required

(This is a follow-up to this answer, trying to get the q more precise.)
Use Case Constructors to build/access a Set datatype. Being a set, the invariant is 'no duplicates'. To implement that I need an Eq constraint on the element type. (More realistically, the set might be implemented as a BST or hash-index, which'll need a more restrictive constraint; using Eq here to keep it simple.)
I want to disallow building even an empty set with an unacceptable type.
" it's now considered bad practice to require a constraint for an operation (data type construction or destruction) that does not need the constraint. Instead, the constraints should be moved closer to the "usage site".", to quote that answer.
OK so there's no need to get the constraint 'built into' the data structure. And operations that access/deconstruct (like showing or counting elements) won't necessarily need Eq.
Then consider two (or rather five) possible designs:
(I'm aware some of the constraints could be achieved via deriving, esp Foldable to get elem. But I'll hand-code here, so I can see what minimal constraints GHC wants.)
Option 1: No constraints on datatype
data NoCSet a where -- no constraints on the datatype
NilSet_ :: NoCSet a
ConsSet_ :: a -> NoCSet a -> NoCSet a
Option 1a. use PatternSynonym as 'smart constructor'
pattern NilSet :: (Eq a) => () => NoCSet a
pattern NilSet = NilSet_
pattern ConsSet x xs <- ConsSet_ x xs where
ConsSet x xs | not (elemS x xs) = ConsSet_ x xs
elemS x NilSet_ = False
elemS x (ConsSet_ y ys) | x == y = True -- infers (Eq a) for elemS
| otherwise = elemS x ys
GHC infers the constraint elemS :: Eq t => t -> NoCSet t -> Bool. But it doesn't infer a constraint for ConsSet that uses it. Rather, it rejects that definition:
* No instance for (Eq a) arising from a use of `elemS'
Possible fix:
add (Eq a) to the "required" context of
the signature for pattern synonym `ConsSet'
Ok I'll do that, with an explicitly empty 'Provided' constraint:
pattern ConsSet :: (Eq a) => () => ConsType a -- Req => Prov'd => type; Prov'd is empty, so omittable
Consequently inferred type (\(ConsSet x xs) -> x) :: Eq a => NoCSet a -> a, so the constraint 'escapes' from the destructor (also from elemS), whether or not I need it at the "usage site".
Option 1b. Pattern synonym wrapping a GADT constructor as 'smart constructor'
data CSet a where CSet :: Eq a => NoCSet a -> CSet a -- from comments to the earlier q
pattern NilSetC = CSet NilSet_ -- inferred Eq a provided
pattern ConsSetC x xs <- CSet (ConsSet_ x xs) where -- also inferred Eq a provided
ConsSetC x xs | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer pattern ConsSetC :: () => Eq a => a -> NoCSet a -> CSet a a Provided constraint, but empty Required.
Inferred (\(ConsSetC x xs) -> x) :: CSet p -> p, so the constraint doesn't escape from a "usage site".
But there's a bug: to Cons an element, I need to unwrap the NoCSet inside the CSet in the tail then re-wrap a CSet. And trying to do that with the ConsSetC pattern alone is ill typed. Instead:
insertCSet x (CSet xs) = ConsSetC x xs -- ConsSetC on rhs, to check for duplicates
As 'smart constructors' go, that's dumb. What am I doing wrong?
Inferred insertCSet :: a -> CSet a -> CSet a, so again the constraint doesn't escape.
Option 1c. Pattern synonym wrapping a GADT constructor as 'smarter constructor'
Same setup as option 1b, except this monster as ViewPattern for the Cons pattern
pattern ConsSetC2 x xs <- ((\(CSet (ConsSet_ x' xs')) -> (x', CSet xs')) -> (x, xs)) where
ConsSetC2 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer pattern ConsSetC2 :: a -> CSet a -> CSet a with no constraint at all. I'm nervous. But it does correctly reject attempts to build a set with duplicates.
Inferred (\(ConsSetC2 x xs) -> x) :: CSet a -> a, so the constraint that isn't there doesn't escape from a "usage site".
Edit: ah, I can get a somewhat less monstrous ViewPattern expression to work
pattern ConsSetC3 x xs <- (CSet (ConsSet_ x (CSet -> xs))) where
ConsSetC3 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
Curiously inferred pattern ConsSetC3 :: () => Eq a => a -> CSet a -> CSet a -- so the Provided constraint is visible, unlike with ConsSetC2, even though they're morally equivalent. It does reject attempts to build a set with duplicates.
Inferred (\(ConsSetC3 x xs) -> x) :: CSet p -> p, so that constraint that is there doesn't excape from "usage sites".
Option 2: GADT constraints on datatype
data GADTSet a where
GADTNilSet :: Eq a => GADTSet a
GADTConsSet :: Eq a => a -> GADTSet a -> GADTSet a
elemG x GADTNilSet = False
elemG x (GADTConsSet y ys) | x == y = True -- no (Eq a) 'escapes'
| otherwise = elemG x ys
GHC infers no visible constraint elemG :: a -> GADTSet a -> Bool; (\(GADTConsSet x xs) -> x) :: GADTSet p -> p.
Option 2a. use PatternSynonym as 'smart constructor' for the GADT
pattern ConsSetG x xs <- GADTConsSet x xs where
ConsSetG x xs | not (elemG x xs) = GADTConsSet x xs -- does infer Provided (Eq a) for ConsSetG
GHC doesn't complain about the lack of signature, does infer pattern ConsSetG :: () => Eq a => a -> GADTSet a -> GADTSet a a Provided constraint, but empty Required.
Inferred (\(ConsSetG x xs) -> x) :: GADTSet p -> p, so the constraint doesn't escape from a "usage site".
Option 2b. define an insert function
insertGADTSet x xs | not (elemG x xs) = GADTConsSet x xs -- (Eq a) inferred
GHC infers insertGADTSet :: Eq a => a -> GADTSet a -> GADTSet a; so the Eq has escaped, even though it doesn't escape from elemG.
Questions
With insertGADTSet, why does the constraint escape? It's only needed for the elemG check, but elemG's type doesn't expose the constraint.
With constructors GADTConsSet, GADTNilSet, there's a constraint wrapped 'all the way down' the data structure. Does that mean the data structure has a bigger memory footprint than with ConsSet_, NilSet_?
With constructors GADTConsSet, GADTNilSet, it's the same type a 'all the way down'. Is the same Eq a dictionary repeated at each node? Or shared?
By comparison, pattern ConsSetC/constructor CSet/Option 1b wraps only a single dictionary(?), so it'll have a smaller memory footprint than a GADTSet structure(?)
insertCSet has a performance hit of unwrapping and wrapping CSets?
ConsSetC2 in the build direction seems to work; there's a performance hit in unwrapping and wrapping CSets? But worse there's a unwrapping/wrapping performance hit in accessing/walking the nodes?
(I'm feeling there's no slam-dunk winner amongst these options for my use case.)
I don't think there is any realistic scenario where it is important that you disallow the creation of an empty set with an "unacceptable" type. It is at least partly because of lack of such realistic scenarios that DatatypeContexts is considered bad practice. Seriously, try to imagine how such a restriction could possibly help avoid real-world programming errors. That is, try to imagine how someone using your types and functions might (1) write an erroneous program that (2) "accidentally" uses a set of, say, functions and yet (3) somehow gets it to type check in a manner that (4) could have been caught if only there'd been an extra Eq constraint on NilSet. As soon as a programmer tries to do anything with that set that makes it non-empty (i.e., anything that needs Eq functionality), it won't type check, so what exactly are you trying to prevent? You want to stop someone who only needs empty sets of functions from using your types? Why? Is it spite? ... It is spite, isn't it?
Getting down to your various options, putting the constraint in a GADT is inappropriate and unnecessary to my mind. The point of constraints in GADTs is to allow destruction to dynamically and/or conditionally bring an instance dictionary into scope, based on a runtime case match. You do not need this functionality. In particular, you do not need the overhead of a dictionary in every one of your Cons nodes as per option 2. However, you also don't need a dictionary in the data type as per option 1(b). It's better to use the normal non-GADT mechanisms of passing dictionaries to functions instead of carrying them around in the data types. I expect you'll be missing many opportunities for specialization and optimization if you try option 1(b). Partly this may be because GADTs are intrinsically harder to optimize, but there's also much less work that's been put into optimizing code using GADTs than code using non-GADTs. Some of your questions suggest you're very concerned about small performance gains. If so, it's generally a good idea to stay well away from GADTs!
Option 1(a) is a reasonable solution. Without the unnecessary Eq constraint on Nil, and folding the insert function into the pattern definition, you get something like:
{-# LANGUAGE PatternSynonyms #-}
data Set a = Nil | Cons_ a (Set a) deriving (Show)
pattern Cons :: (Eq a) => a -> Set a -> Set a
pattern Cons x xs <- Cons_ x xs
where Cons x xs = go xs
where go Nil = Cons_ x xs
go (Cons_ y ys) | x == y = xs
| otherwise = go ys
which seems like a perfectly idiomatic, straightforward, smart constructor implementation using patterns, as designed.
Indeed, it's unfortunately that the constraint applies to destruction, when it isn't really needed. Ideally, GHC would allow constraints for use of Cons as a constructor to be specified separately, instead of assuming they're the combination of the required and provided constraints for destruction.
This would allow us to write something like:
pattern Cons :: a -> Set a -> Set a
pattern Cons x xs <- Cons_ x xs
where Cons :: Eq a => a -> Set a -> Set a -- <== only need Eq here
Cons x Nil = Cons_ x Nil
Cons x rest#(Cons_ y xs) | x == y = rest
| otherwise = Cons_ y (Cons x xs)
and then usages of Cons as a destructor could be constraint-free while uses as a constructor could take advantage of the Eq a constraint. I see this as a limitation of PatternSynonyms rather than an indication that adding unnecessary constraints a la DatatypeContexts is actually good programming practice. It looks like at least a few other people agree that this is a bug not a feature.
To your first four questions:
In option 2(b), insertGADTSet needs an Eq a dictionary to insert into the dictionary slot in the GADTConsSet constructor on the RHS. So, the Eq a constraint comes from the use of the GADTConsSet constructor.
Yes, GADT constraints become additional fields in the data type. In option 2, a pointer to the Eq a dictionary is added to every node in your set.
The dictionary itself is shared, but each node includes its own pointer to the dictionary.
Yes, for the CSet type, there's only one pointer to the dictionary per CSet value.
I believe that option 1b is your best bet. Of course, I may be biased, as I'm the one which suggested it on your other question.
To address the issue you've pointed out with your pattern synonym, let us imagine we don't have pattern synonyms. How might we deconstruct a Cons set?
One way is to write a method with this signature:
openSetC :: CSet a -> (Eq a => a -> CSet a -> r) -> (Eq a => r) -> r
Which says that given:
a CSet a,
a function which takes: a proof that Eq a, an a, and another CSet a, and returns some arbitrary type,
and a function which takes a proof that Eq a and returns the same arbitrary type,
we can produce a value of that arbitrary type. Since the type is arbitrary, we know that the values comes from calling the function, or from the given value of that type. The contract of this function is that it invokes the first function and return its result if and only if the set is ConsSet_, otherwise, if it is NilSet_, it invokes the second function.
If you squint a little, you can see this function is in a sense "equivalent" to pattern matching on CSet. You don't need pattern matching at all anymore; you can do everything with this function that you can do with pattern matching.
It's implementation is quite trivial:
openSetC (CSet (ConsSet_ x xs)) k _ = k x (CSet xs)
openSetC (CSet NilSet_) _ z = z
Consider now a different form of this function, which accomplishes all the same things, but is maybe a bit easier to use.
Note that the type forall r . (a -> r) -> (b -> r) -> r is isomorphic to Either a b. Also note that x0 -> y0 -> r is isomorphic (or close enough) to (x0, y0) -> r. And finally note that C a => r is isomorphic to Dict (C a) -> r where:
data Dict c where Dict :: c => Dict c
If we exploit these isomorphisms, we can write openSetC differently as:
openSetC' :: CSet a -> Either (a, CSet a, Dict (Eq a)) (Dict (Eq a))
openSetC' (CSet (ConsSet_ x xs)) = Left (x, CSet xs, Dict)
openSetC' (CSet NilSet_) = Right Dict
Now the fun part: using ViewPatterns, we can use this function directly to easily write the pattern with the signature you want. It's easy only because we've set up the type of openSetC' to match with the type of the pattern you want:
pattern ConsSetC :: () => Eq a => a -> CSet a -> CSet a
pattern ConsSetC x xs <- (openSetC' -> Left (x, xs, Dict))
-- included for completeness, but the expression form of the pattern synonym is not at issue here
where
ConsSetC x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
| otherwise = CSet xs
As for the rest of your questions, I would strongly suggest splitting them up into different posts so they could all have focused answers.

Conditions on list comprehension using Haskell and SBV

I want to write a Haskell list comprehension with a condition on symbolic expressions (SBV). I reproduced the problem with the following small example.
import Data.SBV
allUs :: [SInteger]
allUs = [0,1,2]
f :: SInteger -> SBool
f 0 = sTrue
f 1 = sFalse
f 2 = sTrue
someUs :: [SInteger]
someUs = [u | u <- allUs, f u == sTrue]
with show someUs, this gives the following error
*** Data.SBV: Comparing symbolic values using Haskell's Eq class!
***
*** Received: 0 :: SInteger == 0 :: SInteger
*** Instead use: 0 :: SInteger .== 0 :: SInteger
***
*** The Eq instance for symbolic values are necessiated only because
*** of the Bits class requirement. You must use symbolic equality
*** operators instead. (And complain to Haskell folks that they
*** remove the 'Eq' superclass from 'Bits'!.)
CallStack (from HasCallStack):
error, called at ./Data/SBV/Core/Symbolic.hs:1009:23 in sbv-8.8.5-IR852OLMhURGkbvysaJG5x:Data.SBV.Core.Symbolic
Changing the condition into f u .== sTrue also gives an error
<interactive>:8:27: error:
• Couldn't match type ‘SBV Bool’ with ‘Bool’
Expected type: Bool
Actual type: SBool
• In the expression: f u .== sTrue
In a stmt of a list comprehension: f u .== sTrue
In the expression: [u | u <- allUs, f u .== sTrue]
How to get around this problem?
Neither your f nor your someUs are symbolically computable as written. Ideally, these should be type-errors, rejected out-of-hand. This is due to the fact that symbolic values cannot be instances of the Eq class: Why? Because determining equality of symbolic values requires a call to the underlying solver; so the result cannot be Bool; it should really be SBool. But Haskell doesn't allow generalized guards in pattern-matching to allow for that possibility. (And there are good reasons for that too, so it's not really Haskell's fault here. It's just that the two styles of programming don't work well all that great together.)
You can ask why SBV makes symbolic values an instance of the Eq class. The only reason why it's an instance of Eq is what the error message is telling you: Because we want them to be instances of the Bits class; which has Eq as a superclass requirement. But that's a whole another discussion.
Based on this, how can you write your functions in SBV? Here's how you'd code f in the symbolic style:
f :: SInteger -> SBool
f i = ite (i .== 0) sTrue
$ ite (i .== 1) sFalse
$ ite (i .== 2) sTrue
$ sFalse -- arbitrarily filled to make the function total
Ugly, but this is the only way to write it unless you want to play some quasi-quoting tricks.
Regarding someUs: This isn't something you can directly write symbolically either: This is known as a spine-concrete list. And there's no way for SBV to know how long your resulting list would be without actually running the solver on individual elements. In general you cannot do filter like functions on a spine-concrete list with symbolic elements.
The solution is to use what's known as a symbolic list and a bounded-list abstraction. This isn't very satisfactory, but is the best you can do to avoid termination problems:
{-# LANGUAGE OverloadedLists #-}
import Data.SBV
import Data.SBV.List
import Data.SBV.Tools.BoundedList
f :: SInteger -> SBool
f i = ite (i .== 0) sTrue
$ ite (i .== 1) sFalse
$ ite (i .== 2) sTrue
$ sFalse -- arbitrarily filled to make the function total
allUs :: SList Integer
allUs = [0,1,2]
someUs :: SList Integer
someUs = bfilter 10 f allUs
When I run this, I get:
*Main> someUs
[0,2] :: [SInteger]
But you'll ask what's that number 10 in the call to bfilter? Well, the idea is that all lists are assumed to have some sort of an upper bound on their length, and the Data.SBV.Tools.BoundedList exports a bunch of methods to deal with them easily; all taking a bound parameter. So long as the inputs are at most this length long, they'll work correctly. There's no guarantee as to what happens if your list is longer than the bound given. (In general it'll chop off your lists at the bound, but you should not rely on that behavior.)
There's a worked-out example of uses of such lists in coordination with BMC (bounded-model-checking) at https://hackage.haskell.org/package/sbv-8.12/docs/Documentation-SBV-Examples-Lists-BoundedMutex.html
To sum up, dealing with lists in a symbolic context comes with some costs in modeling and how much you can do, due to restrictions in Haskell (where Bool is a fixed type instead of a class), and underlying solvers, which cannot deal with recursively defined functions all that well. The latter is mainly due to the fact that such proofs require induction, and SMT-solvers cannot do induction out-of-the-box. But if you follow the rules of the game using BMC like ideas, you can handle practical instances of the problem up to reasonable bounds.
(.==) takes two instances of EqSymbolic, returning an SBool. Inside a list comprehension, conditionals are implemented using the guard function.
Here's what it looks like:
guard :: Alternative f => Bool -> f ()
guard False = empty
guard True = pure ()
For lists, empty is [], and pure () returns a singleton list [()]. Any member of the list that evaluates to False will return an empty list instead of a unit item, excluding it from computations down the chain.
[True, False, True] >>= guard
= concatMap guard [True, False, True]
= concat $ map guard [True, False, True]
= concat $ [[()], [], [()]]
= [(), ()]
The second branch is then excluded when the context is flattened, so it's "pruned" from the computation.
It seems like you have two problems here - when you pattern match in f, you're doing a comparison using the Eq class. That's where the SBV error is coming from. Since your values are close together, you could use select, which takes a list of items, a default, an expression which evaluates to an index, and attempt to take the indexth item from that list.
You could rewrite f as
f :: SInteger -> SBool
f = select [sTrue, sFalse, sTrue] sFalse
The second problem is that guards explicitly look for Bool, but (.==) still returns an SBool. Looking at Data.SBV, you should be able to coerce that into a regular Bool using unliteral, which attempts to unwrap an SBV value into an equivalent Haskell one.
fromSBool :: SBool -> Bool
fromSBool = fromMaybe False . unliteral
someUs :: [SInteger]
someUs = [u | u <- allUs, fromSBool (f u)]
-- [0 :: SInteger, 2 :: SInteger]

How to "iterate" over a function whose type changes among iteration but the formal definition is the same

I have just started learning Haskell and I come across the following problem. I try to "iterate" the function \x->[x]. I expect to get the result [[8]] by
foldr1 (.) (replicate 2 (\x->[x])) $ (8 :: Int)
This does not work, and gives the following error message:
Occurs check: cannot construct the infinite type: a ~ [a]
Expected type: [a -> a]
Actual type: [a -> [a]]
I can understand why it doesn't work. It is because that foldr1 has type signature foldr1 :: Foldable t => (a -> a -> a) -> a -> t a -> a, and takes a -> a -> a as the type signature of its first parameter, not a -> a -> b
Neither does this, for the same reason:
((!! 2) $ iterate (\x->[x]) .) id) (8 :: Int)
However, this works:
(\x->[x]) $ (\x->[x]) $ (8 :: Int)
and I understand that the first (\x->[x]) and the second one are of different type (namely [Int]->[[Int]] and Int->[Int]), although formally they look the same.
Now say that I need to change the 2 to a large number, say 100.
My question is, is there a way to construct such a list? Do I have to resort to meta-programming techniques such as Template Haskell? If I have to resort to meta-programming, how can I do it?
As a side node, I have also tried to construct the string representation of such a list and read it. Although the string is much easier to construct, I don't know how to read such a string. For example,
read "[[[[[8]]]]]" :: ??
I don't know how to construct the ?? part when the number of nested layers is not known a priori. The only way I can think of is resorting to meta-programming.
The question above may not seem interesting enough, and I have a "real-life" case. Consider the following function:
natSucc x = [Left x,Right [x]]
This is the succ function used in the formal definition of natural numbers. Again, I cannot simply foldr1-replicate or !!-iterate it.
Any help will be appreciated. Suggestions on code styles are also welcome.
Edit:
After viewing the 3 answers given so far (again, thank you all very much for your time and efforts) I realized this is a more general problem that is not limited to lists. A similar type of problem can be composed for each valid type of functor (what if I want to get Just Just Just 8, although that may not make much sense on its own?).
You'll certainly agree that 2 :: Int and 4 :: Int have the same type. Because Haskell is not dependently typed†, that means foldr1 (.) (replicate 2 (\x->[x])) (8 :: Int) and foldr1 (.) (replicate 4 (\x->[x])) (8 :: Int) must have the same type, in contradiction with your idea that the former should give [[8]] :: [[Int]] and the latter [[[[8]]]] :: [[[[Int]]]]. In particular, it should be possible to put both of these expressions in a single list (Haskell lists need to have the same type for all their elements). But this just doesn't work.
The point is that you don't really want a Haskell list type: you want to be able to have different-depth branches in a single structure. Well, you can have that, and it doesn't require any clever type system hacks – we just need to be clear that this is not a list, but a tree. Something like this:
data Tree a = Leaf a | Rose [Tree a]
Then you can do
Prelude> foldr1 (.) (replicate 2 (\x->Rose [x])) $ Leaf (8 :: Int)
Rose [Rose [Leaf 8]]
Prelude> foldr1 (.) (replicate 4 (\x->Rose [x])) $ Leaf (8 :: Int)
Rose [Rose [Rose [Rose [Leaf 8]]]]
†Actually, modern GHC Haskell has quite a bunch of dependently-typed features (see DaniDiaz' answer), but these are still quite clearly separated from the value-level language.
I'd like to propose a very simple alternative which doesn't require any extensions or trickery: don't use different types.
Here is a type which can hold lists with any number of nestings, provided you say how many up front:
data NestList a = Zero a | Succ (NestList [a]) deriving Show
instance Functor NestList where
fmap f (Zero a) = Zero (f a)
fmap f (Succ as) = Succ (fmap (map f) as)
A value of this type is a church numeral indicating how many layers of nesting there are, followed by a value with that many layers of nesting; for example,
Succ (Succ (Zero [['a']])) :: NestList Char
It's now easy-cheesy to write your \x -> [x] iteration; since we want one more layer of nesting, we add one Succ.
> iterate (\x -> Succ (fmap (:[]) x)) (Zero 8) !! 5
Succ (Succ (Succ (Succ (Succ (Zero [[[[[8]]]]])))))
Your proposal for how to implement natural numbers can be modified similarly to use a simple recursive type. But the standard way is even cleaner: just take the above NestList and drop all the arguments.
data Nat = Zero | Succ Nat
This problem indeed requires somewhat advanced type-level programming.
I followed #chi's suggestion in the comments, and searched for a library that provided inductive type-level naturals with their corresponding singletons. I found the fin library, which is used in the answer.
The usual extensions for type-level trickery:
{-# language DataKinds, PolyKinds, KindSignatures, ScopedTypeVariables, TypeFamilies #-}
Here's a type family that maps a type-level natural and an element type to the type of the corresponding nested list:
import Data.Type.Nat
type family Nested (n::Nat) a where
Nested Z a = [a]
Nested (S n) a = [Nested n a]
For example, we can test from ghci that
*Main> :kind! Nested Nat3 Int
Nested Nat3 Int :: *
= [[[[Int]]]]
(Nat3 is a convenient alias defined in Data.Type.Nat.)
And here's a newtype that wraps the function we want to construct. It uses the type family to express the level of nesting
newtype Iterate (n::Nat) a = Iterate { runIterate :: (a -> [a]) -> a -> Nested n a }
The fin library provides a really nifty induction1 function that lets us compute a result by induction on Nat. We can use it to compute the Iterate that corresponds to every Nat. The Nat is passed implicitly, as a constraint:
iterate' :: forall n a. SNatI n => Iterate (n::Nat) a
iterate' =
let step :: forall m. SNatI m => Iterate m a -> Iterate (S m) a
step (Iterate recN) = Iterate (\f a -> [recN f a])
in induction1 (Iterate id) step
Testing the function in ghci (using -XTypeApplications to supply the Nat):
*Main> runIterate (iterate' #Nat3) pure True
[[[[True]]]]

List of polymorphic functions in haskell?

Consider the code below:
t1 :: [Int] -> (Int,String)
t1 xs = (sum xs,show $ length xs)
t2 :: [Int] -> (Int,String)
t2 xs = (length xs, (\x -> '?') <$> xs)
t3 :: [Int] -> (Char,String)
t3 (x:xs) = ('Y',"1+" ++ (show $ length xs))
t3 [] = ('N',"empty")
These three functions have a type that only varies partially -- they are entirely usable without needing to know the type of the first component of the tuple they produce. This means that I can operate on them without needing to refer to that type:
fnListToStrs vs fs = (\x -> snd $ x vs) <$> fs
Loading these definitions into GHCi, all three of the functions work independently as an argument to fnListToStrs, and indeed I can pass in a list containing both t1 and t2 because they have the same type:
*Imprec> fnListToStrs [1,2] [t1,t2]
["2","??"]
*Imprec> fnListToStrs [1,2] [t3]
["1+1"]
But I can't pass all 3 at the same time, even though the divergence of types is actually irrelevant to the calculation performed:
*Imprec> fnListToStrs [1,2] [t1,t2]
["2","??"]
*Imprec> fnListToStrs [1,2] [t3]
["1+1"]
I have the feeling that making this work has something to do with either existential or impredicative types, but neither extension has worked for me when using the type declaration I expect fnListToStrs to be able to take, namely:
fnListToStrs :: [Int] -> [forall a.[Int]->(a,String)] -> [String]
Is there some other way to make this work?
Existential is correct, not impredicative. And Haskell doesn't have existentials, except through an explicit wrapper...
{-# LANGUAGE GADTs #-}
data SomeFstRes x z where
SFR :: (x -> (y,z)) -> SomeFstRes x z
> fmap (\(SFR f) -> snd $ f [1,2]) [SFR t1, SFR t2, SFR t3]
["2","??","1+1"]
but, this really is a bit useless. Since you can't possibly do anything with the first result anyway, it's more sensible to just throw it away immediately and put the remaining function in a simple monomorphic list:
> fmap ($[1,2]) [snd . t1, snd . t2, snd . t3]
["2","??","1+1"]
Any way to put these functions into a list will require "wrapping" each of them in some fashion. The simplest wrapping is just
wrap :: (a -> (b, c)) -> a -> c
wrap f = snd . f
There are, indeed, other ways to wrap these (notably with existential types), but you've not given any information to suggest that any of those would be even slightly better in your application than this simplest version.
Here's an example where something more sophisticated might make sense. Suppose you have
data Blob a b = Blob [a -> b] [a]
Now imagine you want to make a list of values of type Blob a b that all have the same b type, but may have different a types. Actually applying each function to each argument could lead to a prohibitively large list of potential results, so it would make sense to write
data WrapBlob b where
WrapBlob :: Blob a b -> WrapBlob b
Now you can make the list and postpone the decision of which function(s) to apply to which argument(s) without paying a prohibitive price.

How to implement delete with foldr in Haskell

I've been studying folds for the past few days. I can implement simple functions with them, like length, concat and filter. What I'm stuck at is trying to implement with foldr functions like delete, take and find. I have implemented these with explicit recursion but it doesn't seem obvious to me how to convert these types of functions to right folds.
I have studied the tutorials by Graham Hutton and Bernie Pope. Imitating Hutton's dropWhile, I was able to implement delete with foldr but it fails on infinite lists.
From reading Implement insert in haskell with foldr, How can this function be written using foldr? and Implementing take using foldr, it would seem that I need to use foldr to generate a function which then does something. But I don't really understand these solutions and don't have an idea how to implement for example delete this way.
Could you explain to me a general strategy for implementing with foldr lazy versions of functions like the ones I mentioned. Maybe you could also implement delete as an example since this probably is one of the easiest.
I'm looking for a detailed explanation that a beginner can understand. I'm not interested in just solutions, I want to develop an understanding so I can come up with solutions to similar problems myself.
Thanks.
Edit: At the moment of writing there is one useful answer but it's not quite what I was looking for. I'm more interested in an approach that uses foldr to generate a function, which then does something. The links in my question have examples of this. I don't quite understand those solutions so I would like to have more information on this approach.
delete is a modal search. It has two different modes of operation - whether it's already found the result or not. You can use foldr to construct a function that passes the state down the line as each element is checked. So in the case of delete, the state can be a simple Bool. It's not exactly the best type, but it will do.
Once you have identified the state type, you can start working on the foldr construction. I'm going to walk through figuring it out the way I did. I'll be enabling ScopedTypeVariables just so I can annotate the type of subexpressions better. One you know the state type, you know you want foldr to generate a function taking a value of that type, and returning a value of the desired final type. That's enough to start sketching things.
{-# LANGUAGE ScopedTypeVariables #-}
delete :: forall a. Eq a => a -> [a] -> [a]
delete a xs = foldr f undefined xs undefined
where
f :: a -> (Bool -> [a]) -> (Bool -> [a])
f x g = undefined
It's a start. The exact meaning of g is a little bit tricky here. It's actually the function for processing the rest of the list. It's accurate to look at it as a continuation, in fact. It absolutely represents performing the rest of the folding, with your whatever state you choose to pass along. Given that, it's time to figure out what to put in some of those undefined places.
{-# LANGUAGE ScopedTypeVariables #-}
delete :: forall a. Eq a => a -> [a] -> [a]
delete a xs = foldr f undefined xs undefined
where
f :: a -> (Bool -> [a]) -> (Bool -> [a])
f x g found | x == a && not found = g True
| otherwise = x : g found
That seems relatively straightforward. If the current element is the one being searched for, and it hasn't yet been found, don't output it, and continue with the state set to True, indicating it's been found. otherwise, output the current value and continue with the current state. This just leaves the rest of the arguments to foldr. The last one is the initial state. The other one is the state function for an empty list. Ok, those aren't too bad either.
{-# LANGUAGE ScopedTypeVariables #-}
delete :: forall a. Eq a => a -> [a] -> [a]
delete a xs = foldr f (const []) xs False
where
f :: a -> (Bool -> [a]) -> (Bool -> [a])
f x g found | x == a && not found = g True
| otherwise = x : g found
No matter what the state is, produce an empty list when an empty list is encountered. And the initial state is that the element being searched for has not yet been found.
This technique is also applicable in other cases. For instance, foldl can be written as a foldr this way. If you look at foldl as a function that repeatedly transforms an initial accumulator, you can guess that's the function being produced - how to transform the initial value.
{-# LANGUAGE ScopedTypeVariables #-}
foldl :: forall a b. (a -> b -> a) -> a -> [b] -> a
foldl f z xs = foldr g id xs z
where
g :: b -> (a -> a) -> (a -> a)
g x cont acc = undefined
The base cases aren't too tricky to find when the problem is defined as manipulating the initial accumulator, named z there. The empty list is the identity transformation, id, and the value passed to the created function is z.
The implementation of g is trickier. It can't just be done blindly on types, because there are two different implementations that use all the expected values and type-check. This is a case where types aren't enough, and you need to consider the meanings of the functions available.
Let's start with an inventory of the values that seem like they should be used, and their types. The things that seem like they must need to be used in the body of g are f :: a -> b -> a, x :: b, cont :: (a -> a), and acc :: a. f will obviously take x as its second argument, but there's a question of the appropriate place to use cont. To figure out where it goes, remember that it represents the transformation function returned by processing the rest of the list, and that foldl processes the current element and then passes the result of that processing to the rest of the list.
{-# LANGUAGE ScopedTypeVariables #-}
foldl :: forall a b. (a -> b -> a) -> a -> [b] -> a
foldl f z xs = foldr g id xs z
where
g :: b -> (a -> a) -> (a -> a)
g x cont acc = cont $ f acc x
This also suggests that foldl' can be written this way with only one tiny change:
{-# LANGUAGE ScopedTypeVariables #-}
foldl' :: forall a b. (a -> b -> a) -> a -> [b] -> a
foldl' f z xs = foldr g id xs z
where
g :: b -> (a -> a) -> (a -> a)
g x cont acc = cont $! f acc x
The difference is that ($!) is used to suggest evaluation of f acc x before it's passed to cont. (I say "suggest" because there are some edge cases where ($!) doesn't force evaluation even as far as WHNF.)
delete doesn't operate on the entire list evenly. The structure of the computation isn't just considering the whole list one element at a time. It differs after it hits the element it's looking for. This tells you it can't be implemented as just a foldr. There will have to be some sort of post-processing involved.
When that happens, the general pattern is that you build a pair of values and just take one of them at completion of the foldr. That's probably what you did when you imitated Hutton's dropWhile, though I'm not sure since you didn't include code. Something like this?
delete :: Eq a => a -> [a] -> [a]
delete a = snd . foldr (\x (xs1, xs2) -> if x == a then (x:xs1, xs1) else (x:xs1, x:xs2)) ([], [])
The main idea is that xs1 is always going to be the full tail of the list, and xs2 is the result of the delete over the tail of the list. Since you only want to remove the first element that matches, you don't want to use the result of delete over the tail when you do match the value you're searching for, you just want to return the rest of the list unchanged - which fortunately is what's always going to be in xs1.
And yeah, that doesn't work on infinite lists - but only for one very specific reason. The lambda is too strict. foldr only works on infinite lists when the function it is provided doesn't always force evaluation of its second argument, and that lambda does always force evaluation of its second argument in the pattern match on the pair. Switching to an irrefutable pattern match fixes that, by allowing the lambda to produce a constructor before ever examining its second argument.
delete :: Eq a => a -> [a] -> [a]
delete a = snd . foldr (\x ~(xs1, xs2) -> if x == a then (x:xs1, xs1) else (x:xs1, x:xs2)) ([], [])
That's not the only way to get that result. Using a let-binding or fst and snd as accessors on the tuple would also do the job. But it is the change with the smallest diff.
The most important takeaway here is to be very careful with handling the second argument to the reducing function you pass to foldr. You want to defer examining the second argument whenever possible, so that the foldr can stream lazily in as many cases as possible.
If you look at that lambda, you see that the branch taken is chosen before doing anything with the second argument to the reducing function. Furthermore, you'll see that most of the time, the reducing function produces a list constructor in both halves of the result tuple before it ever needs to evaluate the second argument. Since those list constructors are what make it out of delete, they are what matter for streaming - so long as you don't let the pair get in the way. And making the pattern-match on the pair irrefutable is what keeps it out of the way.
As a bonus example of the streaming properties of foldr, consider my favorite example:
dropWhileEnd :: (a -> Bool) -> [a] -> [a]
dropWhileEnd p = foldr (\x xs -> if p x && null xs then [] else x:xs) []
It streams - as much as it can. If you figure out exactly when and why it does and doesn't stream, you'll understand pretty much every detail of the streaming structure of foldr.
here is a simple delete, implemented with foldr:
delete :: (Eq a) => a -> [a] -> [a]
delete a xs = foldr (\x xs -> if x == a then (xs) else (x:xs)) [] xs

Resources