'Valuation a' data structure and 'eval' function - haskell

I'm studying haskell and I don't know how to complete one exercise:
We can define a data structure for generalised expressions as
follows:
data Expr a = Lit a | EVar Var | Op (Ops a) [Expr a]
type Ops a = [a] -> a
type Var = Char
To evaluate an expression, we need to know all the values of its variables.
Define a new type or datatype Valuation a, associating variables with values
of the type a. Then write a function:
eval :: Valuation a -> Expr a -> a
that, for the given variable valuation and expression, evaluates (folds) the
expression to a single value.
Extra information (given by my professor with the exercise):
Valuation could be any type, associating Var (= Char) and 'a', example: >[(Var,a)] or Var -> a .
eval function, after receiving a structure (of type: Valuation a) and an >expression (of type: Expr a), should simplify that expression to one "a" type >value.
Expression (Expr Int) example: standard expression such as (x + 10) * y would >look like this: Op* [Op+ [EVar ‘x’, Lit 10], EVar ‘y’]. Is Valuation Int type >structure would associate EVar ‘x’ with 15 and EVar ‘y’ with 2, then the end >result should be 50.
My questions would be:
1) How should such data structure of Valuation a look like? I'm thinking about some kind of a map with keys and values, but I might be totally wrong.
2) For eval I was thinking about writing a function which would include all operators and their priorities, but overall this exercise looks a bit too difficult (in my mind) compared to other exercises we usually do - most of the time the hardest part is figuring out what the exercise wants and the solution itself takes up only about 5-15 lines of code; so, maybe my thinking for this exercise is off?
Any help would be appreciated.

Let's consolidate the comments into an answer:
1) How should such data structure of Valuation a look like? I'm thinking about some kind of a map with keys and values, but I might be totally wrong.
You are indeed looking for something that associates Var keys with a values. A [(Var, a)] list of pairs is one way of achieving that. You can then work with such a list using functions like lookup:
lookup :: Eq a => a -> [(a, b)] -> Maybe b
lookup key assocs looks up a key in an association list.
Alternatively, you might want to consider using a proper map type, such as Map from containers.
2) For eval I was thinking about writing a function which would include all operators and their priorities [...]
Given your problem statement, there is no list of "all operators" from you to include. A second look at the Expr definition...
data Expr a = Lit a | EVar Var | Op (Ops a) [Expr a]
type Ops a = [a] -> a
... shows that an Op expression already encapsulates the operation to be done, in the form of the [a] -> a function in is its first field. Your task, then, boils down to converting the [Expr a] in the second field into an [a], handling each of the three types of Expr appropriately, and then apply the [a] -> a function to the resulting [a] list.

Related

How do I test that something is valid for all elements in a map?

I have the following things in my application:
newtype User = User Text
newtype Counts = Counts (Map User Int)
subjectUnderTest :: Counts -> Text
An example of correct output would be
> subjectUnderTest $ fromList [(User "foo", 4), (User "bar", 4), (User "qux", 2)]
"4: foo, bar\n2: qux"
I would like to write property-based tests that verify things like "all users are represented in the output", "all counts are represented in the output" and "all users are on the same line as their corresponding count". In common for these properties is that the wording of them starts with "all ..."
How do I write a property that verifies that something is valid for each element in the Map?
I'm assuming that this question is only a simplified representation of something more complex, so here's a couple of things strategies to consider:
Split up the functionality
It looks like subjectUnderTest does two unrelated things:
It groups the values in the map by value, instead of by key.
It formats, or pretty-prints, the inverted map.
If you can split up the functionality into those two steps, they're easier to test in in isolation.
The first step, you can make parametrically polymorphic. Instead of testing a function with the type Counts -> Text, consider testing a function with the type Eq b => Map a b -> [(b, [a])]. Property-based testing is easier with parametric polymorphism, because you get certain properties for free. For example, you can be sure that the values in the output can only come from the input, because there's no way to conjure a and b values out of thin air.
You're still going to have to write tests for the properties you ask about. Write a function with a type like Eq b => Map a b -> Testable. If you want to test that all the values are there, pull them out of the map and make list of them. Sort the list and nub it. It's now a [b] value. That's your expected output.
Now call your function. It returns something like [(b, [a])]. Map it using fst, sort and nub it. That list should be equal to your expected output.
For the next step (pretty-printing), see the next section.
Roundtrips
When you want to property-base pretty-printing, the easiest approach is usually to bite the bullet and also write a parser. The printer and the parser should be the dual of each other, so if you have a function MyType -> String, your should have a parser with the type String -> Maybe MyType.
You can now write a general property like MyType -> Testable. It takes as input a value of MyType (let's call it expected). You now produce a value (let's call it actual) as actual = parse $ print expected. You can now verify that Just expected === actual.
If the particular String format is important, I'd follow it up with a few actual examples, using good old parametrised tests.
Just because you're doing property-based testing doesn't mean that a 'normal' unit test can't be useful as well.
Example
Here's a simple example of what I meant above. Assume that
invertMap :: (Ord b, Eq b) => Map a b -> [(b, [a])]
you can define one of the properties as:
allValuesAreNowKeys :: (Show a, Ord a) => Map k a -> Property
allValuesAreNowKeys m =
let expected = nub $ sort $ Map.elems m
actual = invertMap m
in expected === nub (sort $ fmap fst actual)
Since this property is still parametrically polymorphic, you'll have to add it to your test suite with a particular type, e.g.:
tests = [
testGroup "Sorting Group 1" [
testProperty "all values are now keys" (allValuesAreNowKeys :: Map String Int -> Property)]]
There are prettier ways to define lists of properties; that one is just the template used by the quickcheck-test-framework Stack template...

Why are ML/Haskell datatypes useful for defining "languages" like arithmetic expressions?

This is more of a soft question about static type systems in functional languages like those of the ML family. I understand why you need datatypes to describe data structures like lists and trees but defining "expressions" like those of propositional logic within datatypes seems to bring just some convenience and is not necessary. For example
datatype arithmetic_exp = Constant of int
| Neg of arithmetic_exp
| Add of (arithmetic_exp * arithmetic_exp)
| Mult of (arithmetic_exp * arithmetic_exp)
defines a set of values, on which you can write an eval function which would give you the result. You could just as well define 4 functions: const: int -> int, neg: int -> int, add: int * int -> int and mult: int * int -> int and then an expression of the sort add (mult (const 3, neg 2), neg 4) would give you the same thing without any loss of static security. The only complication is that you have to do four things instead of two. While learning SML and Haskell I've been trying to think about which features give you something necessary and which are just a convenience, so this is the reason why I'm asking. I guess this would matter if you want to decouple the process of evaluating a value from the value itself but I'm not sure where that would be useful.
Thanks a lot.
There is a duality between initial / first-order / datatype-based encodings (aka deep embeddings) and final / higher-order / evaluator-based encodings (aka shallow embeddings). You can indeed typically use a typeclass of combinators instead of a datatype (and convert back and forth between the two).
Here is a module showing the two approaches:
{-# LANGUAGE GADTs, Rank2Types #-}
module Expr where
data Expr where
Val :: Int -> Expr
Add :: Expr -> Expr -> Expr
class Expr' a where
val :: Int -> a
add :: a -> a -> a
You can see that the two definitions look eerily similar. Expr' a is basically describing an algebra on Expr which means that you can get an a out of an Expr if you have such an Expr' a. Similarly, because you can write an instance Expr' Expr, you're able to reify a term of type forall a. Expr' a => a into a syntactic value of type Expr:
expr :: Expr' a => Expr -> a
expr e = case e of
Val n -> val n
Add p q -> add (expr p) (expr q)
instance Expr' Expr where
val = Val
add = Add
expr' :: (forall a. Expr' a => a) -> Expr
expr' e = e
In the end, picking a representation over another really depends on what your main focus is: if you want to inspect the structure of the expression (e.g. if you want to optimise / compile it), it's easier if you have access to an AST. If, on the other hand, you're only interested in computing an invariant using a fold (e.g. the depth of the expression or its evaluation), a higher order encoding will do.
The ADT is in a form you can inspect and manipulate in ways other than simply evaluating it. Once you hide all the interesting data in a function call, there is no longer anything you can do with it but evaluate it. Consider this definition, similar to the one in your question, but with a Var term to represent variables and with the Mul and Neg terms removed to focus on addition.
data Expr a = Constant a
| Add (Expr a) (Expr a)
| Var String
deriving Show
The obvious function to write is eval, of course. It requires a way to look up a variable's value, and is straightforward:
-- cheating a little bit by assuming all Vars are defined
eval :: Num a => Expr a -> (String -> a) -> a
eval (Constant x) _env = x
eval (Add x y) env = eval x env + eval y env
eval (Var x) env = env x
But suppose you don't have a variable mapping yet. You have a large expression that you will evaluate many times, for different choices of variable. And some silly recursive function built up an expression like:
Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Var "x"))))))
It would be wasteful to recompute 1+1+1+1+1+1 every time you evaluate this: wouldn't it be nice if your evaluator could realize that this is just another way of writing Add (Constant 6) (Var "x")?
So, you write an expression optimizer, which runs before any variables are available and tries to simplify expressions. There are many simplification rules you could apply, of course; below I've implemented just two very easy ones to illustrate the point.
simplify :: Num a => Expr a -> Expr a
simplify (Add (Constant x) (Constant y)) = Constant $ x + y
simplify (Add (Constant x) (Add (Constant y) z)) = simplify $ Add (Constant $ x + y) z
simplify x = x
Now how does our silly expression look?
> simplify $ Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Var "x"))))))
Add (Constant 6) (Var "x")
All the unnecessary stuff has been removed, and you now have a nice clean expression to try for various values of x.
How do you do the same thing with a representation of this expression in functions? You can't, because there is no "intermediate form" between the initial specification of the expression and its final evaluation: you can only look at the expression as a single, opaque function call. Evaluating it at a particular value of x necessarily evaluates each of the subexpressions anew, and there is no way to disentangle them.
Here's an extension of the functional type you propose in your question, again enriched with variables:
type FExpr a = (String -> a) -> a
lit :: a -> FExpr a
lit x _env = x
add :: Num a => FExpr a -> FExpr a -> FExpr a
add x y env = x env + y env
var :: String -> FExpr a
var x env = env x
with the same silly expression to evaluate many times:
sample :: Num a => FExpr a
sample = add (lit 1)
(add (lit 1)
(add (lit 1)
(add (lit 1)
(add (lit 1)
(add (lit 1)
(var "x"))))))
It works as expected:
> sample $ \_var -> 5
11
But it has to do a bunch of addition every time you try it for a different x, even though the addition and the variable are mostly unrelated. And there's no way you can simplify the expression tree. You can't simplify it while defining it: that is, you can't make add smarter, because it can't inspect its arguments at all: its arguments are functions which, as far as add is concerned, could do anything at all. And you can't simplify it after constructing it, either: at that point you just have an opaque function that takes in a variable lookup function and produces a value.
By modeling the important parts of your problem as data types in their own right, you make them values that your program can manipulate intelligently. If you leave them as functions, you get a shorter program that is less powerful, because you lock all the information inside lambdas that only GHC can manipulate.
And once you've written it with ADTs, it's not hard to collapse that representation back into the shorter function-based representation if you want. That is, it might be nice to have a function of type
convert :: Expr a -> FExpr a
But in fact, we've already done this! That's exactly the type that eval has. You just might not have noticed because of the FExpr type alias, which is not used in the definition of eval.
So in a way, the ADT representation is more general and more powerful, acting as a tree that you can fold up in many different ways. One of those ways is evaluating it, in exactly the way that the function-based representation does. But there are others:
Simplify the expression before evaluating it
Produce a list of all the variables that must be defined for this expression to be well formed
Count how deeply nested the deepest part of the tree is, to estimate how many stack frames an evaluator might need
Convert the expression to a String approximating a Haskell expression you could type to get the same result
So if possible you'd like to work with information-rich ADTs for as long as you can, and then eventually fold the tree up into a more compact form once you have something specific to do with it.

Could not deduce (b ~ a)

So I have the following code, I'm trying to write an abstract syntax tree for an interpreter, & I prefer not to jam everything in the same data type, so I was going to write a typeclass that had the basic behaviour (in this case AST).
{-# LANGUAGE ExistentialQuantification #-}
import qualified Data.Map as M
-- ...
data Expr = forall a. AST a => Expr a
type Env = [M.Map String Expr]
class AST a where
reduce :: AST b => a -> Env -> b
-- when i remove the line below, it compiles fine
reduce ast _ = ast
-- ...
When I remove the default implementation of reduce in the typeclass AST it compiles fine, but when ever I provide an implementation that returns it's self it complains. I get the following compiler error
src/Data/AbstractSyntaxTree.hs:13:18:
Could not deduce (b ~ a)
from the context (AST a)
bound by the class declaration for `AST'
at src/Data/AbstractSyntaxTree.hs:(11,1)-(13,20)
or from (AST b)
bound by the type signature for reduce :: AST b => a -> Env -> b
at src/Data/AbstractSyntaxTree.hs:12:13-36
`b' is a rigid type variable bound by
the type signature for reduce :: AST b => a -> Env -> b
at src/Data/AbstractSyntaxTree.hs:12:13
`a' is a rigid type variable bound by
the class declaration for `AST'
at src/Data/AbstractSyntaxTree.hs:11:11
In the expression: ast
In an equation for `reduce': reduce ast _ = ast
The behaviour of AST's reduce will evaluate an AST, and occasionally return a different type of AST, and sometimes the same type of AST.
Edit: Regarding data Expr = forall a. AST a => Expr a & GADTs
I originally went with data Expr = forall a. AST a => Expr a because I wanted to represent types like this.
(+ 2 true) -> Compound [Expr (Ref "+"), Expr 2, Expr true]
(+ a 2) -> Compound [Expr (Ref "+"), Expr (Ref "a"), Expr 2]
(eval (+ 2 2)) -> Compound [Expr (Ref "eval"),
Compound [
Expr (Ref "+"),
Expr 2,
Expr 2]]
((lambda (a b) (+ a b)) 2 2) -> Compound [Expr SomeLambdaAST, Expr 2, Expr 2]
Since I'm generating ASTs from text I feel it would be a burden to represent a strictly typed ASTs in a GADT, although I do see where they could be useful in case like DSLs in Haskell.
But since I'm generating the AST from text (which could contain some of the examples above), it might be a bit hard to predict what AST I'll end up with. I don't want to start juggling between Eithers & Maybes. That is what I ended up doing last time & it was a mess, & I gave up trying to attempt this in Haskell.
But again I'm not the most experienced Haskell programmer so maybe I'm looking at this the wrong way, maybe I can implement an AST with so more rigours typing, so I'll have a look and see if I can come up with GADTs, but I have my doubts & I have a feeling that it might end the way it did last time.
Ultimately I'm just trying to learn Haskell at the moment with a fun finish able project, so I don't mind if my first Haskell project isn't really idiomatic Haskell. Getting something working is a higher priority just so I can make my way around the language and have something to show for it.
Update:
I've taken #cdk's & #bheklilr advice and ditched the existential type, although I've gone with a much simpler type, as opposed to utilising GADTs (also suggested by #cdk's & #bheklilr). It could possibly be a stronger type but again I'm just trying to get familiar with Haskell, so I gave up up after a few hours & went with a simple data type like so :P
import qualified Data.Map as M
type Environment = [M.Map String AST]
data AST
= Compound [AST]
| FNum Double
| Func Function
| Err String
| Ref String
data Function
= NativeFn ([AST] -> AST)
| LangFn [String] AST
-- previously called reduce
eval :: Environment -> AST -> AST
eval env ast = case ast of
Ref ref -> case (lookup ref env ) of
Just ast -> ast
Nothing -> Err ("Not in scope `" ++ ref ++ "'")
Compound elements -> case elements of
[] -> Err "You tried to invoke `()'"
function : args -> case (eval env function) of
Func fn -> invoke env fn args
other -> Err $ "Cannot invoke " ++ (show other)
_ -> ast
-- invoke & lookup are defined else where
Although I will still probably look at GADTs as they seem to be pretty interesting & have lead me to some interesting reading material regarding implementing abstract syntax trees in haskell.
What part of the error message are you having difficulties understanding? I think it's quite clear.
The type of reduce is
reduce :: AST b => a -> Env -> b
The first argument has type a and GHC expects reduce to return something of type b, which may be entirely different from a. GHC is correct to complain that you've tried to return a value of a when it expects b.
The "existential quantification with type class" is (as noted by bheklilr) an anti-pattern. A better approach would be to create an Algebraic Data Type for AST:
data AST a
now reduce becomes a simple function:
reduce :: Env -> AST a -> AST b
if you want reduce to be able to return a different type of AST, you could use Either
reduce :: Env -> AST a -> Either (AST a) (AST b)
but I don't think this is what you really want. My advice is to take a look at the GADT style of creating ASTs and re-evaluate your approach.
You are interpreting this type signature incorrectly (in a way that is common to OO programmers):
reduce :: AST b => a -> Env -> b
This does not mean that reduce can choose any type it likes (that is a member of AST) and return a value of that type. If it did, your implementation would be valid. Rather, it means that for any type b the caller likes (that is a member of AST), reduce must be able to return a value in that type. b could well be the same as a sometimes, but it's the caller's choice, not the choice of reduce.
If your implementation returns a value of type a, then this can only be true if b is always equal to a, which is what the compiler is on about when it reports failing to prove that b ~ a.
Haskell does not have subtypes. Type variables are not supertypes of all the concrete types that could instantiate them, as you might be used to using Object or abstract interface types in OO languages. Rather type variables are parameters; any implementation which claims to have a parametric type must work regardless of what types are chosen for the parameters.
If you want to use a design where reduce can return a value in whatever type of AST it feels like (rather than whatever type of AST is asked of it), then you need to use your Expr box again, since Expr is not parameterized by the type of AST it contains, but can still contain any AST:
reduce :: a -> Env -> Expr
reduce ast _ = Expr ast
Now reduce can work regardless of the types chosen for its type parameters, since there's only a. Consumers of the returned Expr will have no way of constraining the type inside the Expr, so they'll have to be written to work regardless of what that type is.
Your default implementation doesn't compile, because it has the wrong definition.
reduce :: AST b => a -> b -> Env -> b
reduce ast _ = ast
Now ast has the type a and reduce function returns type b but according to your implementation you return ast which is of type a but the compiler expects b.
Even something like this will work:
reduce :: AST b => a -> b -> Env -> b
reduce _ ast _ = ast

How should types be used in Haskell type classes?

I'm new to Haskell, and a little confused about how type classes work. Here's a simplified example of something I'm trying to do:
data ListOfInts = ListOfInts {value :: [Int]}
data ListOfDoubles = ListOfDoubles {value :: [Double]}
class Incrementable a where
increment :: a -> a
instance Incrementable ListOfInts where
increment ints = map (\x -> x + 1) ints
instance Incrementable ListOfDoubles where
increment doubles = map (\x -> x + 1) doubles
(I realize that incrementing each element of a list can be done very simply, but this is just a simplified version of a more complex problem.)
The compiler tells me that I have multiple declarations of value. If I change the definitions of ListOfInts and ListOfDoubles as follows:
type ListOfInts = [Int]
type ListOfDoubles = [Double]
Then the compiler says "Illegal instance declaration for 'Incrementable ListOfInts'" (and similarly for ListOfDoubles. If I use newtype, e.g., newtype ListOfInts = ListOfInts [Int], then the compiler tells me "Couldn't match expected type 'ListOfInts' with actual type '[b0]'" (and similarly for ListOfDoubles.
My understanding of type classes is that they facilitate polymorphism, but I'm clearly missing something. In the first example above, does the compiler just see that the type parameter a refers to a record with a field called value and that it appears I'm trying to define increment for this type in multiple ways (rather than seeing two different types, one which has a field whose type of a list of Ints, and the other whose type is a list of Doubles)? And similarly for the other attempts?
Thanks in advance.
You're really seeing two separate problems, so I'll address them as such.
The first one is with the value field. Haskell records work in a slightly peculiar way: when you name a field, it is automatically added to the current scope as a function. Essentially, you can think of
data ListOfInts = ListOfInts {value :: [Int]}
as syntax sugar for:
data ListOfInts = ListOfInts [Int]
value :: ListOfInt -> [Int]
value (ListOfInts v) = v
So having two records with the same field name is just like having two different functions with the same name--they overlap. This is why your first error tells you that you've declared values multiple times.
The way to fix this would be to define your types without using the record syntax, as I did above:
data ListOfInts = ListOfInts [Int]
data ListOfDoubles = ListOfDoubles [Double]
When you used type instead of data, you simply created a type synonym rather than a new type. Using
type ListOfInts = [Int]
means that ListOfInts is the same as just [Int]. For various reasons, you can't use type synonyms in class instances by default. This makes sense--it would be very easy to make a mistake like trying to write an instance for [Int] as well as one for ListOfInts, which would break.
Using data to wrap a single type like [Int] or [Double] is the same as using newtype. However, newtype has the advantage that it carries no runtime overhead at all. So the best way to write these types would indeed be with newtype:
newtype ListOfInts = ListOfInts [Int]
newtype ListOfDoubles = ListOfDoubles [Double]
An important thing to note is that when you use data or newtype, you also have to "unwrap" the type if you want to get at its content. You can do this with pattern matching:
instance Incrementable ListOfInts where
increment (ListOfInts ls) = ListOfInts (map (\ x -> x + 1) ls)
This unwraps the ListOfInts, maps a function over its contents and wraps it back up.
As long as you unwrap the value this way, your instances should work.
On a side note, you can write map (\ x -> x + 1) as map (+ 1), using something that is called an "operator section". All this means is that you implicitly create a lambda filling in whichever argument of the operator is missing. Most people find the map (+ 1) version easier to read because there is less unnecessary noise.

Creating a list type using functions

For a silly challenge I am trying to implement a list type using as little of the prelude as possible and without using any custom types (the data keyword).
I can construct an modify a list using tuples like so:
import Prelude (Int(..), Num(..), Eq(..))
cons x = (x, ())
prepend x xs = (x, xs)
head (x, _) = x
tail (_, x) = x
at xs n = if n == 0 then xs else at (tail xs) (n-1)
I cannot think of how to write an at (!!) function. Is this even possible in a static language?
If it is possible could you try to nudge me in the right direction without telling me the answer.
There is a standard trick known as Church encoding that makes this easy. Here's a generic example to get you started:
data Foo = A Int Bool | B String
fooValue1 = A 3 False
fooValue2 = B "hello!"
Now, a function that wants to use this piece of data must know what to do with each of the constructors. So, assuming it wants to produce some result of type r, it must at the very least have two functions, one of type Int -> Bool -> r (to handle the A constructor), and the other of type String -> r (to handle the B constructor). In fact, we could write the type that way instead:
type Foo r = (Int -> Bool -> r) -> (String -> r) -> r
You should read the type Foo r here as saying "a function that consumes a Foo and produces an r". The type itself "stores" a Foo inside a closure -- so that it will effectively apply one or the other of its arguments to the value it closed over. Using this idea, we can rewrite fooValue1 and fooValue2:
fooValue1 = \consumeA consumeB -> consumeA 3 False
fooValue2 = \consumeA consumeB -> consumeB "hello!"
Now, let's try applying this trick to real lists (though not using Haskell's fancy syntax sugar).
data List a = Nil | Cons a (List a)
Following the same format as before, consuming a list like this involves either giving a value of type r (in case the constructor was Nil) or telling what to do with an a and another List a, so. At first, this seems problematic, since:
type List a r = (r) -> (a -> List a -> r) -> r
isn't really a good type (it's recursive!). But we can instead demand that we first reduce all the recursive arguments to r first... then we can adjust this type to make something more reasonable.
type List a r = (r) -> (a -> r -> r) -> r
(Again, we should read the type List a r as being "a thing that consumes a list of as and produces an r".)
There's one final trick that's necessary. What we would like to do is to enforce the requirement that the r that our List a r returns is actually constructed from the arguments we pass. That's a little abstract, so let's give an example of a bad value that happens to have type List a r, but which we'd like to rule out.
badList = \consumeNil consumeCons -> False
Now, badList has type List a Bool, but it's not really a function that consumes a list and produces a Bool, since in some sense there's no list being consumed. We can rule this out by demanding that the type work for any r, no matter what the user wants r to be:
type List a = forall r. (r) -> (a -> r -> r) -> r
This enforces the idea that the only way to get an r that gets us off the ground is to use the (user-supplied) consumeNil function. Can you see how to make this same refinement for our original Foo type?
If it is possible could you try and nudge me in the right direction without telling me the answer.
It's possible, in more than one way. But your main problem here is that you've not implemented lists. You've implemented fixed-size vectors whose length is encoded in the type.
Compare the types from adding an element to the head of a list vs. your implementation:
(:) :: a -> [a] -> [a]
prepend :: a -> b -> (a, b)
To construct an equivalent of the built-in list type, you'd need a function like prepend with a type resembling a -> b -> b. And if you want your lists to be parameterized by element type in a straightforward way, you need the type to further resemble a -> f a -> f a.
Is this even possible in a static language?
You're also on to something here, in that the encoding you're using works fine in something like Scheme. Languages with "dynamic" systems can be regarded as having a single static type with implicit conversions and metadata attached, which obviously solves the type mismatch problem in a very extreme way!
I cannot think of how to write an at (!!) function.
Recalling that your "lists" actually encode their length in their type, it should be easy to see why it's difficult to write functions that do anything other than increment/decrement the length. You can actually do this, but it requires elaborate encoding and more advanced type system features. A hint in this direction is that you'll need to use type-level numbers as well. You'd probably enjoy doing this as an exercise as well, but it's much more advanced than encoding lists.
Solution A - nested tuples:
Your lists are really nested tuples - for example, they can hold items of different types, and their type reveals their length.
It is possible to write indexing-like function for nested tuples, but it is ugly, and it won't correspond to Prelude's lists. Something like this:
class List a b where ...
instance List () b where ...
instance List a b => List (b,a) b where ...
Solution B - use data
I recommend using data construct. Tuples are internally something like this:
data (,) a b = Pair a b
so you aren't avoiding data. The division between "custom types" and "primitive types" is rather artificial in Haskell, as opposed to C.
Solution C - use newtype:
If you are fine with newtype but not data:
newtype List a = List (Maybe (a, List a))
Solution D - rank-2-types:
Use rank-2-types:
type List a = forall b. b -> (a -> b -> b) -> b
list :: List Int
list = \n c -> c 1 (c 2 n) -- [1,2]
and write functions for them. I think this is closest to your goal. Google for "Church encoding" if you need more hints.
Let's set aside at, and just think about your first four functions for the moment. You haven't given them type signatures, so let's look at those; they'll make things much clearer. The types are
cons :: a -> (a, ())
prepend :: a -> b -> (a, b)
head :: (a, b) -> a
tail :: (a, b) -> b
Hmmm. Compare these to the types of the corresponding Prelude functions1:
return :: a -> [a]
(:) :: a -> [a] -> [a]
head :: [a] -> a
tail :: [a] -> [a]
The big difference is that, in your code, there's nothing that corresponds to the list type, []. What would such a type be? Well, let's compare, function by function.
cons/return: here, (a,()) corresponds to [a]
prepend/(:): here, both b and (a,b) correspond to [a]
head: here, (a,b) corresponds to [a]
tail: here, (a,b) corresponds to [a]
It's clear, then, that what you're trying to say is that a list is a pair. And prepend indicates that you then expect the tail of the list to be another list. So what would that make the list type? You'd want to write type List a = (a,List a) (although this would leave out (), your empty list, but I'll get to that later), but you can't do this—type synonyms can't be recursive. After all, think about what the type of at/!! would be. In the prelude, you have (!!) :: [a] -> Int -> a. Here, you might try at :: (a,b) -> Int -> a, but this won't work; you have no way to convert a b into an a. So you really ought to have at :: (a,(a,b)) -> Int -> a, but of course this won't work either. You'll never be able to work with the structure of the list (neatly), because you'd need an infinite type. Now, you might argue that your type does stop, because () will finish a list. But then you run into a related problem: now, a length-zero list has type (), a length-one list has type (a,()), a length-two list has type (a,(a,())), etc. This is the problem: there is no single "list type" in your implementation, and so at can't have a well-typed first parameter.
You have hit on something, though; consider the definition of lists:
data List a = []
| a : [a]
Here, [] :: [a], and (:) :: a -> [a] -> [a]. In other words, a list is isomorphic to something which is either a singleton value, or a pair of a value and a list:
newtype List' a = List' (Either () (a,List' a))
You were trying to use the same trick without creating a type, but it's this creation of a new type which allows you to get the recursion. And it's exactly your missing recursion which allows lists to have a single type.
1: On a related note, cons should be called something like singleton, and prepend should be cons, but that's not important right now.
You can implement the datatype List a as a pair (f, n) where f :: Nat -> a and n :: Nat, where n is the length of the list:
type List a = (Int -> a, Int)
Implementing the empty list, the list operations cons, head, tail, and null, and a function convert :: List a -> [a] is left as an easy exercise.
(Disclaimer: stole this from Bird's Introduction to Functional Programming in Haskell.)
Of course, you could represent tuples via functions as well. And then True and False and the natural numbers ...

Resources