How to convert my thoughts in OOP to Haskell? - haskell

For example, I have a container type to hold elements with common character. And I also provide some types to be the element. And I also want this function to be easily extended (others could make their own element type and be hold by my container).
So I do:
class ElementClass
data E1 = E1 String
instance ElementClass E1
data E2 = E2 Int
instance ElementClass E2
data Element = forall e. (ElementClass e) => Element e
data Container = Container [Element]
This is fine until I need to deal with the element individually. Due to forall, function "f :: Element -> IO ()" has no way to know what element it is exactly.
What is the proper way to do this in Haskell style?

to know what element it is exactly
To know that, you should of course use a simple ADT
data Element' = E1Element E1
| E2Element E2
| ...
this way, you can pattern-match on which one it is in your container.
Now, that clashes with
others could make their own element type and be hold by my container
and it must clash! When other people are allowed to add new types to the list of elements, there's no way to safely match all possible cases. So if you want to match, the only correct thing is to have a closed set of possibilities, as an ADT gives you.
OTOH, with an existential like you originally had in mind, the class of allowed types is open. That's ok, but only because the exact type isn't in fact accessible but only the common interface defined by forall e. ElementClass e.
Existentials are indeed a bit frowned-upon in Haskell, because they are so OO-ish. But sometimes this is quite the right thing to do, your application might be a good case.

Ok I'll try to help a bit.
First: I assume you have these data types:
data E1 = E1 String
data E2 = E2 Int
And you have a sensible operation on both that I'll call say:
say1 :: E1 -> String -> String
say1 (E1 s) msg = msg ++ s
say2 :: E2 -> String -> String
say2 (E2 i) msg = msg ++ show i
So what you can do without any type-classes or stuff is this:
type Messanger = String -> String
and instead of having a container with lot's of E1 and E2, instead use a container with Messagners:
sayHello :: [Messanger] -> String
sayHello = map ($ "Hello, ")
sayHello [say1 (E1 "World"), say2 (E2 42)]
> ["Hello, World","Hello, 42"]
I hope this helps you a bit - the thing is just going away from the object and looking at the operations instead.
So instead of pushing the objects/data to a function that should work with the objects data and behaviour just use a common "interface" to do your stuff.
If you give me some better example of classes and methods (for example two types that might indeed share some traits or behaviour - String and Int are really lacking on this) I will update my answer.

First of all, make sure you read and understand "Haskell Antipattern: Existential Typeclass". Your example code is more complex than it needs to be.
Basically, you're asking how to perform the equivalent of a downcast in Haskell—cast a value from a supertype to a subtype. This sort of operation can intrinsically fail, so the type is something like Element -> Maybe E1.
The first question to ask here is: do you really need to? There are two complementary alternatives to this. First: you can formulate your "supertype" in such a way that it only ever has a finite, fixed number of "subtypes." Then you implement your type just as a union:
data Element = E1 String | E2 Int
And every time you want to use an Element you pattern match and presto, you have the case-specific data:
processElement :: Element -> whatever
processElement (E1 str) = ...
processElement (E2 i) = ...
The downsides to this approach are that:
Your union type can only have a fixed set of subcases.
Every time you add a subcase you will have to modify all the existing operations to add an extra matching case for it.
The upsides are:
By enumerating all the subcases in your type, you can use the compiler to tell you when you've missed one.
Adding a new operation is easy, and doesn't require you to modify any existing code.
The second way you can go is to reformulate the type as an "interface". By this I mean your type is now going to be modeled as a record type, each of whose fields constitutes a "method":
data Element = Element { say :: String }
-- A "constructor" for your first subcase
makeE1 :: String -> Element
makeE1 str = Element str
-- A "constructor" for your second subcase
makeE2 :: Int -> Element
makeE2 i = Element (show i)
This has the upside that you now can have as many subcases as you want, and you can easily add them without modifying existing operations. It has these two downsides:
If you need to add new operations, you will have to add a "method" (field) to the Element type, and modify every existing function that constructs an Element.
Consumers of the Element type can never tell which subcase they're dealing with, or get information specific to this subcase. E.g., a consumer can't tell a particular Element started was constructed with makeE2, much less extract the Int that such an Element encapsulates.
(Note that your example with existentials is equivalent to this "interface" approach, and shares the same advantages and limitations. It's just needlessly verbose.)
But if you really insist on having the equivalent of a downcast, there is a third alternative: use the Data.Dynamic module. A Dynamic value is an immutable container that holds a single value of any type that instantiates the Typeable class (which GHC can derive for you). Example:
data E1 = E1 String deriving Typeable
data E2 = E2 Int deriving Typeable
newtype Element = Element Dynamic
makeE1 :: String -> Element
makeE1 str = Element (toDyn (E1 str))
makeE2 :: Int -> Element
makeE2 i = Element (toDyn (E2 i))
-- Cast an Element to E1
toE1 :: Element -> Maybe E1
toE1 (Element dyn) = fromDynamic dyn
-- Cast an Element to E2
toE2 :: Element -> Maybe E2
toE2 (Element dyn) = fromDynamic dyn
-- Cast an Element to whichever type the context expects
fromElement :: Typeable a => Element -> Maybe a
fromElement (Element dyn) = fromDynamic dyn
This is the closest solution to the OOP downcasting operation. The downside to this is that downcasts are inherently not type safe. Let's go back to the case where, some months later, you need to add an E3 subcase to your code. Well, the problem now is you have a lot of functions sprinkled throughout the code that are testing whether an Element is an E1 or an E2, which were written before E3 ever existed. How many of these functions will break when you add this third subcase? Good luck, because the compiler has no way of helping you!
Note that this three-alternative scenario I've described also exists in OOP, with these three alternatives:
The OOP counterpart to union type is the Visitor pattern, which is meant to make it easy to add new operations to a type without having to modify its subclasses. (Well, relatively easy. The Visitor pattern is hella verbose.)
The OOP counterpart to the "interface" solution is to code 100% to an interface (or abstract class). This means not only that you use an interface—it also means that your client code never "peeks under the interface" to see what the actual implementation classes are; it relies entirely on the interface methods and their contracts.
The OOP counterpart to the Dynamic solution is to use downcasting. It has the same downsides as I explained above—somebody can come in and add a new subclass, and code that "peeks" at the runtime subtype may not be ready to handle this.
So to the broader question of how to change from OOP thinking to Haskell thinking, I think this comparison provides a good starting point. OOP and Haskell provide all three alternatives. OOP makes #3 very easy, but that basically gives you rope to hang yourself with; #2 is what many OOP gurus would recommend you do, and it can be achieved if you are disciplined; but #1 in OOP gets very verbose. Haskell makes #1 easiest; #2 is not much harder to implement, but requires more careful forethought ("am I providing the correct operations for all users of this type?"); #3 is the one that's a bit verbose and against the grain of the language.

Related

How does return statement work in Haskell? [duplicate]

This question already has answers here:
What's so special about 'return' keyword
(3 answers)
Closed 5 years ago.
Consider these functions
f1 :: Maybe Int
f1 = return 1
f2 :: [Int]
f2 = return 1
Both have the same statement return 1. But the results are different. f1 gives value Just 1 and f2 gives value [1]
Looks like Haskell invokes two different versions of return based on return type. I like to know more about this kind of function invocation. Is there a name for this feature in programming languages?
This is a long meandering answer!
As you've probably seen from the comments and Thomas's excellent (but very technical) answer You've asked a very hard question. Well done!
Rather than try to explain the technical answer I've tried to give you a broad overview of what Haskell does behind the scenes without diving into technical detail. Hopefully it will help you to get a big picture view of what's going on.
return is an example of type inference.
Most modern languages have some notion of polymorphism. For example var x = 1 + 1 will set x equal to 2. In a statically typed language 2 will usually be an int. If you say var y = 1.0 + 1.0 then y will be a float. The operator + (which is just a function with a special syntax)
Most imperative languages, especially object oriented languages, can only do type inference one way. Every variable has a fixed type. When you call a function it looks at the types of the argument and chooses a version of that function that fits the types (or complains if it can't find one).
When you assign the result of a function to a variable the variable already has a type and if it doesn't agree with the type of the return value you get an error.
So in an imperative language the "flow" of type deduction follows time in your program Deduce the type of a variable, do something with it and deduce the type of the result. In a dynamically typed language (such as Python or javascript) the type of a variable is not assigned until the value of the variable is computed (which is why there don't seem to be types). In a statically typed language the types are worked out ahead of time (by the compiler) but the logic is the same. The compiler works out what the types of variables are going to be, but it does so by following the logic of the program in the same way as the program runs.
In Haskell the type inference also follows the logic of the program. Being Haskell it does so in a very mathematically pure way (called System F). The language of types (that is the rules by which types are deduced) are similar to Haskell itself.
Now remember Haskell is a lazy language. It doesn't work out the value of anything until it needs it. That's why it makes sense in Haskell to have infinite data structures. It never occurs to Haskell that a data structure is infinite because it doesn't bother to work it out until it needs to.
Now all that lazy magic happens at the type level too. In the same way that Haskell doesn't work out what the value of an expression is until it really needs to, Haskell doesn't work out what the type of an expression is until it really needs to.
Consider this function
func (x : y : rest) = (x,y) : func rest
func _ = []
If you ask Haskell for the type of this function it has a look at the definition, sees [] and : and deduces that it's working with lists. But it never needs to look at the types of x and y, it just knows that they have to be the same because they end up in the same list. So it deduces the type of the function as [a] -> [a] where a is a type that it hasn't bothered to work out yet.
So far no magic. But it's useful to notice the difference between this idea and how it would be done in an OO language. Haskell doesn't convert the arguments to Object, do it's thing and then convert back. Haskell just hasn't been asked explicitly what the type of the list is. So it doesn't care.
Now try typing the following into ghci
maxBound - length ""
maxBound : "Hello"
Now what just happened !? minBound bust be a Char because I put it on the front of a string and it must be an integer because I added it to 0 and got a number. Plus the two values are clearly very different.
So what is the type of minBound? Let's ask ghci!
:type minBound
minBound :: Bounded a => a
AAargh! what does that mean? Basically it means that it hasn't bothered to work out exactly what a is, but is has to be Bounded if you type :info Bounded you get three useful lines
class Bounded a where
minBound :: a
maxBound :: a
and a lot of less useful lines
So if a is Bounded there are values minBound and maxBound of type a.
In fact under the hood Bounded is just a value, it's "type" is a record with fields minBound and maxBound. Because it's a value Haskell doesn't look at it until it really needs to.
So I appear to have meandered somewhere in the region of the answer to your question. Before we move onto return (which you may have noticed from the comments is a wonderfully complex beast.) let's look at read.
ghci again
read "42" + 7
read "'H'" : "ello"
length (read "[1,2,3]")
and hopefully you won't be too surprised to find that there are definitions
read :: Read a => String -> a
class Read where
read :: String -> a
so Read a is just a record containing a single value which is a function String -> a. Its very tempting to assume that there is one read function which looks at a string, works out what type is contained in the string and returns that type. But it does the opposite. It completely ignores the string until it's needed. When the value is needed, Haskell first works out what type it's expecting, once it's done that it goes and gets the appropriate version of the read function and combines it with the string.
now consider something slightly more complex
readList :: Read a => [String] -> a
readList strs = map read strs
under the hood readList actually takes two arguments
readList' (Read a) -> [String] -> [a]
readList' {read = f} strs = map f strs
Again as Haskell is lazy it only bothers looking at the arguments when it's needs to find out the return value, at that point it knows what a is, so the compiler can go and fine the right version of Read. Until then it doesn't care.
Hopefully that's given you a bit of an idea of what's happening and why Haskell can "overload" on the return type. But it's important to remember it's not overloading in the conventional sense. Every function has only one definition. It's just that one of the arguments is a bag of functions. read_str doesn't ever know what types it's dealing with. It just knows it gets a function String -> a and some Strings, to do the application it just passes the arguments to map. map in turn doesn't even know it gets strings. When you get deeper into Haskell it becomes very important that functions don't know very much about the types they're dealing with.
Now let's look at return.
Remember how I said that the type system in Haskell was very similar to Haskell itself. Remember that in Haskell functions are just ordinary values.
Does this mean I can have a type that takes a type as an argument and returns another type? Of course it does!
You've seen some type functions Maybe takes a type a and returns another type which can either be Just a or Nothing. [] takes a type a and returns a list of as. Type functions in Haskell are usually containers. For example I could define a type function BinaryTree which stores a load of a's in a tree like structure. There are of course lots of much stranger ones.
So, if these type functions are similar to ordinary types I can have a typeclass that contains type functions. One such typeclass is Monad
class Monad m where
return a -> m a
(>>=) m a (a -> m b) -> m b
so here m is some type function. If I want to define Monad for m I need to define return and the scary looking operator below it (which is called bind)
As others have pointed out the return is a really misleading name for a fairly boring function. The team that designed Haskell have since realised their mistake and they're genuinely sorry about it. return is just an ordinary function that takes an argument and returns a Monad with that type in it. (You never asked what a Monad actually is so I'm not going to tell you)
Let's define Monad for m = Maybe!
First I need to define return. What should return x be? Remember I'm only allowed to define the function once, so I can't look at x because I don't know what type it is. I could always return Nothing, but that seems a waste of a perfectly good function. Let's define return x = Just x because that's literally the only other thing I can do.
What about the scary bind thing? what can we say about x >>= f? well x is a Maybe a of some unknown type a and f is a function that takes an a and returns a Maybe b. Somehow I need to combine these to get a Maybe b`
So I need to define Nothing >== f. I can't call f because it needs an argument of type a and I don't have a value of type a I don't even know what 'a' is. I've only got one choice which is to define
Nothing >== f = Nothing
What about Just x >>= f? Well I know x is of type a and f takes a as an argument, so I can set y = f a and deduce that y is of type b. Now I need to make a Maybe b and I've got a b so ...
Just x >>= f = Just (f x)
So I've got a Monad! what if m is List? well I can follow a similar sort of logic and define
return x = [x]
[] >>= f = []
(x : xs) >>= a = f x ++ (xs >>= f)
Hooray another Monad! It's a nice exercise to go through the steps and convince yourself that there's no other sensible way of defining this.
So what happens when I call return 1?
Nothing!
Haskell's Lazy remember. The thunk return 1 (technical term) just sits there until someone needs the value. As soon as Haskell needs the value it know what type the value should be. In particular it can deduce that m is List. Now that it knows that Haskell can find the instance of Monad for List. As soon as it does that it has access to the correct version of return.
So finally Haskell is ready To call return, which in this case returns [1]!
The return function is from the Monad class:
class Applicative m => Monad (m :: * -> *) where
...
return :: a -> m a
So return takes any value of type a and results in a value of type m a. The monad, m, as you've observed is polymorphic using the Haskell type class Monad for ad hoc polymorphism.
At this point you probably realize return is not an good, intuitive, name. It's not even a built in function or a statement like in many other languages. In fact a better-named and identically-operating function exists - pure. In almost all cases return = pure.
That is, the function return is the same as the function pure (from the Applicative class) - I often think to myself "this monadic value is purely the underlying a" and I try to use pure instead of return if there isn't already a convention in the codebase.
You can use return (or pure) for any type that is a class of Monad. This includes the Maybe monad to get a value of type Maybe a:
instance Monad Maybe where
...
return = pure -- which is from Applicative
...
instance Applicative Maybe where
pure = Just
Or for the list monad to get a value of [a]:
instance Applicative [] where
{-# INLINE pure #-}
pure x = [x]
Or, as a more complex example, Aeson's parse monad to get a value of type Parser a:
instance Applicative Parser where
pure a = Parser $ \_path _kf ks -> ks a

Redundancy regarding product types and tuples in Haskell

In Haskell you have product types and you have tuples.
You use tuples if you don't want to associate a dedicated type with the value, and you can use product types if you wish to do so.
However I feel there is redundancy in the notation of product types
data Foo = Foo (String, Int, Char)
data Bar = Bar String Int Char
Why are there both kinds of notations? Is there any case where you would prefer one the other?
I guess you can't use record notation when using tuples, but that's just a convenience problem. Another thing might be the notion of order in tuples, as opposed to product types, but I think that's just due to the naming of the functions fst and snd.
#chi's answer is about the technical differences in terms of Haskell's evaluation model. I hope to give you some insight into the philosophy of this sort of typed programming.
In category theory we generally work with objects "up to isomorphism". Your Bar is of course isomorphic to (String, Int, Char), so from a categorical perspective they're the same thing.
bar_tuple :: Iso' Bar (String, Int, Char)
bar_tuple = iso to from
where to (Bar s i c) = (s, i, c)
from (s, i, c) = Bar s i c
In some sense tuples are a Platonic form of product type, in that they have no meaning beyond being a collection of disparate values. All the other product types can be mapped to and from a plain old tuple.
So why not use tuples everywhere, when all Haskell types ultimately boil down to a sum of products? It's about communication. As Martin Fowler says,
Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
Names are important! Writing down a custom product type like
data Customer = Customer { name :: String, address :: String }
imbues the type Customer with meaning to the person reading the code, unlike (String, String) which just means "two strings".
Custom types are particularly useful when you want to enforce invariants by hiding the representation of your data and using smart constructors:
newtype NonEmpty a = NonEmpty [a]
nonEmpty :: [a] -> Maybe (NonEmpty a)
nonEmpty [] = Nothing
nonEmpty xs = Just (NonEmpty xs)
Now, if you don't export the NonEmpty constructor, you can force people to go through the nonEmpty smart constructor. If someone hands you a NonEmpty value you may safely assume that it has at least one element.
You can of course represent Customer as a tuple under the hood and expose evocatively-named field accessors,
newtype Customer = Bar (String, String)
name, address :: Customer -> String
name (Customer (n, a)) = n
address (Customer (n, a)) = a
but this doesn't really buy you much, except that it's now cheaper to convert Customer to a tuple (if, say, you're writing performance-sensitive code that works with a tuple-oriented API).
If your code is intended to solve a particular problem - which of course is the whole point of writing code - it pays to not just solve the problem, but make it look like you've solved it too. Someone - maybe you in a couple of years - is going to have to read this code and understand it with no a priori knowledge of how it works. Custom types are a very important communication tool in this regard.
The type
data Foo = Foo (String, Int, Char)
represents a double-lifted tuple. It values comprise
undefined
Foo undefined
Foo (undefined, undefined, undefined)
etc.
This is usually troublesome. Because of this, it's rare to see such definitions in actual code. We either have plain data types
data Foo = Foo String Int Char
or newtypes
newtype Foo = Foo (String, Int, Char)
The newtype can be just as inconvenient to use, but at least it
does not double-lift the tuple: undefined and Foo undefined are now equal values.
The newtype also provides zero-cost conversion between a plain tuple and Foo, in both directions.
You can see such newtypes in use e.g. when the programmer needs a different instance for some type class, than the one already associated with the tuple. Or, perhaps, it is used in a "smart constructor" idiom.
I would not expect the pattern used in Foo to be frequent. There is slight difference in what the constructor acts like: Foo :: (String, Int, Char) -> Foo as opposed to Bar :: String -> Int -> Char -> Bar. Then Foo undefined and Foo (undefined, ..., ...) are strictly speaking different things, whereas you miss one level of undefinedness in Bar.

Can I declare a NULL value in Haskell?

Just curious, seems when declaring a name, we always specify some valid values, like let a = 3. Question is, in imperative languages include c/java there's always a keyword of "null". Does Haskell has similar thing? When could a function object be null?
There is a “null” value that you can use for variables of any type. It's called ⟂ (pronounced bottom). We don't need a keyword to produce bottom values; actually ⟂ is the value of any computation which doesn't terminate. For instance,
bottom = let x = x in x -- or simply `bottom = bottom`
will infinitely loop. It's obviously not a good idea to do this deliberately, however you can use undefined as a “standard bottom value”. It's perhaps the closest thing Haskell has to Java's null keyword.
But you definitely shouldn't/can't use this for most of the applications where Java programmers would grab for null.
Since everything in Haskell is immutable, a value that's undefined will always stay undefined. It's not possible to use this as a “hold on a second, I'll define it later” indication†.
It's not possible to check whether a value is bottom or not. For rather deep theoretical reasons, in fact. So you can't use this for values that may or may not be defined.
And you know what? It's really good that Haskell does't allow this! In Java, you constantly need to be wary that values might be null. In Haskell, if a value is bottom than something is plain broken, but this will never be part of intended behaviour / something you might need to check for. If for some value it's intended that it might not be defined, then you must always make this explicit by wrapping the type in a Maybe. By doing this, you make sure that anybody trying to use the value must first check whether it's there. Not possible to forget this and run into a null-reference exception at runtime!
And because Haskell is so good at handling variant types, checking the contents of a Maybe-wrapped value is really not too cumbersome. You can just do it explicitly with pattern matching,
quun :: Int -> String
quun i = case computationWhichMayFail i of
Just j -> show j
Nothing -> "blearg, failed"
computationWhichMayFail :: Int -> Maybe Int
or you can use the fact that Maybe is a functor. Indeed it is an instance of almost every specific functor class: Functor, Applicative, Alternative, Foldable, Traversable, Monad, MonadPlus. It also lifts semigroups to monoids.
Dᴏɴ'ᴛ Pᴀɴɪᴄ now,
you don't need to know what the heck these things are. But when you've learned what they do, you will be able to write very concise code that automagically handles missing values always in the right way, with zero risk of missing a check.
†Because Haskell is lazy, you generally don't need to defer any calculations to be done later. The compiler will automatically see to it that the computation is done when it's necessary, and no sooner.
There is no null in Haskell. What you want is the Maybe monad.
data Maybe a
= Just a
| Nothing
Nothing refers to classic null and Just contains a value.
You can then pattern match against it:
foo Nothing = Nothing
foo (Just a) = Just (a * 10)
Or with case syntax:
let m = Just 10
in case m of
Just v -> print v
Nothing -> putStrLn "Sorry, there's no value. :("
Or use the supperior functionality provided by the typeclass instances for Functor, Applicative, Alternative, Monad, MonadPlus and Foldable.
This could then look like this:
foo :: Maybe Int -> Maybe Int -> Maybe Int
foo x y = do
a <- x
b <- y
return $ a + b
You can even use the more general signature:
foo :: (Monad m, Num a) => m a -> m a -> m a
Which makes this function work for ANY data type that is capable of the functionality provided by Monad. So you can use foo with (Num a) => Maybe a, (Num a) => [a], (Num a) => Either e a and so on.
Haskell does not have "null". This is a design feature. It completely prevents any possibility of your code crashing due to a null-pointer exception.
If you look at code written in an imperative language, 99% of the code expects stuff to never be null, and will malfunction catastrophically if you give it null. But then 1% of the code does expect nulls, and uses this feature to specify optional arguments or whatever. But you can't easily tell, by looking at the code, which parts are expecting nulls as legal arguments, and which parts aren't. Hopefully it's documented — but don't hold your breath!
In Haskell, there is no null. If that argument is declared as Customer, then there must be an actual, real Customer there. You can't just pass in a null (intentionally or by mistake). So the 99% of the code that is expecting a real Customer will always work.
But what about the other 1%? Well, for that we have Maybe. But it's an explicit thing; you have to explicitly say "this value is optional". And you have to explicitly check when you use it. You cannot "forget" to check; it won't compile.
So yes, there is no "null", but there is Maybe which is kinda similar, but safer.
Not in Haskell (or in many other FP languages). If you have some expression of some type T, its evaluation will give a value of type T, with the following exceptions:
infinite recursion may make the program "loop forever" and failing to return anything
let f n = f (n+1) in f 0
runtime errors can abort the program early, e.g.:
division by zero, square root of negative, and other numerical errors
head [], fromJust Nothing, and other partial functions used on invalid inputs
explicit calls to undefined, error "message", or other exception-throwing primitives
Note that even if the above cases might be regarded as "special" values called "bottoms" (the name comes from domain theory), you can not test against these values at runtime, in general. So, these are not at all the same thing as Java's null. More precisely, you can't write things like
-- assume f :: Int -> Int
if (f 5) is a division-by-zero or infinite recursion
then 12
else 4
Some exceptional values can be caught in the IO monad, but forget about that -- exceptions in Haskell are not idiomatic, and roughly only used for IO errors.
If you want an exceptional value which can be tested at run-time, use the Maybe a type, as #bash0r already suggested. This type is similar to Scala's Option[A] or Java's not-so-much-used Optional<A>.
The value is having both a type T and type Maybe T is to be able to precisely identify which functions always succeed, and which ones can fail. In Haskell the following is frowned upon, for instance:
-- Finds a value in a list. Returns -1 if not present.
findIndex :: Eq a => [a] -> a -> Int
Instead this is preferred:
-- Finds a value in a list. Returns Nothing if not present.
findIndex :: Eq a => [a] -> a -> Maybe Int
The result of the latter is less convenient than the one of the former, since the Int must be unwrapped at every call. This is good, since in this way each user of the function is prevented to simply "ignore" the not-present case, and write buggy code.

Haskell "dependent" fields of a record?

I've got the following record defined:
data Option = Option {
a :: Maybe String,
b :: Either String Int
} deriving (Show)
Is there anyway for me to enforce that when a is Nothing, b must be a Left and when a is Just, b must be a Right? Maybe with phantom types, or something else? Or must I wrap the whole thing inside of an Either and make it Either String (String, Int) ?
You should just use two constructors for the two possible shapes:
data Option = NoA String | WithA String Int
Of course, you should give them better names, based on what they represent. Phantom types are definitely overkill here, and I would suggest avoiding Either — Left and Right are not very self-documenting constructor names.
If it makes sense to interpret both Either branches of the b field as representing the same data, then you should define a function that reflects this interpretation:
b :: Option -> MeaningOfB
b (NoA s) = ...
b (WithA t n) = ...
If you have fields that stay the same no matter what the choice, you should make a new data type with all of them, and include it in both constructors. If you make each constructor a record, you can give the common field the same name in every constructor, so that you can extract it from any Option value without having to pattern-match on it.
Basically, think about what it means for the string not to be present: what does it change about the other fields, and what stays the same? Whatever changes should go in the respective constructors; whatever stays the same should be factored out into its own type. (This is a good design principle in general!)
If you come from an OOP background, you can think about this in terms of reasoning with composition instead of inheritance — but try not to take the analogy too far.

overview, but very over in functional programming

What does a very general function look like in functional programming?
Somebody said "we don't have objects, but we have higher order functions". Do higher order functions replace objects?
While programming object-oriented apps, I try to go from a more general to a more detailed idea, lots of times. If I try to do that in functional programming, am I going to need lots of higher order functions?
This answer is oriented towards Haskell rather than Lisp because although lisp has higher order functions, idiomatic lisp can be and is often very object-oriented.
We'll also ignore inheritance (and ad-hoc polymorphism) which is commonly associated with object oriented programming, but is somewhat orthogonal.
In general, abstract data types "replace" objects, in the sense that generally you use an object to bundle together a bunch of related data in e.g. Java or Python, and you declare a data type to do such a thing in Haskell or ML.
However objects also bundle behavior with the data. So an object of a class has data, but also functions which can access and mutate that data. In a functional style, you'd simply declare the functions on your data type outside of that data type. Then encapsulation is provided by either modules or use of closures.
On the latter point -- closures and objects are duals, although it is not necessarily idiomatic to express them as such. There's some very old-school discussion of this at the portland patterns wiki: http://c2.com/cgi/wiki?ClosuresAndObjectsAreEquivalent.
Oh, and an example from oleg: http://okmij.org/ftp/Scheme/oop-in-fp.txt.
Ignoring typeclasses (which are essential to idiomatic Haskell), and focusing just on core functional programming, here's a sketch of a different approach to something that would be done with inheritance in an OO language. Function foo uses some object that implements interface A and some object that implements interface B to produce some Double. With higher order functions, you'd perhaps have a type signature of fooGen :: (a -> Double -> Double) -> (b -> String -> Double) -> a -> b -> Double.
That signature says that fooGen takes a function from some a and a Double to another Double, and a function of some b and a String to a Double, and then it takes an a and a b, and finally it returns a Double.
So now you can pass in the "interface" separately from the concrete values through partial application, by declaring, e.g., fooSpecialized = fooGen funcOnA funcOnB.
With typeclasses, you can abstract away the concrete passing in of the "interface implementation" (or, in more proper language, dictionary), and declare foo :: HasSomeFunc a, HasSomeOtherFunc b => a -> b -> Double. You can think of the stuff on the left hand side of the => as declaring, loosely, the interfaces that your concrete a and b types are required to implement.
This is all a handwavy and partial answer of course to an exceedingly general question.
Answers first
Somebody said "we don't have objects, but we have higher order functions". Do higher order functions replace objects?
If you mean can higher order functions contain some hidden state, then yes. Functions defined inside of the other functions can capture some information from their scope, and, if returned to the outer world, will preserve this information. This is what closures are about.
If you mean can higher order functions contain mutable state, then no. In pure functional programming they are not stateful. They produce the same results on the same inputs. If you want to simulate how something changes, you do not overwrite a variable, but you define how to calculate its new value from the old one.
Of course, there are shortcuts, and even functional languages allow for writing in imperative style.
If I try to do that in functional programming, am I going to need lots of higher order functions?
You are going to use them a lot. Probably not even thinking that your functions are higher-order. And probably, enjoying them a lot. You'll just pass functions as values to other functions.
For example, map is a HOF. Its first argument is another function. What you would think of in an imperative language as a loop "for every element in a collection: apply some function, save the result", in the functional language will be "map a function over a collection and get a new collection of results". Folds are another example of HOFs. So most of the loops from an imperative language can be translated to the calls of a higher order functions in a functional language. This is to make clear how often you are likely to use them.
overview, but very over in functional programming
This is a good place to start: Functional Programming.
An example of encapsulating "state":
f = let x = 3
in let withX y = x + y
in withX
Now f is the same as withX, which is a function that "remembers", that x = 3. When we use f, we need to supply only one argument, y, and it will sum it with the "remembered" value of x (3).
This should print 3 and then [4, 5, 6]:
main = do
print $ f 0
print $ map f [1..3]
We do not pass 3 as an argument to f, it "remembers" 3 from the closure above,
and we can pass f itself as a parameter to map, which is a HOF in this case.
So functions can encapsulate state.
An example of "changing" a variable
As I said above, in functional programming, the state is not mutable. So if you want to, say, apply operation f to the value, save the result, and then apply operation g, in functional language you would express it with an intermediate variable, which contains the result of applying f, and then apply g to it to calculate the new value. Please note that you are not "overwriting" the original value x0:
applyTwo first second x0 =
let x1 = first x0
in second x1
But in fact it is possible to write it shorter, because it is just a simple composition of functions:
applyTwo' f g = g . f
Or you can generalize this approach, and write a function, which will apply any number of functions:
applyAll [] = id -- don't do anything if there are no functions to apply
applyAll (f:otherFs) = applyAll otherFs . f
Please note, that applyTwo and applyAll are now a higher order functions. Of course, they do not replace objects, but allow to avoid the mutable state.
This how they are used:
ghci> applyTwo (+1) (*10) 2
30
ghci> applyAll [(+1), (*10)] 2
30
It's all programming; the same patterns show up again and again. You might write something like this in your favorite OO language:
role Person {
has 'firstname' => ( isa => 'Str' );
has 'lastname' => ( isa => 'Str' );
}
class Employee does Person {
has 'manager' => ( isa => 'Employee' );
has 'reports' => ( isa => '[Employee]' );
}
Whereas in Haskell, you'd write:
class Person a where
firstname :: a -> String
lastname :: a -> String
data Employee = Employee { firstName :: String
, lastName :: String
, manager :: Employee
, reports :: [Employee]
}
instance Person Employee where
firstname = firstName
lastname = lastName
People worry too much about what's different rather than realizing that most things are the same.

Resources