Why do I have to use newtype when my data type declaration only has one constructor? [duplicate]

Why do I have to use newtype when my data type declaration only has one constructor? [duplicate] - haskell

This question already has answers here:
Difference between `data` and `newtype` in Haskell
(2 answers)
Closed 8 years ago.
It seems that a newtype definition is just a data definition that obeys some restrictions (e.g., only one constructor), and that due to these restrictions the runtime system can handle newtypes more efficiently. And the handling of pattern matching for undefined values is slightly different.
But suppose Haskell would only knew data definitions, no newtypes: couldn't the compiler find out for itself whether a given data definition obeys these restrictions, and automatically treat it more efficiently?
I'm sure I'm missing out on something, there must be some deeper reason for this.

Both newtype and the single-constructor data introduce a single value constructor, but the value constructor introduced by newtype is strict and the value constructor introduced by data is lazy. So if you have
data D = D Int
newtype N = N Int
Then N undefined is equivalent to undefined and causes an error when evaluated. But D undefined is not equivalent to undefined, and it can be evaluated as long as you don't try to peek inside.
Couldn't the compiler handle this for itself.
No, not really—this is a case where as the programmer you get to decide whether the constructor is strict or lazy. To understand when and how to make constructors strict or lazy, you have to have a much better understanding of lazy evaluation than I do. I stick to the idea in the Report, namely that newtype is there for you to rename an existing type, like having several different incompatible kinds of measurements:
newtype Feet = Feet Double
newtype Cm = Cm Double
both behave exactly like Double at run time, but the compiler promises not to let you confuse them.

According to Learn You a Haskell:
Instead of the data keyword, the newtype keyword is used. Now why is
that? Well for one, newtype is faster. If you use the data keyword to
wrap a type, there's some overhead to all that wrapping and unwrapping
when your program is running. But if you use newtype, Haskell knows
that you're just using it to wrap an existing type into a new type
(hence the name), because you want it to be the same internally but
have a different type. With that in mind, Haskell can get rid of the
wrapping and unwrapping once it resolves which value is of what type.
So why not just use newtype all the time instead of data then? Well,
when you make a new type from an existing type by using the newtype
keyword, you can only have one value constructor and that value
constructor can only have one field. But with data, you can make data
types that have several value constructors and each constructor can
have zero or more fields:
data Profession = Fighter | Archer | Accountant
data Race = Human | Elf | Orc | Goblin
data PlayerCharacter = PlayerCharacter Race Profession
When using newtype, you're restricted to just one constructor with one
field.
Now consider the following type:
data CoolBool = CoolBool { getCoolBool :: Bool }
It's your run-of-the-mill algebraic data type that was defined with
the data keyword. It has one value constructor, which has one field
whose type is Bool. Let's make a function that pattern matches on a
CoolBool and returns the value "hello" regardless of whether the Bool
inside the CoolBool was True or False:
helloMe :: CoolBool -> String
helloMe (CoolBool _) = "hello"
Instead of applying this function to a normal CoolBool, let's throw it a curveball and apply it to undefined!
ghci> helloMe undefined
"*** Exception: Prelude.undefined
Yikes! An exception! Now why did this exception happen? Types defined
with the data keyword can have multiple value constructors (even
though CoolBool only has one). So in order to see if the value given
to our function conforms to the (CoolBool _) pattern, Haskell has to
evaluate the value just enough to see which value constructor was used
when we made the value. And when we try to evaluate an undefined
value, even a little, an exception is thrown.
Instead of using the data keyword for CoolBool, let's try using
newtype:
newtype CoolBool = CoolBool { getCoolBool :: Bool }
We don't have to
change our helloMe function, because the pattern matching syntax is
the same if you use newtype or data to define your type. Let's do the
same thing here and apply helloMe to an undefined value:
ghci> helloMe undefined
"hello"
It worked! Hmmm, why is that? Well, like we've said, when we use
newtype, Haskell can internally represent the values of the new type
in the same way as the original values. It doesn't have to add another
box around them, it just has to be aware of the values being of
different types. And because Haskell knows that types made with the
newtype keyword can only have one constructor, it doesn't have to
evaluate the value passed to the function to make sure that it
conforms to the (CoolBool _) pattern because newtype types can only
have one possible value constructor and one field!
This difference in behavior may seem trivial, but it's actually pretty
important because it helps us realize that even though types defined
with data and newtype behave similarly from the programmer's point of
view because they both have value constructors and fields, they are
actually two different mechanisms. Whereas data can be used to make
your own types from scratch, newtype is for making a completely new
type out of an existing type. Pattern matching on newtype values isn't
like taking something out of a box (like it is with data), it's more
about making a direct conversion from one type to another.
Here's another source. According to this Newtype article:
A newtype declaration creates a new type in much the same way as data.
The syntax and usage of newtypes is virtually identical to that of
data declarations - in fact, you can replace the newtype keyword with
data and it'll still compile, indeed there's even a good chance your
program will still work. The converse is not true, however - data can
only be replaced with newtype if the type has exactly one constructor
with exactly one field inside it.
Some Examples:
newtype Fd = Fd CInt
-- data Fd = Fd CInt would also be valid
-- newtypes can have deriving clauses just like normal types
newtype Identity a = Identity a
deriving (Eq, Ord, Read, Show)
-- record syntax is still allowed, but only for one field
newtype State s a = State { runState :: s -> (s, a) }
-- this is *not* allowed:
-- newtype Pair a b = Pair { pairFst :: a, pairSnd :: b }
-- but this is:
data Pair a b = Pair { pairFst :: a, pairSnd :: b }
-- and so is this:
newtype Pair' a b = Pair' (a, b)
Sounds pretty limited! So why does anyone use newtype?
The short version The restriction to one constructor with one field
means that the new type and the type of the field are in direct
correspondence:
State :: (s -> (a, s)) -> State s a
runState :: State s a -> (s -> (a, s))
or in mathematical terms they are isomorphic. This means that after
the type is checked at compile time, at run time the two types can be
treated essentially the same, without the overhead or indirection
normally associated with a data constructor. So if you want to declare
different type class instances for a particular type, or want to make
a type abstract, you can wrap it in a newtype and it'll be considered
distinct to the type-checker, but identical at runtime. You can then
use all sorts of deep trickery like phantom or recursive types without
worrying about GHC shuffling buckets of bytes for no reason.
See the article for the messy bits...

Simple version for folks obsessed with bullet lists (failed to find one, so have to write it by myself):
data - creates new algebraic type with value constructors
Can have several value constructors
Value constructors are lazy
Values can have several fields
Affects both compilation and runtime, have runtime overhead
Created type is a distinct new type
Can have its own type class instances
When pattern matching against value constructors, WILL be evaluated at least to weak head normal form (WHNF) *
Used to create new data type (example: Address { zip :: String, street :: String } )
newtype - creates new “decorating” type with value constructor
Can have only one value constructor
Value constructor is strict
Value can have only one field
Affects only compilation, no runtime overhead
Created type is a distinct new type
Can have its own type class instances
When pattern matching against value constructor, CAN be not evaluated at all *
Used to create higher level concept based on existing type with distinct set of supported operations or that is not interchangeable with original type (example: Meter, Cm, Feet is Double)
type - creates an alternative name (synonym) for a type (like typedef in C)
No value constructors
No fields
Affects only compilation, no runtime overhead
No new type is created (only a new name for existing type)
Can NOT have its own type class instances
When pattern matching against data constructor, behaves the same as original type
Used to create higher level concept based on existing type with the same set of supported operations (example: String is [Char])
[*] On pattern matching laziness:
data DataBox a = DataBox Int
newtype NewtypeBox a = NewtypeBox Int
dataMatcher :: DataBox -> String
dataMatcher (DataBox _) = "data"
newtypeMatcher :: NewtypeBox -> String
newtypeMatcher (NewtypeBox _) = "newtype"
ghci> dataMatcher undefined
"*** Exception: Prelude.undefined
ghci> newtypeMatcher undefined
“newtype"

Off the top of my head; data declarations use lazy evaluation in access and storage of their "members", whereas newtype does not. Newtype also strips away all previous type instances from its components, effectively hiding its implementation; whereas data leaves the implementation open.
I tend to use newtype's when avoiding boilerplate code in complex data types where I don't necessarily need access to the internals when using them. This speeds up both compilation and execution, and reduces code complexity where the new type is used.
When first reading about this I found this chapter of a Gentle Introduction to Haskell rather intuitive.

Related

What does a stand for in a data type declaration?

Normally when using type declarations we do:
function_name :: Type -> Type
However in an exercise I am trying to solve there is the following structure:
function_name :: Type a -> Type a
or explicitly as in the exercise
alphabet :: DFA a -> Alphabet a
alphabet = undefined
What does a stand for?

Short answer: it's a type variable.
At the computation level, the way we define functions is to use variables to refer to their arguments. Like this:
f x = x + 3
Here x is a variable, and its value will be chosen when the function is called. Haskell has a similar (but not identical...) mechanism in its type sublanguage. For example, you can write things like:
type F x = (x, Int, x)
type Endo a = a -> a -> a
Here again x is a variable in the first one (and a in the second), and its value will be chosen at use sites. One can also use this mechanism when defining new types. (The previous two examples just give new names to existing types, but the following does more.) One of the most basic nontrivial examples of this is the Maybe family of types:
data Maybe a = Nothing | Just a
The things on the right of the = are computation-level, so you can mostly ignore them for now, but on the left we are declaring a new family of types Maybe which accepts other types as an argument. For example, Maybe Int, Maybe (Bool, String), Maybe (Endo Char), and even passing in expressions that have variables like Maybe (x, Int, x) are all possible.
Syntactically, type constructors (things which are defined as part of the program text and that we expect the compiler to look up the definition for) start with an upper case letter and type variables (things which will be instantiated later and so don't currently have a concrete definition) start with lower case letters.
So, in the type signature you showed:
alphabet :: DFA a -> Alphabet a
I suspect there are actually two constructs new to you, not just one: first, the type variable a that you asked about, and second, the concept of type application, where we apply at the type level one "function-like" type to another. (Outside of this answer, people say "parameterized" instead of "function-like".)
...and, believe it or not, there is even a type system for types that makes sure you don't write things like these:
Int a -- Int is not parameterized, so shouldn't be applied to arguments
Int Char -- ditto
Maybe -> String -- Maybe is parameterized, so should be applied to
-- arguments, but isn't

Can Haskell type synonyms be used as type constructors?

I'm writing a benchmark to compare the performance of a number of Haskell collections, including STArray, on a given task. To eliminate repetition, I'm trying to write a set of functions that provide a uniform interface to these collections, so that I can implement the task as a polymorphic higher-order function. More specifically, the task is implemented in terms of a polymorphic monad, which is ST s for STArray, and Identity for collections like HashMap, that do not typically need to be manipulated within a monad.
Due to uniformity requirements, I can't use the Identity and HashMap types directly, as I need their kinds to match the kinds of ST and STArray. I thought that the simplest way to achieve this would be to define type synonyms with phantom parameters:
type Identity' s a = Identity a
type HashMap' s i e = HashMap i e
-- etc.
Unfortunately this doesn't work, because when I try to use these synonyms as type constructors in places where I use ST and STArray as type constructors, GHC gives errors like:
The type synonym ‘Identity'’ should have 2 arguments, but has been given none
I came across the -XLiberalTypeSynonyms GHC extension, and thought it would allow me to do this, as the documentation says:
You can apply a type synonym to a partially applied type synonym
and gives this example of doing so:
type Generic i o = forall x. i x -> o x
type Id x = x
foo :: Generic Id []
That example works in GHC 8.0.2 (with -XExistentialQuantification and -XRank2Types). But replacing Generic with a newtype or data declaration, as needed in my use case, does not work.
I.e. the following code leads to the same kind of error that I reported above:
newtype Generic i o = Generic (forall x. i x -> o x)
type Id x = x
foo :: Generic Id []
foo = Generic (\x -> [x])
Question
Is there some other extension that I need to enable to get this to work? If not, is there a good reason why this doesn't work, or is it just an oversight?
Workaround
I'm aware that I can work around this by defining Identity', etc. as fully-fledged types, e.g.:
newtype Identity' s a = Identity' a
newtype Collection collection s i e = Collection (collection i e)
-- etc.
This is not ideal though, as it means that I have to reimplement Identity's Functor, Applicative and Monad instances for Identity', and it means that I have to write additional wrapping and unwrapping code for the collections.

Clarification on Existential Types in Haskell

I am trying to understand Existential types in Haskell and came across a PDF http://www.ii.uni.wroc.pl/~dabi/courses/ZPF15/rlasocha/prezentacja.pdf
Please correct my below understandings that I have till now.
Existential Types not seem to be interested in the type they contain but pattern matching them say that there exists some type we don't know what type it is until & unless we use Typeable or Data.
We use them when we want to Hide types (ex: for Heterogeneous Lists) or we don't really know what the types at Compile Time.
GADT's provide the clear & better syntax to code using Existential Types by providing implicit forall's
My Doubts
In Page 20 of above PDF it is mentioned for below code that it is impossible for a Function to demand specific Buffer. Why is it so? When I am drafting a Function I exactly know what kind of buffer I gonna use eventhough I may not know what data I gonna put into that.
What's wrong in Having :: Worker MemoryBuffer Int If they really want to abstract over Buffer they can have a Sum type data Buffer = MemoryBuffer | NetBuffer | RandomBuffer and have a type like :: Worker Buffer Int
data Worker x = forall b. Buffer b => Worker {buffer :: b, input :: x}
data MemoryBuffer = MemoryBuffer
memoryWorker = Worker MemoryBuffer (1 :: Int)
memoryWorker :: Worker Int
As Haskell is a Full Type Erasure language like C then How does it know at Runtime which function to call. Is it something like we gonna maintain few information and pass in a Huge V-Table of Functions and at runtime it gonna figure out from V-Table? If it is so then what sort of Information it gonna store?

GADT's provide the clear & better syntax to code using Existential Types by providing implicit forall's
I think there's general agreement that the GADT syntax is better. I wouldn't say that it's because GADTs provide implicit foralls, but rather because the original syntax, enabled with the ExistentialQuantification extension, is potentially confusing/misleading. That syntax, of course, looks like:
data SomeType = forall a. SomeType a
or with a constraint:
data SomeShowableType = forall a. Show a => SomeShowableType a
and I think the consensus is that the use of the keyword forall here allows the type to be easily confused with the completely different type:
data AnyType = AnyType (forall a. a) -- need RankNTypes extension
A better syntax might have used a separate exists keyword, so you'd write:
data SomeType = SomeType (exists a. a) -- not valid GHC syntax
The GADT syntax, whether used with implicit or explicit forall, is more uniform across these types, and seems to be easier to understand. Even with an explicit forall, the following definition gets across the idea that you can take a value of any type a and put it inside a monomorphic SomeType':
data SomeType' where
SomeType' :: forall a. (a -> SomeType') -- parentheses optional
and it's easy to see and understand the difference between that type and:
data AnyType' where
AnyType' :: (forall a. a) -> AnyType'
Existential Types not seem to be interested in the type they contain but pattern matching them say that there exists some type we don't know what type it is until & unless we use Typeable or Data.
We use them when we want to Hide types (ex: for Heterogeneous Lists) or we don't really know what the types at Compile Time.
I guess these aren't too far off, though you don't have to use Typeable or Data to use existential types. I think it would be more accurate to say an existential type provides a well-typed "box" around an unspecified type. The box does "hide" the type in a sense, which allows you to make a heterogeneous list of such boxes, ignoring the types they contain. It turns out that an unconstrained existential, like SomeType' above is pretty useless, but a constrained type:
data SomeShowableType' where
SomeShowableType' :: forall a. (Show a) => a -> SomeShowableType'
allows you to pattern match to peek inside the "box" and make the type class facilities available:
showIt :: SomeShowableType' -> String
showIt (SomeShowableType' x) = show x
Note that this works for any type class, not just Typeable or Data.
With regard to your confusion about page 20 of the slide deck, the author is saying that it's impossible for a function that takes an existential Worker to demand a Worker having a particular Buffer instance. You can write a function to create a Worker using a particular type of Buffer, like MemoryBuffer:
class Buffer b where
output :: String -> b -> IO ()
data Worker x = forall b. Buffer b => Worker {buffer :: b, input :: x}
data MemoryBuffer = MemoryBuffer
instance Buffer MemoryBuffer
memoryWorker = Worker MemoryBuffer (1 :: Int)
memoryWorker :: Worker Int
but if you write a function that takes a Worker as argument, it can only use the general Buffer type class facilities (e.g., the function output):
doWork :: Worker Int -> IO ()
doWork (Worker b x) = output (show x) b
It can't try to demand that b be a particular type of buffer, even via pattern matching:
doWorkBroken :: Worker Int -> IO ()
doWorkBroken (Worker b x) = case b of
MemoryBuffer -> error "try this" -- type error
_ -> error "try that"
Finally, runtime information about existential types is made available through implicit "dictionary" arguments for the typeclasses that are involved. The Worker type above, in addtion to having fields for the buffer and input, also has an invisible implicit field that points to the Buffer dictionary (somewhat like v-table, though it's hardly huge, as it just contains a pointer to the appropriate output function).
Internally, the type class Buffer is represented as a data type with function fields, and instances are "dictionaries" of this type:
data Buffer' b = Buffer' { output' :: String -> b -> IO () }
dBuffer_MemoryBuffer :: Buffer' MemoryBuffer
dBuffer_MemoryBuffer = Buffer' { output' = undefined }
The existential type has a hidden field for this dictionary:
data Worker' x = forall b. Worker' { dBuffer :: Buffer' b, buffer' :: b, input' :: x }
and a function like doWork that operates on existential Worker' values is implemented as:
doWork' :: Worker' Int -> IO ()
doWork' (Worker' dBuf b x) = output' dBuf (show x) b
For a type class with only one function, the dictionary is actually optimized to a newtype, so in this example, the existential Worker type includes a hidden field that consists of a function pointer to the output function for the buffer, and that's the only runtime information needed by doWork.

In Page 20 of above PDF it is mentioned for below code that it is impossible for a Function to demand specific Buffer. Why is it so?
Because Worker, as defined, takes only one argument, the type of the "input" field (type variable x). E.g. Worker Int is a type. The type variable b, instead, is not a parameter of Worker, but is a sort of "local variable", so to speak. It can not be passed as in Worker Int String -- that would trigger a type error.
If we instead defined:
data Worker x b = Worker {buffer :: b, input :: x}
then Worker Int String would work, but the type is no longer existential -- we now always have to pass the buffer type as well.
As Haskell is a Full Type Erasure language like C then How does it know at Runtime which function to call. Is it something like we gonna maintain few information and pass in a Huge V-Table of Functions and at runtime it gonna figure out from V-Table? If it is so then what sort of Information it gonna store?
This is roughly correct. Briefly put, each time you apply constructor Worker, GHC infers the b type from the arguments of Worker, and then searches for an instance Buffer b. If that is found, GHC includes an additional pointer to the instance in the object. In its simplest form, this is not too different from the "pointer to vtable" which is added to each object in OOP when virtual functions are present.
In the general case, it can be much more complex, though. The compiler might use a different representation and add more pointers instead of a single one (say, directly adding the pointers to all the instance methods), if that speeds up code. Also, sometimes the compiler needs to use multiple instances to satisfy a constraint. E.g., if we need to store the instance for Eq [Int] ... then there is not one but two: one for Int and one for lists, and the two needs to be combined (at run time, barring optimizations).
It is hard to guess exactly what GHC does in each case: that depends on a ton of optimizations which might or might not trigger.
You could try googling for the "dictionary based" implementation of type classes to see more about what's going on. You can also ask GHC to print the internal optimized Core with -ddump-simpl and observe the dictionaries being constructed, stored, and passed around. I have to warn you: Core is rather low level, and can be hard to read at first.

Data declaration with no data constructor. Can it be instantiated? Why does it compile?

Reading one of my Haskell books, I came across the sentence:
Data declarations always create a new type constructor, but may or may not create new data constructors.
It sounded strange that one should be able to declare a data type with no data constructor, because it seems that then one could never instantiate the type. So I tried it out. The following data declaration compiles without error.
data B = String
How would one create an instance of this type? Is it possible? I cannot seem to find a way.
I thought maybe a data constructor with name matching the type constructor would be created automatically, but that does not appear to be the case, as shown by the error resulting from attempting to use B as a data constructor with the declaration in scope.
Prelude> data B = String deriving Show
Prelude> B
<interactive>:129:1: error: Data constructor not in scope: B
Why is this data declaration permitted to compile if the type can never be instantiated?
Is it permitted solely for some formal reason despite not having a known practical application?
I also wonder whether my book's statement about data types with no constructor might be referring to types declared via the type or newtype keywords instead of by data.
In the type case, type synonyms clearly do not use data
constructors, as illustrated by the following.
Prelude> type B = String
Prelude>
Type synonyms such as this can be instantiated by the constructors of
the type they are set to. But I am not convinced that this is what my
book is referring to as type synonyms do not seem to be declaring a new data type as much as
simply defining an new alias for an existing type.
In the newtype case, it appears that types without data
constructors cannot be created, as shown by the following error.
Prelude> newtype B = String
<interactive>:132:13: error:
• The constructor of a newtype must have exactly one field
but ‘String’ has none
• In the definition of data constructor ‘String’
In the newtype declaration for ‘B’
type and newtype do not appear to be what the book is referring to, which brings me back to my original question: why it is possible to declare a type using data with no data constructor?

How would one create an instance of this type?
The statement from your book is correct, but your example is not. data B = String defines a type constructor B and a data constructor String, both taking no arguments. Note that the String you define is in the value namespace, so is different from the usual String type constructor.
ghci> data B = String
ghci> x = String
ghci> :t x
x :: B
However, here is an example of a data definition without data constructors (so it cannot be instantiated).
ghci> data B
Now, I have a new type constructor B, but no data constructors to produce values of type B. In fact, such a data type is declared in the Haskell base: it is called Void:
ghci> import Data.Void
ghci> :i Void
data Void -- Defined in ‘Data.Void’
Why is this data declaration permitted to compile if the type can never be instantiated?
Being able to have uninhabited types turns out to be useful in a handful of places. The examples that I can think of right now are mostly passing in such a type as a type parameter to another type constructor. One more practical use case is in streaming libraries like conduit.
There is a ConduitM i o m r type constructor where: i is the type of the input stream elements, o the type of the output stream elements, m the monad in which actions are performed, r is the final result produced at the end.
Then, it defines a Sink as
type Sink i m r = ConduitM i Void m r
since a Sink should never output any values. Void is a compile time guarantee that Sink cannot output any (non-bottom) values.
Much like Identity, Void is mostly useful in conjunction with other abstractions.
... type synonyms clearly do not use data constructors
Yes, but they are not defining type constructors either. Synonyms are just some surface-level convenience renaming. Under the hood, nothing new is defined.
In the newtype case, it appears that types without data constructors cannot be created, as shown by the following error.
I suggest you look up what newtype is for. The whole point of newtype is to provide a zero-cost wrapper around an existing type. That means you have one and exactly one constructor taking one and exactly one argument (the wrapped value). At compile time, the wrapping and unwrapping operations become NOPs.

What is () in Haskell, exactly?

I'm reading Learn You a Haskell, and in the monad chapters, it seems to me that () is being treated as a sort of "null" for every type. When I check the type of () in GHCi, I get
>> :t ()
() :: ()
which is an extremely confusing statement. It seems that () is a type all to itself. I'm confused as to how it fits into the language, and how it seems to be able to stand for any type.

tl;dr () does not add a "null" value to every type, hell no; () is a "dull" value in a type of its own: ().
Let me step back from the question a moment and address a common source of confusion. A key thing to absorb when learning Haskell is the distinction between its expression language and its type language. You're probably aware that the two are kept separate. But that allows the same symbol to be used in both, and that is what is going on here. There are simple textual cues to tell you which language you're looking at. You don't need to parse the whole language to detect these cues.
The top level of a Haskell module lives, by default, in the expression language. You define functions by writing equations between expressions. But when you see foo :: bar in the expression language, it means that foo is an expression and bar is its type. So when you read () :: (), you're seeing a statement which relates the () in the expression language with the () in the type language. The two () symbols mean different things, because they are not in the same language. This replication often causes confusion for beginners, until the expression/type language separation installs itself in their subconscious, at which point it becomes helpfully mnemonic.
The keyword data introduces a new datatype declaration, involving a careful mixture of the expression and type languages, as it says first what the new type is, and secondly what its values are.
data TyCon tyvar ... tyvar = ValCon1 type ... type | ... | ValConn type ... type
In such a declaration, type constructor TyCon is being added to the type language and the ValCon value constructors are being added to the expression language (and its pattern sublanguage). In a data declaration, the things which stand in argument places for the ValCons tell you the types given to the arguments when that ValCon is used in expressions. For example,
data Tree a = Leaf | Node (Tree a) a (Tree a)
declares a type constructor Tree for binary tree types storing a elements at nodes, whose values are given by value constructors Leaf and Node. I like to colour type constructors (Tree) blue and value constructors (Leaf, Node) red. There should be no blue in expressions and (unless you're using advanced features) no red in types. The built-in type Bool could be declared,
data Bool = True | False
adding blue Bool to the type language, and red True and False to the expression language. Sadly, my markdown-fu is inadequate to the task of adding the colours to this post, so you'll just have to learn to add the colours in your head.
The "unit" type uses () as a special symbol, but it works as if declared
data () = () -- the left () is blue; the right () is red
meaning that a notionally blue () is a type constructor in the type language, but that a notionally red () is a value constructor in the expression language, and indeed () :: (). [ It is not the only example of such a pun. The types of larger tuples follow the same pattern: pair syntax is as if given by
data (a, b) = (a, b)
adding (,) to both type and expression languages. But I digress.]
So the type (), often pronounced "Unit", is a type containing one value worth speaking of: that value is also written () but in the expression language, and is sometimes pronounced "void". A type with only one value is not very interesting. A value of type () contributes zero bits of information: you already know what it must be. So, while there is nothing special about type () to indicate side effects, it often shows up as the value component in a monadic type. Monadic operations tend to have types which look like
val-in-type-1 -> ... -> val-in-type-n -> effect-monad val-out-type
where the return type is a type application: the (type) function tells you which effects are possible and the (type) argument tells you what sort of value is produced by the operation. For example
put :: s -> State s ()
which is read (because application associates to the left ["as we all did in the sixties", Roger Hindley]) as
put :: s -> (State s) ()
has one value input type s, the effect-monad State s, and the value output type (). When you see () as a value output type, that just means "this operation is used only for its effect; the value delivered is uninteresting". Similarly
putStr :: String -> IO ()
delivers a string to stdout but does not return anything exciting.
The () type is also useful as an element type for container-like structures, where it indicates that the data consists just of a shape, with no interesting payload. For example, if Tree is declared as above, then Tree () is the type of binary tree shapes, storing nothing of interest at nodes. Similarly [()] is the type of lists of dull elements, and if there is nothing of interest in a list's elements, then the only information it contributes is its length.
To sum up, () is a type. Its one value, (), happens to have the same name, but that's ok because the type and expression languages are separate. It's useful to have a type representing "no information" because, in context (e.g., of a monad or a container), it tells you that only the context is interesting.

The () type can be thought of as a zero-element tuple. It's a type that can only have one value, and thus it's used where you need to have a type, but you don't actually need to convey any information. Here's a couple of uses for this.
Monadic things like IO and State have a return value, as well as performing side-effects. Sometimes the only point of the operation is to perform a side-effect, like writing to the screen or storing some state. For writing to the screen, putStrLn must have type String -> IO ? -- IO always has to have some return type, but here there's nothing useful to return. So what type should we return? We could say Int, and always return 0, but that's misleading. So we return (), the type that has only one value (and thus no useful information), to indicate that there's nothing useful coming back.
It's sometimes useful to have a type which can have no useful values. Consider if you'd implemented a type Map k v which maps keys of type k to values of type v. Then you want to implement a Set, which is really similar to a map except that you don't need the value part, just the keys. In a language like Java you might use booleans as the dummy value type, but really you just want a type that has no useful values. So you could say type Set k = Map k ()
It should be noted that () is not particularly magic. If you want you can store it in a variable and do a pattern match on it (although there's not much point):
main = do
x <- putStrLn "Hello"
case x of
() -> putStrLn "The only value..."

It is called the Unit type, usually used to represent side effects. You can think of it vaguely as Void in Java. Read more here and here etc. What can be confusing is that () syntactically represents both the type and its only value literal. Also note that it is not similar to null in Java which means an undefined reference - () is just effectively a 0-sized tuple.

I really like to think of () by analogy with tuples.
(Int, Char) is the type of all pairs of an Int and a Char, so it's values are all possible values of Int crossed with all possible values of Char. (Int, Char, String) is similarly the type of all triples of an Int, a Char, and a String.
It's easy to see how to keep extending this pattern upwards, but what about downwards?
(Int) would be the "1-tuple" type, consisting of all possible values of Int. But that would be parsed by Haskell as just putting parentheses around Int, and thus being just the type Int. And values in this type would be (1), (2), (3), etc, which also would just get parsed as ordinary Int values in parentheses. But if you think about it, a "1-tuple" is exactly the same as just a single value, so there's no need to actually have them exist.
Going down one step further to zero-tuples gives us (), which should be all possible combinations of values in an empty list of types. Well, there's exactly one way to do that, which is to contain no other values, so there should be only one value in the type (). And by analogy with tuple value syntax, we can write that value as (), which certainly looks like a tuple containing no values.
That's exactly how it works. There is no magic, and this type () and its value () are in no way treated specially by the language.
() is not in fact being treated as "a null value for any type" in the monads examples in the LYAH book. Whenever the type () is used the only value which can be returned is (). So it's used as a type to explicitly say that there cannot be any other return value. And likewise where another type is supposed to be returned, you cannot return ().
The thing to keep in mind is that when a bunch of monadic computations are composed together with do blocks or operators like >>=, >>, etc, they'll be building a value of type m a for some monad m. That choice of m has to stay the same throughout the component parts (there's no way to compose a Maybe Int with an IO Int in that way), but the a can and very often is different at each stage.
So when someone sticks an IO () in the middle of an IO String computation, that's not using the () as a null in the String type, it's simply using an IO () on the way to building an IO String, the same way you could use an Int on the way to building a String.

Yet another angle:
() is the name of a set which contains a single element called ().
Its indeed slightly confusing that the name of the set and the
element in it happens to be the same in this case.
Remember: in Haskell a type is a set that has its possible values as elements in it.

The confusion comes from other programming languages:
"void" means in most imperative languages that there is no structure in memory storing a value. It seems inconsistent because "boolean" has 2 values instead of 2 bits, while "void" has no bits instead of no values, but there it is about what a function returns in a practical sense. To be exact: its single value consumes no bit of storage.
Let's ignore the value bottom (written _|_) for a moment...
() is called Unit, written like a null-tuple. It has only one value. And it is not called
Void, because Void has not even any value, thus could not be returned by any function.
Observe this: Bool has 2 values (True and False), () has one value (()), and Void has no value (it doesn't exist). They are like sets with two/one/no elements. The least memory they need to store their value is 1 bit / no bit / impossible, respectively. Which means that a function that returns a () may return with a result value (the obvious one) that may be useless to you. Void on the other hand would imply that that function will never return and never give you any result, because there would not exist any result.
If you want to give "that value" a name, that a function returns which never returns (yes, this sounds like crazytalk), then call it bottom ("_|_", written like a reversed T). It could represent an exception or infinity loop or deadlock or "just wait longer". (Some functions will only then return bottom, iff one of their parameters is bottom.)
When you create the cartesian product / a tuple of these types, you will observe the same behaviour:
(Bool,Bool,Bool,(),()) has 2·2·2·1·1=6 differnt values. (Bool,Bool,Bool,(),Void) is like the set {t,f}×{t,f}×{t,f}×{u}×{} which has 2·2·2·1·0=0 elements, unless you count _|_ as a value.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string