What does the "Just" syntax mean in Haskell? - haskell

I have scoured the internet for an actual explanation of what this keyword does. Every Haskell tutorial that I have looked at just starts using it randomly and never explains what it does (and I've looked at many).
Here's a basic piece of code from Real World Haskell that uses Just. I understand what the code does, but I don't understand what the purpose or function of Just is.
lend amount balance = let reserve = 100
newBalance = balance - amount
in if balance < reserve
then Nothing
else Just newBalance
From what I have observed, it is related to Maybe typing, but that's pretty much all I have managed to learn.
A good explanation of what Just means would be very much appreciated.

It's actually just a normal data constructor that happens to be defined in the Prelude, which is the standard library that is imported automatically into every module.
What Maybe is, Structurally
The definition looks something like this:
data Maybe a = Just a
| Nothing
That declaration defines a type, Maybe a, which is parameterized by a type variable a, which just means that you can use it with any type in place of a.
Constructing and Destructing
The type has two constructors, Just a and Nothing. When a type has multiple constructors, it means that a value of the type must have been constructed with just one of the possible constructors. For this type, a value was either constructed via Just or Nothing, there are no other (non-error) possibilities.
Since Nothing has no parameter type, when it's used as a constructor it names a constant value that is a member of type Maybe a for all types a. But the Just constructor does have a type parameter, which means that when used as a constructor it acts like a function from type a to Maybe a, i.e. it has the type a -> Maybe a
So, the constructors of a type build a value of that type; the other side of things is when you would like to use that value, and that is where pattern matching comes in to play. Unlike functions, constructors can be used in pattern binding expressions, and this is the way in which you can do case analysis of values that belong to types with more than one constructor.
In order to use a Maybe a value in a pattern match, you need to provide a pattern for each constructor, like so:
case maybeVal of
Nothing -> "There is nothing!"
Just val -> "There is a value, and it is " ++ (show val)
In that case expression, the first pattern would match if the value was Nothing, and the second would match if the value was constructed with Just. If the second one matches, it also binds the name val to the parameter that was passed to the Just constructor when the value you're matching against was constructed.
What Maybe Means
Maybe you were already familiar with how this worked; there's not really any magic to Maybe values, it's just a normal Haskell Algebraic Data Type (ADT). But it's used quite a bit because it effectively "lifts" or extends a type, such as Integer from your example, into a new context in which it has an extra value (Nothing) that represents a lack of value! The type system then requires that you check for that extra value before it will let you get at the Integer that might be there. This prevents a remarkable number of bugs.
Many languages today handle this sort of "no-value" value via NULL references. Tony Hoare, an eminent computer scientist (he invented Quicksort and is a Turing Award winner), owns up to this as his "billion dollar mistake". The Maybe type is not the only way to fix this, but it has proven to be an effective way to do it.
Maybe as a Functor
The idea of transforming one type to another one such that operations on the old type can also be transformed to work on the new type is the concept behind the Haskell type class called Functor, which Maybe a has a useful instance of.
Functor provides a method called fmap, which maps functions that range over values from the base type (such as Integer) to functions that range over values from the lifted type (such as Maybe Integer). A function transformed with fmap to work on a Maybe value works like this:
case maybeVal of
Nothing -> Nothing -- there is nothing, so just return Nothing
Just val -> Just (f val) -- there is a value, so apply the function to it
So if you have a Maybe Integer value m_x and an Int -> Int function f, you can do fmap f m_x to apply the function f directly to the Maybe Integer without worrying if it's actually got a value or not. In fact, you could apply a whole chain of lifted Integer -> Integer functions to Maybe Integer values and only have to worry about explicitly checking for Nothing once when you're finished.
Maybe as a Monad
I'm not sure how familiar you are with the concept of a Monad yet, but you have at least used IO a before, and the type signature IO a looks remarkably similar to Maybe a. Although IO is special in that it doesn't expose its constructors to you and can thus only be "run" by the Haskell runtime system, it's still also a Functor in addition to being a Monad. In fact, there's an important sense in which a Monad is just a special kind of Functor with some extra features, but this isn't the place to get into that.
Anyway, Monads like IO map types to new types that represent "computations that result in values" and you can lift functions into Monad types via a very fmap-like function called liftM that turns a regular function into a "computation that results in the value obtained by evaluating the function."
You have probably guessed (if you have read this far) that Maybe is also a Monad. It represents "computations that could fail to return a value". Just like with the fmap example, this lets you do a whole bunch of computations without having to explicitly check for errors after each step. And in fact, the way the Monad instance is constructed, a computation on Maybe values stops as soon as a Nothing is encountered, so it's kind of like an immediate abort or a valueless return in the middle of a computation.
You Could Have Written Maybe
Like I said before, there is nothing inherent to the Maybe type that is baked into the language syntax or runtime system. If Haskell didn't provide it by default, you could provide all of its functionality yourself! In fact, you could write it again yourself anyway, with different names, and get the same functionality.
Hopefully you understand the Maybe type and its constructors now, but if there is still anything unclear, let me know!

Most of the current answers are highly technical explanations of how Just and friends work; I thought I might try my hand at explaining what it's for.
A lot of languages have a value like null that can be used instead of a real value, at least for some types. This has made a lot of people very angry and been widely regarded as a bad move. Still, it's sometimes useful to have a value like null to indicate the absence of a thing.
Haskell solves this problem by making you explicitly mark places where you can have a Nothing (its version of a null). Basically, if your function would normally return the type Foo, it instead should return the type Maybe Foo. If you want to indicate that there's no value, return Nothing. If you want to return a value bar, you should instead return Just bar.
So basically, if you can't have Nothing, you don't need Just. If you can have Nothing, you do need Just.
There's nothing magical about Maybe; it's built on the Haskell type system. That means you can use all the usual Haskell pattern matching tricks with it.

Given a type t, a value of Just t is an existing value of type t, where Nothing represents a failure to reach a value, or a case where having a value would be meaningless.
In your example, having a negative balance doesn't make sense, and so if such a thing would occur, it is replaced by Nothing.
For another example, this could be used in division, defining a division function that takes a and b, and returns Just a/b if b is nonzero, and Nothing otherwise. It's often used like this, as a convenient alternative to exceptions, or like your earlier example, to replace values that don't make sense.

A total function a->b can find a value of type b for every possible value of type a.
In Haskell not all functions are total. In this particular case function lend is not total - it is not defined for case when balance is less than reserve (although, to my taste it would make more sense to not permit newBalance to be less than reserve - as is, you can borrow 101 from a balance of 100).
Other designs that deal with non-total functions:
throw exceptions upon checking input value does not fit the range
return a special value (primitive type): favourite choice is a negative value for integer functions that are meant to return Natural numbers (for example, String.indexOf - when a substring is not found, the returned index is commonly designed to be negative)
return a special value (pointer): NULL or some such
silently return without doing anything: for example, lend could be written to return old balance, if the condition for lending is not met
return a special value: Nothing (or Left wrapping some error description object)
These are necessary design limitations in languages that cannot enforce totality of functions (for example, Agda can, but that leads to other complications, like becoming turing-incomplete).
The problem with returning a special value or throwing exceptions is that it is easy for the caller to omit handling of such a possibility by mistake.
The problem with silently discarding a failure is also obvious - you are limiting what the caller can do with the function. For example, if lend returned old balance, the caller has no way of knowing if balance has changed. It may or may not be a problem, depending on the intended purpose.
Haskell's solution forces the caller of a partial function to deal with the type like Maybe a, or Either error a because of the function's return type.
This way lend as it is defined, is a function that doesn't always compute new balance - for some circumstances new balance is not defined. We signal this circumstance to the caller by either returning the special value Nothing, or by wrapping the new balance in Just. The caller now has freedom to choose: either handle the failure to lend in a special way, or ignore and use old balance - for example, maybe oldBalance id $ lend amount oldBalance.

Function if (cond :: Bool) then (ifTrue :: a) else (ifFalse :: a) must have the same type of ifTrue and ifFalse.
So, when we write then Nothing, we must use Maybe a type in else f
if balance < reserve
then (Nothing :: Maybe nb) -- same type
else (Just newBalance :: Maybe nb) -- same type

Related

A new type for all a (normal value) and nothing in Haskell?

I want to
create a new type and value that indicates none like Nothing of Maybe.
create a new union type for all a (normal value) and none
create a function that accepts the new union type, then if the argument is the none does nothing.
Without using Maybe, how can I do this in Haskell?
In Typescript
const none = { undefined };
type None = typeof none;
type Option<T> = None | T;
type f = <T>(a:Option<T>) => any;
const f:f = a =>
a === none
? undefined
: console.log(a);
How to do this in Haskell? and just in case, please note here I'm not asking how to use Maybe Monad. Thanks.
If you have an actual programming problem you're trying to solve which requires you to represent the concept of "either a value of type T or nothing", then Haskell's answer to that is to use Maybe T.
Whatever the goal of your code, if in Typescript you would use your Option to meat that goal, in Haskell you can use Maybe to meet that goal. The code will be a little different, but that's unsurprising because they are different languages.
But if you are trying to directly use the concepts involved in your Option from Typescript, in Haskell, to implement a type that is exactly the same for its own sake (rather than as a tool to solve a problem), then you're out of luck. Haskell does not have the concept of an (undiscriminated) union type, nor the means to easily simulate one1.
You could sort-of try with Typeable:
import Data.Typeable (Typeable, cast)
data None = None
f :: (Typeable a, Show a) => a -> IO ()
f x = case (cast x :: Maybe None)
of Just None -> pure ()
Nothing -> print x
When we've got a Typeable constraint in play we can actually check if a value is of a certain type (and if so get a reference to it where the reference has the right type, so we can use it as that type). But this forces us to go via Maybe anyway, because cast needs a way to represent the value of a cast that didn't succeed!
It's also warty because now an actual argument of our None type can't be represented at all, whereas the Maybe option can represent Just Nothing just fine. It's true that you don't often need this "nested failure-or-absence" capability (though by no means never; think of a query that might not return a result: if you need to distinguish between the failure to run the query vs the query returning no result, that's quite naturally Maybe (Maybe Result)). But I prefer my facilities that handle any type to handle any type, with uniformity. You can't get caught out by odd corner cases if there aren't any corner cases, by design.
And you can't use Typeable more generally to actually declare a type that could be the union of two specific types; you have to accept a type variable a with a Typeable constraint, which means people can pass you any type, not just the two you were trying to put into a union. Since you can only write a finite number of casts, Typeable fundamentally can only be used to handle a set of particular types specially, or a code path for anything else (which might be to error out).
As a more general point, you need to avoid over-using Typeable in order to write code that gains all the benefits of Haskell's type system. Lots of Haskell code makes use of properties that arise from what you can't do with a polymorphic type, because you don't know which concrete type it is. All of those go out the window when Typeable is in play. For a very simple example, there's the classic argument that there's only one function of type a -> a (that doesn't bottom out); with no way to know what the type a is, the function's implementation can't create one out of nothing, so it can only return it unmodified. But a function types Typeable a => a -> a could have any number of oddball rules encoded like "if the argument is an Integer, add one to it". No matter what is hiding in the implementation it won't be reflected in the type beyond the Typeable constraint, so you have to be familiar with the particulars of the implementation to work with this function, and the compiler won't spot if you make a mistake. A clear example of this effect is that in my attempt to write f above, we end up with the type f :: Typeable a => a -> IO (); there is zero indication of what types it is expecting to work with (that it wants to handle "either None or anything else").
As a totally different track, you can do this of course:
data None = None
data Option a = Some a | None' None
Now you've "created a new type to represent nothing" and "create a new union type for all a (normal value) and none". But Haskell's ADTs are discrimated unions, meaning there will always be a constructor tag (both at runtime and in your source code) on each variant in the union.
So there's really no point in having the separate None type, since once you've seen the None' tag in an Option a value you already know everything there is to know about the value; looking inside the None value to see that it is None tells you nothing2. So you might as well just use:
data Option a = Some a | None
Which is exactly the same as the built in Maybe, differing only in the names chosen.
Similarly you can obviously write a function with a case to handle when an Option is None, and a case to handle when there is something there (either of which can "do nothing", if you're returning a type where "doing something" makes any sense, i.e. some sort of action, like IO or State).
f :: Show a => Option a -> IO a
f None = pure ()
f (Some a) = print a
The code is slightly different, even ignoring for trivial syntax and naming issues; we had to reference the Some constructor and call print on the value inside it, rather than printing the argument to f directly3. And Haskell uses a typeclass divide types into ones that can meaningfully be printed and ones that can't, and only allow you to call print on the former. But it's so close that I have no reservations saying "this is the equivalent Haskell to the Typescript you wrote".
And given that Maybe already exists you might as well use it. (Similarly, should you ever need it the built in type () is - apart from the name - identical to data None = None.) But again: that is what you do if you are trying to solve a problem that you would use Option for in Typescript; if instead your goal is to implement something exactly the same as Option, then you simply can't with the tools Haskell gives you.
1 You can probably hack something truly horrible together using unsafe features (like unsafeCoerceing to and from Any). I do not know exactly how to go about making that usable and reliable, and doing so is an utter waste of time for any practical purpose; you would never use such code in a real program, it would just be an interesting exercise in how far you can hack the language implementation. So I'm not going to write an answer that addresses that angle.
2 Well, technically it sells you that the computation that produced it terminated; it could have been bottom. But you can't do anything with that information since it's impossible to test whether it was bottom; if it is you'll just error out (or not terminate) as well.
3 Printing the whole argument to f would have also worked, it would have just printed e.g. Some "value", which I assume is not what you meant.

What’s the idiomatic type to represent multiple failure cases but only one success case?

The type Maybe a represents a computation that could fail, with the semantics that we don’t care about exactly how it failed. If the computation succeeds then any value of type a might be returned.
What about the inverse case, where the computation could fail for any number of reasons (and we want to preserve that information) but success doesn’t involve any information other than “yes, it succeeded”? I can think of two obvious ways to encode this kind of computation:
Maybe e, where Just e represents a failure and Nothing represents a success. This is so contrary to the usual use of Maybe that I would be reluctant to use it.
Either e (), where Left e represents a failure and Right () represents a success. This has the advantage of being explicit, but has the disadvantage of… being explicit. Writing () feels awkward, especially outside the context of a type signature.
Does Haskell have a more idiomatic way to represent “multiple failure cases but only one success case”?
Without seeing the actual code it is actually difficult to understand what you mean by failure. If it's a pure function then I don't see what using Maybe would be a problem. I never really see Nothing as failure but just as it is : Nothing. Depending on the context , I either return Nothing as well or, use a default value and carry on. I understand that it can be seen as a failure, but it more depends on the point of view
if the caller than the function itself.
Now, you want to represent a computation which can fails but returns nothing. If it is a pure function, that doesn't make sense. You function being pure, nothing has happened (no side effect) and you don't get a result. So in case of success, you actually computed nothing : that's not a success, that's nothing. ATHI If you fail, you've got a reason why it failed. That's no different from a simple check returning a Maybe.
For example you might need to check that a domain is not in a blacklist. For that you do a lookup in a list : Nothing means it's fine even though it means it's from your point of view and failure and need to stop your computation. The same code can be used to check your domain belongs to a white list. in that case Nothing is a failure : just depends on the context.
Now, if you are running a monadic action (like saving a file or something) it makes sense to return nothing but different failures can happened (disk full, path incorrect, etc). The standard signature for an IO which we don't care about the result is IO (), so you can either go for IO (Either e ()) (everybody will understand it) or go for IO () and raises exception (if they are genuinely exceptional).
A short way to go about this would be to use Either e () along with the pattern synonym
pattern Success :: Either e () -- Explicit type signature not necessary
pattern Success = Right ()
You could also include some other things as well, if it improves readability, such as
type FailableWith e = Either e ()
pattern FailedWith :: e -> FailableWith e
pattern FailedWith x = Left x
Unlike Maybe, Either has the advantage of having all the existing machinery already in place: the Functor, Applicative, Monad, Foldable, Traversable, Semigroup and even Ord (Left x < Right y should always hold) instances will likely already behave exactly how you would want for error handling. Generally, for this particular situation, Maybe will tend to do the opposite of what you want (usually you want to continue on a success and stop after the first failure, which is the opposite of what most of the Maybe machinery will provide for this scenario).
Its not clear from the question how the computation might fail.
If it is something like a compiler, which might produce a lot of error messages (rather than halting on the first one) then you want something like:
type MyResult a = Either [Error] a
which either succeeds with a result or fails with a list of reasons.
On the other hand if you have a non-deterministic computation where each variation might succeed or fail then you want something more like:
type MyResult a = [Either Error a]
Then search the list of results. If you find a Right then return it, otherwise assemble the list of Lefts.

Why is there no runConst function in Haskell?

Is there a convention so that I know when to expect runX versus getX typeclass functions?
It's purely a matter of how the author preferred to think about what they're representing. And it's often more about the "abstract concept" being represented, not the actual data structure being used to represent it.
If you have some type X and think of an X value as a computation that could be run to get a value, then you'd have a runX function. If you think of it as more like a container, then you'd have a getX function (There are other possible interpretations that could lead to runX or getX or something else, these are just 2 commonly recurring ways of thinking about values).
Of course when we're using first class Haskell values to represent things (and functions are perfectly good values), a lot of the time you could interpret something as either a computation or a container reasonably well. Consider State for representing stateful computations; surely that has to be interpreted as a computation, right? We say runState :: State s a -> s -> (a , s) because we think of it as "running" the State s a, needing an s as additional input. But we could just as easily think of it as "getting" an s -> (a, s) out of the State s a - treating State more like a container.
So the choice between runX and getX isn't really meaningful in any profound tense, but it tells you how the author was thinking about X (and perhaps how they think you should think about it).
Const is so-named in analogy to the function const (which takes an argument to produce the "constant function" that takes another input, ignores it, and returns whatever the first input to const was). But it's thought of as operating at the type level; Const takes a type and generates a "type-level function" that ignores whatever type it is applied to and then is isomorphic to the first type Const was applied to. Isomorphic rather than equal because to create a new type that could have different instances, it needs to have a constructor. At the value level, in order to be an isomorphism you need to be able to get a Const a b from an a (that's the Const constructor), and get the a back out of a Const a b. Since "being isomorphic to a" is all the properties we need it to have there's no real need to think of it as doing anything other than being a simple container of a, so we have getConst.
Identity seems similarly obvious as "just a container" and we have runIdentity. But one of the main motivations for having Identity is to think of Identity a as being a "monadic computation" in the same way that State s a, Reader e a, etc values are. So to continue the analogy we think of Identity as a "do-nothing" computation we run, rather than a simple wrapper container that we get a value out of. It would be perfectly valid to think of Identity as a container (the simplest possible one), but that wasn't the interpretation the authors chose to focus on.

Haskell "Not a data constructor" [duplicate]

Thanks to some excellent answers here, I generally understand (clearly in a limited way) the purpose of Haskell's Maybe and that its definition is
data Maybe a = Nothing | Just a
however I'm not entity clear exactly why Just is a part of this definition. As near as I can tell, this is where Just itself is defined, but the the relevant documentation doesn't say much about it.
Am I correct is thinking that the primary benefit of using Just in the definition of Maybe, rather than simply
data Maybe a = Nothing | a
is that it allows for pattern matching to with Just _ and for useful functionality like isJust and fromJust?
Why is Maybe defined in the former way rather than the latter?
Haskell's algebraic data types are tagged unions. By design, when you combine two different types into another type, they have to have constructors to disambiguate them.
Your definition does not fit with how algebraic data types work.
data Maybe a = Nothing | a
There's no "tag" for a here. How would we tell an Maybe a apart from a normal, unwrapped a in your case?
Maybe has a Just constructor because it has to have a constructor by design.
Other languages do have union types which could work like what you imagine, but they would not be a good fit for Haskell. They play out differently in practice and tend to be somewhat error-prone.
There are some strong design reasons for preferring tagged unions to normal union types. They play well with type inference. Unions in real code often have a tag anyhow¹. And, from the point of view of elegance, tagged unions are a natural fit to the language because they are the dual of products (ie tuples and records). If you're curious, I wrote about this in a blog post introducing and motivating algebraic data types.
footnotes
¹ I've played with union types in two places: TypeScript and C. TypeScript compiles to JavaScript which is dynamically typed, meaning it keeps track of the type of a value at runtime—basically a tag.
C doesn't but, in practice, something like 90% of the uses of union types either have a tag or effectively emulate struct subtyping. One of my professors actually did an empirical study on how unions are used in real C code, but I don't remember what paper it was in off-hand.
Another way to look at it (in addition to Tikhon's answer) is to consider another one of the basic Haskell types, Either, which is defined like this:
-- | A value that contains either an #a# (the 'Left') constructor) or
-- a #b# (the 'Right' constructor).
data Either a b = Left a | Right b
This allows you to have values like these:
example1, example2 :: Either String Int
example1 = Left "Hello!"
example2 = Right 42
...but also like this one:
example3, example4 :: Either String String
example3 = Left "Hello!"
example4 = Right "Hello!"
The type Either String String, the first time you encounter it, sounds like "either a String or a String," and you might therefore think that it's the same as just String. But it isn't, because Haskell unions are tagged unions, and therefore an Either String String records not just a String, but also which of the "tags" (data constructors; in this case Left and Right) was used to construct it. So even though both alternatives carry a String as their payload, you're able to tell how any one value was originally built. This is good because there are lots of cases where the alternatives are the same type but the constructors/tags impart extra meaning:
data ResultMessage = FailureMessage String | SuccessMessage String
Here the data constructors are FailureMessage and SuccessMessage, and you can guess from the names that even though the payload in both cases is a String, they would mean very different things!
So bringing it back to Maybe/Just, what's happening here is that Haskell just uniformly works like that: every alternative of a union type has a distinct data constructor that must always be used to construct and pattern match values of its type. Even if at first you might think it would be possible to guess it from context, it just doesn't do it.
There are other reasons, a bit more technical. First, the rules for lazy evaluation are defined in terms of data constructors. The short version: lazy evaluation means that if Haskell is forced to peek inside of a value of type Maybe a, it will try to do the bare minimum amount of work needed to figure out whether it looks like Nothing or like Just x—preferably it won't peek inside the x when it does this.
Second: the language needs to be able distinguish types like Maybe a, Maybe (Maybe a) and Maybe (Maybe (Maybe a)). If you think about it, if we had a type definition that worked like you wrote:
data Maybe a = Nothing | a -- NOT VALID HASKELL
...and we wanted to make a value of type Maybe (Maybe a), you wouldn't be able to tell these two values apart:
example5, example6 :: Maybe (Maybe a)
example5 = Nothing
example6 = Just Nothing
This might seem a bit silly at first, but imagine you have a map whose values are "nullable":
-- Map of persons to their favorite number. If we know that some person
-- doesn't have a favorite number, we store `Nothing` as the value for
-- that person.
favoriteNumber :: Map Person (Maybe Int)
...and want to look up an entry:
Map.lookup :: Ord k => Map k v -> k -> Maybe v
So if we look up mary in the map we have:
Map.lookup favoriteNumber mary :: Maybe (Maybe Int)
And now the result Nothing means Mary's not in the map, while Just Nothing means Mary's in the map but she doesn't have a favorite number.
Just is a constructor, a alone would be of type a, when Just a constructs a different type Maybe a.
Maybe a is designed so to have one more value than the type a. In type theory, sometimes it is written as 1 + a (up to iso), which makes that fact even more evident.
As an experiment, consider the type Maybe (Maybe Bool). Here we have 1 + 1 + 2 values, namely:
Nothing
Just Nothing
Just (Just False)
Just (Just True)
If we were allowed to define
data Maybe a = Nothing | a
we would lose the distinction between the cases Just Nothing and Nothing, since there is no longer Just to make them apart. Indeed, Maybe (Maybe a) would collapse into Maybe a. This would be an inconvenient special case.

Monad "unboxing"

My question came up while following the tutorial Functors, Applicatives, And Monads In Pictures and its JavaScript version.
When the text says that functor unwraps value from the context, I understand that a Just 5 -> 5 transformation is happening. As per What does the "Just" syntax mean in Haskell? , Just is "defined in scope" of the Maybe monad.
My question is what is so magical about the whole unwrapping thing? I mean, what is the problem of having some language rule which automatically unwraps the "scoped" variables? It looks to me that this action is merely a lookup in some kind of a table where the symbol Just 5 corresponds to the integer 5.
My question is inspired by the JavaScript version, where Just 5 is prototype array instance. So unwrapping is, indeed, not rocket science at all.
Is this a "for-computation" type of reason or a "for-programmer" one? Why do we distinguish Just 5 from 5 on the programming language level?
First of all, I don't think you can understand Monads and the like without understanding a Haskell like type system (i.e. without learning a language like Haskell). Yes, there are many tutorials that claim otherwise, but I've read a lot of them before learning Haskell and I didn't get it. So my advice: If you want to understand Monads learn at least some Haskell.
To your question "Why do we distinguish Just 5 from 5 on the programming language level?". For type safety. In most languages that happen not to be Haskell null, nil, whatever, is often used to represent the absence of a value. This however often results in things like NullPointerExceptions, because you didn't anticipate that a value may not be there.
In Haskell there is no null. So if you have a value of type Int, or anything else, that value can not be null. You are guarantied that there is a value. Great! But sometimes you actually want/need to encode the absence of a value. In Haskell we use Maybe for that. So something of type Maybe Int can either be something like Just 5 or Nothing. This way it is explicit that the value may not be there and you can not accidentally forget that it might be Nothing because you have to explicitly unwrap the value.
This has nothing really to do with Monads, except that Maybe happens to implement the Monad type class (a type class is a bit like a Java interface, if you are familiar with Java). That is Maybe is not primarily a Monad, but just happens to also be a Monad.
I think you're looking at this from the wrong direction. Monad is explicitly not about unwrapping. Monad is about composition.
It lets you combine (not necessarily apply) a function of type a -> m b with a value of type m a to get a value of type m b. I can understand where you might think the obvious way to do that is unwrapping the value of type m a into an value of type a. But very few Monad instances work that way. In fact, the only ones that can work that way are the ones that are equivalent to the Identity type. For nearly all instances of Monad, it's just not possible to unwrap a value.
Consider Maybe. Unwrapping a value of type Maybe a into a value of type a is impossible when the starting value is Nothing. Monadic composition has to do something more interesting than just unwrapping.
Consider []. Unwrapping a value of type [a] into a value of type a is impossible unless the input just happens to be a list of length 1. In every other case, monadic composition is doing something more interesting than unwrapping.
Consider IO. A value like getLine :: IO String doesn't contain a String value. It's plain impossible to unwrap, because it isn't wrapping something. Monadic composition of IO values doesn't unwrap anything. It combines IO values into more complex IO values.
I think it's worthwhile to adjust your perspective on what Monad means. If it were only an unwrapping interface, it would be pretty useless. It's more subtle, though. It's a composition interface.
A possible example is this: consider the Haskell type Maybe (Maybe Int). Its values can be of the following form
Nothing
Just Nothing
Just (Just n) for some integer n
Without the Just wrapper we couldn't distinguish between the first two.
Indeed, the whole point of the optional type Maybe a is to add a new value (Nothing) to an existing type a. To ensure such Nothing is indeed a fresh value, we wrap the other values inside Just.
It also helps during type inference. When we see the function call f 'a' we can see that f is called at the type Char, and not at type Maybe Char or Maybe (Maybe Char). The typeclass system would allow f to have a different implementation in each of these cases (this is similar to "overloading" in some OOP languages).
My question is, what is so magical about the whole unwrapping thing?
There is nothing magical about it. You can use garden-variety pattern matching (here in the shape of a case expression) to define...
mapMaybe :: (a -> b) -> Maybe a -> Maybe b
mapMaybe f mx = case mx of
Just x -> Just (f x)
_ -> mx
... which is exactly the same than fmap for Maybe. The only thing the Functor class adds -- and it is a very useful thing, make no mistake -- is an extra level of abstraction that covers various structures that can be mapped over.
Why do we distinguish Just 5 from 5 on programming language level?
More meaningful than the distinction between Just 5 and 5 is the one between their types -- e.g. between Maybe Intand Int. If you have x :: Int, you can be certain x is an Int value you can work with. If you have mx :: Maybe Int, however, you have no such certainty, as the Int might be missing (i.e. mx might be Nothing), and the type system forces you to acknowledge and deal with this possibility.
See also: jpath's answer for further comments on the usefulness of Maybe (which isn't necessarily tied to classes such as Functor and Monad); Carl's answer for further comments on the usefulness of classes like Functor and Monad (beyond the Maybe example).
What "unwrap" means depends on the container. Maybe is just one example. "Unwrapping" means something completely different when the container is [] instead of Maybe.
The magical about the whole unwrapping thing is the abstraction: In a Monad we have a notion of "unwrapping" which abstracts the nature of the container; and then it starts to get "magical"...
You ask what Just means: Just is nothing but a Datatype constructor in Haskell defined via a data declaration like :
data Maybe a = Just a | Nothing
Just take a value of type a and creates a value of type Maybe a. It's Haskell's way to distinguigh values of type a from values of type Maybe a
First of all, you need to remove monads from your question. They have nothing to do this. Treat this articles as one of the points of view on the monads, maybe it does not suit you, you may still little understood in the type system that would understand monads in haskell.
And so, your question can be rephrased as: Why is there no implicit conversion Just 5 => 5? But answer is very simple. Because value Just 5 has type Maybe Integer, so this value may would be Nothing, but what must do compiler in this case? Only programmer can resolve this situation.
But there is more uncomfortable question. There are types, for example, newtype Identity a = Identity a. It's just wrapper around some value. So, why is there no impliciti conversion Identity a => a?
The simple answer is - an attempt to realize this would lead to a different system types, which would not have had many fine qualities that exist in the current. According to this, it can be sacrificed for the benefit of other possibilities.

Resources