I am writing a function to check if a tree if a BST. All I've tried is to print the tree in an in-order traversal to a list and then check if the list is increasing. However I am having this error:
Couldn't match expected type `a' against inferred type `[t]'
`a' is a rigid type variable bound by
the type signature for `checkList' at BST.hs:24:18
In the pattern: x : y : xs
In the pattern: [x : y : xs]
In the definition of `checkList':
checkList [x : y : xs] = x <= y && checkList (y : xs)
Here is what I have so far (only a checkList function).
checkList :: (Ord a) => [a] -> Bool
checkList [] = True
checkList [x] = True
checkList [x:y:xs] = x <= y && checkList (y:xs)
You want:
checkList :: (Ord a) => [a] -> Bool
checkList [] = True
checkList [x] = True
checkList (x:y:xs) = x <= y && checkList (y:xs)
When you tried to use [ ] in the final pattern, you were saying "match against a list that contains x:y:xs (also a list!) as its sole element". Which doesn't match the type [a].
A somewhat ugly one using foldl'
checkList :: Ord a => [a] -> Bool
checkList xs = fst $ foldl' (\(b,x1) x2 -> (b && x1 <= x2,x2)) (True,head xs) xs
Note: Using head xs is OK here because of lazy evaluation.
The usual way to do this is to make your tree foldable:
data BST a = Node (BST a) a (BST a) | Leaf
-- Use `deriving Foldable` or this instance
instance Foldable BST where
foldMap _ Leaf = mempty
foldMap f (Node l v r) =
foldMap f l <> (f v <> foldMap f r)
Then you can skip conversion to a list like this. This is similar to bmk's answer, but avoids head.
-- Is this increasing? If so, what is the maximum?
data Result a = Empty | NotInc | Inc a
finalInc :: Result a -> Bool
finalInc NotInc = False
finalInc _ = True
increasing :: (Foldable f, Ord a) => f a -> Bool
increasing = finalInc . foldl' go Empty where
go Empty y = Inc y
go NotInc _ = NotInc
go (Inc x) y
| x <= y = Inc y
| otherwise = NotInc
Warning! Warning!
The property this checks is weaker than the traditional binary search tree property, and weaker than the commonly accepted ways to weaken that property. In particular, you generally want to ensure, at least, that the root of each subtree is strictly greater than all elements of its left subtree, or that the root of each subtree is strictly less than all elements of its right subtree. These weak properties cannot be expressed in terms of the Foldable instance or conversion to a list; they must be checked directly. You can, however, use these techniques to verify the classical BST property by simply replacing <= with <.
A remark on space
All of the answers, including this one, have a somewhat unfortunate property: given a very left-heavy tree (e.g., Node (Node (...) 2 Leaf) 1 Leaf) they will use O(n) additional space to verify the search tree property. Is there some way to write this so it won't have any such bad cases? Unfortunately, the answer seems to be no. The classical BST property can be stated thus:
Each node must be greater than all elements of its left subtree and less than all elements of its right subtree.
The trouble is that "and". If we decide to check the left subtree first, we have to remember to check the right subtree afterwards, and vice versa.
Thus the only way to make verification efficient is to ensure that the tree is balanced.
Related
I am writing a small function in Haskell to check if a list is a palindrome by comparing it with it's reverse.
checkPalindrome :: [Eq a] -> Bool
checkPalindrome l = (l == reverse l)
where
reverse :: [a] -> [a]
reverse xs
| null xs = []
| otherwise = (last xs) : reverse newxs
where
before = (length xs) - 1
newxs = take before xs
I understand that I should use [Eq a] in the function definition because I use the equality operator later on, but I get this error when I compile:
Expected kind ‘*’, but ‘Eq a’ has kind ‘GHC.Prim.Constraint’
In the type signature for ‘checkPalindrome’:
checkPalindrome :: [Eq a] -> Bool
P.s Feel free to correct me if I am doing something wrong with my indentation, I'm very new to the language.
Unless Haskell adopted a new syntax, your type signature should be:
checkPalindrome :: Eq a => [a] -> Bool
Declare the constraint on the left hand side of a fat-arrow, then use it on the right hand side.
Unlike OO languages, Haskell makes a quite fundamental distinction between
Constraints – typeclasses like Eq.
Types – concrete types like Bool or lists of some type.
In OO languages, both of these would be represented by classes†, but a Haskell type class is completely different. You never have “values of class C”, only “types of class C”. (These concrete types may then contain values, but the classes don't.)
This distinction may seem pedantic, but it's actually very useful. What you wrote, [Eq a] -> Bool, would supposedly mean: each element of the list must be comparable... but comparable to what? You could have elements of different type in the list, how do you know that these elements are comparable to each other? In Haskell, that's no issue, because whenever the function is used you first settle on one type a. This type must be in the Eq class. The list then must have all elements from the same type a. This way you ensure that each element of the list is comparable to all of the others, not just, like, comparable to itself. Hence the signature
checkPalindrome :: Eq a => [a] -> Bool
This is the usual distinction on the syntax level: constraints must always‡ be written on the left of an => (implication arrow).
The constraints before the => are “implicit arguments”: you don't explicitly “pass Eq a to the function” when you call it, instead you just pass the stuff after the =>, i.e. in your example a list of some concrete type. The compiler will then look at the type and automatically look up its Eq typeclass instance (or raise a compile-time error if the type does not have such an instance). Hence,
GHCi, version 7.10.2: http://www.haskell.org/ghc/ :? for help
Prelude> let palin :: Eq a => [a] -> Bool; palin l = l==reverse l
Prelude> palin [1,2,3,2,1]
True
Prelude> palin [1,2,3,4,5]
False
Prelude> palin [sin, cos, tan]
<interactive>:5:1:
No instance for (Eq (a0 -> a0))
(maybe you haven't applied enough arguments to a function?)
arising from a use of ‘palin’
In the expression: palin [sin, cos, tan]
In an equation for ‘it’: it = palin [sin, cos, tan]
...because functions can't be equality-compared.
†Constraints may in OO also be interfaces / abstract base classes, which aren't “quite proper classes” but are still in many ways treated the same way as OO value-classes. Most modern OO languages now also support Haskell-style parametric polymorphism in addition to “element-wise”/covariant/existential polymorphism, but they require somewhat awkward extends trait-mechanisms because this was only implemented as an afterthought.
‡There are also functions which have “constraints in the arguments”, but that's a more advanced concept called rank-n polymorphism.
This is really an extended comment. Aside from your little type error, your function has another problem: it's extremely inefficient. The main problem is your definition of reverse.
reverse :: [a] -> [a]
reverse xs
| null xs = []
| otherwise = (last xs) : reverse newxs
where
before = (length xs) - 1
newxs = take before xs
last is O(n), where n is the length of the list. length is also O(n), where n is the length of the list. And take is O(k), where k is the length of the result. So your reverse will end up taking O(n^2) time. One fix is to just use the standard reverse function instead of writing your own. Another is to build up the result recursively, accumulating the result as you go:
reverse :: [a] -> [a]
reverse xs0 = go [] xs0
go acc [] = acc
go acc (x : xs) = go (x : acc) xs
This version is O(n).
There's another source of inefficiency in your implementation:
checkPalindrome l = (l == reverse l)
This isn't nearly as bad, but let's look at what it does. Suppose we have the string "abcdefedcba". Then we test whether "abcdefedcba" == "abcdefedcba". By the time we've checked half the list, we already know the answer. So we'd like to stop there! There are several ways to accomplish this. The simplest efficient one is probably to calculate the length of the list as part of the process of reversing it so we know how much we'll need to check:
reverseCount :: [a] -> (Int, [a])
reverseCount xs0 = go 0 [] xs0 where
go len acc [] = (len, acc)
go len acc (x : xs) = len `seq`
go (len + 1) (x : acc) xs
Don't worry about the len `seq` bit too much; that's just a bit of defensive programming to make sure laziness doesn't make things inefficient; it's probably not even necessary if optimizations are enabled. Now you can write a version of == that only looks at the first n elements of the lists:
eqTo :: Eq a => Int -> [a] -> [a] -> Bool
eqTo 0 _ _ = True
eqTo _ [] [] = True
eqTo n (x : xs) (y : ys) =
x == y && eqTo (n - 1) xs ys
eqTo _ _ _ = False
So now
isPalindrome xs = eqTo ((len + 1) `quot` 2) xs rev_xs
where
(len, rev_xs) = reverseCount xs
Here's another way, that's more efficient and arguably more elegant, but a bit tricky. We don't actually need to reverse the whole list; we only need to reverse half of it. This saves memory allocation. We can use a tortoise and hare trick:
splitReverse ::
[a] ->
( [a] -- the first half, reversed
, Maybe a -- the middle element
, [a] ) -- the second half, in order
splitReverse xs0 = go [] xs0 xs0 where
go front rear [] = (front, Nothing, rear)
go front (r : rs) [_] = (front, Just r, rs)
go front (r : rs) (_ : _ : xs) =
go (r : front) rs xs
Now
isPalindrome xs = front == rear
where
(front, _, rear) = splitReverse xs
Now for some numbers, using the test case
somePalindrome :: [Int]
somePalindrome = [1..10000] ++ [10000,9999..1]
Your original implementation takes 7.523s (2.316 mutator; 5.204 GC) and allocates 11 gigabytes to build the test list and check if it's a palindrome. My counting implementation takes less than 0.01s and allocates 2.3 megabytes. My tortoise and hare implementation takes less than 0.01s and allocates 1.7 megabytes.
Though disjoint exhausts all possible patterns in its guard conditions, Haskell gives me a PatternMatchFail error when running it.
disjoint :: (Ord a) => [a] -> [a] -> Bool
disjoint l#(x:xs) r#(y:ys)
| null l || null r = True
| x == y = False
| x > y = disjoint l ys -- reduce right list
| otherwise = disjoint xs r -- reduce left list
-- | Terminates when either list has been reduced to null, or when their head
-- elements are equal. Since lists are ordered, it only needs to compare head elements.
However, it has no problem if I write:
disjoint :: (Ord a) => [a] -> [a] -> Bool
disjoint [] _ = True
disjoint _ [] = True
disjoint l#(x:xs) r#(y:ys)
-- | null l || null r = True -- now redundant, but included for sake of continuity
| x == y = False
| x > y = disjoint l ys -- reduce right list
| otherwise = disjoint xs r -- reduce left list
Without those additional lines, I get a PatternMatchFail. If I am to infer what the issue for Haskell is in the first case, it is that if given a null list for an input argument, its expected arguments l#(x:xs) r#(y:ys) are already invoking a pattern-match, one that is non-exhaustive in the case of a null list, resulting in a PatternMatchFail, despite having a guard condition that checks for exactly the same condition. It just can't ever reach the guard condition, because it first needs to match on the "argument condition".
However, those additional two lines are a tad off-putting to me in their repetitiveness, and I was just wondering if there was a more succinct way of fixing this. More generally: if I were to be using three or more lists as arguments, I definitely wouldn't want to write out disjoint 3+ more times just to check for null conditions, so what might I do in cases like that? Thank you for your time.
Your explaination for why this gives a pattern match failure is correct. You can write the code the following way to avoid redundant lines:
disjoint :: (Ord a) => [a] -> [a] -> Bool
disjoint l#(x:xs) r#(y:ys)
| x == y = False
| x > y = disjoint l ys -- reduce right list
| otherwise = disjoint xs r -- reduce left list
disjoint _ _ = True -- catch all pattern, executed if either l or r is []
This is the solution I recommend. There is another solution, to make the pattern match more lazy (the pattern is then only checked if x/xs or y/ys is actually required):
disjoint :: (Ord a) => [a] -> [a] -> Bool
disjoint l# ~(x:xs) r# ~(y:ys) -- the ~ here says that this is an irrefutable pattern, which makes the match more lazy
| null l || null r = True -- x/y is not required, so pattern not checked
| x == y = False
| x > y = disjoint l ys -- reduce right list
| otherwise = disjoint xs r -- reduce left list
I don't recommend doing this though, since checking for null explicitly does not feel like idiomatic Haskell (also, irrefutable patterns are rarely used). The problem with the second approach is that you have to take care that you don't access y/ys / x/xs in the null cases, and the compiler won't help you. The first approach guarrantes that you can't access them in the null cases.
Another way to avoid the duplication is to take advantage of pattern match/guard fall through:
disjoint :: (Ord a) => [a] -> [a] -> Bool
disjoint l r
| null l || null r = True
-- If the guard above fails, then this pattern match is attempted:
disjoint l#(x:xs) r#(y:ys)
| x == y = False
| x > y = disjoint l ys -- reduce right list
| otherwise = disjoint xs r -- reduce left list
This is overkill here and personally I prefer the explicit pattern matching over null (the style of the first code block in bennofs answer is what I would go for), but this general technique can be handy in some situations.
If given a list of tuples representing ranges like this:
[(0,10),(10,100),(1000,5000)]
I'd like to merge the tuples that represent contiguous ranges, so the result is this:
[(0,100),(1000,5000)]
Any elegant solutions?
Here's mine
mergeRanges :: [(Int, Int)] -> [(Int, Int)]
mergeRanges xs = foldr f [] (sort xs)
where f new#(x,y) acc#((a,b):ys) =
if y == a
then (x,b):ys
else new:acc
f x acc = x:acc
EDIT: Ranges are non-overlapping
Unless this is a pattern that shows up more often in your program, I would just go for a direct recursion (untested code follows!):
mergeRanges (lo1,hi1) : (lo2,hi2) : rest)
| hi1 == lo2 = mergeRanges ((lo1,hi2) : rest)
-- or (lo1,hi2) : mergeRanges rest, to merge only adjacent ranges
mergeRanges (interval:rest) = interval : mergeRanges rest
mergeRanges [] = []
(where you could optimize a bit by using #-patterns at the cost of clutter).
But if you really want to, you could use the following helper function
merge :: (a -> a -> Maybe a) -> [a] -> [a]
merge f [] = []
merge f [x] = [x]
merge f (x:y:xs) = case f x y of
Nothing -> x : merge f (y:xs)
Just z -> merge (z:xs) -- or z : merge xs
and give as first argument
merge2Ranges (lo1, hi1) (lo2, hi2)
| hi1 == lo2 = Just (lo1, hi2)
| otherwise = Nothing
I doubt that merge is in a library somewhere, since it's pretty specific to the problem at hand.
Well, I think the best solutions in this space probably will involve specialized data structures that maintain the invariant in question. In Java-land, the Guava library has RangeSet, which does precisely this.
This is not a solution to your problem directly, but once I was playing around with this simple (too simple) implementation of "historical values" as a kind of binary search tree:
-- | A value that changes over time at discrete moments. #t# is the timeline type,
-- #a# is the value type.
data RangeMap t a = Leaf a
-- Invariant: all #t# values in the left branch must be less than
-- the one in the parent.
| Split t (RangeMap a) (RangeMap a)
valueAt :: RangeMap t a -> t -> a
valueAt _ (Leaf a) = a
valueAt t (Split t' before since)
| t < t' = get t before
| otherwise = get t since
The idea here is that Split t beforeT sinceT divides the timeline into two branches, one for values that held before t and a second for those that held since t.
So represented in terms of this type, your range set could be represented something like this:
example :: RangeMap Int Bool
example = Split 1000 (Split 100 (Split 0 (Leaf False) (Leaf False))
(Leaf False))
(Split 5000 (Leaf True) (Leaf False))
There are a few neat things about this, compared to the [(since, until, value)] representation that I've used in the past for similar applications:
The tree representation makes it impossible to have conflicting a values for the same time range. RangeMap is a true function from t to a.
The tree representation guarantees that some a is assigned to every t. Again, a RangeMap is a true function from t to a.
Since it's a tree and not a list, it supports log-time operations.
I did not go as far as working out a balanced tree representation for this or figuring out how to merge adjacent ranges with the same value, however...
I have this data definition for a tree:
data Tree = Leaf Int | Node Tree Int Tree
and I have to make a function, nSatisfy, to check how many items of the tree check some predicate.
Here's what I've done:
nSatisfy :: (Int->Bool) -> Tree -> Int
nSatisfy _ Leaf = 0
nSatisfy y (Node left x right)
|y x = 1 + nSatisfy y (Node left x right)
| otherwise = nSatisfy y (Node left x right)
Is this the right way to solve this problem?
In your nSatisfy function, you should add the number of nodes satisfying the condition in both subtrees with two recursive calls. The last two lines should be like this:
|x y=1+(nSatisfy y left)+(nSatisfy y right)
|otherwise=(nSatisfy y left)+(nSatisfy y right)
This way, it will call itself again on the same node but only on the subtrees.
Also, if a leaf contains an integer, as is implied in the data declaration, you should make it evaluate the condition for a leaf and return 1 if it is true, instead of always returning 0.
In addition to the main answer, I'd like to offer a slightly different way how to generalize your problem and solving it using existing libraries.
The operation you're seeking is common to many data structures - to go through all elements and perform some operation on them. Haskell defines Foldable type-class, which can be implemented by structures like yours.
First let's import some modules we'll need:
import Data.Foldable
import Data.Monoid
In order to use Foldable, we need to generalize the structure a bit, in particular parametrize its content:
data Tree a = Leaf a | Node (Tree a) a (Tree a)
In many cases this is a good idea as it separates the structure from its content and allows it to be easily reused.
Now let's define its Foldable instance. For tree-like structures it's easier to define it using foldMap, which maps each element into a monoid and then combines all values:
instance Foldable Tree where
foldMap f (Leaf x) = f x
foldMap f (Node lt x rt) = foldMap f lt <> f x <> foldMap f rt
This immediately gives us the whole library of functions in the Data.Foldable module, such as searching for an element, different kinds of folds, etc. While a function counting the number of values satisfying some predicate isn't defined there, we can easily define it for any Foldable. The idea is that we'll use the Sum:
nSatisfy :: (Foldable f) => (a -> Bool) -> f a -> Int
nSatisfy p = getSum . foldMap (\x -> Sum $ if p x then 1 else 0)
The idea behind this function is simple: Map each value to 1 if it satisfies the predicate, otherwise to 0. And then folding with the Sum monoid just adds all values up.
This submission to Programming Praxis gives an O(n) function that "undoes" a preorder traversal of a binary search tree, converting a list back into a tree. Supplying the missing data declaration:
data Tree a = Leaf | Branch {value::a, left::Tree a, right:: Tree a}
deriving (Eq, Show)
fromPreOrder :: Ord a => [a] -> Tree a
fromPreOrder [] = Leaf
fromPreOrder (a:as) = Branch a l (fromPreOrder bs)
where
(l,bs) = lessThan a as
lessThan n [] = (Leaf,[])
lessThan n all#(a:as)
| a >= n = (Leaf,all)
| otherwise = (Branch a l r,cs)
where (l,bs) = lessThan a as
(r,cs) = lessThan n bs
It's obvious that one constructor is added to the tree in each recursive step, which is key to its efficiency.
The only "problem" is that the list is threaded through the computation manually, which is not a terribly Haskellian way to do it and makes it a little harder to see that it is actually consumed element by element in a single-threaded manner.
I attempted to correct this using a state monad (prettified on Codepad):
import Control.Monad.State
data Tree a = Leaf
| Branch {root::a, left::Tree a, right::Tree a}
deriving (Eq,Show)
peek = State peek' where
peek' [] = (Nothing,[])
peek' a#(x:_) = (Just x,a)
pop = State pop' where
pop' [] = error "Tried to read past the end of the list"
pop' (_:xs) = ((),xs)
prebuild'::Ord a => State [a] (Tree a)
prebuild' = do
next <- peek
case next of
Nothing -> return Leaf
Just x -> do
pop
leftpart <- lessThan x
rightpart <- prebuild'
return (Branch x leftpart rightpart)
lessThan n = do
next <- peek
case next of
Nothing -> return Leaf
Just x -> do
if x < n
then do
pop
leftpart <- lessThan x
rightpart <- lessThan n
return (Branch x leftpart rightpart)
else
return Leaf
prebuild::Ord a => [a] -> Tree a
prebuild = evalState prebuild'
Unfortunately, this just looks obscenely messy, and doesn't seem any easier to reason about.
One thought I haven't been able to get anywhere with yet (in part because I don't have a deep enough understanding of the underlying concepts, quite likely): could I use a left fold over the list that builds a continuation that ultimately produces the tree? Would that be possible? Also, would it be anything short of insane?
Another thought was to write this as a tree unfold, but I don't think it's possible to do that efficiently; the list will end up being traversed too many times and the program will be O(n^2).
Edit
Taking things from another direction, I have the nagging suspicion that it might be possible to come up with an algorithm that starts by splitting up the list into increasing segments and decreasing segments, but I haven't yet found something concrete to do with that idea.
I think the problem you're having with State is that your primitives (push, pop, peek) are not the right ones. I think a better one would be something like available_, which checks if the front of the stack matches a particular condition, and executes something different in each case:
available_ p f m = do
s <- get
case s of
x:xs | p x -> put xs >> f x
_ -> m
Actually, in our use case, we can specialize a bit: we will always want to return a Leaf when the head of our stack doesn't satisfy the condition, and we'll always want to recurse when it does.
available p m = available_ p
(\x -> liftM2 (Branch x) (lessThan' x) m)
(return Leaf)
(You could also just write available to begin with and skip available_ entirely. In my first iteration, that is what I did.) Now writing fromPreOrder and lessThan are a snap, and also I think give some insight into their behavior. I'll name them with primes so we can double-check they do the right thing with QuickCheck.
fromPreOrder' = available (const True) fromPreOrder'
lessThan' n = available (<n) (lessThan' n)
And in ghci:
> quickCheck (\xs -> fromPreOrder (xs :: [Int]) == evalState fromPreOrder' xs)
+++ OK, passed 100 tests.
While I can't answer the question about continuation passing, I believe that the State monad based implementation can be written much more clearly. First, we can use notational convenience such as those from Control.Applicative to make it easier to read. Second, we can upgrade the effect stack to include Maybe in order to capture the notion of failure (a) from taking the head of an empty list and (b) from the (a >= n) comparison as an effect.
import Control.Monad.State
import Control.Applicative
The final code uses the backtracking-state monad transformer stack. This means that we wrap State around Maybe instead of Maybe around State. In some sense we can think of this as meaning that failure is the "primary" effect. In practice it means that if the algorithm fails there's no way to continue using potentially bad state and so it must backtrack to the last known good state.
type Preord a b = StateT [a] Maybe b
Since we keep taking the head of a list and want to capture that failure correctly, we'll use a "safe head" function (which is the natural destructor of a list anyway, despite not being in the base Haskell libraries)
-- Safe list destructor
uncons :: [a] -> Maybe (a, [a])
uncons [] = Nothing
uncons (a:as) = Just (a, as)
If we look at it cleverly we'll notice that this is already exactly the form of our monadic computation (StateT [a] Maybe b is isomorphic to [a] -> Maybe (b, [a])). We'll give it a more evocative name when lifted into the Monad.
-- Try to get the head or fail
getHead :: Preord a a
getHead = StateT uncons
A common feature of this algorithm is stopping local failures by providing a default value. I'll capture this in the certain combinator
-- Provides a default value for a failing computation
certain :: b -> Preord a b -> Preord a b
certain def p = p <|> return def
And now we can write the final algorithm very cleanly in our Preord monad.
fromPreOrder :: Ord a => Preord a (Tree a)
fromPreOrder = certain Leaf $ do
a <- getHead
Branch a <$> lessThan a <*> fromPreOrder
lessThan :: Ord a => a -> Preord a (Tree a)
lessThan n = certain Leaf $ do
a <- getHead
guard (a < n)
Branch a <$> lessThan a <*> lessThan n
Note that Applicative style helps to indicate that we're building the components of the Branch constructor using further effectful (state consuming) computations. The guard short-circuits lessThan when the pivot is already the least element in the pre-order traversal. We also explicitly see how both fromPreOrder and lessThan default out to Leaf when they cannot compute a better result.
(Also note that fromPreOrder and lessThan are nearly identical now, a commonality Daniel Wagner exploited in his own answer when writing available.)
We finally would want to hide all the monadic noise since, to an outside user, this is just a pure algorithm.
rebuildTree :: [a] -> Tree a
rebuildTree = fromMaybe Leaf . runStateT fromPreOrder
For a complete picture, here's the implementation of the algorithm using only the State monad. Note all the extra noise for handling failure! We've absorbed the entire popElse function into the effects of the backtracking state monad. We also lift the if up into the failure effect. Without that effect stack, our combinators are terrifically specific to the application instead of decomplected and useful elsewhere.
-- Try to take the head of the state list and return the default
-- if that's not possible.
popElse :: b -> (a -> State [a] b) -> State [a] b
popElse def go = do
x <- get
case x of
[] -> return def
(a:as) -> put as >> go a
push :: a -> State [a] ()
push a = modify (a:)
fromPreOrder :: Ord a => State [a] (Tree a)
fromPreOrder = popElse Leaf $ \a -> Branch a <$> lessThan a <*> fromPreOrder
lessThan :: Ord a => a -> State [a] (Tree a)
lessThan n =
popElse Leaf $ \a ->
if a >= n
then push a >> return Leaf
else Branch a <$> lessThan a <*> lessThan n
As you've said, the state monad doesn't really improve the situation, and I don't think it can be expected to, as it's both much too general in that it allows arbitrary access to the state, and annoying in that it enforces unnecessary sequencing.
At first glance, this looks quite like a foldr : we do one thing for the empty case, and in the (:) case we take the head off and make a recursive call based on the tail. However, as the recursive call isn't just using the tail directly, it isn't quite a foldr.
We could express it as a paramorphism but I don't think that really adds anything to the readability.
What I did notice is that the complicated recursion on the tail is all based on lessThan, which led me to the following idea for breaking down the algorithm:
lessThans [] = []
lessThans (a:as) = (a, l) : lessThans bs
where (l, bs) = lessThan a as
fromPreOrder2 :: Ord a => [a] -> Tree a
fromPreOrder2 = foldr (\(a, l) r -> Branch a l r) Leaf . lessThans
I'm sure lessThans could have a better name but I'm not quite sure what!
The foldr can also be expressed as foldr (uncurry Branch) Leaf but I'm not sure if that's an improvement.
EDIT: also, lessThans is an unfoldr, leading to this version:
fromPreOrder3 :: Ord a => [a] -> Tree a
fromPreOrder3 = foldr (uncurry Branch) Leaf . unfoldr lessThanList
lessThanList [] = Nothing
lessThanList (a:as) = Just ((a, l), bs)
where (l, bs) = lessThan a as