Slowdown by removing useless code (Project Euler 23) - haskell

I'm trying to optimize my old code from Project Euler #23 and noticed some strange slowdown while removing useless comparisons in a function for list merging.
My code:
import Data.List
import Debug.Trace
limit = 28123
-- sum of all integers from 1 to n
summe :: Int -> Int
summe n = div (n*(n+1)) 2
-- all divisors of x excluding itself
divisors :: Int -> [Int]
divisors x = l1 ++ [x `div` z | z <- l1, z*z /= x, z /= 1]
where m = floor $ sqrt $ fromIntegral x
l1 = [y | y <- [1..m] , mod x y == 0]
-- list of all abundant numbers
liste :: [Int]
liste = [x | x <- [12..limit] , x < sum (divisors x)]
-- nested list with sums of abundent numbers
sumliste :: [[Int]]
sumliste = [[x+y | x <- takeWhile (<=y) liste, x + y <= limit] | y <- liste]
-- reduced list
rsl :: [[Int]] -> [Int]
rsl (hl:[]) = hl
rsl (hl:l) = mergelists hl (rsl l)
-- build a sorted union of two sorted lists
mergelists :: [Int] -> [Int] -> [Int]
mergelists [] [] = []
mergelists [] b = b
mergelists a [] = a
mergelists as#(a:at) bs#(b:bt)
-- | a == b = a : mergelists at bt
-- | a > b = b : mergelists as bt
-- | a < b = a : mergelists at bs
| a == b = if a == hl1
then trace "1" l1
else a : l1
| a > b = if b == hl2
then trace "2" l2
else b : l2
| a < b = if a == hl3
then trace "3" l3
else a : l3
where l1 = mergelists at bt
hl1 = if null l1 then a + 1 else head l1
l2 = mergelists as bt
hl2 = head l2
l3 = mergelists at bs
hl3 = head l3
-- build the sum of target numbers by subtracting sum of nontarget numbers from all numbers
main = print $ (summe limit) - (sum $ rsl sumliste)
My problem is the function mergelists. The body of this functions contains some useless if clauses (as can be seen by the missing trace output) and could be refactored to the three commented lines. The problem with this is an increase of execution time from 3.4s to 5.8s what I can't understand.
Why is the shorter code slower?

As Thomas M. DuBuisson suggested, the problem has to do with the lack of strictness. The following code is a slight modification of the code that you have commented out, which uses the $! operator to ensure that the mergelists call is evaluated before forming the list.
mergelists :: [Int] -> [Int] -> [Int]
mergelists [] [] = []
mergelists [] b = b
mergelists a [] = a
mergelists as#(a:at) bs#(b:bt)
| a == b = (a :) $! mergelists at bt
| a > b = (b :) $! mergelists as bt
| a < b = (a :) $! mergelists at bs
The function $! ensures if the result of (_ :) $! mergelists _ _ is evaluated, then mergelists _ _ must be evaluated as well. Thanks to the recursion, this implies that if the result of mergelists is evaluated, then the entire list must be evaluated.
In the slow version,
mergelists as#(a:at) bs#(b:bt)
| a == b = a : mergelists at bt
| a > b = b : mergelists as bt
| a < b = a : mergelists at bs
you can inspect the first element of the result without evaluating the remainder of the list. The call to mergelists in the tail of the list is stored as an unevaluated thunk. This has various implications:
This is good if you only need a small portion of the merged list, or if the inputs are infinitely long.
On the other hand, if the lists aren't that big to begin with and/or you need all the elements eventually, this adds extra overhead due to the presence of the thunk. It also means that the garbage collector doesn't get to free any of the inputs since the thunks will retain references to them.
I don't understand exactly why it's slower for your particular problem though — perhaps someone more experienced can shed some light on this.
I've noticed that, at -O0, the "slow version" is actually the fastest of the three approaches, so I suspect that GHC was able to take advantage of the strictness and produce more optimized code.

Related

Function containing head and tail functions throws empty list error

I'm trying the solve the first question in Advent of Code 2017, and come up with the following solution to calculate the needed value:
checkRepetition :: [Int] -> Bool
checkRepetition [] = False
checkRepetition (x:xs)
| x == ( head xs ) = True
| otherwise = False
test :: [Int] -> Int
test [] = 0
test [x] = 0
test xs
| checkRepetition xs == True = ((head xs)*a) + (test (drop a xs))
| otherwise = test (tail xs)
where
a = (go (tail xs)) + 1
go :: [Int] -> Int
go [] = 0
go xs
| checkRepetition xs == True = 1 + ( go (tail xs) )
| otherwise = 0
However, when I give an input that contains repetitive numbers such as [1,3,3], it gives the error
*** Exception: Prelude.head: empty list
However, for 1.5 hours, I couldn't figure out exactly where this error is generated. I mean any function that is used in test function have a definition for [], but still it throws this error, so what is the problem ?
Note that, I have checked out this question, and in the given answer, it is advised not to use head and tail functions, but I have tested those function for various inputs, and they do not throw any error, so what exactly is the problem ?
I would appreciate any help or hint.
As was pointed out in the comments, the issue is here:
checkRepetition (x:xs)
| x == ( head xs ) = True
xs is not guaranteed to be a non-empty list (a one-element list is written as x:[], so that (x:xs) pattern matches that xs = []) and calling head on an empty list is a runtime error.
You can deal with this by changing your pattern to only match on a 2+ element list.
checkRepetition [] = False
checkRepetition [_] = False
checkRepetition (x1:x2:_) = x1 == x2
-- No need for the alternations on this function, by the way.
That said, your algorithm seems needlessly complex. All you have to do is check if the next value is equal, and if so then add the current value to the total. Assuming you can get your String -> [Int] on your own, consider something like:
filteredSum :: [Int] -> Int
filteredSum [] = 0 -- by definition, zero- and one-element lists
filteredSum [_] = 0 -- cannot produce a sum, so special case them here
filteredSum xss#(first:_) = go xss
where
-- handle all recursive cases
go (x1:xs#(x2:_)) | x1 == x2 = x1 + go xs
| otherwise = go xs
-- base case
go [x] | x == first = x -- handles last character wrapping
| otherwise = 0 -- and if it doesn't wrap
-- this should be unreachable
go [] = 0
For what it's worth, I think it's better to work in the Maybe monad and operate over Maybe [Int] -> Maybe Int, but luckily that's easy since Maybe is a functor.
digitToMaybeInt :: Char -> Maybe Int
digitToMaybeInt '0' = Just 0
digitToMaybeInt '1' = Just 1
digitToMaybeInt '2' = Just 2
digitToMaybeInt '3' = Just 3
digitToMaybeInt '4' = Just 4
digitToMaybeInt '5' = Just 5
digitToMaybeInt '6' = Just 6
digitToMaybeInt '7' = Just 7
digitToMaybeInt '8' = Just 8
digitToMaybeInt '9' = Just 9
digitToMaybeInt _ = Nothing
maybeResult :: Maybe Int
maybeResult = fmap filteredSum . traverse digitToMaybeInt $ input
result :: Int
result = case maybeResult of
Just x -> x
Nothing -> 0
-- this is equivalent to `maybe 0 id maybeResult`
Thank you for the link. I went there first to glean the purpose.
I assume the input will be a string. The helper function below constructs a numeric list to be used to sum if predicate is True, that is, the zipped values are equal, that is, each number compared to each successive number (the pair).
The helper function 'nl' invokes the primary function 'invcap' Inverse Captcha with a list of numbers.
The nl function is a list comprehension. The invcap function is a list comprehension. Perhaps the logic in this question is at fault. Overly complicated logic is more likely to introduce errors. Proofs are very much easier when logic is not cumbersome.
The primary function "invcap"
invcap l = sum [ x | (x,y) <- zip l $ (tail l) ++ [head l], x == y]
The helper function that converts a string to a list of digits and invokes invcap with a list of numeric digits.
nl cs = invcap [ read [t] :: Int | t <- cs]
Invocation examples
Prelude> nl "91212129" ......
9 ' ' ' ' ' ' ' ' ' ' ' ' '
Prelude> nl "1122" ......
3

Haskell list monad looping

I have a list comprehension that looks like this:
cross ps = [ p* pp * ppp | p <- ps, pp <- ps, ppp <- ps, p >= pp , pp >= ppp ]
How do I achieve this using monads without literally typing out the list names?
dim ps n = do
p <- ps
pp <- ps
ppp <- ps
p...p <- ps
guard (p >= pp && pp >= ppp ... && p...p >=p....p)
return (p*pp*ppp*p...p)
How can I do this without explicitly assigning values in order to use the list monad?
Here's how I'd do it
ascending :: Ord a => [a] -> Bool
ascending list = and $ zipWith (>=) (tail list) list
dim ps n = map product $ filter ascending allComb
where allComb = replicateM n ps
The replicateM comes from Control.Monad and for the list monad it generates all combinations of n elements of the given list.
Then I filter out just the combinations that are in an ascending order and finally calculate the products of the lists that remained.
Perhaps the easiest to understand solution is to “literally use a loop”:
dim ps n = do
pz <- forM [1..n] $ \_i -> do
p <- ps
return p
guard $ descending pz
return $ product pz
But do {p <- ps; return p} is equivalent to simply ps, and for forM [1..n] $ \_i -> ps we have the shorthand replicateM n ps. So you get to chi's suggested solution. I'd say Luka Horvat's is actually a little better, though.
But then again, as chi remarked, you can make this a lot more efficient by not selecting all possible combinations and throwing the vast majority away, but rather only selecting the descending possibilities in the first place. For this I'd manually write a recursive function:
descendingChoices :: Ord a => Int -> [a] -> [[a]]
descendingChoices 1 ps = [[p] | p<-ps] -- aka `pure<$>ps`
descendingChoices n ps = [ p : qs | qs <- descendingChoices (n-1) ps
, p <- ps
, all (<=p) qs
]
A close translation could be:
dim :: Num a => [a] -> Int -> [a]
dim ps n = do
chosen <- replicateM n ps
guard $ increasing chosen
return $ product chosen
increasing :: Ord a => [a] -> Bool
increasing [] = True
increasing xs#(_:ys) = and $ zipWith (<=) xs ys
However, this could be improved by putting the guards earlier. I mean:
[ ... | p1<-xs, p2<-xs, p3<-xs, p1 <= p2, p2 <= p3 ]
is worse than
[ ... | p1<-xs, p2<-xs, p1 <= p2, p3<-xs, p2 <= p3 ]
since the latter will avoid scanning the whole list for p3<-xs when p1 <= p2, so we are not going to generate anything anyway.
So, let's try again, with a more primitive approach:
dim :: Num a => [a] -> Int -> [a]
dim ps 0 = [1]
dim ps n = do
x <- ps
xs <- dim (filter (>=x) ps) (n-1)
return (x * xs)
Now we discard impossible alternatives early, removing them from ps before the recursive call.
Given that your list of primes is in ascending order, you can avoid the guards entirely, by only generating each set of products once to begin with:
cross :: Int -> [a] -> [[a]]
cross 0 _ = [[]]
cross n [] = []
cross n all#(x:xs) = ((x:) <$> cross (n - 1) all) ++ cross n xs
dim :: Num a => Int -> [a] -> [a]
dim n xs = map product $ cross n xs
If the list of primes is not in ascending order, then the best option is to sort it and use an algorithm that assumes the list is sorted. amalloy gave one, however you can make the function that generates k-combinations with repetitions (aka cross) quite more efficient by using sharing (an example).
Another such algorithm is
dim :: (Num a, Ord a) => Int -> [a] -> [a]
dim 0 xs = [1]
dim n xs = [y * x | x <- xs, y <- dim (n - 1) (takeWhile (<= x) xs)]
Note the takeWhile instead of filter. This way you don't need to process the whole list of primes over and over again, instead you always process only those primes that you actually need.

Euler #4 with bigger domain

Consider the modified Euler problem #4 -- "Find the maximum palindromic number which is a product of two numbers between 100 and 9999."
rev :: Int -> Int
rev x = rev' x 0
rev' :: Int -> Int -> Int
rev' n r
| n == 0 = r
| otherwise = rev' (n `div` 10) (r * 10 + n `mod` 10)
pali :: Int -> Bool
pali x = x == rev x
main :: IO ()
main = print . maximum $ [ x*y | x <- nums, y <- nums, pali (x*y)]
where
nums = [9999,9998..100]
This Haskell solution using -O2 and ghc 7.4.1 takes about 18
seconds.
The similar C solution takes 0.1 second.
So Haskell is 180 times
slower. What's wrong with my solution? I assume that this type of
problems Haskell solves pretty well.
Appendix - analogue C solution:
#define A 100
#define B 9999
int ispali(int n)
{
int n0=n, k=0;
while (n>0) {
k = 10*k + n%10;
n /= 10;
}
return n0 == k;
}
int main(void)
{
int max = 0;
for (int i=B; i>=A; i--)
for (int j=B; j>=A; j--) {
if (i*j > max && ispali(i*j))
max = i*j; }
printf("%d\n", max);
}
The similar C solution
That is a common misconception.
Lists are not loops!
And using lists to emulate loops has performance implications unless the compiler is able to eliminate the list from the code.
If you want to compare apples to apples, write the Haskell structure more or less equivalent to a loop, a tail recursive worker (with strict accumulator, though often the compiler is smart enough to figure out the strictness by itself).
Now let's take a more detailed look. For comparison, the C, compiled with gcc -O3, takes ~0.08 seconds here, the original Haskell, compiled with ghc -O2 takes ~20.3 seconds, with ghc -O2 -fllvm ~19.9 seconds. Pretty terrible.
One mistake in the original code is to use div and mod. The C code uses the equivalent of quot and rem, which map to the machine division instructions and are faster than div and mod. For positive arguments, the semantics are the same, so whenever you know that the arguments are always non-negative, never use div and mod.
Changing that, the running time becomes ~15.4 seconds when compiling with the native code generator, and ~2.9 seconds when compiling with the LLVM backend.
The difference is due to the fact that even the machine division operations are quite slow, and LLVM replaces the division/remainder with a multiply-and-shift operation. Doing the same by hand for the native backend (actually, a slightly better replacement taking advantage of the fact that I know the arguments will always be non-negative) brings its time down to ~2.2 seconds.
We're getting closer, but are still a far cry from the C.
That is due to the lists. The code still builds a list of palindromes (and traverses a list of Ints for the two factors).
Since lists cannot contain unboxed elements, that means there is a lot of boxing and unboxing going on in the code, that takes time.
So let us eliminate the lists, and take a look at the result of translating the C to Haskell:
module Main (main) where
a :: Int
a = 100
b :: Int
b = 9999
ispali :: Int -> Bool
ispali n = go n 0
where
go 0 acc = acc == n
go m acc = go (m `quot` 10) (acc * 10 + (m `rem` 10))
maxpal :: Int
maxpal = go 0 b
where
go mx i
| i < a = mx
| otherwise = go (inner mx b) (i-1)
where
inner m j
| j < a = m
| p > m && ispali p = inner p (j-1)
| otherwise = inner m (j-1)
where
p = i*j
main :: IO ()
main = print maxpal
The nested loop is translated to two nested worker functions, we use an accumulator to store the largest palindrome found so far. Compiled with ghc -O2, that runs in ~0.18 seconds, with ghc -O2 -fllvm it runs in ~0.14 seconds (yes, LLVM is better at optimising loops than the native code generator).
Still not quite there, but a factor of about 2 isn't too bad.
Maybe some find the following where the loop is abstracted out more readable, the generated core is for all intents and purposes identical (modulo a switch of argument order), and the performance of course the same:
module Main (main) where
a :: Int
a = 100
b :: Int
b = 9999
ispali :: Int -> Bool
ispali n = go n 0
where
go 0 acc = acc == n
go m acc = go (m `quot` 10) (acc * 10 + (m `rem` 10))
downto :: Int -> Int -> a -> (a -> Int -> a) -> a
downto high low acc fun = go high acc
where
go i acc
| i < low = acc
| otherwise = go (i-1) (fun acc i)
maxpal :: Int
maxpal = downto b a 0 $ \m i ->
downto b a m $ \mx j ->
let p = i*j
in if mx < p && ispali p then p else mx
main :: IO ()
main = print maxpal
#axblount is at least partly right; the following modification makes the program run almost three times as fast as the original:
maxPalindrome = foldl f 0
where f a x | x > a && pali x = x
| otherwise = a
main :: IO ()
main = print . maxPalindrome $ [x * y | x <- nums, y <- nums]
where nums = [9999,9998..100]
That still leaves a factor 60 slowdown, though.
This is more true to what the C code is doing:
maxpali :: [Int] -> Int
maxpali xs = go xs 0
where
go [] m = m
go (x:xs) m = if x > m && pali(x) then go xs x else go xs m
main :: IO()
main = print . maxpali $ [ x*y | x <- nums, y <- nums ]
where nums = [9999,9998..100]
On my box this takes 2 seconds vs .5 for the C version.
Haskell may be storing that entire list [ x*y | x <- nums, y <- nums, pali (x*y)] where as the C solution calculates the maximum on the fly. I'm not sure about this.
Also the C solution will only calculate ispali if the product beats the previous maximum. I would bet Haskell calculates are palindrome products regardless of whether x*y is a possible max.
It seems to me that you are having a branch prediction problem. In the C code, you have two nested loops and as soon as a palindrome is seen in the inner loop, the rest of the inner loop will be skipped very fast.
The way you feed this list of products instead of the nested loops I am not sure that ghc is doing any of this prediction.
Another way to write this is to use two folds, instead of one fold over the flattened list:
-- foldl g0 0 [x*y | x<-[b-1,b-2..a], y<-[b-1,b-2..a], pali(x*y)] (A)
-- foldl g1 0 [x*y | x<-[b-1,b-2..a], y<-[b-1,b-2..a]] (B)
-- foldl g2 0 [ [x*y | y<-[b-1,b-2..a]] | x<-[b-1,b-2..a]] (C)
maxpal b a = foldl f1 0 [b-1,b-2..a] -- (D)
where
f1 m x = foldl f2 m [b-1,b-2..a]
where
f2 m y | p>m && pali p = p
| otherwise = m
where p = x*y
main = print $ maxpal 10000 100
Seems to run much faster than (B) (as in larsmans's answer), too (only 3x - 4x slower then the following loops-based code). Fusing foldl and enumFromThenTo definitions gets us the "functional loops" code (as in DanielFischer's answer),
maxpal_loops b a = f (b-1) 0 -- (E)
where
f x m | x < a = m
| otherwise = g (b-1) m
where
g y m | y < a = f (x-1) m
| p>m && pali p = g (y-1) p
| otherwise = g (y-1) m
where p = x*y
The (C) variant is very suggestive of further algorithmic improvements (that's outside the scope of the original Q of course) that exploit the hidden order in the lists, destroyed by the flattening:
{- foldl g2 0 [ [x*y | y<-[b-1,b-2..a]] | x<-[b-1,b-2..a]] (C)
foldl g2 0 [ [x*y | y<-[x, x-1..a]] | x<-[b-1,b-2..a]] (C1)
foldl g0 0 [ safehead 0 . filter pali $
[x*y | y<-[x, x-1..a]] | x<-[b-1,b-2..a]] (C2)
fst $ until ... (\(m,s)-> (max m .
safehead 0 . filter pali . takeWhile (> m) $
head s, tail s))
(0,[ [x*y | y<-[x, x-1..a]] | x<-[b-1,b-2..a]]) (C3)
safehead 0 $ filter pali $ mergeAllDescending
[ [x*y | y<-[x, x-1..a]] | x<-[b-1,b-2..a]] (C4)
-}
(C3) can stop as soon as the head x*y in a sub-list is smaller than the currently found maximum. It is what short-cutting functional loops code could achieve, but not (C4), which is guaranteed to find the maximal palindromic number first. Plus, for list-based code its algorithmic nature is more visually apparent, IMO.

Comparing 3 output lists in haskell

I am doing another Project Euler problem and I need to find when the result of these 3 lists is equal (we are given 40755 as the first time they are equal, I need to find the next:
hexag n = [ n*(2*n-1) | n <- [40755..]]
penta n = [ n*(3*n-1)/2 | n <- [40755..]]
trian n = [ n*(n+1)/2 | n <- [40755..]]
I tried adding in the other lists as predicates of the first list, but that didn't work:
hexag n = [ n*(2*n-1) | n <- [40755..], penta n == n, trian n == n]
I am stuck as to where to to go from here.
I tried graphing the function and even calculus but to no avail, so I must resort to a Haskell solution.
Your functions are weird. They get n and then ignore it?
You also have a confusion between function's inputs and outputs. The 40755th hexagonal number is 3321899295, not 40755.
If you really want a spoiler to the problem (but doesn't that miss the point?):
binarySearch :: Integral a => (a -> Bool) -> a -> a -> a
binarySearch func low high
| low == high = low
| func mid = search low mid
| otherwise = search (mid + 1) high
where
search = binarySearch func
mid = (low+high) `div` 2
infiniteBinarySearch :: Integral a => (a -> Bool) -> a
infiniteBinarySearch func =
binarySearch func ((lim+1) `div` 2) lim
where
lim = head . filter func . lims $ 0
lims x = x:lims (2*x+1)
inIncreasingSerie :: (Ord a, Integral i) => (i -> a) -> a -> Bool
inIncreasingSerie func val =
val == func (infiniteBinarySearch ((>= val) . func))
figureNum :: Integer -> Integer -> Integer
figureNum shape index = (index*((shape-2)*index+4-shape)) `div` 2
main :: IO ()
main =
print . head . filter r $ map (figureNum 6) [144..]
where
r x = inIncreasingSerie (figureNum 5) x && inIncreasingSerie (figureNum 3) x
Here's a simple, direct answer to exactly the question you gave:
*Main> take 1 $ filter (\(x,y,z) -> (x == y) && (y == z)) $ zip3 [1,2,3] [4,2,6] [8,2,9]
[(2,2,2)]
Of course, yairchu's answer might be more useful in actually solving the Euler question :)
There's at least a couple ways you can do this.
You could look at the first item, and compare the rest of the items to it:
Prelude> (\x -> all (== (head x)) $ tail x) [ [1,2,3], [1,2,3], [4,5,6] ]
False
Prelude> (\x -> all (== (head x)) $ tail x) [ [1,2,3], [1,2,3], [1,2,3] ]
True
Or you could make an explicitly recursive function similar to the previous:
-- test.hs
f [] = True
f (x:xs) = f' x xs where
f' orig (y:ys) = if orig == y then f' orig ys else False
f' _ [] = True
Prelude> :l test.hs
[1 of 1] Compiling Main ( test.hs, interpreted )
Ok, modules loaded: Main.
*Main> f [ [1,2,3], [1,2,3], [1,2,3] ]
True
*Main> f [ [1,2,3], [1,2,3], [4,5,6] ]
False
You could also do a takeWhile and compare the length of the returned list, but that would be neither efficient nor typically Haskell.
Oops, just saw that didn't answer your question at all. Marking this as CW in case anyone stumbles upon your question via Google.
The easiest way is to respecify your problem slightly
Rather than deal with three lists (note the removal of the superfluous n argument):
hexag = [ n*(2*n-1) | n <- [40755..]]
penta = [ n*(3*n-1)/2 | n <- [40755..]]
trian = [ n*(n+1)/2 | n <- [40755..]]
You could, for instance generate one list:
matches :: [Int]
matches = matches' 40755
matches' :: Int -> [Int]
matches' n
| hex == pen && pen == tri = n : matches (n + 1)
| otherwise = matches (n + 1) where
hex = n*(2*n-1)
pen = n*(3*n-1)/2
tri = n*(n+1)/2
Now, you could then try to optimize this for performance by noticing recurrences. For instance when computing the next match at (n + 1):
(n+1)*(n+2)/2 - n*(n+1)/2 = n + 1
so you could just add (n + 1) to the previous tri to obtain the new tri value.
Similar algebraic simplifications can be applied to the other two functions, and you can carry all of them in accumulating parameters to the function matches'.
That said, there are more efficient ways to tackle this problem.

Detecting cyclic behaviour in Haskell

I am doing yet another projecteuler question in Haskell, where I must find if the sum of the factorials of each digit in a number is equal to the original number. If not repeat the process until the original number is reached. The next part is to find the number of starting numbers below 1 million that have 60 non-repeating units. I got this far:
prob74 = length [ x | x <- [1..999999], 60 == ((length $ chain74 x)-1)]
factorial n = product [1..n]
factC x = sum $ map factorial (decToList x)
chain74 x | x == 0 = []
| x == 1 = [1]
| x /= factC x = x : chain74 (factC x)
But what I don't know how to do is to get it to stop once the value for x has become cyclic. How would I go about stopping chain74 when it gets back to the original number?
When you walk through the list that might contain a cycle your function needs to keep track of the already seen elements to be able to check for repetitions. Every new element is compared against the already seen elements. If the new element has already been seen, the cycle is complete, if it hasn't been seen the next element is inspected.
So this calculates the length of the non-cyclic part of a list:
uniqlength :: (Eq a) => [a] -> Int
uniqlength l = uniqlength_ l []
where uniqlength_ [] ls = length ls
uniqlength_ (x:xs) ls
| x `elem` ls = length ls
| otherwise = uniqlength_ xs (x:ls)
(Performance might be better when using a set instead of a list, but I haven't tried that.)
What about passing another argument (y for example) to the chain74 in the list comprehension.
Morning fail so EDIT:
[.. ((length $ chain74 x x False)-1)]
chain74 x y not_first | x == y && not_first = replace_with_stop_value_:-)
| x == 0 = []
| x == 1 = [1]
| x == 2 = [2]
| x /= factC x = x : chain74 (factC x) y True
I implemented a cycle-detection algorithm in Haskell on my blog. It should work for you, but there might be a more clever approach for this particular problem:
http://coder.bsimmons.name/blog/2009/04/cycle-detection/
Just change the return type from String to Bool.
EDIT: Here is a modified version of the algorithm I posted about:
cycling :: (Show a, Eq a) => Int -> [a] -> Bool
cycling k [] = False --not cycling
cycling k (a:as) = find 0 a 1 2 as
where find _ _ c _ [] = False
find i x c p (x':xs)
| c > k = False -- no cycles after k elements
| x == x' = True -- found a cycle
| c == p = find c x' (c+1) (p*2) xs
| otherwise = find i x (c+1) p xs
You can remove the 'k' if you know your list will either cycle or terminate soon.
EDIT2: You could change the following function to look something like:
prob74 = length [ x | x <- [1..999999], let chain = chain74 x, not$ cycling 999 chain, 60 == ((length chain)-1)]
Quite a fun problem. I've come up with a corecursive function that returns the list of the "factorial chains" for every number, stopping as soon as they would repeat themselves:
chains = [] : let f x = x : takeWhile (x /=) (chains !! factC x) in (map f [1..])
Giving:
take 4 chains == [[],[1],[2],[3,6,720,5043,151,122,5,120,4,24,26,722,5044,169,363601,1454]]
map head $ filter ((== 60) . length) (take 10000 chains)
is
[1479,1497,1749,1794,1947,1974,4079,4097,4179,4197,4709,4719,4790,4791,4907,4917
,4970,4971,7049,7094,7149,7194,7409,7419,7490,7491,7904,7914,7940,7941,9047,9074
,9147,9174,9407,9417,9470,9471,9704,9714,9740,9741]
It works by calculating the "factC" of its position in the list, then references that position in itself. This would generate an infinite list of infinite lists (using lazy evaluation), but using takeWhile the inner lists only continue until the element occurs again or the list ends (meaning a deeper element in the corecursion has repeated itself).
If you just want to remove cycles from a list you can use:
decycle :: Eq a => [a] -> [a]
decycle = dc []
where
dc _ [] = []
dc xh (x : xs) = if elem x xh then [] else x : dc (x : xh) xs
decycle [1, 2, 3, 4, 5, 3, 2] == [1, 2, 3, 4, 5]

Resources