I want to create a series of possible equations based on a general specification:
test = ["12", "34=", "56=", "78"]
Each string (e.g. "12") represents a possible character at that location, in this case '1' or '2'.)
So possible equations from test would be "13=7" or "1=68".
I know the examples I give are not balanced but that's because I'm deliberately giving a simplified short string.
(I also know that I could use 'sequence' to search all possibilities but I want to be more intelligent so I need a different approach explained below.)
What I want is to try fixing each of the equals in turn and then removing all other equals in the equation. So I want:
[["12","=","56","78"],["12","34","=","78”]]
I've written this nested list comprehension:
(it needs: {-# LANGUAGE ParallelListComp #-} )
fixEquals :: [String] -> [[String]]
fixEquals re
= [
[
if index == outerIndex then equals else remain
| equals <- map (filter (== '=')) re
| remain <- map (filter (/= '=')) re
| index <- [1..]
]
| outerIndex <- [1..length re]
]
This produces:
[["","34","56","78"],["12","=","56","78"],["12","34","=","78"],["12","34","56","”]]
but I want to filter out any with empty lists within them. i.e. in this case, the first and last.
I can do:
countOfEmpty :: (Eq a) => [[a]] -> Int
countOfEmpty = length . filter (== [])
fixEqualsFiltered :: [String] -> [[String]]
fixEqualsFiltered re = filter (\x -> countOfEmpty x == 0) (fixEquals re)
so that "fixEqualsFiltered test" gives:
[["12","=","56","78"],["12","34","=","78”]]
which is what I want but it doesn’t seem elegant.
I can’t help thinking there’s another way to filter these out.
After all, it’s whenever "equals" is used in the if statement and is empty that we want to drop the equals so it seems a waste to build the list (e.g. ["","34","56","78”] and then ditch it.)
Any thoughts appreciated.
I don't know if this is any cleaner than your code, but it might be a bit more clear and maybe more efficient using a recursion:
fixEquals = init . f
f :: [String] -> [[String]]
f [] = [[]]
f (x:xs) | '=' `elem` x = ("=":removeEq xs) : map (removeEq [x] ++) (f xs)
| otherwise = map (x:) (f xs)
removeEq :: [String] -> [String]
removeEq = map (filter (/= '='))
The way it works is that, if there's an '=' in the current string, then it splits the return into two, if not just calls recursively. The init is needed as in the last element returned there's no equal in any string.
Finally, I believe you can probably find a better data structure to do what you need to achieve instead of using list of strings
Let
xs = [["","34","56","78"],["12","=","56","78"],["12","34","=","78"],["12","34","56",""]]
in
filter (not . any null) xs
will give
[["12","=","56","78"],["12","34","=","78"]]
If you want list comprehension then do
[x | x <- xs, and [not $ null y | y <- x]]
I think I'd probably do it this way. First, a preliminary that I've written so many times it's practically burned into my fingers by now:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go _ [] = []
go b (h:e) = (b,h,e):go (h:b) e
Probably running it once or twice in ghci will be a more clear explanation of what this does than any English writing I could do:
> zippers "abcd"
[("",'a',"bcd"),("a",'b',"cd"),("ba",'c',"d"),("cba",'d',"")]
In other words, it gives a way of selecting each element of a list in turn, giving the "leftovers" of what was before and after the selection point. Given that tool, here's our plan: we'll nondeterministically choose a String to serve as our equals sign, double-check that we've got an equals sign in the first place, and then clear out the equals from the others. So:
fixEquals ss = do
(prefix, s, suffix) <- zippers ss
guard ('=' `elem` s)
return (reverse (deleteEquals prefix) ++ ["="] ++ deleteEquals suffix)
deleteEquals = map (filter ('='/=))
Let's try it:
> fixEquals ["12", "34=", "56=", "78"]
[["12","=","56","78"],["12","34","=","78"]]
Perfect! But this is just a stepping-stone to actually generating the equations, right? It turns out to be not that hard to go all the way in one step, skipping this intermediate. Let's do that:
equations ss = do
(prefixes, s, suffixes) <- zippers ss
guard ('=' `elem` s)
prefix <- mapM (filter ('='/=)) (reverse prefixes)
suffix <- mapM (filter ('='/=)) suffixes
return (prefix ++ "=" ++ suffix)
And we can try it in ghci:
> equations ["12", "34=", "56=", "78"]
["1=57","1=58","1=67","1=68","2=57","2=58","2=67","2=68","13=7","13=8","14=7","14=8","23=7","23=8","24=7","24=8"]
The easiest waty to achieve what you want is to create all the combinations and to filter the ones that have a meaning:
Prelude> test = ["12", "34=", "56=", "78"]
Prelude> sequence test
["1357","1358","1367","1368","13=7","13=8","1457","1458","1467","1468","14=7","14=8","1=57","1=58","1=67","1=68","1==7","1==8","2357","2358","2367","2368","23=7","23=8","2457","2458","2467","2468","24=7","24=8"
Prelude> filter ((1==).length.filter('='==)) $ sequence test
["13=7","13=8","14=7","14=8","1=57","1=58","1=67","1=68","23=7","23=8","24=7","24=8","2=57","2=58","2=67","2=68"]
You pointed the drawback: imagine we have the followig list of strings: ["=", "=", "0123456789", "0123456789"]. We will generate 100 combinations and drop them all.
You can look at the combinations as a tree. For the ["12", "34"], you have:
/ \
1 2
/ \ / \
3 4 3 4
You can prune the tree: just ignore the subtrees when you have two = on the path.
Let's try to do it. First, a simple combinations function:
Prelude> :set +m
Prelude> let combinations :: [String] -> [String]
Prelude| combinations [] = [""]
Prelude| combinations (cs:ts) = [c:t | c<-cs, t<-combinations ts]
Prelude|
Prelude> combinations test
["1357","1358","1367","1368","13=7","13=8","1457","1458","1467","1468","14=7","14=8","1=57","1=58","1=67","1=68","1==7","1==8","2357","2358","2367","2368","23=7","23=8","2457","2458","2467","2468","24=7","24=8", ...]
Second, we need a variable to store the current number of = signs met:
if we find a second = sign, just drop the subtree
if we reach the end of a combination with no =, drop the combination
That is:
Prelude> let combinations' :: [String] -> Int -> [String]
Prelude| combinations' [] n= if n==1 then [""] else []
Prelude| combinations' (cs:ts) n = [c:t | c<-cs, let p = n+(fromEnum $ c=='='), p <= 1, t<-combinations' ts p]
Prelude|
Prelude> combinations' test 0
["13=7","13=8","14=7","14=8","1=57","1=58","1=67","1=68","23=7","23=8","24=7","24=8","2=57","2=58","2=67","2=68"]
We use p as the new number of = sign on the path: if p>1, drop the subtree.
If n is zero, we don't have any = sign in the path, drop the combination.
You may use the variable n to store more information, eg type of the last char (to avoid +* sequences).
I looked up for a solution in Haskell for the 8th Euler problem, but I don't quite understand it.
import Data.List
import Data.Char
euler_8 = do
str <- readFile "number.txt"
print . maximum . map product
. foldr (zipWith (:)) (repeat [])
. take 13 . tails . map (fromIntegral . digitToInt)
. concat . lines $ str
Here is the link for the solution and here you can find the task.
Could anyone explain me the solution one by one?
Reading the data
readFile reads the file "number.txt". If we put a small 16 digit number in a file called number.txt
7316
9698
8586
1254
Runing
euler_8 = do
str <- readFile "number.txt"
print $ str
Results in
"7316\n9698\n8586\n1254"
This string has extra newline characters in it. To remove them, the author splits the string into lines.
euler_8 = do
str <- readFile "number.txt"
print . lines $ str
The result no longer has any '\n' characters, but is a list of strings.
["7316","9698","8586","1254"]
To turn this into a single string, the strings are concatenated together.
euler_8 = do
str <- readFile "number.txt"
print . concat . lines $ str
The concatenated string is a list of characters instead of a list of numbers
"7316969885861254"
Each character is converted into an Int by digitToInt then converted into an Integer by fromInteger. On 32 bit hardware using a full-sized Integer is important since the product of 13 digits could be larger than 2^31-1. This conversion is mapped onto each item in the list.
euler_8 = do
str <- readFile "number.txt"
print . map (fromIntegral . digitToInt)
. concat . lines $ str
The resulting list is full of Integers.
[7,3,1,6,9,6,9,8,8,5,8,6,1,2,5,4]
Subsequences
The author's next goal is to find all of the 13 digit runs in this list of integers. tails returns all of the sublists of a list, starting at any position and running till the end of the list.
euler_8 = do
str <- readFile "number.txt"
print . tails
. map (fromIntegral . digitToInt)
. concat . lines $ str
This results in 17 lists for our 16 digit example. (I've added formatting)
[
[7,3,1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[3,1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[6,9,6,9,8,8,5,8,6,1,2,5,4],
[9,6,9,8,8,5,8,6,1,2,5,4],
[6,9,8,8,5,8,6,1,2,5,4],
[9,8,8,5,8,6,1,2,5,4],
[8,8,5,8,6,1,2,5,4],
[8,5,8,6,1,2,5,4],
[5,8,6,1,2,5,4],
[8,6,1,2,5,4],
[6,1,2,5,4],
[1,2,5,4],
[2,5,4],
[5,4],
[4],
[]
]
The author is going to pull a trick where we rearrange these lists to read off 13 digit long sub lists. If we look at these lists left-aligned instead of right-aligned we can see the sub sequences running down each column.
[
[7,3,1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[3,1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[6,9,6,9,8,8,5,8,6,1,2,5,4],
[9,6,9,8,8,5,8,6,1,2,5,4],
[6,9,8,8,5,8,6,1,2,5,4],
[9,8,8,5,8,6,1,2,5,4],
[8,8,5,8,6,1,2,5,4],
[8,5,8,6,1,2,5,4],
[5,8,6,1,2,5,4],
[8,6,1,2,5,4],
[6,1,2,5,4],
[1,2,5,4],
[2,5,4],
[5,4],
[4],
[]
]
We only want these columns to be 13 digits long, so we only want to take the first 13 rows.
[
[7,3,1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[3,1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[1,6,9,6,9,8,8,5,8,6,1,2,5,4],
[6,9,6,9,8,8,5,8,6,1,2,5,4],
[9,6,9,8,8,5,8,6,1,2,5,4],
[6,9,8,8,5,8,6,1,2,5,4],
[9,8,8,5,8,6,1,2,5,4],
[8,8,5,8,6,1,2,5,4],
[8,5,8,6,1,2,5,4],
[5,8,6,1,2,5,4],
[8,6,1,2,5,4],
[6,1,2,5,4],
[1,2,5,4]
]
foldr (zipWith (:)) (repeat []) transposes a list of lists (explaining it belongs to perhaps another question). It discards the parts of the rows longer than the shortest row.
euler_8 = do
str <- readFile "number.txt"
print . foldr (zipWith (:)) (repeat [])
. take 13 . tails
. map (fromIntegral . digitToInt)
. concat . lines $ str
We are now reading the sub-sequences across the lists as usual
[
[7,3,1,6,9,6,9,8,8,5,8,6,1],
[3,1,6,9,6,9,8,8,5,8,6,1,2],
[1,6,9,6,9,8,8,5,8,6,1,2,5],
[6,9,6,9,8,8,5,8,6,1,2,5,4]
]
The problem
We find the product of each of the sub-sequences by mapping product on to them.
euler_8 = do
str <- readFile "number.txt"
print . map product
. foldr (zipWith (:)) (repeat [])
. take 13 . tails
. map (fromIntegral . digitToInt)
. concat . lines $ str
This reduces the lists to a single number each
[940584960,268738560,447897600,1791590400]
From which we must find the maximum.
euler_8 = do
str <- readFile "number.txt"
print . maximum . map product
. foldr (zipWith (:)) (repeat [])
. take 13 . tails
. map (fromIntegral . digitToInt)
. concat . lines $ str
The answer is
1791590400
If you're not familiar with the functions used, the first thing you should do is examine the types of each function. Since this is function composition, you apply from inside out (i.e. operations occur right to left, bottom to top when reading). We can walk through this line by line.
Starting from the last line, we'll first examine the types.
:t str
str :: String -- This is your input
:t lines
lines :: String -> [String] -- Turn a string into an array of strings splitting on new line
:t concat
concat :: [[a]] -> [a] -- Merge a list of lists into a single list (hint: type String = [Char])
Since type String = [Char] (so [String] is equivalent to [[Char]]), this line is converting the multi-line number into a single array of number characters. More precisely, it first creates an array of strings based on the full string. That is, one string per new line. It then merges all of these lines (now containing only number characters) into a single array of characters (or a single String).
The next line takes this new String as input. Again, let's observe the types:
:t digitToInt
digitToInt :: Char -> Int -- Convert a digit char to an int
:t fromIntegral
fromIntegral :: (Num b, Integral a) => a -> b -- Convert integral to num type
:t map
map :: (a -> b) -> [a] -> [b] -- Perform a function on each element of the array
:t tails
tails :: [a] -> [[a]] -- Returns all final segments of input (see: http://hackage.haskell.org/package/base-4.8.0.0/docs/Data-List.html#v:tails)
:t take
take :: Int -> [a] -> [a] -- Return the first n values of the list
If we apply these operations to our string current input, the first thing that happens is we map the composed function of (fromIntegral . digitToInt) over each character in our string. What this does is turn our string of digits into a list of number types. EDIT As pointed out below in the comments, the fromIntegral in this example is to prevent overflow on 32-bit integer types. Now that we have converted our string into actual numeric types, we start by running tails on this result. Since (by the problem statement) all values must be adjacent and we know that all of the integers are non-negative (by virtue of being places of a larger number), we take only the first 13 elements since we want to ensure our multiplication is groupings of 13 consecutive elements. How this works is difficult to understand without considering the next line.
So, let's do a quick experiment. After converting our string into numeric types, we now have a big list of lists. This is actually kind of hard to think about what we actually have here. For sake of understanding, the contents of the list are not very important. What is important is its size. So let's take a look at an artificial example:
(map length . take 13 . tails) [1..1000]
[1000,999,998,997,996,995,994,993,992,991,990,989,988]
You can see what we have here is a big list of 13 elements. Each element is a list of size 1000 (i.e. the full dataset) down to 988 in descending order. So this is what we currently have for input into the next line which is, arguably, the most difficult-- yet most important-- line to understand. Why understanding this is important should become clear as we walk through the next line.
:t foldr
foldr :: (a -> b -> b) -> b -> [a] -> b -- Combine values into a single value
:t zipWith
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] -- Generalization of zip
:t (:)
(:) :: a -> [a] -> [a] -- Cons operator. Add element to list
:t repeat
repeat :: a -> [a] -- Infinite list containing specified value
Remember how I mentioned we had a list of 13 elements before (of varying-sized lists)? This is important now. The line is going to iterate over that list and apply (zipWith (:)) to it. The (repeat []) is such that each time zipWith is called on a subsequence, it starts with an empty list as its base. This allows us to construct a list of lists containing our adjacent subsequences of length 13.
Finally, we get to the last line which is pretty easy. That said, we should still be mindful of our types
:t product
product :: Num a => [a] -> a -- Multiply all elements of a list together and return result
:t maximum
maximum :: Ord a => [a] -> a -- Return maximum element in the list
The first thing we do is map the product function over each subsequence. When this has completed we end up with a list of numeric types (hey, we finally don't have a list of lists anymore!). These values are the products of each subsequence. Finally, we apply the maximum function which returns only the largest element in the list.
EDIT: I found out later what the foldr expression was for. (See comments bellow my answer).
I think that this could be expressed in different way - You can simply add a guard at the end of the list.
My verbose version of that solution would be:
import Data.List
import Data.Char
euler_8 = do
let len = 13
let str1 = "123456789\n123456789"
-- Join lines
let str2 = concat (lines str1)
-- Transform the list of characters into a list of numbers
let lst1 = map (fromIntegral . digitToInt) str2
-- EDIT: Add a guard at the end of list
let lst2 = lst1 ++ [-1]
-- Get all tails of the list of digits
let lst3 = tails lst2
-- Get first 13 digits from each tail
let lst4 = map (take len) lst3
-- Get a list of products
let prod = map product lst4
-- Find max product
let m = maximum prod
print m