Clean syntax for conditionally folding a list in Haskell - haskell

I'm relatively new to haskell, but in my searching I couldn't find an easy way to conditionally fold a list. i.e. When an element satisfies a condition (like in filter) to fold that element by a function (like foldr and foldl).
My workaround was to write the following helper function, then apply map to change the resulting list of pairs as my situation required.
-- This function returns tuples containing the elements which
-- satisfy `cond` folded right, adding 1 to the second value
-- in each pair. (`snd` pair starts at 0)
-- Condition takes a single value (similar to `filter`)
-- NOTE: list cannot end with token
foldrOn cond list =
if (length list) > 0 then
if cond (head list) then
do
let tmp = foldrOn cond (tail list)
(fst (head tmp), snd (head tmp) + 1) : (tail tmp)
-- fold token into char after it
else
(head list, 0) : (foldrOn cond (tail list))
-- don't fold token
else
[] -- base case len list = 0
foldlOn cond list = ...
For example, the use-case would be something along the lines of wanting to remove the zeros in the following lists but remember how many were removed between each value.
-- the second value in each resultant pair represents the number of
-- zeroes preceding the corresponding first value in the original list.
foldrOn (== 0) [1,0,0,0,0,0,1,0,0,0,1] -- [(1,0),(1,5),(1,3)]
foldrOn (== 0) [1,0,0,12,0,13] -- [(1,0),(12,2),(13,1)]
Is there a better way to accomplish this?
Additionally, can this be done more optimally?

First of all,
foldrOn :: Num t => (a -> Bool) -> [a] -> [(a, t)]
-- foldrOn (== 0) [1,0,0,0,0,0,1,0,0,0,1] -- [(1,0),(1,5),(1,3)]
foldrOn p xs = foldr g [] xs
where
g x [] = [(x,0)]
g x ((y,n):r)
| p x = ((y,n+1):r)
g x r = ((x,0):r)
This is the simplest, though it is recursive, i.e. will force the whole list to the end before starting returning its result.
To make it maximally lazy we'd have to use a lazy left fold. The skipping over the p-satisfying elements is still a recursive step, but at least the process will pause between each such span.
Lazy left fold is usually implemented as a foldr with additional argument being passed left to right along the list:
foldlOn :: Num t => (a -> Bool) -> [a] -> [(a, t)]
-- foldlOn (== 0) [1,0,0,0,0,0,1,0,0,0,1] -- [(1,0),(1,5),(1,3)]
foldlOn p xs = foldr g z xs 0
where
g x r i | p x = r (i+1)
| otherwise = (x,i) : r 0
z _i = []
Or you could combine span/break and unfoldr to do the same.
You might find a way to use groupBy with some post-processing step:
GHCi> groupBy (\a b -> (==0) b) [1,0,0,0,0,0,1,0,0,0,1]
[[1,0,0,0,0,0],[1,0,0,0],[1]]
GHCi> groupBy (const (==0)) [1,2,0,0,1,0,1]
[[1],[2,0,0],[1,0],[1]]
Finishing this should not be a problem.

You can always bring some builtin machinery. The Data.List library is quite powerful:
import Data.List(mapAccumL)
import Data.Maybe(catMaybes)
foldrOn cond = catMaybes . snd . mapAccumL combine 0 where
combine a el =
if cond el then (a + 1, Nothing)
else (0, Just (el, a))
What's going on
Essentially, foldrOn cond is a composition of the following functions:
mapAccumL combine 0 which advances along the list modifying each element by information about the number of recently skipped entities (starting the count at 0 and resetting it whenever we find something that doesn't match the cond predicate).
snd which discards the final state from the mapAccumL's result
catMaybes which removes the Maybe layer and leaves only the "present" values.

Let's start by using pattern matching to make your own implementation more idiomatic, more obviously correct, and also (much) faster. We can also use guards in an idiomatic fashion rather than if/then/else; this is rather less important. There's also no reason to use do here, so we won't.
foldrOn _cond [] = []
foldrOn cond (hd : tl)
| cond hd
= case foldrOn cond tl of
(x, y) : tl' -> (x, y + 1) : tl'
-- fold token into char after it
[] -> error "String ended on token."
| otherwise
= (hd, 0) : foldrOn cond tl
-- don't fold token
This is ... okay. But as Will Ness suggests, we don't actually gain anything by consing an "incomplete" element onto the result list. We can instead count up the cond-satisfying tokens until we reach the end of the block, and then produce a complete element. I think this makes the code a little easier to understand, and it should also run a little bit faster.
foldrOn cond = go 0
where
go count (hd : tl)
| cond hd
= go (count + 1) tl -- Don't produce anything; just bump the count
| otherwise
= (hd, count) : go 0 tl -- Produce the element and the count; reset the count to 0
go count []
| count == 0
= []
| otherwise
= error "List ended on a token."
To actually run faster, you might need to tell the compiler explicitly that you really want to calculate the counts. You probably don't need to understand this part just yet, but it looks like this:
-- At the top of the file, add this line:
{-# LANGUAGE BangPatterns #-}
foldrOn cond = go 0
where
go !count (hd : tl)
| cond hd
= go (count + 1) tl -- Don't produce anything; just bump the count
| otherwise
= (hd, count) : go 0 tl -- Produce the element and the count; reset the count to 0
go count []
| count == 0
= []
| otherwise
= error "List ended on a token."
This can be written as a fold in the manner Will Ness demonstrates.
Note: while it's possible to avoid the BangPatterns language extension, doing so is a bit annoying.

Related

Are recursive calls in my "permutations with repetition" code accumulated to clog the RAM?

A bit of background:
I am an amateur programmer, having picked up Haskell a few months ago, on my spare time, after a period of Mathematica programmning (my first language). I am currently going through my second Haskell book, by Will Kurt, but I still have miles to go to call myself comfortable around Haskell code. Codeabbey has been my platform for experimentation and learning so far.
I have written a piece of code to generate permutations of a given number, that deals with possible duplicate numbers, so for 588 it will internally generate 588, 858 and 885.
However, because I want to scale to pretty big input numbers (think perhaps even a hundred digits long), I don't want to output the whole list and then perform calculations on it, instead every number that is generated is checked on the spot for a certain property and if it has it, well, we have a winner, the number is returned as output and there's no need to go through the rest of the humongous list. If sadly no desired number is found and we unsuccessfully go through all possible permutations, it outputs a "0".
I have also opted to make it a command line program to feed values to it via gnu parallel for faster work.
So here is the code
import System.Environment
import Data.List
toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)
fromDigits :: Integral a => [a] -> Integer
fromDigits list = fromDigitsHelperFunction list 0
fromDigitsHelperFunction :: Integral a => [a] -> Integer -> Integer
fromDigitsHelperFunction [] acc = acc
fromDigitsHelperFunction (x:[]) acc = (fromIntegral x) + acc
fromDigitsHelperFunction digits#(x:xs) acc = fromDigitsHelperFunction xs (acc + ((fromIntegral x) * 10 ^((length digits) - 1 )))
testPermutationsWithRepetition :: ([Integer],Int,[Int],[(Int,Integer)]) -> [Integer]
testPermutationsWithRepetition (digits, index, rotationMap, registeredPositions)
| index == 0 && rotationMap !! index == 0 = [0,0,0] --finish state (no more recursion). Nothing more to do
| index == digitsLength - 1 && beautyCheck (fromDigits digits) = digits
| index == digitsLength - 1 = testPermutationsWithRepetition (digits, index-1, rotationMap, registeredPositions)
| not ((index,digits!!index) `elem` registeredPositions) = testPermutationsWithRepetition (digits, index+1, rotationMap, (index,digits!!index):registeredPositions)
| rotationMap!!index == 0 = testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
| rotationMap!!index > 0 && (index,digits!!index) `elem` registeredPositions = testPermutationsWithRepetition (shiftLDigits, index, subtractRot, registeredPositions)
where digitsLength = length digits
shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
splitDigits = splitAt index digits
restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
splitRotMap = splitAt index rotationMap
restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions --clear everything below the parent index
subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)
--Frontend function for testing permutations by inputting a single parameter (the number in digit form)
testPermsWithRep :: [Integer] -> [Integer]
testPermsWithRep digits = testPermutationsWithRepetition (digits, 0, [length $ digits, (length $ digits) -1 .. 1], [])
main :: IO ()
main = do
args <- getArgs
let number = read (head args) :: Integer
let checkResult = fromDigits $ testPermsWithRep $ toDigits number
print checkResult
It's really a sequential process with an index variable that points to a certain number on the digit list and performs a recursive call on that list based on my rules. The functions tracks its progress through the digit list for visited numbers in certain positions so far (to avoid repetition following already visited paths until it gets to the last digit (index == length -1). If the number that we get there passes the beauty check, it exits with the number produced.
Now, in a Mathematica (or I guess any imperative language) I would probably implement this with a While loop and Cases for its checks, and by the logic of the program, however long it took to compute (generate the permutations and check them for validity) it would take a moderate amount of memory, just enough to hold the list of "registeredPositions" really (you could call it the record of visited digits in specific positions, so it's a variable list as we go deeper in index but gets cleaned up as we move back up). However in this case, the recursive calls stack up as it seems and the whole thing acts as a fork bomb for sufficiently large numbers (e.g 27777772222222222222222223333) and eventually crashes. Is this behaviour something that can be handled differently in Haskell or is there no way to avoid the recursion and memory hogging?
I really like Haskell because the programs make logical sense, but I would like to use it also for cases like this where performance (and resources) matters.
As a side note, my brother pointed to this Algorithm to print all permutations with repetition of numbers in C that is reasonably fast (only generates a list though) and most importantly has minimal memory footprint, although I can tell there's also recursion used in it. Other that that I'm clueless when it comes to C and I would like to stick to Haskell, if it can do what I want at the end of the day, that is.
Any help is welcome. Have a good day!
Edit:
Per Soleil's suggestion I update my post with additional info provided in the comments. Specifically:
After compiling with "ghc checking_program.hs" I run the program with "./checking program 27777772222222222222222223333". On an i5 3470 with 4GB RAM it runs for about 10 minutes and exits with a segmentation fault. On my brothers 32GB machine he let it run until it took up 20GB of RAM. No need to go further I guess. My tests were on Ubuntu via Win10 WSL. His is bare Linux
testPermsWithRep is just a front end for testPermutationsWithRepetition, so that I can only provide the number and testPermsWithRep creates the initial parameters and calls testPermutationsWithRepetition with those. It outputs exactly what testPermutationsWithRepetition outputs, either a number (in digit form) that passes the test, or [0,0,0]. Now the test, the beautyCheck function is simply a test for single digit divisors of that number, that returns True or False. I didn't include it because it really is inconsequential. It could even be just a "bigger than x number" test.
An an example, calling "testPermsWithRep [2,6,7,3]" will call "testPermutationsWithRepetition ([2,6,7,3], 0, [4,3,2,1],[])" and whatever comes out of that function, testPermsWithRep will return that as well.
The performance issue with your program doesn't have anything to do with recursion. Rather, you seem to be running up against an accumulation of a partially evaluated, lazy data structure in your rotation map. Your program will run in constant memory if you use the deepseq package to fully force evaluation of the restoredRotMap:
-- Install the `deepseq` package and add this import
import Control.DeepSeq
-- And then change this one case
... | rotationMap!!index == 0 = restoredRotMap `deepseq`
testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
Compiled with ghc -O2 and using beautyMap _ = False, this runs with a fixed resident memory usage of about 6 megs.
Some other performance targets:
You might want to replace most of your Integer types with Int, as this will be faster. I think you only need Integer for the input to toDigits and the output of fromDigits, and everything else can be Int, since it's all indexes and digits.
An even bigger win will be to replace your rotation map and registered positions with better data structures. If you find yourself splicing up lists with lots of listpart1 ++ [x] ++ listpart2 calls, there are going to be enormous performance costs to that, and the linear lookups with (!!) aren't helping either.
So I am not 100% sure of this and I am also not 100% sure I understand your code.
But as far as I understand you are generating permutations without duplicates and then you are checking for some predicate wanting whatever single number that fulfils it.
I think it should help to use as many of the prelude functions as possible because afaik then the compiler understands it can optimize recursion into a loop. As a rule of thumb I was taught to avoid explicit recursion as much as possible and instead use prelude functions like map, filter and fold. Mainly you avoid reinventing the wheel this way but there also should be a higher chance of the compiler optimizing things.
So to solve your problem try generating a list of all permutations, then filter it using filter and then just do take 1 if you want the result that is found first. Because of Haskell's lazy evaluation take 1 makes it so that we are interested only in the first x in (x:xs) that a filter would return. Therefore filter will keep dropping elements from the, again lazily evaluated, list of permutations and when it finds one it stops.
I found a permutation implementation on https://rosettacode.org/wiki/Permutations#Haskell
and used it to try this call:
take 1 $ filter ((> 67890123456789012345) . fromDigits) $ permutations' $ toDigits 12345678901234567890
it has been running for like 20 minutes now and RAM usage has stayed around 230 MB.
I hope that has answered/helped you at least in some way.
+ a bonus tip: you can simplify your fromDigits to this beautiful thing:
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl shiftAndAdd 0
where shiftAndAdd acc d = 10 * acc + fromIntegral d
EDIT:
I read some more of the comments and I see you care about ignoring duplicates but I am afraid you'll have to go smarter about that, since if I understand correctly your implementation still generates all the duplicates it only throws them away after checking if they are in a list (which has O(n) complexity). And when you only care about finding one permutation that fits your predicate you drop the not fitting ones anyway.
And people have already correctly pointed out that !! is generally also very bad.
Thanks to everyone for your helpful answers and comments.
#lordQuick permuations used with filter is still terrible but that fromDigits code is a beauty, so I used it.
#k-a-buhr That's exactly what I did yesterday, also per others suggestion, I replaced all use of !! and ++. When I did the latter all memory problems disappeared. Wow! I mean I knew ++ is bad I just didn't realise how bad! We're talking orders of magnitude bad! 3M of RAM vs several GB. Also, valid point about integers. I will try that.
Oh, also a very important thing. I replaced recursive calls with until. This is the approach I would have followed in Mathematica (a NestWhile function to be exact), and I'm glad I found it in Haskell. It seemed to make things a bit faster too.
Anyway, the revised code, that solves my memory issues is here for anyone if interested.
{-compiled with "ghc -Rghc-timing -O2 checking_program_v3.hs"-}
import System.Environment
import Data.List
--A little help with triples
fstOfThree (a, _, _) = a
sndOfThree (_, b, _) = b
thrOfThree (_, _, c) = c
--And then some with quads
fstOfFour (a, _, _, _) = a
sndOfFour (_, b, _, _) = b
thrOfFour (_, _, c, _) = c
--This function is a single pass test for single digit factors
--It will be called as many times as needed by pryForSDFactors
trySingleDigitsFactors :: (Bool, Integer, [Integer]) -> (Bool, Integer, [Integer])
trySingleDigitsFactors (True, n, f) = (True, n, f)
trySingleDigitsFactors (b, n, []) = (b, n, [])
trySingleDigitsFactors (b, n, (f:fs))
| mod n f == 0 = (True, div n f, fs)
| otherwise = trySingleDigitsFactors (False, n, fs)
--This function will take a number and repeatedly divide by single digits till it gets to a single digit if possible
--Then it will return True
pryForSDFactors :: Integer -> Bool
pryForSDFactors n
| sndOfThree sdfTry < 10 = True
| fstOfThree sdfTry == True = pryForSDFactors $ sndOfThree sdfTry
| otherwise = False
where sdfTry = trySingleDigitsFactors (False, n, [7,5,3,2])
toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl shiftAndAdd 0
where shiftAndAdd acc d = 10 * acc + fromIntegral d
replaceElementAtPos :: a -> Int -> [a] -> [a]
replaceElementAtPos newElement pos [] = []
replaceElementAtPos newElement 0 (x:xs) = newElement:xs
replaceElementAtPos newElement pos (x:xs) = x : replaceElementAtPos newElement (pos-1) xs
checkPermutationsStep :: ([Integer],Int,[Int],[(Int,Integer)]) -> ([Integer],Int,[Int],[(Int,Integer)])
checkPermutationsStep (digits, index, rotationMap, registeredPositions)
| index == digitsLength - 1 = (digits, index-1, rotationMap, registeredPositions)
| not ((index, digitAtIndex) `elem` registeredPositions) = (digits, index+1, rotationMap, (index,digitAtIndex):registeredPositions)
| rotationAtIndex == 0 = (digits, index-1, restoredRotMap, restoredRegPositions)
| rotationAtIndex > 0 && (index, digitAtIndex) `elem` registeredPositions = (shiftLDigits, index, subtractRot, registeredPositions)
where digitsLength = length digits
digitAtIndex = head $ drop index digits
rotationAtIndex = head $ drop index rotationMap
--restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
restoredRotMap = replaceElementAtPos (digitsLength - index) index rotationMap
--splitRotMap = splitAt index rotationMap
restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions --clear everything below the parent index
shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
splitDigits = splitAt index digits
--subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)
subtractRot = replaceElementAtPos (rotationDigitAtIndex - 1) index rotationMap
rotationDigitAtIndex = head $ drop index rotationMap
checkConditions :: ([Integer],Int,[Int],[(Int,Integer)]) -> Bool
checkConditions (digits, index, rotationMap, registeredPositions)
| (index == 0 && rotationAtIndex == 0) || ((index == (length digits) - 1) && pryForSDFactors (fromDigits digits)) = True
| otherwise = False
where rotationAtIndex = head $ drop index rotationMap
testPermsWithRep :: Integer -> Integer
testPermsWithRep n
| sndOfFour computationResult == 0 && (head . thrOfFour) computationResult == 0 = 0
| otherwise = (fromDigits . fstOfFour) computationResult
where computationResult = until checkConditions checkPermutationsStep (digitsOfn, 0 , [digitsLength, digitsLength -1 .. 1], [])
digitsOfn = toDigits n
digitsLength = length digitsOfn
main :: IO ()
main = do
args <- getArgs
let inputNumber = read (head args) :: Integer
let checkResult = testPermsWithRep inputNumber
print checkResult
Now, bear in mind that this code, as I've mentioned, checks for a condition of each generated permutation (single digit factors) on the spot, and moves on if False, but it's pretty easy to repurpose it for output list generation.
Sure it's now just inefficient in terms of big O complexity (scales terribly), and I was at first thinking of replacing lists with Data.Map because that's what I've learned so far (though not so comfortable with maps yet).
I've also read that there's a more efficient replacement for read since that's also called a lot for numbers-to-digits conversions.
# lordQuick I don't know about HashMaps or vectors yet but I'm still learning. Every little optimization will pay off in computation time because this is my first piece of "practical" code, not just Codeabbey credit
Cheers!
Here is a solution using a more efficient, insertion-based algorithm to compute unique permutations:
import Data.List
permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insert) [[]]
where insert y = foldr combine [[y]] . (zip <*> tail . tails)
where combine (x, xs) xss = (y : x : xs) :
if y == x then [] else map (x :) xss
headDef :: a -> [a] -> a
headDef x [] = x
headDef x (h : t) = h
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl1' ((+) . (10 *)) . map fromIntegral
toDigits :: Integer -> [Int]
toDigits = map (read . pure) . show
firstValidPermutation :: (Integer -> Bool) -> Integer -> Integer
firstValidPermutation p =
headDef 0 .
filter p .
map fromDigits .
permutationsNub .
toDigits
The basic idea is that, given the unique permutations of a list's tail, we can compute the unique permutations of the whole list by inserting its head into all of the tail's permutations, in every position that doesn't follow an occurrence of the head (to avoid creating duplicates). From my tests, permutationsNub seems to be faster than permutations from Data.List even when the input contains no repetitions. However, unlike that function, it consumes its input eagerly and thus cannot handle an infinite input. Exercise: Prove this algorithm's correctness.
to be continued

Every n-th element of a list in the form of a list

I went through a post for this problem but I do not understand it. Could someone please explain it?
Q: Find every n-th element of the list in the form of a list start from the n-th element itself.
everyNth :: Int -> [t] -> [t]
everyNth elt = map snd . filter (\(lst,y) -> (mod lst elt) == 0) . zip [1..]
Also, please explain how pattern matching can be used for this problem. That is using
[]->[]
It's easy to use pattern matching to 'select every nth element' for particular cases of n:
every2nd (first:second:rest) = second : every2nd rest
every2nd _ = []
-- >>> every2nd [1..12]
-- [2,4,6,8,10,12]
every3rd (first:second:third:rest) = third : every3rd rest
every3rd _ = []
-- >>> every3rd [1..13]
-- [3,6,9,12]
every4th (first:second:third:fourth:rest) = fourth : every4th rest
every4th _ = []
-- >>> every4th [1..12]
-- [4,8,12]
For the general case, though, we're out of luck, at least with that particular approach. Patterns like those above will need some definite length to be definite patterns. The composed function you mention starts from the thought that we do know how to find every nth member of [1..], namely if it's a multiple of n
multiple n m = m `mod` n == 0
-- >>> filter (multiple 3) [1..12]
-- [3,6,9,12]
So the solution you are trying to understand zips [1..] with the list
index xs = zip [1..] xs
-- >>> index [1..5]
-- [(1,1),(2,2),(3,3),(4,4),(5,5)]
-- >>> index "hello"
-- [(1,'h'),(2,'e'),(3,'l'),(4,'l'),(5,'o')]
Then it filters out just those pairs whose first element is a multiple of n
every_nth_with_index n xs = filter (\(m,a) -> multiple n m) (index xs)
-- >>> every_nth_with_index 3 [1..12]
-- [(3,3),(6,6),(9,9),(12,12)]
-- >>> every_nth_with_index 3 "stackoverflow.com"
-- [(3,'a'),(6,'o'),(9,'r'),(12,'o'),(15,'c')]
Then it gets rid of the ancillary construction, leaving us with just the second element of each pair:
every_nth n xs = map snd (every_nth_with_index n xs)
-- >>> every_nth 3 [1..12]
-- [3,6,9,12]
-- >>> every_nth 3 "stackoverflow.com"
-- "aoroc"
Retracinging our steps we see that this is the same as
everyNth elt = map snd . filter (\(lst,y) -> (mod lst elt) == 0) . zip [1..]
The notorious fold fan strikes again.
everyNth n xs = foldr go (`seq` []) xs n where
go x r 0 = x : r (n - 1)
go _ r k = r (k - 1)
This is very similar to chepner's approach but it integrates the dropping into the recursion. Rewritten without the fold, it's pure pattern matching:
everyNth n = go n where
go k [] = k `seq` []
go 0 (x : xs) = x : go (n - 1) xs
go k (_ : xs) = go (k - 1) xs
With a little cheating, you can define everyNth using pattern matching. Really, we're abstracting out the part that makes pattern matching difficult, as pointed out in Michael's answer.
everyNth n lst = e (shorten lst)
where shorten = drop (n-1) -- here's the cheat
e [] = []
e (first:rest) = first : e (shorten rest)
If you have never seen Haskell before then this takes a bit of explaining.
everyNth :: Int -> [t] -> [t]
everyNth elt = map snd . filter (\(lst,y) -> (mod lst elt) == 0) . zip [1..]
First, note that the type has two arguments, but the definition has only one. This is because the value returned by everyNth is in fact another function. elt is the Int, and the expression in the second line creates a new function that does the job.
Second, note the "." operators. This is an operator that joins two functions together. It is defined like this:
(f . g) x = f (g x)
Here is an equivalent version of the definition with the second argument made explicit:
everyNth elt xs = map snd (filter (\(lst y) -> (mod lst elt) == 0) (zip xs))
When you see a bunch of functions in a chain linked by "." operators you need to read it from right to left. In my second version pay attention to the bracket nesting. zip [1..] xs is the inner-most expression, so it gets evaluated first. It turns a list like ["foo", "bar"] into [(1, "foo"),(2, "bar")]. Then this is filtered to find entries where the number is a multiple of elt. Finally the map snd strips the numbers back out to return just the required entries.

How to shorten a Haskell implementation like this?

I have a function with a lot of guards that look like this:
function
| p `elem` [0,1,2,3,4,5,6] = [0,1,2,3,4,5,6]
| p `elem` [7,8,9,10,11,12,13] = [7,8,9,10,11,12,13]
| p `elem` [14,15,16,17,18,19,20] = [14,15,16,17,18,19,20]
| otherwise = []
I'm sure I can write this much shorter with Haskell. If not, then it's okay. I'm new to Haskell and I would love to become better at it by learning different approaches.
Perhaps using "map" may be a good start? But then, I'm not sure how to pass in those specific lists.
The values are not always contiguous.
What about simple bounds checks?
function p
| p < 0 = []
| p < 7 = [0..6]
| p < 14 = [7..13]
| p < 21 = [14..20]
| otherwise = []
It will be faster and for some applications use less memory.
If you don't want to perform a bounds check (but an element check), you can still use the shortened list notation.
Alternatively, you could construct a helper function that iterates over the lists:
helper (x:xs) p | elem p x = x
| otherwise = helper xs p
helper [] _ = []
function = helper [[0..6],[7..13],[14..20]]
Although this is actually longer, you can easily extend the function to use other lists. Note however that this function will be slower, since elem requires O(n) time whereas a bounds check takes O(1) time.
You can also - as is suggested in #jamshidh's answer construct a Data.Map which is a datastructure that guarantees O(log n) lookup time:
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Maybe(fromMaybe)
helper2 :: Ord a => [[a]] -> a -> [a]
helper2 lst p = fromMaybe [] $ Map.lookup p (Map.fromList $ concatMap (\x -> zip x (repeat x)) lst)
function = helper2 [[0..6],[7..13],[14..20]]
For this last piece, it generates (\x -> zip x (repeat x)) generates for a list tuples containing an element of the list e and the entire list l. For example:
Prelude> (\x -> zip x (repeat x)) [0..6]
[(0,[0,1,2,3,4,5,6]),(1,[0,1,2,3,4,5,6]),(2,[0,1,2,3,4,5,6]),(3,[0,1,2,3,4,5,6]),(4,[0,1,2,3,4,5,6]),(5,[0,1,2,3,4,5,6]),(6,[0,1,2,3,4,5,6])]
This works as follows: x unifies with a list, for instance [0,1,2,3,4,5,6], now we apply a zip function on [0,1,2,3,4,5,6] and on the infinite list [[0,1,2,3,4,5,6],[0,1,2,3,4,5,6],[0,1,2,3,4,5,6],....]. zip generates tuples as long as both lists feed elements, so it takes the first element from [0,1,..,6] and the first from [[0,1,..,6],[0,1,..,6],[0,1,..,6],...] so the resulting tuple is (0,[0..6]), next it takes the second element 1 from the list, and the second item from the repeat function, thus (1,[0..6]). It keeps doing this -- although lazily -- until one of the lists is exhausted which is the case for the first list.
You can use the list monad here.
func p = join $ do x <- [[1,3,5], [2,4,6], [7,8,9]]
guard $ p `elem` x
return x
The list of lists are the things you want to check against. The call to guard filters out the choices that don't succeed. As long as the candidate lists are disjoint, at most one will succeed. return x evaluates to either [] or [x] for one of the choices of x, so join
reduces [x] to [].
> func 1
[1,3,5]
> func 2
[2,4,6]
> func 7
[7,8,9]
> func 10
[]
As a list comprehension, it would look like
func p = join [x | x <-[[1,3,5],[2,4,6],[7,8,9]], p `elem` x]
First create the list of lists
lists = [[0,1,2,3,4,5,6], [7,8,9,10,11,12,13], [14,15,16,17,18,19,20]]
Then create a mapping from value to list
theMap = concat $ map (\x -> zip x (repeat x)) lists
This will give you what you need
> lookup 1
Just [0,1,2,3,4,5,6]
Note that the output is a Maybe, in the case you don't supply a value in any list.

Does Haskell have a takeUntil function?

Currently I am using
takeWhile (\x -> x /= 1 && x /= 89) l
to get the elements from a list up to either a 1 or 89. However, the result doesn't include these sentinel values. Does Haskell have a standard function that provides this variation on takeWhile that includes the sentinel in the result? My searches with Hoogle have been unfruitful so far.
Since you were asking about standard functions, no. But also there isn't a package containing a takeWhileInclusive, but that's really simple:
takeWhileInclusive :: (a -> Bool) -> [a] -> [a]
takeWhileInclusive _ [] = []
takeWhileInclusive p (x:xs) = x : if p x then takeWhileInclusive p xs
else []
The only thing you need to do is to take the value regardless whether the predicate returns True and only use the predicate as a continuation factor:
*Main> takeWhileInclusive (\x -> x /= 20) [10..]
[10,11,12,13,14,15,16,17,18,19,20]
Is span what you want?
matching, rest = span (\x -> x /= 1 && x /= 89) l
then look at the head of rest.
The shortest way I found to achieve that is using span and adding a function before it that takes the result of span and merges the first element of the resulting tuple with the head of the second element of the resulting tuple.
The whole expression would look something like this:
(\(f,s) -> f ++ [head s]) $ span (\x -> x /= 1 && x /= 89) [82..140]
The result of this expression is
[82,83,84,85,86,87,88,89]
The first element of the tuple returned by span is the list that takeWhile would return for those parameters, and the second element is the list with the remaining values, so we just add the head from the second list to our first list.

Haskell - get nth element without "!!"

I need to get the nth element of a list but without using the !! operator. I am extremely new to haskell so I'd appreciate if you can answer in more detail and not just one line of code. This is what I'm trying at the moment:
nthel:: Int -> [Int] -> Int
nthel n xs = 0
let xsxs = take n xs
nthel n xs = last xsxs
But I get: parse error (possibly incorrect indentation)
There's a lot that's a bit off here,
nthel :: Int -> [Int] -> Int
is technically correct, really we want
nthel :: Int -> [a] -> a
So we can use this on lists of anything (Optional)
nthel n xs = 0
What you just said is "No matter what you give to nthel return 0". which is clearly wrong.
let xsxs = ...
This is just not legal haskell. let ... in ... is an expression, it can't be used toplevel.
From there I'm not really sure what that's supposed to do.
Maybe this will help put you on the right track
nthelem n [] = <???> -- error case, empty list
nthelem 0 xs = head xs
nthelem n xs = <???> -- recursive case
Try filling in the <???> with your best guess and I'm happy to help from there.
Alternatively you can use Haskell's "pattern matching" syntax. I explain how you can do this with lists here.
That changes our above to
nthelem n [] = <???> -- error case, empty list
nthelem 0 (x:xs) = x --bind x to the first element, xs to the rest of the list
nthelem n (x:xs) = <???> -- recursive case
Doing this is handy since it negates the need to use explicit head and tails.
I think you meant this:
nthel n xs = last xsxs
where xsxs = take n xs
... which you can simplify as:
nthel n xs = last (take n xs)
I think you should avoid using last whenever possible - lists are made to be used from the "front end", not from the back. What you want is to get rid of the first n elements, and then get the head of the remaining list (of course you get an error if the rest is empty). You can express this quite directly as:
nthel n xs = head (drop n xs)
Or shorter:
nthel n = head . drop n
Or slightly crazy:
nthel = (head .) . drop
As you know list aren't naturally indexed, but it can be overcome using a common tips.
Try into ghci, zip [0..] "hello", What's about zip [0,1,2] "hello" or zip [0..10] "hello" ?
Starting from this observation, we can now easily obtain a way to index our list.
Moreover is a good illustration of the use of laziness, a good hint for your learning process.
Then based on this and using pattern matching we can provide an efficient algorithm.
Management of bounding cases (empty list, negative index).
Replace the list by an indexed version using zipper.
Call an helper function design to process recursively our indexed list.
Now for the helper function, the list can't be empty then we can pattern match naively, and,
if our index is equal to n we have a winner
else, if our next element is empty it's over
else, call the helper function with the next element.
Additional note, as our function can fail (empty list ...) it could be a good thing to wrap our result using Maybe type.
Putting this all together we end with.
nth :: Int -> [a] -> Maybe a
nth n xs
| null xs || n < 0 = Nothing
| otherwise = helper n zs
where
zs = zip [0..] xs
helper n ((i,c):zs)
| i == n = Just c
| null zs = Nothing
| otherwise = helper n zs

Resources