Composing Haskell filters - haskell

I am converting the zxcvbn password strength algorithm to Haskell.
I have two functions that check for all characters being ASCII and that a brute force attack is possible:
filterAscii :: [String] -- ^terms to filter
-> [String] -- ^filtered terms
filterAscii = filter $ all (\ chr -> ord chr < 128)
and
filterShort :: [String] -- ^terms to filter
-> [String] -- ^filtered terms
filterShort terms = map fst $ filter long $ zip terms [1..]
where long (term, index) = (26 ^ length term) > index
I composed these into a single function:
filtered :: [String] -- ^terms to filter
-> [String] -- ^filtered terms
filtered = filterAscii . filterShort
I now have need to compose these with a third filter to check if the terms are not null:
filter (not . null) terms
It has occurred to me that I am creating a chain of filters and that it would make more sense to create a single function that takes a list of filter functions and composes them in the order given.
If I recall from my reading, this is a job for an applicative functor, I believe. Can I use applicatives for this?
I am not sure how to handle the filterShort function where I need to zip each item with its one-based index before filtering.

You can use the Endo wrapper from Data.Monoid to get a monoid instance that will allow you to use mconcat like so:
Prelude> :m + Data.Monoid
Prelude Data.Monoid> :t appEndo $ mconcat [Endo filterAscii, Endo filterShort]
appEndo $ mconcat [Endo filterAscii, Endo filterShort] :: [String] -> [String]

In other words, you want :
filters :: [a -> Bool] -> [a] -> [a]
filters fs = filter (\a -> and $ map ($ a) fs)
But you should also know that a pipeline of filters is very likely to be optimized by GHC (as far as I know) anyway. So it may not be worth it to create this function. Note that there will be some problems with your filterShort since it's not a pure filter.

Related

List of polymorphic functions in haskell?

Consider the code below:
t1 :: [Int] -> (Int,String)
t1 xs = (sum xs,show $ length xs)
t2 :: [Int] -> (Int,String)
t2 xs = (length xs, (\x -> '?') <$> xs)
t3 :: [Int] -> (Char,String)
t3 (x:xs) = ('Y',"1+" ++ (show $ length xs))
t3 [] = ('N',"empty")
These three functions have a type that only varies partially -- they are entirely usable without needing to know the type of the first component of the tuple they produce. This means that I can operate on them without needing to refer to that type:
fnListToStrs vs fs = (\x -> snd $ x vs) <$> fs
Loading these definitions into GHCi, all three of the functions work independently as an argument to fnListToStrs, and indeed I can pass in a list containing both t1 and t2 because they have the same type:
*Imprec> fnListToStrs [1,2] [t1,t2]
["2","??"]
*Imprec> fnListToStrs [1,2] [t3]
["1+1"]
But I can't pass all 3 at the same time, even though the divergence of types is actually irrelevant to the calculation performed:
*Imprec> fnListToStrs [1,2] [t1,t2]
["2","??"]
*Imprec> fnListToStrs [1,2] [t3]
["1+1"]
I have the feeling that making this work has something to do with either existential or impredicative types, but neither extension has worked for me when using the type declaration I expect fnListToStrs to be able to take, namely:
fnListToStrs :: [Int] -> [forall a.[Int]->(a,String)] -> [String]
Is there some other way to make this work?
Existential is correct, not impredicative. And Haskell doesn't have existentials, except through an explicit wrapper...
{-# LANGUAGE GADTs #-}
data SomeFstRes x z where
SFR :: (x -> (y,z)) -> SomeFstRes x z
> fmap (\(SFR f) -> snd $ f [1,2]) [SFR t1, SFR t2, SFR t3]
["2","??","1+1"]
but, this really is a bit useless. Since you can't possibly do anything with the first result anyway, it's more sensible to just throw it away immediately and put the remaining function in a simple monomorphic list:
> fmap ($[1,2]) [snd . t1, snd . t2, snd . t3]
["2","??","1+1"]
Any way to put these functions into a list will require "wrapping" each of them in some fashion. The simplest wrapping is just
wrap :: (a -> (b, c)) -> a -> c
wrap f = snd . f
There are, indeed, other ways to wrap these (notably with existential types), but you've not given any information to suggest that any of those would be even slightly better in your application than this simplest version.
Here's an example where something more sophisticated might make sense. Suppose you have
data Blob a b = Blob [a -> b] [a]
Now imagine you want to make a list of values of type Blob a b that all have the same b type, but may have different a types. Actually applying each function to each argument could lead to a prohibitively large list of potential results, so it would make sense to write
data WrapBlob b where
WrapBlob :: Blob a b -> WrapBlob b
Now you can make the list and postpone the decision of which function(s) to apply to which argument(s) without paying a prohibitive price.

higher order functions in haskell

i have a list with functions and one other list with "arguments" to make a new list where each element of the one list, map with the other element of the other list. (apply :: Ord u => [v->u]->[v]->[u] )
For example,
apply [(^2),(^3),(^4),(2^)] [10] = [100,1000,1024,10000]. or
apply [reverse,(++"ing"),reverse.(++"ing"),(++"ing").reverse] ["play","do"] = ["doing","gniod","gniyalp","od","oding","playing","yalp","yalping"]..
What can i do, because i do my first steps in haskel..
Let us take your first list:
[(^2),(^3),(^4),(2^)]
It's type is xs :: Integral a => [a -> a]
Now you want to apply it to to a list [10]. What you want is exactly Applicative function <*> whose type is Applicative f => f (a -> b) -> f a -> f b:
λ> import Control.Applicative
λ> let xs = [(^2),(^3),(^4),(2^)]
λ> xs <*> [10]
[100,1000,10000,1024]
You can work out the types to see how they fit together. Your second example doesn't seem to be correct as you are not passing any second parameter to your apply function. I would suggest you to start reading LYAH to further solidify the concepts.

Cleanest way to apply a list of boolean functions to a list?

Consider this:
ruleset = [rule0, rule1, rule2, rule3, rule4, rule5]
where rule0, rule1, etc. are boolean functions that take one argument. What is the cleanest way to find if all elements of a particular list satisfy all the rules in the ruleset?
Obviously, a loop would work, but Haskell folks always seem to have clever one-liners for these types of problems.
The all function seems appropriate (eg. all (== check_one_element) ruleset) or nested maps. Also, map ($ anElement) ruleset is roughly what I want, but for all elements.
I'm a novice at Haskell and the many ways one could approach this problem are overwhelming.
If you require all the functions to be true for each argument, then it's just
and (ruleset <*> list)
(You'll need to import Control.Applicative to use <*>.)
Explanation:
When <*> is given a pair of lists, it applies each function from the list on the left to each argument from the list on the right, and gives back a list containing all the results.
A one-liner:
import Control.Monad.Reader
-- sample data
rulesetL = [ (== 1), (>= 2), (<= 3) ]
list = [1..10]
result = and $ concatMap (sequence rulesetL) list
(The type we're working on here is Integer, but it could be anything else.)
Let me explain what's happening: rulesetL is of type [Integer -> Bool]. By realizing that (->) e is a monad, we can use
sequence :: Monad m => [m a] -> m [a]
which in our case will get specialized to type [Integer -> Bool] -> (Integer -> [Bool]). So
sequence rulesetL :: Integer -> [Bool]
will pass a value to all the rules in the list. Next, we use concatMap to apply this function to list and collect all results into a single list. Finally, calling
and :: [Bool] -> Bool
will check that all combinations returned True.
Edit: Check out dave4420's answer, it's nicer and more concise. Mine answer could help if you'd need to combine rules and apply them later on some lists. In particular
liftM and . sequence :: [a -> Bool] -> (a -> Bool)
combines several rules into one. You can also extend it to other similar combinators like using or etc. Realizing that rules are values of (->) a monad can give you other useful combinators, such as:
andRules = liftM2 (&&) :: (a -> Bool) -> (a -> Bool) -> (a -> Bool)
orRules = liftM2 (||) :: (a -> Bool) -> (a -> Bool) -> (a -> Bool)
notRule = liftM not :: (a -> Bool) -> (a -> Bool)
-- or just (not .)
etc. (don't forget to import Control.Monad.Reader).
An easier-to-understand version (without using Control.Applicative):
satisfyAll elems ruleset = and $ map (\x -> all ($ x) ruleset) elems
Personally, I like this way of writing the function, as the only combinator it uses explicitly is and:
allOkay ruleset items = and [rule item | rule <- ruleset, item <- items]

How to avoid spaceleak in multiple list traversals?

is GHC intelligent enough to run multiple operations on lists in 'semi-parallel'?
Consider this (simplified) code:
findElements bigList = do
let special = head . filter isSpecial $ bigList
let others = filter isSpecialOrNormal $ bigList
return (special, others)
(Monad due to original code)
I guess GHC will run the first list operation and will keep all elements in memory so that the second operation is able to work on them.
My problem is that i am running into a spaceleak when dealing with larger files. But i believe it should be able to run in constant space. Is there a way to achieve this?
Update 1
Having written it down like this the solution to this problem of course is to change the order of the two lines.
But my question remains: is the GHC intelligent enough to figure out this semi-parallel processing when it not done in a monad?
I don't think GHC is smart enough to merge these two traversals, or, as is usually the case, GHC could be smart enough, but there are cases where you don't want this behavior, so GHC doesn't do it.
Here's how I would do it, using monoids and foldMap.
import Data.Monoid
import Data.Foldable
First, here's how to write special with foldMap, using the First monoid.
specialF :: a -> First a
specialF a = First $ if isSpecial a then Just a else Nothing
special :: [a] -> a
special as = let (First (Just s)) = foldMap specialF as in s
And similar for specialOrNormal, using the list monoid.
specialOrNormalF :: a -> [a]
specialOrNormalF a = if isSpecialOrNormal a then [a] else []
specialOrNormal :: [a] -> [a]
specialOrNormal = foldMap specialOrNormalF
One neat thing about monoids is that a tuple of monoids is also a monoid, which makes merging these folds easy:
findElements :: [a] -> (a, [a])
findElements bigList =
let (First (Just s), son) =
foldMap (\a -> (specialF a, specialOrNormalF a)) bigList
in (s, son)
And if you like point-free code, you can write the whole thing like this:
findElements :: [a] -> (a, [a])
findElements =
first (fromJust . getFirst) .
foldMap
( First . mfilter isSpecial . return
&&& mfilter isSpecialOrNormal . return
)

Simple word count in haskell

This is my FIRST haskell program! "wordCount" takes in a list of words and returns a tuple with with each case-insensitive word paired with its usage count. Any suggestions for improvement on either code readability or performance?
import List;
import Char;
uniqueCountIn ns xs = map (\x -> length (filter (==x) xs)) ns
nubl (xs) = nub (map (map toLower) xs) -- to lowercase
wordCount ws = zip ns (uniqueCountIn ns ws)
where ns = nubl ws
Congrats on your first program!
For cleanliness: lose the semicolons. Use the new hierarchical module names instead (Data.List, Data.Char). Add type signatures. As you get more comfortable with function composition, eta contract your function definitions (remove rightmost arguments). e.g.
nubl :: [String] -> [String]
nubl = nub . map (map toLower)
If you want to be really rigorous, use explicit import lists:
import Data.List (nub)
import Data.Char (toLower)
For performance: use a Data.Map to store the associations instead of nub and filter. In particular, see fromListWith and toList. Using those functions you can simplify your implementation and improve performance at the same time.
One of the ways to improve readibility is to try to get used to the standard functions. Hoogle is one of the tools that sets Haskell apart from the rest of the world ;)
import Data.Char (toLower)
import Data.List (sort, group)
import Control.Arrow ((&&&))
wordCount :: String -> [(String, Int)]
wordCount = map (head &&& length) . group . sort . words . map toLower
EDIT: Explanation: So you think of it as a chain of mappings:
(map toLower) :: String -> String lowercases the entire text, for the purpose of case
insensitivity
words :: String -> [String] splits a piece of text into words
sort :: Ord a => [a] -> [a] sorts
group :: Eq a => [a] -> [[a]] gathers identicial elements in a list, for example, group
[1,1,2,3,3] -> [[1,1],[2],[3,3]]
&&& :: (a -> b) -> (a -> c) -> (a -> (b, c)) applies two functions on the same piece of data, then returns
the tuple of results. For example: (head &&& length) ["word","word","word"] -> ("word", 3) (actually &&& is a little more general, but the simplified explanation works for this example)
EDIT: Or actually, look for the "multiset" package on Hackage.
It is always good to ask more experienced developers for feedback. Nevertheless you could use hlint to get feedback on some small scale issues. It'll tell you about hierarchical imports, unncessary parenthesis, alternative higher-order functions, etc.
Regarding the function, nub1. If you don't follow luqui's advice to remove the parameter altogether yet, I would at least remove the parenthesis around xs on the right side of the equation.

Resources