I'm new to Haskell and trying to implement some genetic algorithms.
Currently I fail with the selection of the n best element of a list of individuals (where each individual is a list for itself.
An individual is created as follows:
ind1 :: [Int]
ind1 = [1, 1, 1, 1, 1, 1, 1]
ind2 :: [Int]
ind2 = [0, 0, 0, 0, 0, 0, 0]
The appropriate population consists of a list of those individuals:
pop :: [[Int]]
pop = [ind1, ind2]
What I want to achieve is to get the best n individuals of the population, where the "best" is determined by the sum of its elements, e.g.,
> sum ind1
7
> sum ind2
0
I started creating a function for creating tuples with individual and its quality:
f x = [(ind, sum ind) | ind <- x]
so at least I got something like this:
[([1, 1, 1, 1, 1, 1, 1], 7), ([0, 0, 0, 0, 0, 0, 0], 0)]
How do I get from here to the expected result? I do not even manage to get the "fst" of the tuple where "snd == max".
I started with recursive approaches as seen in different topics, but unfortunately without reasonable result.
Any suggestions, probably also where to read?
Thank you!
The best choice here is to use sortBy from Data.List:
sortBy :: (a -> a -> Ordering) -> [a] -> [a]
The sortBy function is higher order, so it takes a function as one of its arguments. The function it needs is one that takes two elements and returns a Ordering value (LT, EQ or GT). You can write your own custom comparison function, but the Data.Ord module has comparing, which exists to help with writing these comparison functions:
comparing :: Ord b => (a -> b) -> (a -> a -> Ordering)
Hopefully you can see how comparing pairs with sortBy, you pass it a function to convert your type to a known comparable type, and then you have a function of the right type to pass to sortBy. So in practice you can do
import Data.List (sortBy)
import Data.Ord (comparing)
-- Some types to make things more readable
type Individual = [Int]
type Fitness = Int
-- Here's our fitness function (change as needed)
fitness :: Individual -> Fitness
fitness = sum
-- Redefining so it can be used with `map`
f :: Individual -> (Individual, Fitness)
f ind = (ind, fitness ind)
-- If you do want to see the fitness of the top n individuals
solution1 :: Int -> [Individual] -> [(Individual, Fitness)]
solution1 n inds = take n $ sortBy (flip $ comparing snd) $ map f inds
-- If you just want the top n individuals
solution2 :: Int -> [Individual] -> [Individual]
solution2 n inds = take n $ sortBy (flip $ comparing fitness) inds
The flip in the arguments to sortBy forces the sort to be descending instead of the default ascending, so the first n values returned from sortBy will be the n values with the highest fitness in descending order. If you wanted to try out different fitness functions then you could do something like
fittestBy :: (Individual -> Fitness) -> Int -> [Individual] -> [Individual]
fittestBy fit n = take n . sortBy (flip $ comparing fit)
Then you'd have
solution2 = fittestBy sum
But you could also have
solution3 = fittestBy product
if you wanted to change your fitness function to be the product rather than the sum.
Use sortBy and on.
> take 2 $ sortBy (flip compare `on` sum) [[1,2],[0,4],[1,1]]
[[0,4],[1,2]]
Related
On Hackage I see that groupBy's implementation is this:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
which means that the predicate eq holds between any two elements of each group. Examples:
> difference_eq_1 = ((==1).) . flip (-)
> first_isnt_newline = ((/= '\n').) . const
>
> Data.List.groupBy difference_eq_1 ([1..10] ++ [11,13..21])
[[1,2],[3,4],[5,6],[7,8],[9,10],[11],[13],[15],[17],[19],[21]]
>
> Data.List.groupBy first_isnt_newline "uno\ndue\ntre"
["uno\ndue\ntre"]
What if instead I want to group elements such that the predicate holds between any pair of consecutive elements, so that the above results would be as follows?
[[1,2,3,4,5,6,7,8,9,10,11],[13],[15],[17],[19],[21]]
["uno\n","due\n","tre"]
I wrote it myself, and it looks a bit ugly
groupBy' :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy' p = foldr step []
where step elem [] = [[elem]]
step elem gs'#((g'#(prev:g)):gs)
| elem `p` prev = (elem:g'):gs
| otherwise = [elem]:gs'
So I was wandering if such a function exists already and I just don't find it.
As regards the second usage, Data.List.groupBy first_isnt_newline, where the binary predicate basically ignores the second argument and applies a unary predicate to the first, I've just found that Data.List.HT.segmentAfter unary_predicate does the job, where unary_predicate is the negation of the unary predicate in which const's output is forwarded. In other words Data.List.groupBy ((/= '\n').) . const === Data.List.HT.segmentAfter (=='\n').
There is a groupBy package that does exactly that.
But here’s another way of implementing it:
Zip the list with its tail to test the predicate on adjacent elements
Generate a “group index” by scanning the result and incrementing the group whenever the predicate is false
Group by the index
Remove the indices
groupByAdjacent :: (a -> a -> Bool) -> [a] -> [[a]]
groupByAdjacent p xs
= fmap (fmap fst)
$ groupBy ((==) `on` snd)
$ zip xs
$ scanl' (\ g (a, b) -> if p a b then g else succ g) 0
$ zip xs
$ drop 1 xs
For an input like [1, 2, 3, 10, 11, 20, 30], the predicate will return [True, True, False, True, False, False] and the resulting group indices will be [0, 0, 0, 1, 1, 2, 3].
The scan can also be written pointfree as scanr (bool succ id . uncurry p) 0, since the scan direction doesn’t matter (although the group indices will be reversed). The group index might be handy or just more readable to keep as an integer, but it could be a Bool instead, because the minimum size of a group is 1: the functional argument of the scan would be bool not id . uncurry p, which can be simplified to (==) . uncurry p. And several of these parts could be factored into reusable functions, like zipNext = zip <*> drop 1, but I’ve inlined them for simplicity’s sake.
Given a matrix like this
matrix_table =
[[ 0, 0, 0, 0]
,[ 0, 0, 0, 0]
,[ 0, 0, 0, 0]
,[ 0, 0, 0, 0]
]
and a list position_list = [2, 3, 2, 10]
output of a function
distribute_ones :: [[Int]] -> [Int] -> [[Int]]
distribute_ones matrix_table position_list
should look like this
[[ 0, 1, 0, 1] -- 2 '1's in the list
,[ 0, 1, 1, 1] -- 3 '1's in the list
,[ 0, 1, 0, 1] -- 2 '1's in the list
,[ 1, 1, 1, 1] -- Since 10 > 4, all '1's in the list
]
What I tried:
I generated list of lists, the base matrix with
replicate 4 (replicate 4 0)
then divided inner lists with chunksOf from Data.List.Split library to make cut-outs of 4 - (position_list !! nth).
Finally appending and concatenating with 1 like this
take 4 . concat . map (1 :)
Although I think it's not exactly the best approach.
Is there a better way of doing that?
For evenly distributing elements, I recommend Bjorklund's algorithm. Bjorklund's algorithm takes two sequences to merge, and repeatedly:
Merges as much of the prefix of the two as it can, taking one from each, then
recursively calls itself with the merged elements as one sequence and the leftovers from the longer input as the other sequence.
In code:
bjorklund :: [[a]] -> [[a]] -> [a]
bjorklund xs ys = case zipMerge xs ys of
([], leftovers) -> concat leftovers
(merged, leftovers) -> bjorklund merged leftovers
zipMerge :: [[a]] -> [[a]] -> ([[a]], [[a]])
zipMerge [] ys = ([], ys)
zipMerge xs [] = ([], xs)
zipMerge (x:xs) (y:ys) = ((x++y):merged, leftovers) where
~(merged, leftovers) = zipMerge xs ys
Here's some examples in ghci:
> bjorklund (replicate 2 [1]) (replicate 2 [0])
[1,0,1,0]
> bjorklund (replicate 5 [1]) (replicate 8 [0])
[1,0,0,1,0,1,0,0,1,0,0,1,0]
If you like, you could write a small wrapper that takes just the arguments you care about.
ones len numOnes = bjorklund
(replicate ((-) len numOnes) [0])
(replicate (min len numOnes) [1])
In ghci:
> map (ones 4) [2,3,2,10]
[[0,1,0,1],[0,1,1,1],[0,1,0,1],[1,1,1,1]]
Here's an alternate algorithm to distribute itemCount items across rowLength cells within a single row. Initialize currentCount to 0. Then for each cell:
Add itemCount to currentCount.
If the new currentCount is less than rowLength, use the original value of the cell.
If the new currentCount is at least rowLength, subtract rowLength, and increment the value of the cell by one.
This algorithm produces the output you expect from the input you provide.
We can write the state required for this as a simple data structure:
data Distribution = Distribution { currentCount :: Int
, itemCount :: Int
, rowLength :: Int
} deriving (Eq, Show)
At each step of the algorithm we need to know whether we're emitting an output (and incrementing the value), and what the next state value will be.
nextCount :: Distribution -> Int
nextCount d = currentCount d + itemCount d
willEmit :: Distribution -> Bool
willEmit d = (nextCount d) >= (rowLength d)
nextDistribution :: Distribution -> Distribution
nextDistribution d = d { currentCount = (nextCount d) `mod` (rowLength d) }
To keep this as running state, we can package it in the State monad. Then we can write the "for each cell" list above as a single function:
distributeCell :: Int -> State Distribution Int
distributeCell x = do
emit <- gets willEmit
modify nextDistribution
return $ if emit then x + 1 else x
To run this over a whole row, we can use the traverse function from the standard library. This takes some sort of "container" and a function that maps single values to monadic results, and creates a "container" of the results inside the same monad. Here the "container" type is [a] and the "monad" type is State Distribution a, so the specialized type signature of traverse is
traverse :: (Int -> State Distribution Int)
-> [Int]
-> State Distribution [Int]
We don't actually care about the final state, we just want the resulting [Int] out, which is what evalState does. This would produce:
distributeRow :: [Int] -> Int -> [Int]
distributeRow row count =
evalState
(traverse distributeCell row :: State Distribution [Int])
(Distribution 0 count (length row))
Applying this to the whole matrix is a simple application of zipWith (given two lists and a function, call the function repeatedly with pairs of items from the two lists, returning the list of results):
distributeOnes :: [[Int]] -> [Int] -> [[Int]]
distributeOnes = zipWith distributeRow
This question is based on to the 11th advent of code task. It basically is a more general version of the river crossing puzzle, you can go up and down floors while carrying one or two items each step. The goal is to bring up all items to the 4th floor.
This is fairly straightforward to solve with an A* search but finding the neighboring states is somewhat annoying.
When solving the puzzle originally I just created masks for all items on the current floor and then used the list monad to generate the combinations - slow and awkward but it works. I figured that there would be an elegant solution using lenses, though.
An easy solution could use a function that returns all options of moving a single item from floor x to floor y. Is there a way to get all combinations of applying a function to a single element using lenses? i.e. f 1 2 [(1, 0), (1, 2)] = [[(2, 0) (1, 2)], [(1, 0), (2, 2)]]
For the sake of reference, this is the best I could come up with so far, slightly simplified:
import Control.Lens
import Data.List (sort)
import Data.Set (fromList, Set)
type GenFloor = Int
type ChipFloor = Int
type State = [(GenFloor, ChipFloor)]
neighborStates :: Int -> State -> Set State
neighborStates currentFloor state = finalize $ createStatesTowards =<< [pred, succ]
where
createStatesTowards direction = traverseOf (traverse . both) (moveTowards direction) state
moveTowards direction i
| i == currentFloor = [direction i, i]
| otherwise = [i]
finalize = fromList . map sort . filter valid
valid = (&&) <$> validCarry <*> validFloors
validCarry = (`elem` [1..2]) . carryCount
carryCount = length . filter (uncurry (/=)) . zip state
validFloors = allOf (traverse . each) (`elem` [1..4])
An easy solution could use a function that returns all options of moving a single item from floor x to floor y. Is there a way to get all combinations of applying a function to a single element using lenses? i.e. f 1 2 [(1, 0), (1, 2)] = [[(2, 0) (1, 2)], [(1, 0), (2, 2)]]
holesOf can do that. Quoting the relevant simplified signature from the documentation:
holesOf :: Traversal' s a -> s -> [Pretext' (->) a s]
Given a traversal, holesOf will generate a list of contexts focused on each element targeted by the traversal. peeks from Control.Comonad.Store can then be used to, from each context, modify the focused target and recreate the surrounding structure:
import Control.Lens
import Control.Comonad.Store
-- allMoves :: Int -> Int -> State -> [State]
allMoves :: (Traversable t, Eq a) => a -> a -> t (a, b) -> [t (a, b)]
allMoves src dst its = peeks (changeFloor src dst) <$> holesOf traverse its
where
-- changeFloor :: Int -> Int -> (Int, Int) -> (Int, Int)
changeFloor src dst = over both (\x -> if x == src then dst else x)
GHCi> allMoves 1 2 [(1,0),(1,2)]
[[(2,0),(1,2)],[(1,0),(2,2)]]
Given a list (for instance [1,2,2,3,3,4,5,6]) how is it possible to group and count them according to bins/range? I would like to be able to specify a range, so that:
Say range=2, and using the previous list, would give me [1, 4, 2, 1], given that there's 1 0's or 1's, 4 2's or 3's, 2 4's or 5's and 1 6's or 7's.
Say range=4, and using the previous list, would give me [5, 3], given that there's 5 0's or 1's or 2's or 3's, 3 4's or 5's or 6's or 7's.
I have looked into group and groupBy but not found appropriate predicates, and also the histogram-fill library. The latter seems very nice to create bins, but I could not find out how to load data into those bins.
How can I achieve this?
My attempt on one of the suggestions below:
import Data.List
import Data.Function
quantize range n = n `div` range
main = print (groupBy ((==) `on` quantize 4) [1,2,3,4,2])
The output is [[1,2,3],[4],[2]] when it should have been [[1,2,2,3],[4]]. Both suggestions below works on sorted lists.
main = print (groupBy ((==) `on` quantize 4) (sort [1,2,3,4,2]))
You can achieve this using the groupBy and div functions. Let's say we have a range N. If we get the integral division (div) of N consecutive numbers, all of those should be equal. For example, N=3, 0 div 3 = 0, 1 div 3 = 0, 2 div 3 = 0, 3 div 3 = 1, 4 div 3 = 1, 5 div 3 = 1, 6 div 3 = 2.
Knowing this, we can look at groupBy :: (a -> a -> Bool) -> [a] -> [[a]] and use the function:
sameGroup :: Integral a => a -> a -> a -> Bool
sameGroup range a b = a `div` range == b `div` range
To write our own grouping function
groupings :: Integral a => a -> [a] -> [[a]]
groupings range = groupBy (sameGroup range)
Which should look something like groupings 2 [1, 2, 2, 3, 3, 4, 5, 6] == [[1], [2, 2, 3, 3], [4, 5], [6]]. Now we just have to count it to have the final function
groupAndCount :: Integral a => a -> [a] -> [Int]
groupAndCount range list = map length $ groupings range list
Which should mirror the wanted behavior.
You'll need to quantize in order to get the definitions of the bins.
-- `quantize range n` rounds n down to the nearest multiple of range
quantize :: Int -> Int -> Int
groupBy takes a "predicate" argument*, which identifies whether two items should be placed in the same bin. So:
groupBy (\n m -> quantize range n == quantize range m) :: [Int] -> [[Int]]
will group elements by whether they are in the same bin, without changing the elements. If range is 2, that will give you something like
[[1],[2,2,3,3],[4,5],[6]]
Then you just have to take the length of each sublist.
* There's a neat function called on which allows you to write the predicate more succinctly
groupBy ((==) `on` quantize range)
I'm just learning Haskell and am kind of stuck.
I'd like to compare list elements and measure the difference between them and return the highest one.
Unfortunatly, I do not know how to approach that problem.
For usual, I'd just iterate the list and compare the neighbours but that does not seem to be the way to go in Haskell.
I already tried using map but as I said I do not really know how you can solve that problem.
I'd be thankful for every kind of advice!
Best wishes
Edit: My idea is to first zip all pairs like this pairs a = zip a (tail a). Then I'd like to get all differences (maybe with map?) and then just chose the highest one. I just can't handle the Haskell syntax.
I don't know what you mean by "measure the discrepancy" between list elements, but if you want to calculate the "largest" element in a list, you'd use the built-in maximum function:
maximum :: Ord a => [a] -> a
This function takes a list of values that can be ordered, so all numbers, chars, and strings, among others.
If you want to get the difference between the maximum value and the minimum value, you can use the similar function minimum, then just subtract the two. Sure, there might be a slightly faster solution whereby you only traverse the list once, or you could sort the list then take the first and last elements, but for most cases doing diff xs = maximum xs - minimum xs is plenty fast enough and makes the most sense to someone else.
So what you want to do is compute a difference between successive elements, not calculate the minimum and maximum of each element. You don't need to index directly, but rather use a handy function called zipWith. It takes a binary operation and two lists, and "zips" them together using that binary operation. So something like
zipWith (+) [1, 2, 3] [4, 5, 6] = [1 + 4, 2 + 5, 3 + 6] = [5, 7, 9]
It is rather handy because if one of the lists runs out early, it just stops there. So you could do something like
diff xs = zipWith (-) xs ???
But how do we offset the list by 1? Well, the easy (and safe) way is to use drop 1. You could use tail, but it'll throw an error and crash your program if xs is an empty list, but drop will not
diff xs = zipWith (-) xs $ drop 1 xs
So an example would be
diff [1, 2, 3, 4] = zipWith (-) [1, 2, 3, 4] $ drop 1 [1, 2, 3, 4]
= zipWith (-) [1, 2, 3, 4] [2, 3, 4]
= [1 - 2, 2 - 3, 3 - 4]
= [-1, -1, -1]
This function will return positive and negative values, and we're interested only in the magnitude, so we can then use the abs function:
maxDiff xs = ??? $ map abs $ diff xs
And then using the function I highlighted above:
maxDiff xs = maximum $ map abs $ diff xs
And you're done! If you want to be fancy, you could even write this in point-free notation as
maxDiff = maximum . map abs . diff
Now, this will in fact raise an error on an empty list because maximum [] throws an error, but I'll let you figure out a way to solve that.
As mentioned by bheklilr, maximum is the quick and easy solution.
If you want some of the background though, here's a bit. What we're trying to do is take a list of values and reduce it to a single value. This is known as a fold, and is possible with (among others) the foldl function, which has the signature foldl :: (a -> b -> a) -> a -> [b] -> a.
The (a -> b -> a) section of foldl is a function which takes two values and returns one of the first type. In our case, this should be our comparison function:
myMax :: Ord a => a -> a -> a
myMax x y | x > y = x
| otherwise = y
(note that Ord a is required so that we can compare our values).
So, we can say
-- This doesn't work!
myMaximum :: Ord a => [a] -> a
myMaximum list = foldl myMax _ list
But what is _? It doesn't make sense to have a starting value for this function, so we turn instead to foldl1, which does not require a starting value (instead it takes the first two values from the list). That makes our maximum function
myMaximum :: Ord a => [a] -> a
myMaximum list = foldl1 myMax list
or, in pointfree format,
myMaximum :: Ord a => [a] -> a
myMaximum = foldl1 myMax
If you look at the actual definition of maximum in Data.List, you'll see it uses this same method.
map maps a function over a list. It transforms each thing1 in a list to a thing2.
What you want is to find the biggest difference between two neighbours, which you can't do with map alone. I'll assume you're only looking at numbers for now, because that's just easier.
diffs :: (Num a) => [a] -> [a]
diffs [] = []
diffs [x] = []
diffs (x1:x2:xs) = abs(x1-x2) : (diffs$x2:xs)
mnd :: (Num a, Ord a) => [a] -> a
mnd [] = 0
mnd [x] = 0
mnd xs = maximum$diffs xs
So diffs takes each list item one at a time and gets the absolute difference between it and its neighbour, then puts that at the front of a list it creates at it goes along (the : operator puts an individual element at the front of a list).
mnd is just a wrapper around maximum$diffs xs that stop exceptions being thrown.