I have array of x,y coordinates and char elements. I need function to get list of n chars at specified coordinates toward direction. I wrote it in recurrent way:
import Data.Array
exampleArray = array ((1,1),(2,2)) [((1,1),'a'), ((1,2),'b'), ((2,1),'c'), ((2,2),'d')]
-- 1 2
-- +---+---+
-- 1 | a | b |
-- +---+---+
-- 2 | c | d |
-- +---+---+
f :: Array (Int, Int) Char -> Int -> (Int, Int) -> (Int, Int) -> [Char]
f _ 0 _ _ = []
f arr n (x,y) (dirX, dirY) = (arr ! (y,x)) : f arr (n-1) (x+dirX,y+dirY) (dirX, dirY)
-- ac
list = f exampleArray 2 (1,1) (0,1) -- get 2 chars from array at 1,1 toward E
-- cb
list2 = f exampleArray 2 (1,2) (1,(-1)) -- get 2 chars from array at 1,2 toward NE
It works, but it is too slow and I am wondering how to optimize it?
Array is slow, so your first step should be to port to the vector package. Use the unboxed variant, as the default "boxed" representation allocates heap storage for each value and stores a pointer to it. There is no such thing as a 2-d vector so you will have to write code to transform your coordinate pairs into locations in a 1-d vector. If that still isn't fast enough then access using the unsafeIndex function and its relatives, which don't do bounds checking (but do not call up that which you cannot put down).
See also this question.
Related
I'm programming my own matrix module for fun and practice (Time and space complexity does not matter).
Now I want to implement matrix multiplication and I am struggling with it. It probably the reason I am using Haskell and I haven't had much experience with it.
This is my data type:
data Matrix a =
M {
rows::Int,
cols::Int,
values::[a]
}
Which stores a 3x2 Matrix like this in array:
1 2
3 4
5 6
= [1,2,3,4,5,6]
I have a somewhat working transpose function
transpose::(Matrix a)->(Matrix a)
transpose (M rows cols values) = M cols rows (aux values 0 0 [])
where
aux::[a]->Int->Int->[a]->[a]
aux values row col transposed
| cols > col =
if rows > row then
aux values (row+1) col (transposed ++ [valueAtIndex (M rows cols values) (row,col)])
else aux values 0 (col+1) transposed
| otherwise = transposed
To index the elements in the array I am using this function
valueAtIndex::(Matrix a)->(Int, Int)->a
valueAtIndex (M rows cols values) (row, col)
| rows <= row || cols <= col = error "indices too large for given Matrix"
| otherwise = values !! (cols * row + col)
From my understanding, I have to get elements like this for m1: 2x3 and m2: 3x2
m1(0,0)*m2(0,0)+m1(0,1)*m2(0,1)+m1(0,2)*m2(0,2)
m1(0,0)*m2(1,0)+m1(0,1)*m2(1,1)+m1(0,2)*m2(1,2)
m1(1,0)*m2(0,0)+m1(1,1)*m2(0,1)+m1(1,2)*m2(0,2)
m1(1,0)*m2(1,0)+m1(1,1)*m2(1,1)+m1(1,2)*m2(2,2)
Now I need a function that takes two matrices, with rows m1 == cols m2 and then somehow recursively calculate the correct matrix.
multiplyMatrix::Num a=>(Matrix a)->(Matrix a)->(Matrix a)
First of all, I'm not really convinced that such linear list is a good idea. A list in Haskell is modelled as a linked list. So that means that typically accessing the k-th element, will run in O(k). So for an m×n-matrix that means it takes O(m n) in order to access the last element. By using a 2d linked list: a linked list that contains linked lists, we scale that down to O(m+n), which is typically faster. Yes there is some overhead since you use more "cons" data constructors, but the amount of traversing is typically lower. In case you really want fast access, you should use arrays, vectors, etc. But then there are other design decisions to make.
So I propose we model the matrix as:
data Matrix a = M {
rows :: Int,
cols :: Int,
values :: [[a]]
}
Now with this data constructor we can define a transpose as:
transpose' :: Matrix a -> Matrix a
transpose' (M r c as) = M c r (trans as)
where trans [] = []
trans xs = map head xs : trans (map tail xs)
(here we assume that the list of lists is always rectangular)
So now for the matrix multiplication. If A and B are two matrices, and C = A × B, then that basically means that ai, j is the dot product of the i-th row of A, and the j-th column of B. Or the i-th row of A, and the j-th row of BT (the transpose of B). We can thus define the dot product as:
dot_prod :: Num a => [a] -> [a] -> a
dot_prod xs ys = sum (zipWith (*) xs ys)
and now it is only a matter of iterating through the rows and columns, and placing the elements in the right list. Like:
mat_mul :: Num a => Matrix a -> Matrix a -> Matrix a
mat_mul (M r ca xss) m2 | ca /= ra = error "Invalid matrix shapes"
| otherwise = M r c (matmul xss)
where (M c rb yss) = transpose m2
matmul [] = []
matmul (xs:xss) = generaterow yss xs : matmul xss
generaterow [] _ = []
generaterow (ys:yss) xs = dot_prod xs ys : generaterow yss xs
I'm really struggling with Haskell atm.
It took me almost 6 hours to write a function that does what I want. Unfortunately I'm not satisfied with the look of it.
Could someone please give me any hints how to rewrite it?
get_connected_area :: Eq generic_type => [[generic_type]] -> (Int, Int) -> [(Int,Int)] -> generic_type -> [(Int,Int)]
get_connected_area habitat point area nullValue
| elem point area = area
| not ((fst point) >= 0) = area
| not ((snd point) >= 0) = area
| not ((fst point) < (length habitat)) = area
| not ((snd point) < (length (habitat!!0))) = area
| (((habitat!!(fst point))!!(snd point))) == nullValue = area
| otherwise =
let new_area = point : area
in
get_connected_area habitat (fst point+1, snd point) (
get_connected_area habitat (fst point-1, snd point) (
get_connected_area habitat (fst point, snd point+1) (
get_connected_area habitat (fst point, snd point-1) new_area nullValue
) nullValue
) nullValue
) nullValue
The function get's a [[generic_type]] (representing a landscape-map) and searches the fully connected area around a point that isn't equal to the given nullValue.
Eg.:
If the function gets called like this:
get_connected_area [[0,1,0],[1,1,1],[0,1,0],[1,0,0]] (1,1) [] 0
That literally means
0 1 0
1 1 1
0 1 0
1 0 0
Represents a map (like google maps). Start from the point (coordinates) (1,1) I want to get all coordinates of the elements that form a connected area with the given point.
The result therefore should be:
0 1 0
1 1 1
0 1 0
1 0 0
And the corresponting return value (list of coordinates of bold 1s):
[(2,1),(0,1),(1,2),(1,0),(1,1)]
One small change is that you can use pattern matching for the variable point. This means you can use (x, y) instead of point in the function declaration:
get_connected_area habitat (x, y) area nullValue = ...
Now everywhere you have fst point, just put x, and everywhere you have snd point, put y.
Another modification is to use more variables for subexpressions. This can help with the nested recursive calls. For example, make a variable for the inner-most nested call:
....
where foo = get_connected_area habitat (x, y-1) new_area nullValue
Now just put foo instead of the call. This technique can now be repeated for the "new" inner-most call. (Note that you should pick a more descriptive name than foo. Maybe down?)
Note that not (x >= y) is the same as x < y. Use this to simplify all of the conditions. Since these conditions test if a point is inside a bounding rectangle, most of this logic can be factored to a function isIn :: (Int, Int) -> (Int, Int) -> (Int, Int) -> Bool which will make get_connected_area more readable.
This would be my first quick pass through the function, and sort of the minimum that might pass a code review (just in terms of style):
getConnectedArea :: Eq a => [[a]] -> a -> (Int, Int) -> [(Int,Int)] -> [(Int,Int)]
getConnectedArea habitat nullValue = go where
go point#(x,y) area
| elem point area = area
| x < 0 = area
| y < 0 = area
| x >= length habitat = area
| y >= length (habitat!!0) = area
| ((habitat!!x)!!y) == nullValue = area
| otherwise =
foldr go (point : area)
[ (x+1, y), (x-1, y), (x, y+1), (x, y-1) ]
We bind habitat and nullValue once at the top level (clarifying what the recursive work is doing), remove indirection in the predicates, use camel-case (underdashes obscure where function application is happening), replace generic_type with a (using a noisy variable here actually has the opposite effect from the one you intended; I end up trying to figure out what special semantics you're trying to call out when the interesting thing is that the type doesn't matter (so long as it can be compared for equality)).
At this point there are lots of things we can do:
pretend we're writing real code and worry about asymptotics of treating lists as arrays (!!, and length) and sets (elem), and use proper array and set data structures instead
move your bounds checking (and possible null value checking) into a new lookup function (the goal being to have only a single ... = area clause if possible
consider improvements to the algorithm: can we avoid recursively checking the cell we just came from algorithmically? can we avoid passing area entirely (making our search nicely lazy/"productive")?
Here is my take:
import qualified Data.Set as Set
type Point = (Int, Int)
getConnectedArea :: (Point -> Bool) -> Point -> Set.Set Point
getConnectedArea habitat = \p -> worker p Set.empty
-- \p is to the right of = to keep it out of the scope of the where clause
where
worker p seen
| p `Set.member` seen = seen
| habitat p = foldr worker (Set.insert p seen) (neighbors p)
| otherwise = seen
neighbors (x,y) = [(x-1,y), (x+1,y), (x,y-1), (x,y+1)]
What I've done
foldr over the neighbors, as some commenters suggested.
Since the order of points doesn't matter, I use a Set instead of a list, so it's semantically a better fit and faster to boot.
Named some helpful intermediate abstractions such as Point and neighbors.
A better data structure for the habitat would also be good, since lists are linear time to access, maybe a 2D Data.Array—but as far as this function cares, all you need is an indexing function Point -> Bool (out of bounds and null value are treated the same), so I've replaced the data structure parameter with the indexing function itself (this is a common transformation in FP).
We can see that it would also be possible to abstract away the neighbors function and then we would arrive at a very general graph traversal method
traverseGraph :: (Ord a) => (a -> [a]) -> a -> Set.Set a
in terms of which you could write getConnectedArea. I recommend doing this for educational purposes—left as an exercise.
EDIT
Here's an example of how to call the function in terms of (almost) your old function:
import Control.Monad ((<=<))
-- A couple helpers for indexing lists.
index :: Int -> [a] -> Maybe a
index _ [] = Nothing
index 0 (x:_) = x
index n (_:xs) = index (n-1) xs
index2 :: (Int,Int) -> [[a]] -> Maybe a
index2 (x,y) = index x <=< index y
-- index2 uses Maybe's monadic structure, and I think it's quite pretty.
-- But if you're not ready for that, you might prefer
index2' (x,y) xss
| Just xs <- index y xss = index x xs
| otherwise = Nothing
getConnectedArea' :: (Eq a) => [[a]] -> Point -> a -> [a]
getConnectedArea' habitat point nullValue = Set.toList $ getConnectedArea nonnull point
where
nonnull :: Point -> Bool
nonnull p = case index2 p habitat of
Nothing -> False
Just x -> x /= nullValue
OK i will try to simplify your code. However there are already good answers and that's why i will tackle this with a slightly more conceptual approach.
I believe you could chose better data types. For instance Data.Matrix seems to provide an ideal data type in the place of your [[generic_type]] type. Also for coordinates i wouldn't chose a tuple type since tuple type is there to pack different types. It's functor and monad instances are not very helpful when it is chosen as a coordinate system. Yet since it seems Data.Matrix is just happy with tuples as coordinates i will keep them.
OK your rephrased code is as follows;
import Data.Matrix
gca :: Matrix Int -> (Int, Int) -> Int -> [(Int,Int)]
gca fld crd nil = let nbs = [id, subtract 1, (+1)] >>= \f -> [id, subtract 1, (+1)]
>>= \g -> return (f,g)
>>= \(f,g) -> return ((f . fst) crd, (g . snd) crd)
in filter (\(x,y) -> fld ! (x,y) /= nil) nbs
*Main> gca (fromLists [[0,1,0],[1,1,1],[0,1,0],[1,0,0]]) (2,2) 0
[(2,2),(2,1),(2,3),(1,2),(3,2)]
The first thing to note is, the Matrix data type is index 1 based. So we have our center point at (2,2).
The second is... we have a three element list of functions defined as [id, subtract 1, (+1)]. The contained functions are all Num a => a -> a type and i need them to define the surrounding pixels of the given coordinate including the given coordinate. So we have a line just like if we did;
[1,2,3] >>= \x -> [1,2,3] >>= \y -> return [x,y] would result [[1,1],[1,2],[1,3],[2,1],[2,2],[2,3],[3,1],[3,2],[3,3]] which, in our case, would yield a 2 combinations of all functions in the place of the numbers 1,2 and 3.
Which then we apply to our given coordinate one by one with a cascading instruction
>>= \[f,g] -> return ((f . fst) crd, (g . snd) crd)
which yields all neighboring coordinates.
Then its nothing more than filtering the neighboring filters by checking if they are not equal to the nil value within out matrix.
This is a homework assignment first off. We are given a newtype Matrix which is the professor's implementation of an abstract Matrix. My main issue is how do you create a list of type Matrix. The first function fillWith which I need to implement takes a tuple which is (number of rows, number of columns) of the matrix to create and the data to place at each index.
module Matrix (Matrix, fillWith, fromRule, numRows, numColumns,
at, mtranspose, mmap, add, mult)
where
-- newtype is like "data", but has some efficiency advantages
newtype Matrix a = Mat ((Int,Int),(Int,Int) -> a)
--fillWith :: (Int,Int) -> a -> (Matrix a)
--fillWith ix val = Mat ((,
--trying to create a row of type Matrix
row_matrix :: [a] -> Matrix a
row_matrix ls = Mat ((1, length ls), (\x y -> if x > 1 then undefined else ls !! (y-1)))
You are just missing some parens and a comma:
row_matrix ls = Mat ((1, length ls), (\(x,y) -> if x > 1 then undefined else ls !! (y-1)))
^-^-^--- add these
Mat is a tuple where:
the first element is itself a tuple, and
the second element is a function taking a tuple to a value of type a
You may always use a where or let clause to simplify the construction of values, e.g.:
row_matrix as = Mat (bounds, accessor)
where bounds = (1, length as)
accessor (i,j) = if i > 1 then undefined else as !! (j-1)
Using a where clause makes the code a lot more readable.
To implement fillWith I would follow the same recipe:
fillWith bounds val = Mat (bounds, accessor)
where accessor (i,j) = ???
I think it's obvious now what the ??? should be.
I have the following tuple representing a 2D matrix in Haskell
let a =[(1,2,3),(4,5,6),(7,8,9)]
How can I access each index individually? (e.g. a[1][1], a[0][1] etc.)
Is there a better way to interpret 2D arrays in haskell?
Here's an example of how to create and index an immutable 2D array using the standard Data.Array module:
Prelude> import Data.Array
Prelude Data.Array> let a = array ((0,0),(2,2)) [((i,j),3*i+j)| i <- [0..2], j <- [0..2]]
Prelude Data.Array> a ! (1,1)
4
More information can be found on the Haskell Wiki.
If you're going to be doing this a lot --- working with matrices, arrays, etc. --- then it's probably best to follow one of Mikhail's suggestions.
If you're simply curious about how to go about doing this though, it basically comes down to pattern matching. One thing you can do is use the !! function to get a zero-indexed element from a list (the row in this case) and then you would have to pattern match to get the specific element from the tuple.
For example, in the following code, getRow fetches the specific row using !!, and then getElem returns the particular tuple element, so that ultimately getElem a 1 1 == 5 for example. You would of course have to add some code to handle out-of-bounds indices:
getRow :: [(Integer, Integer, Integer)] -> Int -> (Integer, Integer, Integer)
getRow matrix row = matrix !! (row :: Int)
getElem :: [(Integer, Integer, Integer)] -> Int -> Int -> Integer
getElem matrix row column
| column == 0 = x
| column == 1 = y
| column == 2 = z
where (x, y, z) = getRow matrix row
I'm loading an RGB image from disk with JuicyPixels-repa. Unfortunately the Array representation of the image is Array F DIM3 Word8 where the inner dimension is the RGB pixels. That's a bit incompatibe with existing repa imageprocessing algorithms where an RGB image is Array U DIM2 (Word8, Word8, Word8).
I want to calculate the RGB histograms of the image, I'm searching a function with the Signature:
type Hist = Array U DIM1 Int
histogram:: Array F DIM3 Word8 -> (Hist, Hist, Hist)
how can I fold my 3d array to get a 1d array for each colorchannel?
Edit:
The main problem is not that I'm not able to convert from DIM3 to DIM2 for each channel (easy done with slicing). The problem is that I have to iterate the source image DIM2 or DIM3 and have to accumulate to an DIM1 array of a different Shape (Z:.256) and extent.
So I can't use repa's foldS as it reduces the dimension by one, but with the same extent.
I also experimented with traverse but it iterates over the extent of the destination image, providing a function to get pixels from the source image, that would lead to very inefficient code, counting the same pixels for each colorvalue.
A good way would be a simple folding over a Vector with the histogram type as accumulator, but unfortunately I have no U (unboxed) or V (vector) based array, from which I can efficiently get a Vector. I have an Array F (foreign pointer).
Ok, I found a few minutes. Below, I cover four solutions and have made the worst solutions (the middle two, involving O(n) data conversion) really easy for you.
Lets Acknowledge the Dumb Solution
It's reasonable to start with the obvious. You could use Data.List.foldl to traverse the rows and columns, building up your histograms from initial zero arrays (untested/partial code follows):
foldl (\(histR, histG, histB) (row,col) ->
let r = arr ! (Z:.row:.col:.0)
g = arr ! (Z:.row:.col:.1)
b = arr ! (Z:.row:.col:.2)
in (incElem r histR, incElem g histG, incElem b histB)
(zero,zero,zero)
[ (row,col) | row <- [0..nrRow-1], col <- [0..nrCol-1] ]
...
where (Z:.nrRow:.nrCol:._) = extent arr
I'm not sure how efficient this will be, but suspect that it will do too much bounds checking. Switching to unsafeIndex should do reasonably, assuming the delayed arrays, hist*, do well due to however you'd pick to implement incElem.
You Can Build the Array You Want
Using traverse you can actually convert JP-Repa style arrays into DIM2 arrays with tuples for elements:
main = do
let arr = R.fromFunction (Z:.a:.b:.c) (\(Z:.i:.j:.k) -> i+j-k)
a =4 :: Int
b = 4 :: Int
c= 4 :: Int
new = R.traverse arr
(\(Z:.r:.c:._) -> Z:.r:.c) -- the extent
(\l idx -> (l (idx:.0)
,l (idx:.1)
,l (idx :. 2)))
print (R.computeS new :: R.Array R.U DIM2 (Int,Int,Int))
Could you point me to the body of code you talked about that uses this format? It would be simple to patch JP-Repa to include a function of this type.
You can build the Unboxed Vector You Mentioned
You mentioned an easy solution is to fold over unboxed vectors, but lamented that JP-repa doesn't provide an unboxed array. Luckily, conversion is simple:
toUnboxed :: Img a -> VU.Vector Word8
toUnboxed = R.toUnboxed . R.computeUnboxedS . R.delay . imgData
We Could Patch Repa
This is really only a problem because Repa doesn't have what I consider a normal traverse function. Repa's traverse is more of an array construction that happens to provide an indexing function into another array. We want traverse in the form:
newTraverse :: Array r sh e -> a -> (a -> sh -> e -> a) -> a
but of coarse this is actually just a malformed fold. So lets rename it and reorder the arguments:
foldAllIdxS :: (sh -> a - > e -> a) -> a -> Array r sh e -> a
which contrasts nicely with the (preexisting) foldAllS operation:
foldAllS :: (a -> a -> a) -> a -> Array r sh a -> a
Notice how our new fold has two critical characteristics. The result type is not required to match the element type, so we could start with a tuple of Histograms. Second, our version of fold passes the index, which allows you to select which histogram in the tuple to update (if any).
You can lazily use the latest JuicyPixels-Repa
To acquire your preferred Repa array format, or to acquire an unboxed vector, you can just use the newly uploaded JuicyPixels-Repa-0.6.
someImg <- readImage path :: IO (Either String (Img RGBA))
let img = either (error "Blah") id someImg
uvec = toUnboxed img
tupleArr = collapseColorChannel img
Now you can fold over the vector or use the tuple array directly, as you originally desired.
I also took an ugly stab at fleshing out the first, horribly naive, solution:
histograms :: Img a -> (Histogram, Histogram, Histogram, Histogram)
histograms (Img arr) =
let (Z:.nrRow:.nrCol:._) = R.extent arr
zero = R.fromFunction (Z:.256) (\_ -> 0 :: Word8)
incElem idx x = RU.unsafeTraverse x id (\l i -> l i + if i==(Z:.fromIntegral idx) then 1 else 0)
in Prelude.foldl (\(hR, hG, hB, hA) (row,col) ->
let r = R.unsafeIndex arr (Z:.row:.col:.0)
g = R.unsafeIndex arr (Z:.row:.col:.1)
b = R.unsafeIndex arr (Z:.row:.col:.2)
a = R.unsafeIndex arr (Z:.row:.col:.3)
in (incElem r hR, incElem g hG, incElem b hB, incElem a hA))
(zero,zero,zero,zero)
[ (row,col) | row <- [0..nrRow-1], col <- [0..nrCol-1] ]
I'm too wary of the performance of this code (3 traversals per index... I must be tired) to throw it into JP-Repa, but if you find it works well then let me know.