Finding the number of elements in a matrix - haskell

I'm a Haskell newcomer, so cut me a bit of slack :P
I need to write a Haskell function that goes through a matrix and outputs a list of all matching elements to a given element (like using filter) and then matches the list against another to check if they are the same.
checkMatrix :: Matrix a -> a -> [a] -> Bool
I have tried variations of using filter, and using the !! operator and I can't figure it out. I don't really want to get the answer handed to me, just need some pointers for getting me on the right path
checkMatrix :: Matrix a -> a -> [a] -> Bool
checkMatrix matr a lst = case matr of
x:xs | [] -> (i don't really know what to put for the base case)
| filter (== True) (x:xs !! 0) -> checkMatrix xs a lst
Thats all i got, I'm really very lost as to what to do next

tl;dr You want something to the effect of filter someCondition (toList matrix) == otherList, with minor details varying depending on your matrix type and your specific needs.
The Full Answer
I don't know what Matrix type you're using, but the approach is going to be similar for any reasonably defined matrix type.
For this answer, I'll assume you're using the Data.Matrix class from the package on Hackage called matrix.
You are right to think you should use filter. Thinking functionally, you want to eliminate some elements from the matrix and keep others, based on a condition. However, a matrix does not provide a natural way to perform filter on it, as the idea is not really well-defined. So, instead, we want to extract the elements from our matrix into a list first. The matrix package provides the following function, which does just that.
toList :: Matrix a -> [a]
Once you have a list representation, you can very easily use filter to get the elements that you want.
A few caveats and notes.
If the matrix package that you're using doesn't define toList itself, check if it defines a Foldable instance for the matrix type. If it does, then Data.Foldable has a general-purpose toList that works for all Foldable types.
Be careful with the ordering here. It's not entirely clear what order the elements should be put into the list in, since matrices are two-dimensional and lists are inherently one-dimensional. If the ordering matters for whatever you're doing, you might have to put some additional effort into guaranteeing the desired order. If it does not matter, consider using Data.Set or some other unordered collection instead of lists.
I don't see any constraints in your checkMatrix implementation. Remember that comparing elements of lists adds an Eq a constraint, and if you want to use an unordered collection then that's going to add Ord a instead.

Related

Haskell: what are the implications of combining parameters into tuples, rather than using distinct parameters?

As a Haskell beginner, I'm curious about best practices. In particular, in the absence of other requirements, is it better to associate related function arguments using tuples, or keep them "naked"?
E.g.
vector :: Float -> Float -> Float -> Vector
vs.
vector :: (Float, Float, Float) -> Vector
The reason I ask is that sometimes aspects of a parameter (e.g. x coordinate in a 2D or 3D point or vector) are normally bound up with other parameters (e.g. the y & z coordinates). I can see how pattern-matching can be used in both cases, but I'm curious to know whether there are serious implications "down the track" to using tuples or distinct parameters.
When other parameters are involved, the use of tuples seems to make it clear that a certain set of parameters are associated with each other. But it also makes the code more verbose when functions take just the tuple as a parameter.
I would recommend, as a rule of thumb, to never put tuples in the arguments of a function signature.
Why? Well, if the point is to group stuff together, then tuples do a rather measly job at it. Sure, you could use nested tuples and type synonyms to explain what they mean, but all of that is brittle and much better and safer done with proper record types. As you've identified, the x- and y-components of a vector usually come together. Well, not only that, in many a sense it is a good idea to keep the x- and y-components completely hidden from any interesting code. That's exactly what the Vector type should accomplish. (Which should probably be called Vector3 or ℝ³ instead.) And the only purpose of the vector function should be to assemble one of those from the components. Well, if that's the only thing it does, then the three components are the only arguments, and there's no point grouping them together any further... that's basically just putting a single suitcase into another transport box. Better just use the right container right away as a single wrapper.
vector3 :: Float -> Float -> Float -> Vector3
An example of a tuple in a signature of a commonly used function is
randomR :: (Random a, RandomGen g) => (a,a) -> g -> (a,g)
Why is this a bad idea? Well, you're using a tuple to denote an interval... but also in the result to denote something completely different, a grouping of the obtained random value with the updated generator. The proper way to do this is to either have a type that properly expresses what it is
data Interval a = Interval {lowerBound, upperBound :: a}
randomR :: (Random a, RandomGen g) => Interval a -> g -> (a,g)
...or better, separate the concerns, i.e. that manual state-threading should be hidden in a suitable monad – such as RVar. At that point the range limits become the only arguments, thus you don't need to group them together anymore!
uniform :: Distribution Uniform a => a -> a -> RVar a
That doesn't mean you should never use tuples at all. For result values, the currying mechanism doesn't work as easily†, so if you have a function that gives back two results but there's not really any meaningful interpretation for what those two values represent together, well, give back a tuple.
Furthermore, if you're grouping together completely abstract types, you can't possibly have an interpretation for what they mean together. That's the reason why zip :: [a] -> [b] -> [(a,b)] gives a list of tuples.
†You can also have multi-result functions with tuples. For that, you need to use continuation-passing style, for example splitAt :: Int -> [a] -> ([a],[a]) becomes splitAt' :: Int -> [a] -> ([a] -> [a] -> r) -> r.
There are no implications down the line. A function that can accept one argument first and then another one later, is said to be curried. A function that accepts a tuple as an argument is said to be uncurried. You can convert between the two using curry and uncurry. Feel free to extend this definition to three parameters and define new functions curry3 f a b c= f(a,b,c) and uncurry3 f (a,b,c)= f a b c.
In this case, I would going for a named datatype for most uses. In fact, you already seem to have a Vector type. Making your constructor, vector, accept a triple seems like an excellent idea. That way, those who try to use it to construct a 2D vector will get the most helpful message from the type checker.

Is there significance in the order of Haskell function parameters?

I've been learning Haskell and I noticed that many of the built in functions accept parameters in an order counter intuitive to what I would expect. For example:
replicate :: Int -> a -> [a]
If I want to replicate 7 twice, I would write replicate 2 7. But when read out loud in English, the function call feels like it is saying "Replicate 2, 7 times". If I would have written the function myself, I would have swapped the first and second arguments so that replicate 7 2 would read "replicate 7, 2 times".
Some other examples appeared when I was going through 99 Haskell Problems. I had to write a function:
dropEvery :: [a] -> Int -> [a]`
It takes a list as its first argument and an Int as its second. Intuitively, I would have written the header as dropEvery :: Int -> [a] -> [a] so that dropEvery 3 [1..100] would read as: "drop every third element in the list [1..100]. But in the question's example, it would look like: dropEvery [1..100] 3.
I've also seen this with other functions that I cannot find right now. Is it common to write functions in such a way due to a practical reason or is this all just in my head?
It's common practice in Haskell to order function parameters so that parameters which "configure" an operation come first, and the "main thing being operated on" comes last. This is often counter intuitive coming from other languages, since it tends to mean you end up passing the "least important" information first. It's especially jarring coming from OO where the "main" argument is usually the object on which the method is being invoked, occurring so early in in the call that it's out of the parameter list entirely!
There's a method to our madness though. The reason we do this is that partial application (through currying) is so easy and so widely used in Haskell. Say I have a functions like foo :: Some -> Config -> Parameters -> DataStrucutre -> DataStructure and bar :: Differnt -> Config -> DataStructure -> DataStructure. When you're not used to higher-order thinking you just see these as things you call to transform a data structure. But you can also use either of them as a factory for "DataStructure transformers": functions of the type DataStructure -> DataStructure.
It's very likely that there are other operations that are configured by such DataStructure -> DataStructure functions; at the very least there's fmap for turning transformers of DataStructures into transformers of functors of DataStructures (lists, Maybes, IOs, etc).
We can take this a bit further sometimes too. Consider foo :: Some -> Config -> Parameters -> DataStructure -> DataStructure again. If I expect that callers of foo will often call it many times with the same Some and Config, but varying Parameters, then even-more-partial applications become useful.
Of course, even if the parameters are in the "wrong" order for my partial application I can still do it, using combinators like flip and/or creating wrapper functions/lambdas. But this results in a lot of "noise" in my code, meaning that a reader has to be able to puzzle out what is the "important" thing being done and what's just adapting interfaces.
So the basic theory is for a function writer to try to anticipate the usage patterns of the function, and list its arguments in order from "most stable" to "least stable". This isn't the only consideration of course, and often there are conflicting patterns and no clear "best" order.
But "the order the parameters would be listed in an English sentence describing the function call" would not be something I would give much weight to in designing a function (and not in other languages either). Haskell code just does not read like English (nor does code in most other programming languages), and trying to make it closer in a few cases doesn't really help.
For your specific examples:
For replicate, it seems to me like the a parameter is the "main" argument, so I would put it last, as the standard library does. There's not a lot in it though; it doesn't seem very much more useful to choose the number of replications first and have an a -> [a] function than it would be to choose the replicated element first and have an Int -> [a] function.
dropEvery indeed seems to take it's arguments in a wonky order, but not because we say in English "drop every Nth element in a list". Functions that take a data structure and return a "modified version of the same structure" should almost always take the data structure as their last argument, with the parameters that configure the "modification" coming first.
One of the reasons functions are written this way is because their curried forms turn out to be useful.
For example, consider the functions map and filter:
map :: (a -> b) -> [a] -> [b]
filter :: (a -> Bool) -> [a] -> [a]
If I wanted to keep the even numbers in a list and then divide them by 2, I could write:
myfunc :: [Int] -> [Int]
myfunc as = map (`div` 2) (filter even as)
which may also be written this way:
myfunc = map (`div` 2) . filter even
\___ 2 ____/ \___ 1 ___/
Envision this as a pipeline going from right to left:
first we keep the even numbers (step 1)
then we divide each number by 2 (step 2)
The . operator at as a way of joining pipeline segments together - much like how the | operator works in the Unix shell.
This is all possible because the list argument for map and filter are the last parameters to those functions.
If you write your dropEvery with this signature:
dropEvery :: Int -> [a] -> [a]
then we can include it in one of these pipelines, e.g.:
myfunc2 = dropEvery 3 . map (`div` 2) . filter even
To add to the other answers, there's also often an incentive to make the last argument be the one whose construction is likely to be most complicated and/or to be a lambda abstraction. This way one can write
f some little bits $
big honking calculation
over several lines
rather than having the big calculation surrounded by parentheses and a few little arguments trailing off at the end.
If you wish to flip arguments, just use flip function from Prelude
replicate' = flip replicate
> :t replicate'
replicate' :: a -> Int -> [a]

Why are `take` and `drop` defined for negative arguments?

The Prelude shows examples for take and drop with negative arguments:
take (-1) [1,2] == []
drop (-1) [1,2] == [1,2]
Why are these defined the way they are, when e.g. x !! (-1) does the "safer" thing and crashes? It seems like a hackish and very un-Haskell-like way to make these functions total, even when the argument doesn't make sense. Is there some greater design philosophy behind this that I'm not seeing? Is this behavior guaranteed by the standard, or is this just how GHC decided to implement it?
There would be mainly one good reason to make take partial: it could guarantee that the result list, if there is one, has always the requested number of elements.
Now, take already violates this in the other direction: when you try to take more elements than there are in the list, is simply takes as many as there are, i.e. fewer than requested. Perhaps not the most elegant thing to do, but in practice this tends to work out quite usefully.
The main invariant for take is combined with drop:
take n xs ++ drop n xs ≡ xs
and that holds true even if n is negative.
A good reason not to check the length of the list is that it makes the functions perform nicely on lazy infinite lists: for instance,
take hugeNum [1..] ++ 0 : drop hugeNum [1..]
will immediately give 1 as the first result element. This would not be possible if take and drop first had to check whether there are enough elements in the input.
I think it's a matter of design choice here.
The current definition ensures that the property
take x list ++ drop x list == list
holds for any x, including negative ones as well as those larger than length list.
I can however see the value in a variant of take/drop which errors out: sometimes a crash is preferred to a wrong result.
x !! (-1) does the "safer" thing and crashes
Crashing is not safe. Making a function non-total destroys your ability
to reason about the behaviour of a function based on its type.
Let us imagine that take and drop did have "crash on negative" behaviour. Consider their type:
take, drop :: Int -> [a] -> [a]
One thing this type definitely doesn't tell you that this function could crash! It's helpful to reason about code as though we were using a total language, even though we are not - an idea called fast and loose reasoning - but to be able to do that, you have to avoid using (and writing) non-total functions as much as possible.
What to do, then, about operations that might fail or have no result? Types are the answer! A truly safe variant of (!!) would have a type that models the failure case, like:
safeIndex :: [a] -> Int -> Maybe a
This is preferable to the type of (!!),
(!!) :: [a] -> Int -> a
Which, by simple observation, can have no (total) inhabitants - you cannot "invent" an a if the list is empty!
Finally, let us return to take and drop. Although their type doesn't fully say what the behaviour is, coupled with their names (and ideally a few QuickCheck properties) we get a pretty good idea. As other responders have pointed out, this behaviour is appropriate in many cases. If you truly have a need to reject negative length inputs, you don't have to choose between non-totality (crashing) or the possibility of surprising behaviour (negative length accepted) - model the possible outcomes responsibly with types.
This type makes it clear that there is "no result"
for some inputs:
takePos, dropPos :: Int -> [a] -> Maybe [a]
Better still, this type uses natural numbers;
functions with this type cannot even be
applied to a negative number!
takeNat, dropNat :: Nat -> [a] -> [a]

Haskell Multidimensional Arrays with Compiler-enforced lengths

I've been trying out some Haskell because I was intrigued by the strong typing, and I'm confused about the best way to tackle this:
The Vector datatype defined in Data.Vector allows for multidimensional arrays by way of nested arrays. However, these are constructed from lists, and lists of varying lengths are considered the same datatype (unlike tuples of varying lengths).
How might I extend this datatype (or write an analogous one) that functions in the same way, except that vectors of different lengths are considered to be different datatypes, so any attempt to create a multidimensional array/matrix with rows of differing lengths (for example) would result in a compile-time error?
It seems that tuples manage this by way of writing out 63 different definitions (one for each valid length), but I would like to be able to handle vectors of arbitrary length, if possible.
I see two ways for doing this:
1) The "typed" way: Using dependent types. This is, to some extent, possible in Haskell with the recent GHC extension for DataKinds*. Even better, to use a language with a really advanced type system, like Agda.
2) The other way: encode your vectors like
data Vec a = {values :: [a], len :: [Int]}
Then, export only
buildVec :: [a] -> Vec a
buildVec as = Vec as (length as)
and check for correct lengths in the other functions that use vectors of same length, e.g. ensure same-length vectors in a matrix function or in Vec additions. Or even better: provide another custom builder/ctor for matrices.
*I just saw: exactly what you're wanting is the standard example for DataKinds.
This form of typing, where the type depends on the value, is often called dependently typed programming, and, as luck have it, Wolfgang Jeltsch wrote a blog post about dependent types in Haskell using GADTs and TypeFamilies.
The gist of the blogpost is that if we have two types representing natural numbers:
data Zero
data Succ nat
one can build lists with type enforced lengths in the following way:
data List el len where
Empty :: List el Zero
cons :: el -> List el nat -> List el (Succ nat)

Haskell set datatype/datastructure

What i want to do is to create a type Set in Haskell to represent a generic(polymorphic) set ex. {1,'x',"aasdf",Phi}
first i want to clear that in my program i want to consider Phi(Empty set) as something that belongs to all sets
here is my code
data Set a b= Phi | Cons a (Set a b)
deriving (Show,Eq,Ord)
isMember Phi _ = True
isMember _ Phi = False
isMember x (Cons a b) = if x==a
then True
else isMember x b
im facing a couple of problems:
I want isMember type to be
isMember :: Eq a => a -> Set a b -> Bool
but according to my code it is
isMember :: Eq a => Set a b -> Set (Set a b) c -> Bool
If i have a set of different times the == operator doesn't work correctly so i need some help please :D
Regarding your type error, the problem looks like the first clause to me:
isMember Phi _ = True
This is an odd clause to write, because Phi is an entire set, not a set element. Just deleting it should give you a function of the type you expect.
Observe that your Set type never makes use of its second type argument, so it could be written instead as
data Set a = Phi | Cons a (Set a)
...and at that point you should just use [a], since it's isomorphic and has a huge entourage of functions already written for using and abusing them.
Finally, you ask to be able to put things of different types in. The short answer is that Haskell doesn't really swing that way. It's all about knowing exactly what kind of type a thing is at compile time, which isn't really compatible with what you're suggesting. There are actually some ways to do this; however, I strongly recommend getting much more familiar with Haskell's particular brand of type bondage before trying to take the bonds off.
A) Doing this is almost always not what you actually want.
B) There are a variety of ways to do this from embedding dynamic types (Dynamic) to using very complicated types (HList).
C) Here's a page describing some ways and issues: http://www.haskell.org/haskellwiki/Heterogenous_collections
D) If you're really going to do this, I'd suggest HList: http://homepages.cwi.nl/~ralf/HList/
E) But if you start to look at the documentation / HList paper and find yourself hopelessly confused, fall back to the dynamic solution (or better yet, rethink why you need this) and come back to HLists once you're significantly more comfortable with Haskell.
(Oh yes, and the existential solution described on that page is probably a terrible idea, since it almost never does anything particularly useful for you).
What you try to do is very difficult, as Haskell does not stores any type information by default. Two modules that are very useful for such things are Data.Typeable and Data.Dynamic. They provide support for storing a monomorphic (!) type and support for dynamic monomorphic typing.
I have not attempted to code something like this previously, but I have some ideas to accomplish that:
Each element of your set is a triple (quadruple) of the following things:
A TypeRep of the stored data-type
The value itself, coerced into an Any.
A comparison function (You can only use monomorphic values, you somehow have to store the context)
similary, a function to show the values.
Your set actually has two dimensions, first a tree by the TypeRep and than a list of values.
Whenever you insert a value, you coerce it into an Any and store all the required stuff together with it, as explained in (1) and put it in the right position as in (2).
When you want to find an element, you generate it's TypeRep and find the subtree of the right type. Then you just compare each sub-element with the value you want to find.
That are just some random thoughts. I guess it's actually much easier to use Dynamic.

Resources