Matrix averages in Haskell

Matrix averages in Haskell - haskell

I have a matrix, lets use this one for example:
[ [4.0, 2.0, 0.6],
[4.2, 2.1, 0.59],
[3.9, 2.0, 0.58],
[4.3, 2.1, 0.62],
[4.1, 2.2, 0.63] ]
Now, I take the average of each column, which results in this:
[4.10, 2.08, 0.604]
All these I'm able to make, I have the informations.
What I'm having problem is at this part now. I'm looking a way to subtract each average element with his respective element on the first Matrix.
It should look like this:
[ [-0.1, -0.08, -0.004],
[0.1, 0.02, -0.014],
[-0.2, -0.08, -0.024],
[0.2, 0.02, 0.016],
[0.0, 0.12, 0.026] ]
I have to make it viable for an arbitrary-sized matrix.

something like this, without validations and formatting
colmean rs = let (a,c) = agg rs in map (/(fromIntegral c)) a
agg [r] = (r,1)
agg (r:rs) = let (a,c) = agg rs in (zipWith (+) a r, c+1)
minus = flip (zipWith (-))
demean x = map (minus $ colmean x) x
> demean media
[[-9.999999999999964e-2,-8.000000000000007e-2,-4.0000000000000036e-3],
[0.10000000000000053,2.0000000000000018e-2,-1.4000000000000012e-2],
[-0.19999999999999973,-8.000000000000007e-2,-2.400000000000002e-2],
[0.20000000000000018,2.0000000000000018e-2,1.6000000000000014e-2],
[0.0,0.1200000000000001,2.6000000000000023e-2]]
with this, the second dimension (number of columns) can be infinite
> map (take 10) $ demean [[1..], [2..]]
[[-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5],
[0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5]]

Here's how I would do it:
mean l = sum l / (fromIntegral (length l))
getColumnMeans = map mean . transpose
normalizeMatrix m = map (zipWith subtract columnMeans) m
where
columnMeans = getColumnMeans m
mean does exactly what its name suggests and getColumnMeans is the function that you seem to have implemented yourself.
normalizeMatrix is the function that you are looking for. It takes in a matrix, computes its column means and then subtracts it from each of its rows via map.
subtract function is basically (-) but with its arguments flipped. I use it whenever I wanna map something like subtract 5 and it reads like regular English. So, subtract 5 10 would return 5. Here, zipWith subtract columnMeans does it on an entry-by-entry basis for a row. map does this for all rows. Hope this is useful.

Related

Counting change in Haskell

I came across the following solution to the DP problem of counting change:
count' :: Int -> [Int] -> Int
count' cents coins = aux coins !! cents
where aux = foldr addCoin (1:repeat 0)
where addCoin c oldlist = newlist
where newlist = (take c oldlist) ++ zipWith (+) newlist (drop c oldlist)
It ran much faster than my naive top-down recursive solution, and I'm still trying to understand it.
I get that given a list of coins, aux computes every solution for the positive integers. Thus the solution for an amount is to index the list at that position.
I'm less clear on addCoin, though. It somehow uses the value of each coin to draw elements from the list of coins? I'm struggling to find an intuitive meaning for it.
The fold in aux also ties my brain up in knots. Why is 1:repeat 0 the initial value? What does it represent?

It's a direct translation of the imperative DP algorithm for the problem, which looks like this (in Python):
def count(cents, coins):
solutions = [1] + [0]*cents # [1, 0, 0, 0, ... 0]
for coin in coins:
for i in range(coin, cents + 1):
solutions[i] += solutions[i - coin]
return solutions[cents]
In particular, addCoin coin solutions corresponds to
for i in range(coin, cents + 1):
solutions[i] += solutions[i - coin]
except that addCoin returns a modified list instead of mutating the old one. As to the Haskell version, the result should have an unchanged section at the beginning until the coin-th element, and after that we must implement solutions[i] += solutions[i - coin].
We realize the unchanged part by take c oldlist and the modified part by zipWith (+) newlist (drop c oldlist). In the modified part we add together the i-th elements of the old list and i - coin-th elements of the resulting list. The shifting of indices is implicit in the drop and take operations.
A simpler, classic example for this kind of shifting and recursive definition is the Fibonacci numbers:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
We would write this imperatively as
def fibs(limit):
res = [0, 1] + [0]*(limit - 2)
for i in range(2, limit):
res[i] = res[i - 2] + res[i - 1]
return res
Turning back to coin change, foldr addCoin (1:repeat 0) corresponds to the initialization of solutions and the for loop on the coins, with the change that the initial list is infinite instead of finite (because laziness lets us do that).

How to convert (0,0) to [0,0] in prolog?

I'm making a predicate distance/3 that calculates the distance between 2 points on a 2d plane. For example :
?- distance((0,0), (3,4), X).
X = 5
Yes
My predicate only works if (0,0) is the list [0,0]. Is there a way to make this conversion?

You can do this with a simple rule that unifies its left and right sides:
convert((A,B), [A,B]).
Demo.

Although the others have answered, keep in mind that (a,b) in Prolog is actually not what you might think it is:
?- write_canonical((a,b)).
','(a,b)
true.
So this is the term ','/2. If you are working with pairs, you can do two things that are probably "prettier":
Keep them as a "pair", a-b:
?- write_canonical(a-b).
-(a,b)
true.
The advantage here is that pairs like this can be manipulated with a bunch of de-facto standard predicates, for example keysort, as well as library(pairs).
Or, if they are actually a data structure that is part of your program, you might as well make that explicit, as in coor(a, b) for example. A distance in two-dimensional space will then take two coor/2 terms:
distance(coor(X1, Y1), coor(X2, Y2), D) :-
D is sqrt((X1-X2)^2 + (Y1-Y2)^2).
If you don't know how many dimensions you have, you can then indeed keep the coordinates of each point in a list. The message here is that lists are meant for things that can have 0 or more elements in them, while pairs, or other terms with arity 2, or any term with a known arity, are more explicit about the number of elements they have.

If you just have a simple pair, you can use the univ operator and simply say something like:
X = (a,b) ,
X =.. [_|Y] .
which produces
X = (a,b) .
Y = [a,b] .
This doesn't work if X is something like (a,b,c), producing as it does
X = (a,b,c) .
Y = [a,(b,c)] .
[probably not what you want].
The more general case is pretty simple:
csv2list( X , [X] ) :- % We have a list of length 1
var(X) . % - if X is UNbound
csv2list( X , [X] ) :- % We have a list of length 1
nonvar(X) , % - if X is bound, and
X \= (_,_) . % - X is not a (_,_) term.
cs22list( Xs , [A|Ys] ) :- % otherwise (the general case) ,
nonvar(Xs) , % - if X is bound, and
Xs = (A,Bs) , % - X is a (_,) term,
csv2list(Bs,Ys % - recurse down added the first item to result list.
. % Easy!

Applying function to cartesian product of two unequal vectors

I am trying to avoid looping by using an documented apply function, but have not been able to find any examples to suit my purpose. I have two vectors, x which is (1 x p) and y which is (1 x q) and would like to feed the Cartesian product of their parameters into a function, here is a parsimonious example:
require(kernlab)
x = c("cranapple", "pear", "orange-aid", "mango", "kiwi",
"strawberry-kiwi", "fruit-punch", "pomegranate")
y = c("apple", "cranberry", "orange", "peach")
sk <- stringdot(type="boundrange", length = l, normalized=TRUE)
sk_map = function(x, y){return(sk(x, y))}
I realize I could use an apply function over one dimension and loop for the other, but I feel like there has to be a way to do it in one step... any ideas?

Is this what you had in mind:
sk <- stringdot(type="boundrange", length = 2, normalized=TRUE)
# Create data frame with every combination of x and y
dat = expand.grid(x=x,y=y)
# Apply sk by row
sk_map = apply(dat, 1, function(dat_row) sk(dat_row[1],dat_row[2]))

You can use the outer function for this if your function is vectorized, and you can use the Vectorize function to create a vectorized function if it is not.
outer(x,y,FUN=sk)
or
outer(x,y, FUN=Vectorize(sk))

Return the indices "ordering" of the list?

Is there some Haskell function that can take a list, say of doubles, like this:
[0.5, 0.6, 0.1, 0.7]
and return a list of integers, which represent indices of the items in order. In the above case it would be:
[2, 0, 1, 3]
NOTE: What I am trying to achieve is some function (let's call it consistent) that can compare two lists of doubles, and tell the user if the lists's relative ordering is consistent:
> consistent [1.0, 2.0, 3.0] [2.1 3.5 4.6]
True
> consistent [1.0, 2.0, 3.0] [3.0, 2.0, 1.0]
False

You can zip the list with [0..] to pair each item with its index. Then you can sort that list and use map to get the second element of every pair (the index).

I think you could do something like this too:
swap (a, b) = (b, a)
consistent a b = sort (zip a b) == map swap $ sort (zip b a)
I've never coded seriously in Haskell, so the syntax is just something vaguely similar to what I remember after having read "Learn yourself a Haskell..." months ago.

Problem detecting cyclic numbers in Haskell

I am doing problem 61 at project Euler and came up with the following code (to test the case they give):
p3 n = n*(n+1) `div` 2
p4 n = n*n
p5 n = n*(3*n -1) `div` 2
p6 n = n*(2*n -1)
p7 n = n*(5*n -3) `div` 2
p8 n = n*(3*n -2)
x n = take 2 $ show n
x2 n = reverse $ take 2 $ reverse $ show n
pX p = dropWhile (< 999) $ takeWhile (< 10000) [p n|n<-[1..]]
isCyclic2 (a,b,c) = x2 b == x c && x2 c == x a && x2 a == x b
ns2 = [(a,b,c)|a <- pX p3 , b <- pX p4 , c <- pX p5 , isCyclic2 (a,b,c)]
And all ns2 does is return an empty list, yet cyclic2 with the arguments given as the example in the question, yet the series doesn't come up in the solution. The problem must lie in the list comprehension ns2 but I can't see where, what have I done wrong?
Also, how can I make it so that the pX only gets the pX (n) up to the pX used in the previous pX?
PS: in case you thought I completely missed the problem, I will get my final solution with this:
isCyclic (a,b,c,d,e,f) = x2 a == x b && x2 b == x c && x2 c == x d && x2 d == x e && x2 e == x f && x2 f == x a
ns = [[a,b,c,d,e,f]|a <- pX p3 , b <- pX p4 , c <- pX p5 , d <- pX p6 , e <- pX p7 , f <- pX p8 ,isCyclic (a,b,c,d,e,f)]
answer = sum $ head ns

The order is important. The cyclic numbers in the question are 8128, 2882, 8281, and these are not P3/127, P4/91, P5/44 but P3/127, P5/44, P4/91.
Your code is only checking in the order 8128, 8281, 2882, which is not cyclic.
You would get the result if you check for
isCyclic2 (a,c,b)
in your list comprehension.

EDIT: Wrong Problem!
I assumed you were talking about the circular number problem, Sorry!
There is a more efficient way to do this with something like this:
take (2 * l x -1) . cycle $ show x
where l = length . show
Try that and see where it gets you.

If I understand you right here, you're no longer asking why your code doesn't work but how to make it faster. That's actually the whole fun of Project Euler to find an efficient way to solve the problems, so proceed with care and first try to think of reducing your search space yourself. I suggest you let Haskell print out the three lists pX p3, pX p4, pX p5 and see how you'd go about looking for a cycle.
If you would proceed like your list comprehension, you'd start with the first element of each list, 1035, 1024, 1080. I'm pretty sure you would stop right after picking 1035 and 1024 and not test for cycles with any value from P5, let alone try all the permutations of the combinations involving these two numbers.
(I haven't actually worked on this problem yet, so this is how I would go about speeding it up. There may be some math wizardry out there that's even faster)
First, start looking at the numbers you get from pX. You can drop more than those. For example, P3 contains 6105 - there's no way you're going to find a number in the other sets starting with '05'. So you can also drop those numbers where the number modulo 100 is less than 10.
Then (for the case of 3 sets), we can sometimes see after drawing two numbers that there can't be any number in the last set that will give you a cycle, no matter how you permutate (e.g. 1035 from P3 and 3136 from P4 - there can't be a cycle here).
I'd probably try to build a chain by starting with the elements from one list, one by one, and for each element, find the elements from the remaining lists that are valid successors. For those that you've found, continue trying to find the next chain element from the remaining lists. When you've built a chain with one number from every list, you just have to check if the last two digits of the last number match the first two digits of the first number.
Note when looking for successors, you again don't have to traverse the entire lists. If you're looking for a successor to 3015 from P5, for example, you can stop when you hit a number that's 1600 or larger.
If that's too slow still, you could transform the lists other than the first one to maps where the map key is the first two digits and the associated values are lists of numbers that start with those digits. Saves you from going through the lists from the start again and again.
I hope this helps a bit.

btw, I sense some repetition in your code.
you can unite your [p3, p4, p5, p6, p7, p8] functions into one function that will take the 3 from the p3 as a parameter etc.
to find what the pattern is, you can make all the functions in the form of
pX n = ... `div` 2

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string