Applying function to cartesian product of two unequal vectors - string

I am trying to avoid looping by using an documented apply function, but have not been able to find any examples to suit my purpose. I have two vectors, x which is (1 x p) and y which is (1 x q) and would like to feed the Cartesian product of their parameters into a function, here is a parsimonious example:
require(kernlab)
x = c("cranapple", "pear", "orange-aid", "mango", "kiwi",
"strawberry-kiwi", "fruit-punch", "pomegranate")
y = c("apple", "cranberry", "orange", "peach")
sk <- stringdot(type="boundrange", length = l, normalized=TRUE)
sk_map = function(x, y){return(sk(x, y))}
I realize I could use an apply function over one dimension and loop for the other, but I feel like there has to be a way to do it in one step... any ideas?

Is this what you had in mind:
sk <- stringdot(type="boundrange", length = 2, normalized=TRUE)
# Create data frame with every combination of x and y
dat = expand.grid(x=x,y=y)
# Apply sk by row
sk_map = apply(dat, 1, function(dat_row) sk(dat_row[1],dat_row[2]))

You can use the outer function for this if your function is vectorized, and you can use the Vectorize function to create a vectorized function if it is not.
outer(x,y,FUN=sk)
or
outer(x,y, FUN=Vectorize(sk))

Related

subtracting every element of a list from a previous element in a dictionary

I have a dictionary which has a 2D list (list of a list). This 2D list contains x and y coordinates [x,y] of a particle. Whenever the particle moves, its new coordinates are appended to this 2D list in a dictionary. I want to calculate the distance between every location and append the result to another list (can just be a normal list without dictionary). What I want is something like the following:
dist1 = sqrt((x1-x0)^2 + (y1-y0)^2)
dist2 = sqrt((x2-x1)^2 + (y2-y1)^2)
.....
distN = sqrt((xN-xN-1)^2 + (yN-yN-1)^2)
but I am having issues in accessing elements of a list in a dictionary. I have a very long 2D list but you can use the below example to give me some suggestions.
c = {"coordinates":[[1,2],[3,4],[5,6],[7,8]]}
for k, dk in c.items():
for x in dk:
print(x[0], x[1])
I can access one element in the dk at a time in a loop but how to get the previous one? There should be a nice way of doing it but I just don't know.
Any help will be appreciated.
Using a for loop (probably not the most efficient solution):
import numpy as np
c = {"coordinates":[[1,2],[3,4],[5,6],[7,8]]}
coordinates = np.array(c['coordinates'])
distances = []
for i in range(1, len(coordinates)):
distances.append(np.linalg.norm(coordinates[i-1] - coordinates[i]))
print(distances)
# [2.8284271247461903, 2.8284271247461903, 2.8284271247461903]
I also used numpy and its linalg.norm function to calculate the distance (How can the Euclidean distance be calculated with NumPy?), but you could ofcourse use your own function or calculation in case you'd want that.
I tried this and it also works:
c = {"coordinates":[[1,2],[3,4],[15,6],[7,8]]}
l1 = []
for k, dk in c.items():
for x in dk:
l1.append(x)
print(l1)
dist = [math.sqrt((p1[0]-p0[0])**2 + (p1[1]-p0[1])**2) for p1,p0 in zip(l1,l1[1:]
as others suggested in this question, better way to get l1 is to use the following command:
l1 = c["coordinates"]
dist = [math.sqrt((p1[0]-p0[0])**2 + (p1[1]-p0[1])**2) for p1,p0 in zip(l1,l1[1:]

Multiply every element of matrix with a vector to obtain a matrix whose elements are vectors themselves

I need help in speeding up the following block of code:
import numpy as np
x = 100
pp = np.zeros((x, x))
M = np.ones((x,x))
arrayA = np.random.uniform(0,5,2000)
arrayB = np.random.uniform(0,5,2000)
for i in range(x):
for j in range(x):
y = np.multiply(arrayA, np.exp(-1j*(M[j,i])*arrayB))
p = np.trapz(y, arrayB) # Numerical evaluation/integration y
pp[j,i] = abs(p**2)
Is there a function in numpy or another method to rewrite this piece of code with so that the nested for-loops can be omitted? My idea would be a function that multiplies every element of M with the vector arrayB so we get a 100 x 100 matrix in which each element is a vector itself. And then further each vector gets multiplied by arrayA with the np.multiply() function to then again obtain a 100 x 100 matrix in which each element is a vector itself. Then at the end perform numerical integration for each of those vectors with np.trapz() to obtain a 100 x 100 matrix of which each element is a scalar.
My problem though is that I lack knowledge of such functions which would perform this.
Thanks in advance for your help!
Edit:
Using broadcasting with
M = np.asarray(M)[..., None]
y = 1000*arrayA*np.exp(-1j*M*arrayB)
return np.trapz(y,B)
works and I can ommit the for-loops. However, this is not faster, but instead a little bit slower in my case. This might be a memory issue.
y = np.multiply(arrayA, np.exp(-1j*(M[j,i])*arrayB))
can be written as
y = arrayA * np.exp(-1j*M[:,:,None]*arrayB
producing a (x,x,2000) array.
But the next step may need adjustment. I'm not familiar with np.trapz.
np.trapz(y, arrayB)

How to plot best fit line for values in a list less than an integer?

I'm trying to plot a best fit line for when values in my list x are less than x_c (in this case 20).
plt.scatter(x,tf)
x_c = (20)
filter1 = [a < x_c for a in x]
m, b = np.polyfit(x[filter1], tr[filter1], 1)
plt.plot(x[filter1], m*x[filter1]+ b)
When I do that I get this error: TypeError: list indices must be integers or slices, not list
I also tried it with
filter = [x < x_c]
and that also did not work
Python lists do not support boolean indexing. (But numpy arrays do!) You must select the items from your list that match your condition, and then use the new list for plotting:
good_x = [a for a in x if a < x_c]
....
plt.plot(good_x, ...)

polyfit with multi-dimensional x coordinate

Suppose that I have a (400,10) array called x and a (400,10) array called y. Is that possible to do a polyfit of each row in y to the corresponding row in x without iteration? If with for loop it will be something like
import numpy as np
coe = np.zeros((400,3))
for i in np.arange(y.shape[0]):
coe[i,:] = np.polyfit(x[i,:], y[i,:], 2)
Because the 400 rows in x is totally different, I cannot just apply np.polyfit with the same x coordinate to a multi-dimensional array y.
Have you tried a comprehension?
coe = [tuple(np.polyfit(x[i,:], y[i,:], 2)) for i in range(400)]
The range(400) emits the values 0 to 399 into i
For each i, you compute the polyfit for x[i,:] vs y[i,:]. I believe the results are a tuple (p, v)
The resulting list-of-tuples is assigned to coe
At the innermost levels, this is an iteration - but in Python 3, such comprehensions are optimized for performance at the C level, so you will probably see a nice performance boost doing it this way over using a for: loop.

counting results from a defined matrix

So I am very new to programming and Haskell is the first language that I'm learning. The problem I'm having is probably a very simple one but I simply can not find an answer, no matter how much I search.
So basically what I have is a 3x3-Matrix and each of the elements has a number from 1 to 3. This Matrix is predefined, now all I need to do is create a function which when I input 1, 2 or 3 tells me how many elements there are in this matrix with this value.
I've been trying around with different things but none of them appear to be allowed, for example I've defined 3 variables for each of the possible numbers and tried to define them by
value w =
let a=0
b=0
c=0
in
if matrix 1 1==1 then a=a+1 else if matrix 1 1==2 then b=b+1
etc. etc. for every combination and field.
<- ignoring the wrong syntax which I'm really struggling with, the fact that I can't use a "=" with "if, then" is my biggest problem. Is there a way to bypass this or maybe a way to use "stored data" from previously defined functions?
I hope I made my question somewhat clear, as I said I've only been at programming for 2 days now and I just can't seem to find a way to make this work!
By default, Haskell doesn't use updateable variables. Instead, you typically make a new value, and pass it somewhere else (e.g., return it from a function, add it into a list, etc).
I would approach this in two steps: get a list of the elements from your matrix, then count the elements with each value.
-- get list of elements using list comprehension
elements = [matrix i j | i <- [1..3], j <- [1..3]]
-- define counting function
count (x,y,z) (1:tail) = count (x+1,y,z) tail
count (x,y,z) (2:tail) = count (x,y+1,z) tail
count (x,y,z) (3:tail) = count (x,y,z+1) tail
count scores [] = scores
-- use counting function
(a,b,c) = count (0,0,0) elements
There are better ways of accumulating scores, but this seems closest to what your question is looking for.
Per comments below, an example of a more idiomatic counting method, using foldl and an accumulation function addscore instead of the count function above:
-- define accumulation function
addscore (x,y,z) 1 = (x+1,y,z)
addscore (x,y,z) 2 = (x,y+1,z)
addscore (x,y,z) 3 = (x,y,z+1)
-- use accumulation function
(a,b,c) = foldl addscore (0,0,0) elements

Resources