How to choose multiple columns from a sympy matrix? Broken indexing? - python-3.x

I'm trying to pick multiple columns from a sympy matrix. However, the indexing does not work as expected. The code
import sympy as sp
stdA = sp.Matrix(
[
[-2, 1, 1, 0],
[1, 1, 0, 1]
]
)
b = sp.Matrix(
[
[3],
[2]
]
)
B1 = stdA[:, [0, 1]]
B2 = stdA[:, [0, 2]]
B3 = stdA[:, [0, 3]]
B4 = stdA[:, [1, 2]]
B5 = stdA[:, [1, 3]]
B6 = stdA[:, [2, 3]]
print("std A =", stdA)
print("b =", b)
print("B1 =", B1)
print("B2 =", B2)
print("B3 =", B3)
print("B4 =", B4)
print("B5 =", B5)
print("B6 =", B6)
prints
See the issue with B3, and the matrices after it? It' supposed to read B3 = Matrix([[-2, 1], [0, 1]]). I thought slicing Sympy matrices produces copies of them, so stdA shouldn't be altered in place.
What is causing this erraneous behaviour, and how can I choose specific columns from a matrix with simple indexing?

You requested all rows and columns 0 and 3. That is what you got:
>>> B3
Matrix([
[-2, 0],
[ 1, 1]])
Matrix presents the contents as a list of rows.

Related

Array element shift

I have a question. Sorry if it's very simple, I'm new to this and have struggled for several hours to do this without success.
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
I am trying to divide the first element of a1 by the second element of a2, the second element of a1 by the third element of a2, the third element of a1 by the fourth element of a2, etc...it's a long list but this is a short form.
The new array or list should be something like this:
a3 = [(1/2, 2/3, 3/4, 4/5, 5/6, 6/7, 7/8, 8/9, 9/10]
Here is my code:
a1_new = a1[:-1]
a2_new = a1[1:]
a3 = a1_new/a2_new
return a3
The answer is not correct.
What is a better way to do this?
In Excel 365
={1,2,3,4,5,6,7,8,9,10}/{2,3,4,5,6,7,8,9,10,11}

Check if all list values in dataframe column are the same [duplicate]

If the type of a column in dataframe is int, float or string, we can get its unique values with columnName.unique().
But what if this column is a list, e.g. [1, 2, 3].
How could I get the unique of this column?
I think you can convert values to tuples and then unique works nice:
df = pd.DataFrame({'col':[[1,1,2],[2,1,3,3],[1,1,2],[1,1,2]]})
print (df)
col
0 [1, 1, 2]
1 [2, 1, 3, 3]
2 [1, 1, 2]
3 [1, 1, 2]
print (df['col'].apply(tuple).unique())
[(1, 1, 2) (2, 1, 3, 3)]
L = [list(x) for x in df['col'].apply(tuple).unique()]
print (L)
[[1, 1, 2], [2, 1, 3, 3]]
You cannot apply unique() on a non-hashable type such as list. You need to convert to a hashable type to do that.
A better solution using the latest version of pandas is to use duplicated() and you avoid iterating over the values to convert to list again.
df[~df.col.apply(tuple).duplicated()]
That would return as lists the unique values.

Averaging n elements along 1st axis of 4D array with numpy

I have a 4D array containing daily time-series of gridded data for different years with shape (year, day, x-coordinate, y-coordinate). The actual shape of my array is (19, 133, 288, 620), so I have 19 years of data with 133 days per year over a 288 x 620 grid. I want to take the weekly average of each grid cell over the period of record. The shape of the weekly averaged array should be (19, 19, 288, 620), or (year, week, x-coordinate, y-coordinate). I would like to use numpy to achieve this.
Here I construct some dummy data to work with and an array of what the solution should be:
import numpy as np
a1 = np.arange(1, 10).reshape(3, 3)
a1days = np.repeat(a1[np.newaxis, ...], 7, axis=0)
b1 = np.arange(10, 19).reshape(3, 3)
b1days = np.repeat(b1[np.newaxis, ...], 7, axis=0)
c1year = np.concatenate((a1days, b1days), axis=0)
a2 = np.arange(19, 28).reshape(3, 3)
a2days = np.repeat(a2[np.newaxis, ...], 7, axis=0)
b2 = np.arange(29, 38).reshape(3, 3)
b2days = np.repeat(b2[np.newaxis, ...], 7, axis=0)
c2year = np.concatenate((a2days, b2days), axis=0)
dummy_data = np.concatenate((c1year, c2year), axis=0).reshape(2, 14, 3, 3)
solution = np.concatenate((a1, b1, a2, b2), axis=0).reshape(2, 2, 3, 3)
The shape of the dummy_data is (2, 14, 3, 3). Per the dummy data, I have two years of data, 14 days per year, over a 3 X 3 grid. I want to return the weekly average of the grid for both years, resulting in a solution with shape (2, 2, 3, 3).
You can reshape and take mean:
week_mean = dummy_data.reshape(2,-1,7,3,3).mean(axis=2)
# in your case .reshape(year, -1, 7, x_coord, y_coord)
# check:
(dummy_data.reshape(2,2,7,3,3).mean(axis=2) == solution).all()
# True

Replace indicator values with actual values

I have a numpy array like this
array([[0, 0, 1],
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
and an array with values
array([1, 2, 3, 4])
I would like to replace the ones in the first two-dimensional array with the corresponding values in the second array. Each row of the first array has exactly one 1, and there is only 1 replacement in the second array.
Result:
array([[0, 0, 1],
[2, 0, 0],
[0, 3, 0],
[0, 0, 4]])
I would like an elegant solution to achieve this, without loops and such.
Let's say a is the 2D data array and b the second 1D array.
An elegant solution would be -
a[a==1] = b
For performance, leveraging the fact that there's exactly one 1 per row, we could also use indexing -
a[np.arange(len(a)),a.argmax(1)] = b
Selectively assign per row
If we want to selectively mask and asign values per row, we could use one more level of masking. So, let's say we have the rows to be selected as -
select_rows = np.array([1,3])
Then, we could do -
rowmask = np.isin(np.arange(len(a)),select_rows)
So, for the replacement for the first approach would be -
a[(a==1) & rowmask[:,None]] = b[rowmask]
And for the second one -
a[np.arange(len(a))[rowmask],a.argmax(1)[rowmask]] = b[rowmask]

python 3 double loop comprehension clarification

I'm curious about the double for-loop comprehension.
Comprehension:
multilist = [[row*col for col in range(colNum)] for row in range(rowNum)]
Normal double loop:
for row in range(rowNum):
for col in range(colNum):
multilist[row][col] = row*col
Both of the methods yield the same outcome. For instance, I insert 3 as my row and 5 as my col, they would produce
[[0, 0, 0, 0, 0], [0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]
My question is why the col for-loop is placed as the outer loop in the comprehension instead of the row for-loop? I would welcome any explanation.
Thank you.
In a list comprehension, such as yours, the farthest for loop (rowNum) is executed first.
multilist = [[row*col for col in range(colNum)] for row in range(rowNum)]
Therefore, col for-loop is still the inner loop in the comprehension.

Resources