Problem in numpy operation for a 2*2 list - python-3.x

enter image description here
I wrote a code for 2D list.
row_num = int(input())
col_ = int(input())
arr2=[]
for i in range(row_num):
arr2.append([])
a=input()
a=a.split(" ")
for j in range(col_):
arr2[i].append(a[j])
for j in range(2):
arr2[j][-2]=float(arr2[j][-2])-float(arr2[j][-1])
print(arr2)
first I didn't convert list into np array so my output was
2
2
2 9
2 9
[[-7.0, '9'], [-7.0, '9']]
but when I convert list into np array and do same operation
row_num = int(input())
col_ = int(input())
arr2=[]
for i in range(row_num):
arr2.append([])
a=input()
a=a.split(" ")
for j in range(col_):
arr2[i].append(a[j])
arr2=np.array(arr2) #here I am converting list into np array
for j in range(2):
arr2[j][-2]=float(arr2[j][-2])-float(arr2[j][-1])
print(arr2)
I got different output
2
2
2 9
2 9
[['-' '9']
['-' '9']]
I don't know, why I am getting different answers?

The documentation for numpy.array() says, for its parameter dtype:
The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. ...
This is exactly what happened here. You were expecting the array to be able to hold both str and float values, like ordinary Python arrays do; for this the dtype should be object. Since you didn't specify the type, it chose single characters instead:
>>> np.array(['2', '9'])
array(['2', '9'],
dtype='<U1')
And then when you tried to put -7.0 into one slot of an array of single characters, it must have turned it into a string '-7.0' and only used the first character of that.
So specify the dtype you want for your array when you create it. If you're looking to gain some of the performance advantages of Numpy, you probably want to use a floating-point dtype and convert your strings into floats before you put them into the Numpy array. Or you could do the conversion with astype():
>>> np.array(['2', '9']).astype(float)
array([ 1., 2.])

Related

How to set the values of different channels in a numpy array to zero based on index values from another array without using loops?

I have a numpy array "arr" and an array of indices "ind":
import numpy as np
arr = np.random.randint(255, size=(100,64,64,16))
ind = np.random.randint(16, size=(100,2))
In arr, the last dimension represents channels while the first dimension represents number of samples. In ind, each row represents the two random channel indices corresponding to every sample in arr. I wish to make the values of entire 64 x 64 channel equal to 0 for every sample in arr corresponding to channel index in ind, without using any loop. How can this be achieved?
I have tried using:
arr[:,:,:,ind] = 0
I thought that indices would be broadcasted as per the sample, but instead entire array becomes 0. Using loops is quite time consuming and inefficient. I also wanted to use np.where, but I am not sure what condition to use to access the indices of arrays.
I believe that you can use np.put_along_axis for this:
import numpy as np
arr = np.random.randint(255, size=(100,64,64,16))
ind = np.random.randint(16, size=(100,2))
np.put_along_axis(arr, ind[:, None, None, :], 0, axis=3)

Array entry used in function turns from nan to 0 numpy python

I made a simple function that produces a weighted average of several time series using supplied weights. It is designed to handle missing values (NaNs), which is why I am not using numpy's supplied average function.
However, when I feed it my array containing missing values, the array has its nan values replaced by 0s! I would have assumed that since I am changing the name of the array and it is not a global variable this should not happen. I want my X array to retain its original form including the nan value
I am a relative novice using python (obviously).
Example:
X = np.array([[1, 2, 3], [1, 2, 3], [1, 2, np.nan]]) # 3 time series to be weighted together
weights = np.array([[1,1,1]]) # simple example with weights for each series as 1
def WeightedMeanNaN(Tseries, weights):
## calculates weighted mean
N_Tseries = Tseries
Weights = np.repeat(weights, len(N_Tseries), axis=0) # make a vector of weights matching size of time series
loc = np.where(np.isnan(N_Tseries)) # get location of nans
Weights[loc] = 0
N_Tseries[loc] = 0
Weights = Weights/Weights.sum(axis=1)[:,None] # normalize each row so that weights sum to 1
WeightedAve = np.multiply(N_Tseries,Weights)
WeightedAve = WeightedAve.sum(axis=1)
return WeightedAve
WeightedMeanNaN(Tseries = X, weights = weights)
Out[161]: array([2. , 2. , 1.5])
In:X
Out:
array([[1., 2., 3.],
[1., 2., 3.],
[1., 2., 0.]]) # no longer nan!! ```
Where you call
loc = np.where(np.isnan(N_Tseries)) # get location of nans
Weights[loc] = 0
N_Tseries[loc] = 0
You remove all NaNs and set them to zeros.
To reverse this you could iterate over the array and replace zeros with NaNs.
However, this would also set regular zeros to Nans.
So it turns out this is a mistake caused by me being used to working in Matlab. Python treats arguments supplied to the function as pointers to the original object. In contrast, Matlab creates copies that are discarded when the function ends.
I solved my problem by adding ".copy()" when assigning variables in the function, so that the first line in the function above becomes:
N_Tseries = Tseries.copy().
However, one thing that puzzles me is that some people have suggested that using Tseries[:] should also create a copy of Tseries rather than a pointer to the original variable. This did not work for me though.
I found this answer useful:
Python function not supposed to change a global variable

Delimit array with different strings

I have a text file that contains 3 columns of useful data that I would like to be able to extract in python using numpy. The file type is a *.nc and is NOT a netCDF4 filetype. It is a standard file output type for CNC machines. In my case it is sort of a CMM (coordinate measurement machine). The format goes something like this:
X0.8523542Y0.0000000Z0.5312869
The X,Y, and Z are the coordinate axes on the machine. My question is, can I delimit an array with multiple delimiters? In this case: "X","Y", and "Z".
You can use Pandas
import pandas as pd
from io import StringIO
#Create a mock file
ncfile = StringIO("""X0.8523542Y0.0000000Z0.5312869
X0.7523542Y1.0000000Z0.5312869
X0.6523542Y2.0000000Z0.5312869
X0.5523542Y3.0000000Z0.5312869""")
df = pd.read_csv(ncfile,header=None)
#Use regex with split to define delimiters as X, Y, Z.
df_out = df[0].str.split(r'X|Y|Z', expand=True)
df_out.set_axis(['index','X','Y','Z'], axis=1, inplace=False)
Output:
index X Y Z
0 0.8523542 0.0000000 0.5312869
1 0.7523542 1.0000000 0.5312869
2 0.6523542 2.0000000 0.5312869
3 0.5523542 3.0000000 0.5312869
I ended up using the Pandas solution provided by Scott. For some reason I am not 100% clear on, I cannot simply convert the array from string to float with float(array). I created an array of equal size and iterated over the size of the array, converting each individual element to a float and saving it to the other array.
Thanks all
Using the filter function that I suggested in a comment:
String sample (standin for file):
In [1]: txt = '''X0.8523542Y0.0000000Z0.5312869
...: X0.8523542Y0.0000000Z0.5312869
...: X0.8523542Y0.0000000Z0.5312869
...: X0.8523542Y0.0000000Z0.5312869'''
Basic genfromtxt use - getting strings:
In [3]: np.genfromtxt(txt.splitlines(), dtype=None,encoding=None)
Out[3]:
array(['X0.8523542Y0.0000000Z0.5312869', 'X0.8523542Y0.0000000Z0.5312869',
'X0.8523542Y0.0000000Z0.5312869', 'X0.8523542Y0.0000000Z0.5312869'],
dtype='<U30')
This array of strings could be split in the same spirit as the pandas answer.
Define a function to replace the delimiter characters in a line:
In [6]: def foo(aline):
...: return aline.replace('X','').replace('Y',',').replace('Z',',')
re could be used for a prettier split.
Test it:
In [7]: foo('X0.8523542Y0.0000000Z0.5312869')
Out[7]: '0.8523542,0.0000000,0.5312869'
Use it in genfromtxt:
In [9]: np.genfromtxt((foo(aline) for aline in txt.splitlines()), dtype=float,delimiter=',')
Out[9]:
array([[0.8523542, 0. , 0.5312869],
[0.8523542, 0. , 0.5312869],
[0.8523542, 0. , 0.5312869],
[0.8523542, 0. , 0.5312869]])
With a file instead, the generator would something like:
(foo(aline) for aline in open(afile))

Numpy append 3D vectors without flattening [duplicate]

This question already has answers here:
How do I add an extra column to a NumPy array?
(17 answers)
Closed 5 years ago.
l have the following vector
video_132.shape
Out[64]: (64, 3)
that l would to add to it a new 3D vector of three values
video_146[1][146][45]
such that
video_146[1][146][45].shape
Out[68]: (3,)
and
video_146[1][146][45]
Out[69]: array([217, 207, 198], dtype=uint8)
when l do the following
np.append(video_132,video_146[1][146][45])
l'm supposed to get
video_132.shape
Out[64]: (65, 3) # originally (64,3)
However l get :
Out[67]: (195,) # 64*3+3=195
It seems that it flattens the vector
How can l do the append by preserving the 3D structure ?
For visual simplicity let's rename video_132 --> a, and video_146[1][146][45] --> b. The particular values aren't important so let's say
In [82]: a = np.zeros((64, 3))
In [83]: b = np.ones((3,))
Then we can append b to a using:
In [84]: np.concatenate([a, b[None, :]]).shape
Out[84]: (65, 3)
Since np.concatenate returns a new array, reassign its return value to a to "append" b to a:
a = np.concatenate([a, b[None, :]])
Code for append:
def append(arr, values, axis=None):
arr = asanyarray(arr)
if axis is None:
if arr.ndim != 1:
arr = arr.ravel()
values = ravel(values)
axis = arr.ndim-1
return concatenate((arr, values), axis=axis)
Note how arr is raveled if no axis is provided
In [57]: np.append(np.ones((2,3)),2)
Out[57]: array([1., 1., 1., 1., 1., 1., 2.])
append is really aimed as simple cases like adding a scalar to a 1d array:
In [58]: np.append(np.arange(3),6)
Out[58]: array([0, 1, 2, 6])
Otherwise the behavior is hard to predict.
concatenate is the base operation (builtin) and takes a list, not just two. So we can collect many arrays (or lists) in one list and do one concatenate at the end of a loop. And since it doesn't tweak the dimensions before hand, it forces us to do that ourselves.
So to add a shape (3,) to a (64,3) we have transform that (3,) into (1,3). append requires the same dimension adjustment as concatenate if we specify the axis.
In [68]: np.append(arr,b[None,:], axis=0).shape
Out[68]: (65, 3)
In [69]: np.concatenate([arr,b[None,:]], axis=0).shape
Out[69]: (65, 3)

python 3 round on one numpy float unexpectedly yielding float

I have an np.array startIdx originating from a list of tuples consisting of integer and float fields:
>>> startIdx, someInt, someFloat = np.array(resultList).T
>>> startIdx
array([0.0, 111.0, 333.0]) # 10 to a few 100 positive values of the order of 100 to 10000
>>> round(startIdx[2])
333.0 # oops
>>> help(round)
Round [...] returns an int when called with one argument, otherwise the same type as the number.
>>> round(np.pi)
3
>>> round(np.pi, 2) # the optional argument is the number of decimal digits
3.14
round([0.0, 111.0, 333.0][2]) # to test whether it's specific for numpy arrays.
333
The float currently works (as index into numpy arrays) but yields a warning:
VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
I could avoid the conversion from tuples to arrays (and int to float) by collecting my results in a grossly oversized record array (with an int field ''startIdx'').
I could use something like int(. + 0.1), which is also ugly. Would int(round(.)) or even int(.) safely yield correct results?
In [70]: startIdx=np.array([0.0, 111.0, 333.0])
In [71]: startIdx
Out[71]: array([ 0., 111., 333.])
If you need an integer array, use astype:
In [72]: startIdx.astype(int)
Out[72]: array([ 0, 111, 333])
not round:
In [73]: np.round(startIdx)
Out[73]: array([ 0., 111., 333.])
np.array(resultList) produces a float dtype array because some values are float. arr=np.array(resultList, dtype='i,i,f') should produce a structured array with integer and float fields, assuming resultList is indeed a list of tuples.
startIdx = arr['f0']
should then be an integer dtype array.
I expect the memory use of the structured array to be the same as for the float one.

Resources