How to normalize data loaded from file? Here what I have. Data looks kind of like this:
65535, 3670, 65535, 3885, -0.73, 1
65535, 3962, 65535, 3556, -0.72, 1
Last value in each line is a target. I want to have the same structure of the data but with normalized values.
import numpy as np
dataset = np.loadtxt('infrared_data.txt', delimiter=',')
# select first 5 columns as the data
X = dataset[:, 0:5]
# is that correct? Should I normalize along 0 axis?
normalized_X = preprocessing.normalize(X, axis=0)
y = dataset[:, 5]
Now the question is, how to pack correctly normalized_X and y back, that it has the structure:
dataset = [[normalized_X[0], y[0]],[normalized_X[1], y[1]],...]
It sounds like you're asking for np.column_stack. For example, let's set up some dummy data:
import numpy as np
x = np.arange(25).reshape(5, 5)
y = np.arange(5) + 1000
Which gives us:
X:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
Y:
array([1000, 1001, 1002, 1003, 1004])
And we want:
new = np.column_stack([x, y])
Which gives us:
New:
array([[ 0, 1, 2, 3, 4, 1000],
[ 5, 6, 7, 8, 9, 1001],
[ 10, 11, 12, 13, 14, 1002],
[ 15, 16, 17, 18, 19, 1003],
[ 20, 21, 22, 23, 24, 1004]])
If you'd prefer less typing, you can also use:
In [4]: np.c_[x, y]
Out[4]:
array([[ 0, 1, 2, 3, 4, 1000],
[ 5, 6, 7, 8, 9, 1001],
[ 10, 11, 12, 13, 14, 1002],
[ 15, 16, 17, 18, 19, 1003],
[ 20, 21, 22, 23, 24, 1004]])
However, I'd discourage using np.c_ for anything other than interactive use, simply due to readability concerns.
Related
Here is my matrices and codeline:
d = np.array([[1,2,3],[6,7,8],[11,12,13],
[16,17,18]])
e = np.array([[ 4, 5],[ 9, 10],[14, 15],[19, 20]])
np.concatenate(d,e)
and this is the error that I get:
TypeError: only integer scalar arrays can be converted to a scalar index
You have a syntax mistake in np.concatenate(d,e), the syntax requires d and e to be in a tuple, like: np.concatenate((d,e)). I tested it, and axis=1 is also required for it to work.
np.concatenate((d, e), axis=1)
is the solution
Since those arrays have different dimensions you should specify the axis concatenate you what like the follow:
1) np.concatenate((d,e), axis=1)
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20]])
or
2)np.concatenate((d,e), axis=None)
array([ 1, 2, 3, 6, 7, 8, 11, 12, 13, 16, 17, 18, 4, 5, 9, 10, 14,
15, 19, 20])
I have a 2d numpy array as such:
import numpy as np
a = np.arange(20).reshape((2,10))
# array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
I want to swap pairs of elements in each row. The desired output looks like this:
# array([[ 9, 0, 2, 1, 4, 3, 6, 5, 8, 7],
# [19, 10, 12, 11, 14, 13, 16, 15, 18, 17]])
I managed to find a solution in 1d:
a = np.arange(10)
# does the job for all pairs except the first
output = np.roll(np.flip(np.roll(a,-1).reshape((-1,2)),1).flatten(),2)
# first pair done manually
output[0] = a[-1]
output[1] = a[0]
Any ideas on a "numpy only" solution for the 2d case ?
Owing to the first pair not exactly subscribing to the usual pair swap, we can do that separately. For the rest, it would relatively straight-forward with reshaping to split axes and flip axis. Hence, it would be -
In [42]: a # 2D input array
Out[42]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
In [43]: b2 = a[:,1:-1].reshape(a.shape[0],-1,2)[...,::-1].reshape(a.shape[0],-1)
In [44]: np.hstack((a[:,[-1,0]],b2))
Out[44]:
array([[ 9, 0, 2, 1, 4, 3, 6, 5, 8, 7],
[19, 10, 12, 11, 14, 13, 16, 15, 18, 17]])
Alternatively, stack and then reshape+flip-axis -
In [50]: a1 = np.hstack((a[:,[0,-1]],a[:,1:-1]))
In [51]: a1.reshape(a.shape[0],-1,2)[...,::-1].reshape(a.shape[0],-1)
Out[51]:
array([[ 9, 0, 2, 1, 4, 3, 6, 5, 8, 7],
[19, 10, 12, 11, 14, 13, 16, 15, 18, 17]])
I am trying really hard for my function not to mess my global 'b' value because I wish to re-use that list in another similar function using sets but it seems that even not using the same name (y), they (b and y) are still bound together...
most of the print lines are for debugging only as I was not understanding what was happening.
What am I doing wrong?
import random
a = random.sample(range(1,25),8)
b = random.sample(range(1,25),11)
a.sort()
b.sort()
def list_rdup(x,y):
print('Loop remove duplicates:')
print('x:',x)
print('y:',y)
for i in x:
y.append(i)
y.sort()
print('y modified:',y)
c = []
for i in y:
if i in c:
pass
else:
c.append(i)
return c
print('a:',a)
print('b:',b)
print(list_rdup(a,b))
print('a:',a)
print('b:',b)
Output: we see a and b in their original state.. then I run the function and
print a and b again to show that b was modified in the process...
a: [1, 6, 10, 11, 12, 13, 17, 22]
b: [1, 2, 3, 7, 13, 16, 17, 19, 20, 21, 24]
Loop remove duplicates:
x: [1, 6, 10, 11, 12, 13, 17, 22]
y: [1, 2, 3, 7, 13, 16, 17, 19, 20, 21, 24]
y modified: [1, 1, 2, 3, 6, 7, 10, 11, 12, 13, 13, 16, 17, 17, 19, 20, 21, 22, 24]
[1, 2, 3, 6, 7, 10, 11, 12, 13, 16, 17, 19, 20, 21, 22, 24]
a: [1, 6, 10, 11, 12, 13, 17, 22]
b: [1, 1, 2, 3, 6, 7, 10, 11, 12, 13, 13, 16, 17, 17, 19, 20, 21, 22, 24]
The call list_rdup(a,b) simply passes the reference of a and b which are stored in x and y. So, any change in x and y will change a and b. If you do not want a and b to change make a copy by using b_copy = b.copy().
To avoid altering the elements, copy the array.
import random
a=[1, 6, 10, 11, 12, 13, 17, 22]
b=[1, 2, 3, 7, 13, 16, 17, 19, 20, 21, 24]
#a.sort()
#b.sort()
def list_rdup(x,y):
print('Loop remove duplicates:')
print('x:',x)
print('y:',y)
for i in x:
y.append(i)
y.sort()
print('y modified:',y)
c = []
for i in y:
if i in c:
pass
else:
c.append(i)
return c
print('a:',a)
print('b:',b)
print(list_rdup(a[:], b[:]))
print('a:',a)
print('b:',b)
a: [1, 6, 10, 11, 12, 13, 17, 22]
b: [1, 2, 3, 7, 13, 16, 17, 19, 20, 21, 24]
Loop remove duplicates:
x: [1, 6, 10, 11, 12, 13, 17, 22]
y: [1, 2, 3, 7, 13, 16, 17, 19, 20, 21, 24]
y modified: [1, 1, 2, 3, 6, 7, 10, 11, 12, 13, 13, 16, 17, 17, 19, 20, 21, 22, 24]
[1, 2, 3, 6, 7, 10, 11, 12, 13, 16, 17, 19, 20, 21, 22, 24]
a: [1, 6, 10, 11, 12, 13, 17, 22]
b: [1, 2, 3, 7, 13, 16, 17, 19, 20, 21, 24]
Also consider deep copying if you have non-primitive values in the array.
When does map modify an array in place? I know the preferred way to iterate over an array is with a list comprehension, but I'm preparing an algorithm for ipyparallel, which apparently uses the map function. Each row of my array is a set of model inputs, and I want to use map, ultimately in parallel, to run the model for each row. I'm using Python 3.4.5 and Numpy 1.11.1. I need these versions for compatibility with other packages.
This simple example creates a list and leaves the input array intact, as I expected.
grid = np.arange(25).reshape(5,5)
grid
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
def f(g):
return g + 1
n = list(map(f, grid))
grid
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
But when the function modifies a slice of the input row, the array is modified in place. Can anyone explain this behavior?
def f(g):
g[:2] = g[:2] + 1
return g
n = list(map(f, grid))
grid
array([[ 1, 2, 2, 3, 4],
[ 6, 7, 7, 8, 9],
[11, 12, 12, 13, 14],
[16, 17, 17, 18, 19],
[21, 22, 22, 23, 24]])
I am trying to get numbers between 0 and 25 assigned to 26 things on a list but cannot be repeated I am assuming that you would use and if and else statement but this is what I have so far
def f():
a=[0]*26
for x in a:
b=randrange(0,26)
a[b]=randrange(0,26)
return(a)
print(f())
Make a list of numbers 0..25 and shuffle it:
>>> import random
>>> a = list(range(26))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 2
2, 23, 24, 25]
>>> random.shuffle(a)
>>> a
[11, 3, 17, 0, 20, 13, 24, 21, 4, 12, 14, 1, 22, 18, 5, 8, 6, 10, 9, 25, 23, 19,
16, 7, 2, 15]