Related
I have a long script, but the key point is here:
result = confusion_matrix(y_test, ypred)
where y_test is
>>> y_test
ZFFYZTN 3
ZDDKDTY 0
ZTYKTYKD 0
ZYNDQNDK 1
ZYZQNKQN 3
..
ZYMDDTM 3
ZYLNYFLM 0
ZTNTKDY 0
ZYYLZNKM 3
ZYZMQTZT 0
Name: BT, Length: 91, dtype: object
and the values are
>>> y_test.values
array([3, 0, 0, 1, 3, 0, 0, 1, 0, 3, 1, 0, 3, 1, 0, 0, 3, 0, 3, 0, 0, 0,
1, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0, 2, 3, 3, 0, 0, 3, 3, 1, 1, 0, 2,
0, 0, 0, 3, 3, 3, 1, 0, 3, 3, 3, 2, 3, 3, 0, 1, 0, 3, 3, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0, 3, 2, 0, 0, 0, 3, 3, 3, 0,
0, 3, 0], dtype=object)
and ypred is
>>> ypred
array([3, 0, 0, 1, 3, 0, 0, 1, 0, 3, 1, 0, 3, 1, 0, 0, 3, 0, 3, 0, 0, 0,
1, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0, 2, 3, 3, 0, 0, 3, 3, 1, 1, 0, 2,
0, 0, 0, 3, 3, 3, 1, 0, 3, 3, 3, 2, 3, 3, 0, 1, 0, 3, 3, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0, 3, 2, 0, 0, 0, 3, 3, 3, 0,
0, 3, 0])
gives
raise ValueError("Classification metrics can't handle a mix of {0} "
ValueError: Classification metrics can't handle a mix of unknown and multiclass targets
The confusing part is that I don't see any unknown targets.
so I checked out ValueError: Classification metrics can't handle a mix of unknown and binary targets but the solution there doesn't apply in my case, because all values are integers.
I've also checked Skitlearn MLPClassifier ValueError: Can't handle mix of multiclass and multilabel-indicator but there aren't any encodings in my data.
What can I do to get the confusion matrix and avoid these errors?
This error is due to confusing types.
The solution is to cast y_test values as a list to confusion_matrix:
result = confusion_matrix(list(y_test.values), ypred)
I would like to know if there is a simple way to convert a simple list of 0 and 1 for example:
[[1, 1, 0, 0, 0, 0, 1, 1],
[1, 0, 1, 1, 1, 1, 0, 1],
[0, 1, 0, 1, 1, 0, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 0],
[0, 1, 0, 1, 1, 0, 1, 0],
[0, 1, 1, 0, 0, 1, 1, 0],
[1, 0, 1, 1, 1, 1, 0, 1],
[1, 1, 0, 0, 0, 0, 1, 1]]
Into a black and white image, for example for the previous list, this image :
smiley
Thanks for your help!
You can simply use matplolib (I named as X your input matrix):
import matplotlib.pyplot as plt
im = plt.imshow(X, cmap='Greys')
plt.show()
I´ve got a 3D numpy bit array, I need to pack them along the third axis. So exactly what numpy.packbits does. But unfortunately it packs it only to uint8, but I need more data, is there a similar way to pack it to uint16 or uint32?
Depending on your machine's endianness it is either a matter of simple view casting or of byte swapping and then view casting:
>>> a = np.random.randint(0, 2, (4, 16))
>>> a
array([[1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1],
[0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1]])
>>> np.packbits(a.reshape(-1, 2, 8)[:, ::-1]).view(np.uint16)
array([53226, 23751, 25853, 64619], dtype=uint16)
# check:
>>> [bin(x + (1<<16))[-16:] for x in _]
['1100111111101010', '0101110011000111', '0110010011111101', '1111110001101011']
You may have to reshape in the end.
I have the following labels
>>> lab
array([3, 0, 3 ,3, 1, 1, 2 ,2, 3, 0, 1,4])
I want to assign this label to another numpy array i.e
>>> arr
array([[81, 1, 3, 87], # 3
[ 2, 0, 1, 0], # 0
[13, 6, 0, 0], # 3
[14, 0, 1, 30], # 3
[ 0, 0, 0, 0], # 1
[ 0, 0, 0, 0], # 1
[ 0, 0, 0, 0], # 2
[ 0, 0, 0, 0], # 2
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 1
[13, 2, 0, 11]]) # 4
and add all corresponding rows with same labels.
The output must be
([[108, 7, 4,117]--3
[ 0, 0, 0, 0]--0
[ 0, 0, 0, 0]--1
[ 0, 0, 0, 0]--2
[13, 2, 0, 11]])--4
You could use groupby from pandas:
import pandas as pd
parr=pd.DataFrame(arr,index=lab)
pd.groupby(parr,by=parr.index).sum()
0 1 2 3
0 2 0 1 0
1 0 0 0 0
2 0 0 0 0
3 108 7 4 117
4 13 2 0 11
numpy doesn't have a group_by function like pandas, but it does have a reduceat method that performs fast array actions on groups of elements (rows). But it's application in this case is a bit messy.
Start with our 2 arrays:
In [39]: arr
Out[39]:
array([[81, 1, 3, 87],
[ 2, 0, 1, 0],
[13, 6, 0, 0],
[14, 0, 1, 30],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[13, 2, 0, 11]])
In [40]: lbls
Out[40]: array([3, 0, 3, 3, 1, 1, 2, 2, 3, 0, 1, 4])
Find the indices that will sort lbls (and rows of arr) into contiguous blocks:
In [41]: I=np.argsort(lbls)
In [42]: I
Out[42]: array([ 1, 9, 4, 5, 10, 6, 7, 0, 2, 3, 8, 11], dtype=int32)
In [43]: s_lbls=lbls[I]
In [44]: s_lbls
Out[44]: array([0, 0, 1, 1, 1, 2, 2, 3, 3, 3, 3, 4])
In [45]: s_arr=arr[I,:]
In [46]: s_arr
Out[46]:
array([[ 2, 0, 1, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[81, 1, 3, 87],
[13, 6, 0, 0],
[14, 0, 1, 30],
[ 0, 0, 0, 0],
[13, 2, 0, 11]])
Find the boundaries of these blocks, i.e. where s_lbls jumps:
In [47]: J=np.where(np.diff(s_lbls))
In [48]: J
Out[48]: (array([ 1, 4, 6, 10], dtype=int32),)
Add the index of the start of the first block (see the reduceat docs)
In [49]: J1=[0]+J[0].tolist()
In [50]: J1
Out[50]: [0, 1, 4, 6, 10]
Apply add.reduceat:
In [51]: np.add.reduceat(s_arr,J1,axis=0)
Out[51]:
array([[ 2, 0, 1, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[108, 7, 4, 117],
[ 13, 2, 0, 11]], dtype=int32)
These are your numbers, sorted by lbls (for 0,1,2,3,4).
With reduceat you could take other actions like maximum, product etc.
I have the following labels
>>> lab
array([2, 2, 2, 2, 2, 3, 3, 0, 0, 0, 0, 1])
I want to assign this label to another numpy array i.e
>>> arr
array([[81, 1, 3, 87], # 2
[ 2, 0, 1, 0], # 2
[13, 6, 0, 0], # 2
[14, 0, 1, 30], # 2
[ 0, 0, 0, 0], # 2
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 0
[13, 2, 0, 11]]) # 1
and add the elements of 0th group, 1st group, 2nd group, 3rd group?
If the labels of equal values are contiguous, as in your example, then you may use np.add.reduceat:
>>> lab
array([2, 2, 2, 2, 2, 3, 3, 0, 0, 0, 0, 1])
>>> idx = np.r_[0, 1 + np.where(lab[1:] != lab[:-1])[0]]
>>> np.add.reduceat(arr, idx)
array([[110, 7, 5, 117], # 2
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 0
[ 13, 2, 0, 11]]) # 1
if they are not contiguous, then use np.argsort to align the array and labels such that labels of the same values are next to each other:
>>> i = np.argsort(lab)
>>> lab, arr = lab[i], arr[i, :] # aligns array and labels such that labels
>>> lab # are sorted and equal labels are contiguous
array([0, 0, 0, 0, 1, 2, 2, 2, 2, 2, 3, 3])
>>> idx = np.r_[0, 1 + np.where(lab[1:] != lab[:-1])[0]]
>>> np.add.reduceat(arr, idx)
array([[ 0, 0, 0, 0], # 0
[ 13, 2, 0, 11], # 1
[110, 7, 5, 117], # 2
[ 0, 0, 0, 0]]) # 3
or alternatively use groupby from pandas library:
>>> pd.DataFrame(arr).groupby(lab).sum().values
array([[ 0, 0, 0, 0],
[ 13, 2, 0, 11],
[110, 7, 5, 117],
[ 0, 0, 0, 0]])