I want to update the value of a key in dictionary. This is a snippet of a list that contains over 300 dictionaries
chats = [
{'hour': 10, 'operator': 'john_doe', 'duration': [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'john_doe', 'duration': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'john_doe', 'duration': [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], 'date': '2019-09-09'},
{'hour': 11, 'operator': 'john_doe', 'duration': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'joseph_doe', 'duration': [5, 6, 7, 8, 9], 'date': '2019-09-09'}
]
script: I am getting an error on that script. I am looping to know if this dict is already in so that I can update the duration.
chat_list = list()
for chat in chats:
hour = chat.get('hour')
operator = chat.get("operator")
if len(chat_list) == 0:
chat_list.append(chat)
else:
found = False
for i in chat_list:
hour2 = chat.get('hour')
operator2 = chat.get("operator")
if (hour2 == hour) and (operator == operator2):
found = True
#concat both dictionary
i['duration'] = i.get('duration') + chat.get("duration")
if found == True:
found = False
else:
chat_list.append(chat)
My expected output is
chat_list = [
{'hour': 10, 'operator': 'john_doe', 'duration': [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], 'date': '2019-09-09'},
{'hour': 11, 'operator': 'john_doe', 'duration': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'joseph_doe', 'duration': [5, 6, 7, 8, 9], 'date': '2019-09-09'}
]
or
df = pd.DataFrame(chat_list)
df['duration'] = df['duration'].apply(lambda x: list(set(x)))
To be honest, I didn't tested your algorithm. Instead I took it as a small challenge and I wrote the following algorithm which doesn't need to copy chats in to a new list.
It finds the first occurrence of "similar" chat and concat the duration arrays. Then it deletes the "duplicated" chat. Further explanation in the code itself:
chats = [
{'hour': 10, 'operator': 'john_doe', 'duration': [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'john_doe', 'duration': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'john_doe', 'duration': [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], 'date': '2019-09-09'},
{'hour': 11, 'operator': 'john_doe', 'duration': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], 'date': '2019-09-09'},
{'hour': 10, 'operator': 'joseph_doe', 'duration': [5, 6, 7, 8, 9], 'date': '2019-09-09'}
]
index = 0
while index < len(chats) - 1:
chat = chats[index]
# detect if there is another "similar" chat in the list (before this one)
first_index = next(
i for i, first_chat in enumerate(chats)
if chat.get('hour') == first_chat.get('hour') and chat.get('operator') == first_chat.get('operator')
)
# if the first index found is not this one:
# - concat `duration` arrays
# - delete this (duplicated) chat
if index != first_index:
chats[first_index]['duration'] += chat['duration']
del chats[index]
# otherwise continue and increment the index
else:
index += 1
print(chats)
Related
I have this 2-D tensor:
tmp = torch.tensor([[ 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5,
5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11,
11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 15, 16, 16, 16, 17,
17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21, 22, 22, 22, 23,
23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 26, 27, 27, 27, 28, 28, 28, 29,
29, 29, 30, 30, 30, 31, 31, 31, 31],
[ 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5,
5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11,
11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 15, 0, 16, 16, 17,
17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21, 22, 22, 22, 23,
23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 26, 27, 27, 27, 28, 28, 28, 29,
29, 29, 30, 30, 30, 31, 31, 31, 31]])
So there is 0 in the 50th column of row 2. When I apply torch.unique along
dim=1, I get:
a,c = torch.unique(tmp,dim=1,return_counts=True)
a
tensor([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]])
It can be seen that the second row of the output has two 0s and the first row has two 16s. Am I doing something wrong here or this is suspicious?
It is because you specified dim=1. PyTorch is thus checking for unique pairs (which it correctly does). Like (0, 0), (1, 1), (16, 0): these are the unique pairs that it generated. In general the pair (temp[0,i], temp[1,i]) is unique for all i.
If you want all the elements to be unique, just throw away the dim: torch.unique(tmp).
If you need to maintain the two list structure, the output cannot be stored as a single tensor because their sizes might not match. You can do something like output1 = torch.unique(tmp[0]) and output2 = torch.unique(tmp[1]).
merge element of sublist with another sublist without duplicate
I am working on solving the vehicle routing problem and get the initial best solution and I want to be exploited and need to get rid of duplicate nodes while merging them.
my problem I want to merge the sublist of element [8] with others that carry the condition equal to 3 elements, but cause there are two sets consist 3 elements like [18, 22, 34, 8], [35, 36, 37, 8] and I need integrate element 8 in one of them randomly
bestsolution= [[22, 15, 20, 2, 32, 30, 4, 17], [27, 8, 9, 14, 33, 21, 5, 13], [26, 28, 6, 31, 11], [18,22,34],[35,36,37],[8]]
for a in bestsolution:
if len(a)==1:
p=a
del bestsolution[-1]
for b in bestsolution:
if len(b)==2:
b.extend(p)
print("p",b)
print("bestsolution1-2",bestsolution)
elif len(b)==3:
b.extend(p)
print("p",b)
print("bestsolution1-3",bestsolution)
my results:
p [18, 22, 34, 8]
bestsolution1-3 [[22, 15, 20, 2, 32, 30, 4, 17], [27, 8, 9, 14, 33, 21, 5, 13], [26, 28, 6, 31, 11], [18, 22, 34, 8], [35, 36, 37]]
p [35, 36, 37, 8]
bestsolution1-3 [[22, 15, 20, 2, 32, 30, 4, 17], [27, 8, 9, 14, 33, 21, 5, 13], [26, 28, 6, 31, 11], [18, 22, 34, 8], [35, 36, 37, 8]]
where for each run program, I got two results at the same time so how could be refuse the second solution.
'''
code targeted:
bestsolution1-3 [[22, 15, 20, 2, 32, 30, 4, 17], [27, 8, 9, 14, 33, 21, 5, 13], [26, 28, 6, 31, 11], [18, 22, 34], [35, 36, 37, 8]]
'''
Thank you
Ans/
The code will be:
'''
bestsolution= [[22, 15, 20, 2, 32, 30, 4, 17],[2,3,4], [27, 8, 9, 14, 33, 21, 5, 13], [26, 28, 6, 31, 11], [18,22,34],[35,36,37],[8]]
if len(a)==1:
p=a
del bestsolution[-1]
for b in bestsolution:
if len(b)==2:
b.extend(p)
print("p",b)
print("bestsolution1-2",bestsolution)
break
elif len(b)==3:
b.extend(p)
print("p",b)
print("bestsolution1-3",bestsolution)
break
elif len(b)==4:
b.extend(p)
print("p",b)
print("bestsolution1-4",bestsolution)
break
'''
Output expected:
p [18, 22, 34, 8]
bestsolution1-3 [[22, 15, 20, 2, 32, 30, 4, 17], [27, 8, 9, 14, 33, 21, 5, 13], [26, 28, 6, 31, 11], [18, 22, 34, 8], [35, 36, 37]]
'''
After trying much time I got final results, really didn't expect that I solved by putting the keyword (Break) will prevent execute the program in the second iteration so we will get one solution only.
if anyone have a comment, it will be a pleasure.
I have this nested list:
a = [[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 9, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 4, 8, 14, 18, 23, 36],
[1, 2, 5, 9, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 3, 7, 12, 17, 36],
[1, 2, 4, 8, 14, 19, 23, 36],
[1, 2, 5, 10, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 10, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 33, 34, 35,36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 35, 36]]
I need to choose max length of sublist in nested list, than compare item of sublist with nested list. If item in sublist equal then same item in nested list remove and in final print nested list without this item.
I hope I understand your question correctly.
You want input to be:
a = [[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 9, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 4, 8, 14, 18, 23, 36],
[1, 2, 5, 9, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 28, 31, 32, 33, 34, 35, 36],
[1, 3, 7, 12, 17, 36],
[1, 2, 4, 8, 14, 19, 23, 36],
[1, 2, 5, 10, 15, 20, 24, 26, 30, 36],
[1, 3, 6, 11, 16, 22, 25, 29, 31, 32, 33, 34, 35, 36],
[1, 2, 5, 10, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 33, 34, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 35, 36]]
We are removing
[1, 3, 6, 11, 16, 22, 25, 29, 31, 32, 33, 34, 35, 36]
and
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 34, 35, 36]
since they are of the same length.
The output should be:
a = [[1, 2, 5, 9, 15, 20, 24, 26, 30, 36],
[1, 2, 4, 8, 14, 18, 23, 36],
[1, 2, 5, 9, 15, 20, 24, 27, 30, 36],
[1, 3, 7, 12, 17, 36],
[1, 2, 4, 8, 14, 19, 23, 36],
[1, 2, 5, 10, 15, 20, 24, 26, 30, 36],
[1, 2, 5, 10, 15, 20, 24, 27, 30, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 32, 33, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 28, 31, 33, 34, 35, 36],
[1, 3, 6, 11, 16, 21, 25, 29, 31, 32, 33, 35, 36]]
with the previous lists removed.
Your question was not worded clearly, but I hope this is what you wanted. Here is the code:
# assume a is not empty
d = {} # list of the max length -> number of occurrences in 2d array
# find the length of the longest list
maxLen = len(a[0])
for l in a:
if len(l) > maxLen:
maxLen = len(l)
# add lists of the same max length and their count to the dictionary
for l in a:
if len(l) == maxLen:
#convert list to string because python does not support list being key of a dictionary
l_string = str(l)
if l_string in d:
d[l_string] += 1
else:
d[l_string] = 1
# remove
for l_string in d:
while d[l_string] > 0:
# convert string back to list and remove
a.remove(eval(l_string))
d[l_string] -= 1
# test result if you want
for row in a:
print(row)
I'm trying to build my own speech recognition network. I understood how to pre-process audio. But I can't figure out the pre-processing of the text.
I have a alphabet:
alphabet = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14,'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25, 'z': 26}
And I encode each letter of the sentence into a number (27 is a space):
array([list([27, 23, 8, 5, 14, 27, 8, 5, 27, 19, 16, 5, 1, 11, 19, 27, 9, 14, 27, 15, 21, 18, 27, 12, 1, 14, 7, 21, 1, 7, 5, 27, 9, 27, 3, 1, 14, 27, 9, 14, 20, 5, 18, 16, 18, 5, 20, 27, 23, 8, 1, 20, 27, 8, 5, 27, 8, 1, 19, 27, 19, 1, 9, 4, 27]),
list([27, 19, 15, 27, 14, 15, 23, 27, 9, 27, 6, 5, 1, 18, 27, 14, 15, 20, 8, 9, 14, 7, 27, 2, 5, 3, 1, 21, 19, 5, 27, 9, 20, 27, 23, 1, 19, 27, 20, 8, 15, 19, 5, 27, 15, 13, 5, 14, 19, 27, 20, 8, 1, 20, 27, 2, 18, 15, 21, 7, 8, 20, 27, 25, 15, 21, 27, 20, 15, 27, 13, 5, 27]),
list([27, 14, 9, 7, 8, 20, 27, 6, 5, 12, 12, 27, 1, 14, 4, 27, 1, 14, 27, 1, 19, 19, 15, 18, 20, 13, 5, 14, 20, 27, 15, 6, 27, 6, 9, 7, 8, 20, 9, 14, 7, 27, 13, 5, 14, 27, 1, 14, 4, 27, 13, 5, 18, 3, 8, 1, 14, 20, 19, 27, 5, 14, 20, 5, 18, 5, 4, 27, 1, 14, 4, 27, 5, 24, 9, 20, 5, 4, 27, 20, 8, 5, 27, 20, 5, 14, 20, 27]),
list([27, 9, 27, 8, 5, 1, 18, 4, 27, 1, 27, 6, 1, 9, 14, 20, 27, 13, 15, 22, 5, 13, 5, 14, 20, 27, 21, 14, 4, 5, 18, 27, 13, 25, 27, 6, 5, 5, 20, 27]),
list([27, 25, 15, 21, 27, 3, 1, 13, 5, 27, 19, 15, 27, 20, 8, 1, 20, 27, 25, 15, 21, 27, 3, 15, 21, 12, 4, 27, 12, 5, 1, 18, 14, 27, 1, 2, 15, 21, 20, 27, 25, 15, 21, 18, 27, 4, 18, 5, 1, 13, 19, 27, 19, 1, 9, 4, 27, 20, 8, 5, 27, 15, 12, 4, 27, 23, 15, 13, 1, 14, 27])],
dtype=object)
Here are 5 sentences.
I just create one network layer and try to transfer this data there in order to get a number corresponding to the letter.
model = Sequential()
model.add(Dense(27, input_shape=(20,), activation='softmax'))
model.compile(loss='mean_squared_error',optimizer='Adam', metrics=['accuracy'])
for X, y in batch(X_train, y_train, 5):
model.train_on_batch(X, y)
batch() just breaks X_train, y_train into batch.
5 is size of batch.
But when I try to start the network I get an error
Error when checking target: expected dense_25 to have shape (27,) but got array with shape (1,)
UPD:
I'm using MFCC for X
audio, sr = librosa.load(pathTrain+"\\"+str(file), mono=True, sr=None)
fileMFCC = librosa.feature.mfcc(audio)
mean_scale = np.mean(fileMFCC, axis=0)
std_scale = np.std(fileMFCC, axis=0)
fileMFCC = (fileMFCC - mean_scale[np.newaxis, :]) / std_scale[np.newaxis, :]
X is
[array([[-4.35889894, -4.35889894, -4.35455134, ..., -3.95851777,
-3.99308173, -4.05261022],
[ 0.22941573, 0.22941573, 0.31913073, ..., 1.87189324,
1.7987301 , 1.66804349],
[ 0.22941573, 0.22941573, 0.31165866, ..., -0.27962786,
-0.19009062, -0.13788484],
...,
[ 0.22941573, 0.22941573, 0.18657944, ..., 0.14699792,
0.12751924, 0.16724807],
[ 0.22941573, 0.22941573, 0.18478513, ..., 0.00674492,
-0.04570105, 0.01231168],
[ 0.22941573, 0.22941573, 0.18232521, ..., 0.2571599 ,
0.22477036, 0.09153304]])
etc.
import numpy as np
arr = np.array(range(60)).reshape(6,10)
arr
> array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
> [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
> [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
> [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
> [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
> [50, 51, 52, 53, 54, 55, 56, 57, 58, 59]])
What I need:
select_random_windows(arr, number_of windows= 3, window_size=3)
> array([[[ 1, 2, 3],
> [11, 12, 13],
> [21, 22, 23]],
>
> [37, 38, 39],
> [47, 48, 49],
> [57, 58, 59]],
>
> [31, 32, 33],
> [41, 42, 43],
> [51, 52, 53]]])
In this hypothetical case I'm selecting 3 windows of 3x3 within the main array (arr).
My actual array is a raster and I basically need a bunch (on the thousands) of little 3x3 windows.
Any help or even a hint will be much appreciated.
I actually haven't found any practical solution yet...since many many hours
THX!
We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows. More info on use of as_strided based view_as_windows.
from skimage.util.shape import view_as_windows
def select_random_windows(arr, number_of_windows, window_size):
# Get sliding windows
w = view_as_windows(arr,window_size)
# Store shape info
m,n = w.shape[:2]
# Get random row, col indices for indexing into windows array
lidx = np.random.choice(m*n,number_of_windows,replace=False)
r,c = np.unravel_index(lidx,(m,n))
# If duplicate windows are allowed, use replace=True or np.random.randint
# Finally index into windows and return output
return w[r,c]
Sample run -
In [209]: arr
Out[209]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59]])
In [210]: np.random.seed(0)
In [211]: select_random_windows(arr, number_of_windows=3, window_size=(2,4))
Out[211]:
array([[[41, 42, 43, 44],
[51, 52, 53, 54]],
[[26, 27, 28, 29],
[36, 37, 38, 39]],
[[22, 23, 24, 25],
[32, 33, 34, 35]]])
You can try [numpy.random.choice()][1]. It takes a 1D or an ndarray and creates a single element or an ndarray by sampling the elements from the given ndarray. You also have an option of providing the size of the array you want as the output.