I want to arrange a list of tuples similar to the one bellow in descending order using the numbers:
data = [('ralph picked', ['nose', '4', 'apple', '30', 'winner', '3']),
('aaron popped', ['soda', '1', 'popcorn', '6', 'pill', '4',
'question', '29'])]
I would like to sort the nested list so that the outcome would look somewhat like:
data2 = [('ralph picked', ['apple', '30', 'nose', '4', 'winner', '3']),
('aaron popped', ['question', '29', 'popcorn', '6', 'pill', '4', 'soda', '1'])]
I am trying to use this code for this:
data2=[]
for k, v in data:
data2 = ((k, sorted(zip(data[::2], data[1::2]), key=lambda x: int(x[1]), reverse=True) ))
[value for pair in data2 for value in pair]
print(data2)
But I keep getting the error message:
TypeError: int() argument must be a string or a number, not 'tuple'
I tried to rearrange the int in key=lambda x: int(x[1]) to different things, but I kept getting the same message, I am very new to python, the syntax often gets me. Any ideas on how to solve this? I really thank you very much!
Rather than trying to do everything at once, let's give things names:
data = [('ralph picked', ['nose', '4', 'apple', '30', 'winner', '3']),
('aaron popped', ['soda', '1', 'popcorn', '6', 'pill', '4', 'question', '29'])]
data2 = []
for k, v in data:
new_list = sorted(zip(v[::2], v[1::2]), key=lambda x: int(x[1]), reverse=True)
flattened = [value for pair in new_list for value in pair]
new_tuple = (k, flattened)
data2.append(new_tuple)
produces
>>> print(data2)
[('ralph picked', ['apple', '30', 'nose', '4', 'winner', '3']),
('aaron popped', ['question', '29', 'popcorn', '6', 'pill', '4', 'soda', '1'])]
You need to distinguish between data and v -- you only want to sort v, and you need to store the result of the list comprehension, otherwise you're just building it and throwing it away.
When you're having trouble with the syntax, break everything apart into its pieces and print them to see what's going on. For example, you could decompose new_list into
words = v[::2]
numbers = v[1::2]
pairs = zip(words, numbers)
sorted_pairs = sorted(pairs, key=lambda x: int(x[1]), reverse=True)
and sorted_pairs is really what new_list is.
Related
I have been going over and over this code but I can't understand why I am getting a Key not found 1 error.
The code below may not be enough, but basically, I am comparing grades in a list excel_data_choice with those pulled from a spreadsheet video_grades. It should iterate through the list to determine that grade exists in the video_grades list and if not, then add it.
However, I keep getting the error. Here is my output log. I am sure I am doing something wrong, but just can't see it.
Output
Video grades are: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']
K is 0 and v is K
K is 1 and v is 1
Key not found 1
Code
excel_data_choice = ['K', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
try:
for k,v in enumerate(excel_data_choice):
print(f'K is {k} and v is {v}')
if row[v] == v and str(row[v]) not in video_grades:
print(f'row v = {row[v]}')
video_grades.append(str(v))
print(f'Video grades is now {video_grades}')
elif row[v] == 'delete' and str(v) in video_grades:
video_grades.remove(str(v))
acf_dict['video_grades'] = video_grades
except KeyError as ke:
print(f'Key not found {ke}')
I have inherited this piece of code
dummy_data1 = {
'id': ['1', '2', '3', '4', '5'],
'Feature1': ['A', 'C', 'E', 'G', 'I'],
'Feature2': ['Mouse', 'dog', 'house and parrot', '23', np.NaN],
'dates': ['12/12/2020','12/12/2020','12/12/2020','12/12/2020','12/12/2020']}
df1 = pd.DataFrame(dummy_data1, columns = ['id', 'Feature1', 'Feature2', 'dates'])
df1 = df1.assign(
Feature2=lambda df: df.Feature2.where(
~df.Feature2.str.isnumeric(),
pd.to_numeric(df.Feature2, errors="coerce").astype("Int64"),
)
)
print(df1)
I know that this is because of the np.NAN value. What does the code do? My understanding is that it tries to convert the String to Int, if it is of type integer. Also please tell me how to overcome this issue.
You can try via pd.to_numeric() and then fill NaN's:
df['Feature2']=pd.to_numeric(df['Feature2'], errors="coerce").fillna(df['Feature2'])
OR
go with the where() condition by filling those NaN's with fillna() in your condition ~df.Feature2.str.isnumeric():
df['Feature2']=df['Feature2'].where(~df.Feature2.str.isnumeric().fillna(True),
pd.to_numeric(df.Feature2, errors="coerce").astype("Int64")
)
l1= [['1', 'apple', '1', '2', '1', '0', '0', '0'], ['1',
'cherry', '1', '1', '1', '0', '0', '0']]
l2 = [['1', 'cherry', '2', '1'],
['1', 'plums', '2', '15'],
['1', 'orange', '2', '15'],
['1', 'cherry', '2', '1'],
['1', 'cherry', '2', '1']]
output = []
for i in l1:
for j in l2:
if i[1] != j[1]:
output.append(j)
break
print(output)
Expected Output:
[['1', 'plums', '2', '15'], ['1', 'orange', '2', '15']]
How to stop iteration and find unique elements and get the sublist?
How to stop iteration and find unique elements and get the sublist?
To find the elements in L2 that are not in L1 based on the fruit name:
l1= [[1,'apple',3],[1,'cherry',4]]
l2 = [[1,'apple',3],[1,'plums',4],[1,'orange',3],[1,'apple',4]]
output = []
for e in l2:
if not e[1] in [f[1] for f in l1]: # search by matching fruit
output.append(e)
print(output)
Output
[[1, 'plums', 4], [1, 'orange', 3]]
You can store all the unique elements from list1 in a new list, then check for list2 if that element exists in the new list. Something like:
newlist = []
for item in l1:
if item[1] not in newlist:
newlist.append(item)
output = []
for item in l2:
if item[1] not in newlist:
output.append(item)
print(output)
This is slightly inefficient but really straightforward to understand.
I have following two lists:
list1 = ['17-Q2', '1.00', '17-Q3', '2.00', '17-Q4', '5.00', '18-Q1', '6.00']
list2 = ['17-Q2', '1', '17-Q3', '2', '17-Q4', '5', '18-Q1', '6']
I want a dictionary in the following way. Can I do that in Python?
result = [17-Q2: 1(1.00), 17-Q3: 2(2.00), 17-Q4: 5(5.00), 18-Q1: 6(6.00)]
Here's one approach to this:
result = {}
list1=['17-Q2', '1.00', '17-Q3', '2.00', '17-Q4', '5.00', '18-Q1', '6.00']
list2=['17-Q2', '1', '17-Q3', '2', '17-Q4', '5', '18-Q1', '6']
for i in range(0, len(list1)-1, 2):
result[list1[i]] = list2[i + 1] + '(' + list1[i+1] + ')' ;
You can zip the two lists and then zip the resulting iterable with itself so that you can iterate through it in pairs to construct the desired dict:
i = zip(list1, list2)
result = {k: '%s(%s)' % (v2, v1) for (k, _), (v1, v2) in zip(i, i)}
result becomes:
{'17-Q2': '1(1.00)', '17-Q3': '2(2.00)', '17-Q4': '5(5.00)', '18-Q1': '6(6.00)'}
You can use zip and dict.
keys = ["a", "b", "c", "d"]
values = ["A", "B", "C"]
print(dict(zip(keys, values)))
# prints: {'a': 'A', 'b': 'B', 'c': 'C'}
This works because dict can take a list (or any iterator) of tuples of (key, value) to be created. The zip function allows to build tuples from lists:
zip returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables.
Notice that this will only return a dictionary that pairs the shortest list (either keys or value) with the corresponding element.
If you wish to have a default value for unpaired elements you can always use itertools.zip_longest
from itertools import zip_longest
keys = ["a", "b", "c", "d"]
values = ["A", "B", "C"]
print(dict(zip_longest(keys, values)))
# prints: {'a': 'A', 'b': 'B', 'c': 'C', 'd': None}
You can also use zip_longest keyword parameter fillvalue to have something else than None when the corresponding value isn't found.
On the other hand, if you have more values than keys, this wouldn't make much sense as you would erase the default key (namely fillvalue) for every missing element.
Assuming your input to be as follows:
list1 = ['17-Q2', '1.00', '17-Q3', '2.00', '17-Q4', '5.00', '18-Q1', '6.00']
list2 = ['17-Q2', '1', '17-Q3', '2', '17-Q4', '5', '18-Q1', '6']
And Desired Output to be as follows:
result = {17-Q2: 1(1.00), 17-Q3: 2(2.00), 17-Q4: 5(5.00), 18-Q1: 6(6.00)}
Following code with a single while loop could help:
from collections import OrderedDict
final_dict = dict()
i = 0 # Initializing the counter
while (i<len(list1)):
# Updating the final dict
final_dict.update({list1[i]:str(list2[i+1]) + "(" + str(list1[i+1]) + ")"})
i += 2 # Incrementing by two in order to land on the alternative index
# If u want sorted dictionary
sorted_dict = OrderedDict(sorted(final_dict.items()))
print (sorted_dict)
i'm trying to build a prediction model using GaussianNB.
I have a csv file that looks like this:
csv data
My code looks like as follows:
encoded_df = pd.read_csv('path to file')
y = encoded_df.iloc[:,12]
X = encoded_df.iloc[:,0:12]
model = GaussianNB()
model.fit(X, y)
prediction_test_naive = ['427750', '426259', '2', '1610', '2', '1', '2', '1', '4', '1', '47', '2']
naive_predicted_class = model.predict(np.reshape(prediction_test_naive, [1, -1]))
print("predicted Casualty Severity: 1 = slight, 2 = serious, 3 = fatal: ", naive_predicted_class)
expected_bayes = y
predicted_bayes = model.predict(X)
classification_report_bayes = metrics.classification_report(expected_bayes, predicted_bayes)
print(classification_report_bayes)
When ran i get the type error:
TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('U32') dtype('U32') dtype('U32')
The error appears to be coming from line 7 in the example code above. but other than that i do not know.
i'm not really sure how to fix this, I have a decision tree that works but would like to use bayes theorem too.
The error is due to this line:
prediction_test_naive = ['427750', '426259', '2', '1610', '2', '1', '2', '1', '4', '1', '47', '2']
Here you are declaring a list of strings (by using a single inverted commas around the values) which is then used to prediction. But in the model, only numerical values are allowed. So you need to convert them to numerical.
For this you can use the following ways:
1) Declare the prediction_test_naive as numbers like this (Notice that inverted commas have been removed):
prediction_test_naive = [427750, 426259, 2, 1610, 2, 1, 2, 1, 4, 1, 47, 2]
2) Convert the prediction_test_naive to numerical using numpy
After this line:
prediction_test_naive = ['427750', '426259', '2', '1610', '2', '1', '2', '1', '4', '1', '47', '2']
Do this:
prediction_test_naive = np.array(prediction_test_naive, dtype=float)