I am not familiar with the unhashable error that I am receiving here. I have the following dataframe 'dfd' that I am isolating role descriptions on. From there, I split each word within the role descriptions out and consolidate the entire list together into a single list. From this list, I try and compare this to a list of stop words that will filter out the clutter.
This code fails at the if statement:
if w not in stop_words:
TypeError: unhashable type: 'list'
Could someone explain what the issue is? I feel like this should be straightforward.
dfd = dfd['Role Description']
mylist =[]
for role in dfd:
tokenized_word=word_tokenize(role)
mylist.append(tokenized_word)
stop_words=set(stopwords.words("english"))
map(str, mylist)
print(mylist)
filtered_sent=[]
for w in mylist:
if w not in stop_words:
filtered_sent.append(w)
print("Filtered Sentence:",filtered_sent)
Question solved. The list was not flattened, so I was running a list of lists through this instead of each individual item
Related
I am trying to find the missed dictionary element by comparing two dictionary list
list1=[{"amount":"1000",'dept':'marketing',"desig":"mgr",'id':'101'},
{"amount":"1331",'dept':None,"desig":"accnt",'id':'102'},{"amount":"1351",'dept':'mark',"desig":"sales",'id':'103'}]
list2=[{"amount":"1500",'dept':None,"desig":"mgr2",'id':'101'},
{"amount":"1451",'dept':'IT',"desig":"accnt",'id':'102'}]
difference=[item for item in list1 if item["id"] not in list2]
but its not giving the expected output for the missed key
expected output:
missed={"amount":"1351",'dept':'mark',"desig":"sales",'id':'103'}
Your edited question requires matching against the 'id' key, not the entire dictionary object as before. Since that is hashable, it is simplest to put it in a set to efficiently test whether something is part of the collection or not. (Edit: Credit #KellyBundy for suggesting set comprehension)
id_set = {item['id'] for item in list2}
difference = [item for item in list1 if item['id'] not in id_set]
difference here gives a list of all missing dictionaries. So to access all of them, you need to iterate over it.
for missed in difference:
print(f'{missed=}')
and the output will be (it will print one line per dictionary)
missed={'amount': '1351', 'dept': 'mark', 'desig': 'sales', 'id': '103'}
If you know there will only be one missing item, or you only need the first element, you can use difference[0] to directly access the dictionary.
While running the following code snippet, I get the following error
'list' object has no attribute 'split'
for i in range(len(questions1)):
# Question strings need to be separated into words
# Each question needs a unique label
questions_labeled.append(TaggedDocument(questions1[i].split(), df[df.index == i].qid1))
questions_labeled.append(LabeledSentence(questions2[i].split(), df[df.index == i].qid2))
if i % 10000 == 0:
progress = i/len(questions1) * 100
print("{}% complete".format(round(progress, 2)))```
Because list has no split() only string objects have split.
The questions1 and questions2 objects seem to hold lists of strings (e.g., questions1 = [['this is a sample text', 'this is another one'],['this is some other text],...]), and not just strings (e.g., questions1 = ['this is a sample text', 'this is another one',...]). Hence the error (i.e., 'list' object has no attribute 'split'), as you are trying to split a list instead of a string. One way to solve this is to create a flast list out of each list of lists, before iterating over them, as described here. For example:
questions1 = [item for sublist in questions1 for item in sublist]
questions2 = [item for sublist in questions2 for item in sublist]
How to check exactly if the type of value in the dictioanary are list type or not
i have list which need to check against the key's in the dictionary 'test_dict' if its exist and to know if any of values are 'list' type
col_list = ['pat_cd','dsply_nm', 'opt_cd','dsply_val']
test_dict={'pat_cd':'123','opt_cd':['232','245'],'test':['123','1232']}
result=type(any(test_dict[i]) for i in col_list if i in test_dict) is list
print(result)
##
Output
False
The out put return 'False'.. Ideally it should return 'True' since value of 'opt_cd' is list type
Can anyone help to resolve this?
Thanks
This should work:
any(isinstance(test_dict.get(k, None), list) for k in col_list)
The issue with your code is how you've used the keywords type and any. for example any works on a list of booleans. Their usage in Python is not the same as their use in English. any(test_dict[i]) will always return True so your code creates an iterator of True and an iterator is not a list. So it will always return false.
Also it is recommended to use isinstance(something,class_name) of instead of type(something) is class_name.
Although #alex's answer works. Since you're new to python I would recommend you use for loops instead. It might even help the performance if the size of your dictionary or list is large.
result = False
for k in col_list:
if isinstance(test_dict.get(k, None), list):
result=True
break
Which is almost equivalent to the one liner but more readable for a beginner.:
result = any(isinstance(test_dict.get(k, None), list) for k in col_list)
I'm struggling with some NESTED LISTS.
Briefly, inside the list of lists I have some lists containing several value
biglist = [[['strings', '632'], ['otherstrings', 'hey']],[['blabla', '924'], ['histring', 'hello']]]
from this nested list, I'd like to remove the sublist in which 'hello' string appears.
I tried this:
for sub_line in big_list:
if 'hello' in sub_line:
big_list.remove(sub_line)
Now, if I print the new big_list outside the loop, I get the old list since I didn't assign the updated list to a new list. But if I assign to a new list like:
for sub_line in big_list:
if 'hello' in sub_line:
updated_list = big_list.remove(sub_line)
print(updated_list)
I get an AttributeError: 'NoneType' object has no attribute 'remove'.
So what's the problem with this?
I CANNOT use list indexing because my real list is huge and the target value is not always in the same place.
I've already check other questions but nothing is working.
Thank you all!
if you constantly have two levels of nesting (not what I would label DEEP), you could combine this answer from the dupe marking by #pault with list flattening:
biglist = [[['strings', '632'], ['otherstrings', 'hey']],[['blabla', '924'], ['histring', 'hello']]]
token = 'hello'
smalllist = [x for x in biglist if not token in [j for i in x for j in i]]
# smalllist
# Out[17]: [[['strings', '632'], ['otherstrings', 'hey']]]
Following works for me. You need to remove sub_line (not line) form the list.
big_list = [['strings', '632', 'otherstrings', 'hey'],['blabla', '924', 'histring', 'hello']]
print(big_list)
for sub_line in big_list:
if 'hello' in sub_line:
big_list.remove(sub_line)
print(big_list)
for sublist in biglist:
if 'hello' in sublist:
updated_list=biglist.remove(sublist)
print(updated_list)
The output of the above code is None because remove() does not return any value i.e, it returns None. So you should not assign return value of remove() in a list.
I think that might cause some problems whenever you will use updated_list.
I am an extremely begginer learning python to tackle some biology problems, and I came across lists and its various methods. Basically, when I am running print to my variable I get None as return.
Example, trying to print a sorted list assigned to a variable
list1=[1,3,4,2]
sorted=list1.sort()
print(sorted)
I receive None as return. Shouldn't this provide me with [1,2,3,4]
However, when printing the original list variable (list1), it gives me the sorted list fine.
Because the sort() method will always return None. What you should do is:
list1=[1,3,4,2]
list1.sort()
print(list1)
Or
list1=[1,3,4,2]
list2 = sorted(list1)
print(list2)
You can sort lists in two ways. Using list.sort() and this will sort list, or new_list = sorted(list) and this will return a sorted list new_list and list will not be modified.
So, you can do this:
list1=[1,3,4,2]
sorted=sorted(list1)
print(sorted)
Or you can so this:
list1=[1,3,4,2]
list1.sort()
print(list1)