'list' object has no attribute 'split' in an NLP question - python-3.x

While running the following code snippet, I get the following error
'list' object has no attribute 'split'
for i in range(len(questions1)):
# Question strings need to be separated into words
# Each question needs a unique label
questions_labeled.append(TaggedDocument(questions1[i].split(), df[df.index == i].qid1))
questions_labeled.append(LabeledSentence(questions2[i].split(), df[df.index == i].qid2))
if i % 10000 == 0:
progress = i/len(questions1) * 100
print("{}% complete".format(round(progress, 2)))```

Because list has no split() only string objects have split.

The questions1 and questions2 objects seem to hold lists of strings (e.g., questions1 = [['this is a sample text', 'this is another one'],['this is some other text],...]), and not just strings (e.g., questions1 = ['this is a sample text', 'this is another one',...]). Hence the error (i.e., 'list' object has no attribute 'split'), as you are trying to split a list instead of a string. One way to solve this is to create a flast list out of each list of lists, before iterating over them, as described here. For example:
questions1 = [item for sublist in questions1 for item in sublist]
questions2 = [item for sublist in questions2 for item in sublist]

Related

How to append values to set() to the end in python?

Have got a item = set(), need to append values to this 'item' to the end. But set() append it to first position pushing already present values to last as shown below.
item = {'Mango'}
item.add('Apple')
#Returns
{'Apple','Mango'}
#Expected output
{'Mango','Apple'}
Even tried item.update(['Apple']) doesn't work.
It looks like you want a data structure that has an ordered sequence. You can do this with lists instead of a set, and use .append to add things to the end of the list:
item = ['Mango', 'Apple']
item.append('Pear')
#Output
['Mango', 'Apple', 'Pear']

Unhashable When Comparing Lists

I am not familiar with the unhashable error that I am receiving here. I have the following dataframe 'dfd' that I am isolating role descriptions on. From there, I split each word within the role descriptions out and consolidate the entire list together into a single list. From this list, I try and compare this to a list of stop words that will filter out the clutter.
This code fails at the if statement:
if w not in stop_words:
TypeError: unhashable type: 'list'
Could someone explain what the issue is? I feel like this should be straightforward.
dfd = dfd['Role Description']
mylist =[]
for role in dfd:
tokenized_word=word_tokenize(role)
mylist.append(tokenized_word)
stop_words=set(stopwords.words("english"))
map(str, mylist)
print(mylist)
filtered_sent=[]
for w in mylist:
if w not in stop_words:
filtered_sent.append(w)
print("Filtered Sentence:",filtered_sent)
Question solved. The list was not flattened, so I was running a list of lists through this instead of each individual item

Remove sublist from DEEP nestedlist

I'm struggling with some NESTED LISTS.
Briefly, inside the list of lists I have some lists containing several value
biglist = [[['strings', '632'], ['otherstrings', 'hey']],[['blabla', '924'], ['histring', 'hello']]]
from this nested list, I'd like to remove the sublist in which 'hello' string appears.
I tried this:
for sub_line in big_list:
if 'hello' in sub_line:
big_list.remove(sub_line)
Now, if I print the new big_list outside the loop, I get the old list since I didn't assign the updated list to a new list. But if I assign to a new list like:
for sub_line in big_list:
if 'hello' in sub_line:
updated_list = big_list.remove(sub_line)
print(updated_list)
I get an AttributeError: 'NoneType' object has no attribute 'remove'.
So what's the problem with this?
I CANNOT use list indexing because my real list is huge and the target value is not always in the same place.
I've already check other questions but nothing is working.
Thank you all!
if you constantly have two levels of nesting (not what I would label DEEP), you could combine this answer from the dupe marking by #pault with list flattening:
biglist = [[['strings', '632'], ['otherstrings', 'hey']],[['blabla', '924'], ['histring', 'hello']]]
token = 'hello'
smalllist = [x for x in biglist if not token in [j for i in x for j in i]]
# smalllist
# Out[17]: [[['strings', '632'], ['otherstrings', 'hey']]]
Following works for me. You need to remove sub_line (not line) form the list.
big_list = [['strings', '632', 'otherstrings', 'hey'],['blabla', '924', 'histring', 'hello']]
print(big_list)
for sub_line in big_list:
if 'hello' in sub_line:
big_list.remove(sub_line)
print(big_list)
for sublist in biglist:
if 'hello' in sublist:
updated_list=biglist.remove(sublist)
print(updated_list)
The output of the above code is None because remove() does not return any value i.e, it returns None. So you should not assign return value of remove() in a list.
I think that might cause some problems whenever you will use updated_list.

How can I make my dictionary be able to be indexed by a function in python 3.x

I am trying to make a program that finds out how many integers in a list are not the integer that is represented the most in that list. To do that I have a command which creates a dictionary with every value in the list and the number of times it is represented in it. Next I try to create a new list with all items from the older list except the most represented value so I can count the length of the list. The problem is that I cannot access the most represented value in the dictionary as I get an error code.
import operator
import collections
a = [7, 155, 12, 155]
dictionary = collections.Counter(a).items()
b = []
for i in a:
if a != dictionary[max(iter(dictionary), key=operator.itemgetter(1))[0]]:
b.append(a)
I get this error code: TypeError: 'dict_items' object does not support indexing
The variable you called dictionary is not a dict but a dict_items.
>>> type(dictionary)
<class 'dict_items'>
>>> help(dict.items)
items(...)
D.items() -> a set-like object providing a view on D's items
and sets are iterable, not indexable:
for di in dictionary: print(di) # is ok
dictionary[0] # triggers the error you saw
Note that Counter is very rich, maybe using Counter.most_common would do the trick.

Python add a single string to a list

I have the following Problem:
I have a list of items, in which the first Word represents the type of something e. .g:
Wall DXU76542
Table Uxitr
Wall rT4
Mobile Tr2
.
.
.
I would like to create another list analogous to this list, extract the single letters at the beginning of every row and add them to the corresponding row of the second list until the space letter " " appears. Thus I can create a second list out of the first list with only types of the items. Here is part of the code in Python (the list: "elements" is a flat list):
for Element in elements:
list1.Add(Element.Name)
list2=[]
list2.append([])
for i in xrange(0,len(list1)):
for j in xrange(0,len(list1[i])):
l=0
while not (list1[i][l]==" "):
list2[i][j].append(list1[i][l])
l = l+1
Output=list2
Does anybody have any idea, why I get the error:
AttributeError: 'str' object has no attribute 'append'
Thank you.

Resources