Removing the whitespaces from a particular column in a dataframe [duplicate] - python-3.x

This question already has answers here:
Pandas - Strip white space
(6 answers)
Closed 3 years ago.
I have the following dataframe and I'd like to remove all the whitespace characters and make it lowercase:
df = pd.DataFrame({"col1":[1,2,3,4], "col2":["A","B ", "Cc","D"]})
I tried to do that via df[["col2"]].apply(lambda x: x.strip().lower()) but it raises an error:
AttributeError: ("'Series' object has no attribute 'strip'", 'occurred at index col2')

You need two function call from str
df["col2"].str.strip().str.lower()

Related

Write a function that returns the count of the unique answers to all of the questions in a dataset [duplicate]

This question already has answers here:
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
How to test if a string contains one of the substrings in a list, in pandas?
(4 answers)
Closed 1 year ago.
For example, after filtering the entire dataset to only questions containing the word "King", we could then find all of the unique answers to those questions.
I filtered by using the following code:
`def lower1(x):
x.lower()
filter_dataset = lambda x:all(x) in jeopardy.Question.apply(lower1)
print(filter_dataset(['King','England']))`
The above code is printing True instead of printing the rows of jeopardy['Question'] with the keywords 'King' and 'England'.
That is the first problem.
Now I want to count the unique answers to the jeopardy['Question']
Here is the sample data frame
Now I want to create a function that does the count of the unique answers.
I wrote the following code:
`def unique_counts():
print(jeopardy['Answer'].unique().value_counts())
unique_counts()`
Which is giving me the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'value_counts'
Use Series.str.contains:
jeopardy[jeopardy['Question'].str.contains('|'.join(['King','England']))]

Add multiple value to a key in Python dictionary [duplicate]

This question already has answers here:
list to dictionary conversion with multiple values per key?
(7 answers)
Closed 2 years ago.
I am trying to add multiple value to a single key(if found) from a file in python. I tried below code but getting this error: AttributeError: 'str' object has no attribute 'append'
file=open("Allwords",'r')
list=sorted(list(set([words.strip() for words in file])))
def sequence(word):
return "".join(sorted(word))
dict= {}
for word in list:
if sequence(word) in dict:
dict[sequence(word)].append(word)
else:
dict[sequence(word)]=word
Thanks in advance for your help!
You should insert the first element by putting it into a list, so that you can append subsequent items to it later on. You can do it as follows -
file=open("Allwords",'r')
list=sorted(list(set([words.strip() for words in file])))
def sequence(word):
return "".join(sorted(word))
dict= {}
for word in list:
if sequence(word) in dict:
dict[sequence(word)].append(word)
else:
new_lst = [word] # Inserting the first element as a list, so we can later append to it
dict[sequence(word)]=new_lst
Now you will be able to append to it properly. In your case, the value you were inserting was just a string to which you wouldn't have been able to append. But this will work, since you are inserting a list at start to which you would be able to append to .
Hope this helps !

Getting AttributeError: 'NoneType' object has no attribute 'append' when using list in the loop, which is declared outside loop [duplicate]

This question already has answers here:
Why do these list operations (methods: clear / extend / reverse / append / sort / remove) return None, rather than the resulting list?
(6 answers)
Closed 3 years ago.
'When I run the below code, I get the following error.
AttributeError: 'NoneType' object has no attribute 'append'
Please help me out. I am new to Python'
Blockquote
x = list(map(int,raw_input("Enter a value: ").split()))
x1=[]
for i in range(0,len(x)):
x1=x1.append(x[i])
print(x1)
list.append(x) modifies the list itself and returns None, so
x1=x1.append(x[i])
assigns None to x1 and in the next iteration you encounter the error. Just write
x1.append(x[i])
instead.

How to delete an element by index from a numpy array in Python 3? [duplicate]

This question already has answers here:
How to remove specific elements in a numpy array
(13 answers)
Closed 3 years ago.
I want to delete an element from a numpy array by index.
The commands
arr = np.linspace(-5,5,10)
del arr[0]
The code above throws an error saying cannot delete array elements.
Using pop doesn't work either. What should I do?
You should use np.delete for it.
arr = np.linspace(-5,5,10)
arr = np.delete(arr, 0)

Getting the match value of re.search python [duplicate]

This question already has answers here:
Python extract pattern matches
(10 answers)
Closed 3 years ago.
I'm pulling some data out of the web utilizing python in the Jupyter notebook. I have pulled down the data, parsed, and created the data frame. I need to extract a number out of a string that I have in the data frame. I utilizing this regex to do it:
for note in df["person_notes"]:
print(re.search(r'\d+', note))
and the outcome is the following:
<_sre.SRE_Match object; span=(53, 55), match='89'>
How can I get just the match number; in this line would be 89. I tried to convert the whole line to str() and the replace(), but not all lines have the span=(number, number) iqual. Thank you in advance!
You can use the start() and end() methods on the returned match objects to get the correct positions within the string:
for note in df["person_notes"]:
match = re.search(r'\d+', note)
if match:
print(note[match.start():match.end()])
else:
# no match found ...

Resources