Passing string variable value in Pandas dataframe - string

I have been trying to use variables for passing the string value in dataframe for various column operations, but the code is giving me wrong results. See the code below, I am using in Jupyter Notebook:
first_key = input("key 1: ")
second_key = input("ket 2: ")
third_key = input("ket 2: ")
These receive the values "Russia", "China", "Trump" for the operation in next cell as below:
tweets['{first_key}'] = tweets['text'].str.contains(r"^(?=.*\b{first_key}\b).*$", case=False) == True
tweets['{second_key}'] = tweets['text'].str.contains(r"^(?=.*\b'{second_key}'\b).*$", case=False) == True
tweets['{third_key}'] = tweets['text'].str.contains(r"^(?=.*\b'{third_key}'\b).*$", case=False) == True
But results are wrong. Any idea how to get the correct results. A small snapshot of the results is like this.

I've tried cleaning up your code. You can leverage f-strings (using python-3.6+) with a tiny change to your code:
def contains(series, key):
return series.str.contains(rf"^(?=.*\b{key}\b).*$", case=False)
If you're working with an older version of python, use str.format:
def contains(series, key):
return series.str.contains(r"^(?=.*\b{}\b).*$".format(key), case=False)
Next, call this function inside a loop:
for key in (first_key, second_key, third_key):
tweets[key] = contains(tweets['text'], key)

Related

python function to pass the boolean operators and predict the survival mean with titanic dataset

I am trying to predict the survival chances of the passengers in the testing set.i was successful to predict the values with out using the function.
Dataset as follows
f = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Data/regression_sprint/titanic_train_raw.csv')
df_clean = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Data/regression_sprint/titanic_train_clean_raw.csv')
the expected results are below if the value is passed via function
survival_likelihood(df_clean,"Pclass","==","3") == 0.24
survival_likelihood(df_clean,"Age","<","15") == 0.58
i was able to get the output with out writing a function as per the below image
i have written the following function and am unable to get the desired results if the values is passed via function
def survival_likelihood(df_clean, column_name, boolean_operator, value):
column_name = df_clean.columns
value = df[column_name]
boolean_operator = [">" or "<" or "=="]
if column_name in df_clean.columns and df_clean[column_name].dtypes != object :
s = round(df_clean[df[column_name][boolean_operator][value]].Survived.mean(),
return s
have tried eval() method which did not help either. Looking forward for a pointer/fix for the same. Thanks in advance.
Regards,
Prakash
The below one works able to achieve what i want
def survival_likelihood(df_clean,column_name, boolean_operator, value):
if boolean_operator == '<':
s = df_clean[df_clean[column_name] < eval(value)] ['Survived'].mean()
#print(boolean_operator)
if boolean_operator == '>':
s = df_clean[df_clean[column_name] < eval(value)] ['Survived'].mean()
#print(boolean_operator)
if boolean_operator == '==':
s = df_clean[df_clean[column_name] == eval(value)] ['Survived'].mean()
return s

When a particular key is not present in a dictionary

From a Dictionary input_dict={'Name': 'Monty', 'Profession': 'Singer' }, get the value of a key Label which is not a part of the dictionary, in such a way that Python doesn't hit an error. If the key does not exist in the dictionary, Python should return NA.
Sample Input:
{'Name': 'Monty', 'Profession': 'Singer' }
Sample Output:
NA
The get() method is really useful here as it returns none and don't break the system by giving error
You can use get method of the dictionary. This method never raises a KeyError.
input_dict.get('Label', 'NA')
The syntax of get() is:
dict.get(key, value)
get() Parameters
The get() method takes maximum of two parameters:
key - key to be searched in the dictionary
value (optional) - Value to be returned if the key is not found. The
default value is None.
The get() method returns:
the value for the specified key if key is in dictionary.
None if the key is not found and value is not specified.
value if the key is not found and value is specified.
import ast,sys
input_str = sys.stdin.read()
input_dict = ast.literal_eval(input_str)
answer=input_dict.get('Label', 'NA')
print(answer)
Final solution can be with use of Get().
import ast, sys
input_str = sys.stdin.read()
input_dict = ast.literal_eval(input_str)
answer = input_dict.get('Label', 'NA')
print(answer)
It is working fine
We use update statement to update the Label and so when call the label, we the value of "NA"
import ast,sys
input_str = sys.stdin.read()
input_dict = ast.literal_eval(input_str)
input_dict.update({'Label':'NA'})
answer=input_dict["Label"]
print(answer)
import ast,sys
input_str = sys.stdin.read()
input_dict = ast.literal_eval(input_str)
input_dict["Label"]="NA"
answer=input_dict["Label"]
# Type your answer here
print(answer)

Using loops to call recursive function

I am trying to create a recursive function that takes three parameters: name of the dictionary, name of the original (This will be the key in the dict), and name of the final (trying to determine if it is possible to reach the final from the original)
My code functions well and enters the correct if statements and everything (tested using print statements) however, instead of the function returning True or False, it returns None every time.
I determined that this is because rather than calling my recursive function with "return" I only call the name of the function. However, if I include return in my code, the function only runs with the first value from the dictionary's key.
Any and all help on this would be appreciated.
def evolve(dictname, babyname, evolvedname):
if babyname == evolvedname:
return True
elif babyname in dictname.keys():
if dictname[babyname]:
for i in dictname[babyname]:
evolve(dictname,i,evolvedname)
else:
return False
else:
return False
Collect all recursive call's results, and return True if any of them is true.
Something like:
def evolve(dictname, babyname, evolvedname):
if babyname == evolvedname:
return True
elif babyname in dictname.keys():
if dictname[babyname]:
results = [] #To collect results
for i in dictname[babyname]:
results.append(evolve(dictname,i,evolvedname))
#Check if any of them is True
for res in results:
if res==True: return True
return False #No true among childs
else:
return False
else:
return False
But I think this code can be simplified to just:
def evolve(dictname, babyname, evolvedname):
if babyname == evolvedname:
return True
return any(evolve(dictname,i,evolvedname) for i in dictname.get(babyname,[]))
Lastly, although I don't know what you are trying to do, you might get an infinite loop, this is like doing dfs but without marking any node as already explored(black) or currently exploring(gray).

Unable to copy list to set. 'float' object is not iterable

lst,score_set,final_lst = [],[],[]
if __name__ == '__main__':
for _ in range(int(input())):
name = input()
score = float(input())
score_set.append(score)
lst.append(([name,score]))
new_set = set()
for i in range(0,len(score_set)):
item = score_set[i]
print (item)
new_set.update(item)
I am trying to copy a list into a set to remove duplicates. In my code, if I remove the last line, the code runs fine. Could you guys please help ?
If you want to add a single value, use add() instead of update():
new_set.add(item)

Python3 Function return value

I believe I have something incorrect with my function but I still pretty new to python and functions in general and I am missing what my issue is. In the code below I am sending no values into the function, but once the function reads the database I want it to pull the values from the database back into the main program. I placed a print statement inside of the function to make sure its pulling the values from the database and everything works properly. I copy and pasted that same print line right after the function call in the main area but I am getting a NameError: 'wins' is not defined. This leads me to believe I am not returning the values correctly?
#Python3
import pymysql
def read_db():
db = pymysql.connect(host='**********',user='**********',password='**********',db='**********')
cursor = db.cursor()
sql = "SELECT * FROM loot WHERE id = 1"
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
wins = row[2]
skips = row[3]
lumber1 = row[4]
ore1 = row[5]
mercury1 = row[6]
db.commit()
db.close()
print ("Wins = ",wins," Skips = ",skips," Lumber = ",lumber1," Ore = ",ore1," Mercury = ",mercury1)
return (wins, skips, lumber1, ore1, mercury1)
read_db()
print ("Wins = ",wins," Skips = ",skips," Lumber = ",lumber1," Ore = ",ore1," Mercury = ",mercury1)
Doing just read_db() causes the return values to be thrown out. You need to assign the return values to variables:
wins, skips, lumber, ore = read_db()
The fact that you're returning a variable called wins does not mean a wins variable will become available in the scope you call read_db() from. You need to assign return values explicitly.

Resources