How can I apply a function that I created to each successive row of a column in python? - python-3.x

I have a 4 rows in a column that I want to iterate over and apply a function to.
The df is a column and the values for instance are:
a
b
c
d
I want to apply the function in this way:
function(a,b)
function(a,c)
function(a,d)
function(b,c)
function(c,d)
How can I do this in python?
I've tried using df.apply(lambda column: compare_score, axis=0)
1

Related

Sum dictionary values stored in Data frame Columns

I have a data frame having dictionary like structure. I want to only sum the values and store into new column.
Column 1 Desired Output
[{'Apple':3}, 9
{'Mango:2},
{'Peach:4}]
[{'Blue':2}, 3
{'Black':1}]
df['Desired Output'] = [sum(x) for x in df['Column 1']]
df
Assuming your Column 1 column does indeed have dictionaries (and not strings that look like dictionaries), this should do the trick:
df['Desired Output'] = df["Column 1"].apply(lambda lst: sum(sum(d.values()) for d in lst))

Increase the values in a column values based on values in other column in pandas

I have my source data in the form of csv file as below:
id,col1,col2
123,11|22|33||||||,val1|val3|val2
456,99||77|||88|||||||||6|,val4|val5|val6|val7
I need to add a new column(fnlsrc) which will have the values based on values in Col2 and Col1, i.e if col1 has 9 values(separated with pipe) and col2 has 3 values(separated with pipe), then in fnlsrc column I have to load 9 values(separated with pipe) 3 set of col2(val1|val3|val2|val1|val3|val2|val1|val3|val2). Please refer the output below, which will help in understanding the requirement easily:
id,col1,col2,fnlsrc
123,11|22|33||||||,val1|val3|val2,val1|val3|val2|val1|val3|val2|val1|val3|val2
456,99||77|||88|||||||||6|,val4|val5|val6|val7,val4|val5|val6|val7|val4|val5|val6|val7|val4|val5|val6|val7|val4|val5|val6|val7
I have tried following code, but its adding only the one set:
zipped = zip(df['col1'], df['col2'])
for s,t in zipped:
count = int((s.count('|') + 1)/(t.count('|') + 1))
for val in range(count):
df['fnlsrc'] = t
As the new column is based on the other two, I would use panda's apply() function. I defined a function that calculates the new column value based on the other two columns, which is then applied to each row:
def new_value(x):
# Find out number of values in both columns
col1_numbers = x['col1'].count('|') + 1
col2_numbers = x['col2'].count('|') + 1
# Calculate how many times col2 should appear in the new column
repetition = int(col1_numbers/col2_numbers)
# Create list of strings containing the values of the new column
values = [x['col2']]*repetition
# Join the list of strings with pipes
return '|'.join(values)
# Apply the function on every row
df['fnlsrc'] = df.apply(lambda x:new_value(x), axis=1)
df
Output:
id col1 col2 fnlsrc
0 123 11|22|33|||||| val1|val3|val2 val1|val3|val2|val1|val3|val2|val1|val3|val2
1 456 99||77|||88|||||||||6| val4|val5|val6|val7 val4|val5|val6|val7|val4|val5|val6|val7|val4|v...
Full output in your input format:
id,col1,col2,fnlsrc
123,11|22|33||||||,val1|val3|val2,val1|val3|val2|val1|val3|val2|val1|val3|val2
456,99||77|||88|||||||||6|,val4|val5|val6|val7,val4|val5|val6|val7|val4|val5|val6|val7|val4|val5|val6|val7|val4|val5|val6|val7

How to Make A New Column Which Contains A List That Contains Number Iteration Based on 2 Column in Pandas?

I have a dataframe like below
How can I add new column which consists of number from first and final(inclusive) in pandas?
For example like this
I have solved my problem on my own
Here is my code
df['Final'] = df['Final'].fillna(df['First'])
def make_list(first,final):
return [x for x in range(first,final+1)]
df['Towers'] = df.apply(lambda x: make_list(x['First'],x['Final']), axis=1)

How map() function works in python?

I want to apply numpy function average on pandas dataframe object. Since, I want to apply this function on row wise element of dataframe object, therefore I have applied map function. code is as follows:
df = pd.DataFrame(np.random.rand(5,3),columns = ['Col1','Col2','Col3'])
df_averge_row = df.apply(np.average(weights=[[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]]),axis=0)
Unfortunately, it is not working. Any Suggestion would be helpful
Since you have 3 columns in each row and are applying the function row-wise (not column wise) per your question, the weights function can only have 3 elements (one per each column in a given row, let's say [1,2,3]):
df = pd.DataFrame(np.random.rand(5,3),columns = ['Col1','Col2','Col3'])
weights = weights=[1,2,3]
df_averge_row = df.apply(lambda x: np.average(x, weights=weights),axis=1)
df_averge_row
out:
0 0.618617
1 0.757778
2 0.551463
3 0.497654
4 0.755083
dtype: float64

How to select bunch of rows

I have dataframe with multiple columns , i want to select bunch of rows if column B have consecutive 1 and check in these rows if column A have any value equal to 0.04 then need this bunch of rows and extract start value and end value of column A for this bunch of rows
Here is my dataframe
Here is my desired output:
filtter Consecutive groups .diff().abs().cumsum().bfill() not following the specific considitons (x['B'].eq(1).any() and x['A'].eq(0.04).any()
agg first and last
followed by grouping consecutivity column to extract first and last rows with use of agg fun
df['temp'] = df.B.diff().abs().cumsum().bfill()
df.groupby('temp').filter(lambda x: (x['B'].eq(1).any() and x['A'].eq(0.04).any()))\
.groupby('temp').agg({'A':['first','last']})
Out:
A
first last
temp
3.0 344.0 39.9

Resources