How to Make A New Column Which Contains A List That Contains Number Iteration Based on 2 Column in Pandas? - python-3.x

I have a dataframe like below
How can I add new column which consists of number from first and final(inclusive) in pandas?
For example like this

I have solved my problem on my own
Here is my code
df['Final'] = df['Final'].fillna(df['First'])
def make_list(first,final):
return [x for x in range(first,final+1)]
df['Towers'] = df.apply(lambda x: make_list(x['First'],x['Final']), axis=1)

Related

Sum dictionary values stored in Data frame Columns

I have a data frame having dictionary like structure. I want to only sum the values and store into new column.
Column 1 Desired Output
[{'Apple':3}, 9
{'Mango:2},
{'Peach:4}]
[{'Blue':2}, 3
{'Black':1}]
df['Desired Output'] = [sum(x) for x in df['Column 1']]
df
Assuming your Column 1 column does indeed have dictionaries (and not strings that look like dictionaries), this should do the trick:
df['Desired Output'] = df["Column 1"].apply(lambda lst: sum(sum(d.values()) for d in lst))

How to get number of columns in a DataFrame row that are above threshold

I have a simple python 3.8 DataFrame with 8 columns (simply labeled 0, 1, 2, etc.) with approx. 3500 rows. I want a subset of this DataFrame where there are at least 2 columns in each row that are above 1. I would prefer not to have to check each column individually, but be able to check all columns. I know I can use the .any(1) to check all the columns, but I need there to be at least 2 columns that meet the threshold, not just one. Any help would be appreciated. Sample code below:
import pandas as pd
df = pd.DataFrame({0:[1,1,1,1,100],
1:[1,3,1,1,1],
2:[1,3,1,1,4],
3:[1,1,1,1,1],
4:[3,4,1,1,5],
5:[1,1,1,1,1]})
Easiest way I can think to sort/filter later would be to create another column at the end df[9] that houses the count:
df[9] = df.apply(lambda x: x.count() if x > 2, axis=1)
This code doesn't work, but I feel like it's close?
df[(df>1).sum(axis=1)>=2]
Explanation:
(df>1).sum(axis=1) gives the number of columns in that row that is greater than 1.
then with >=2 we filter those rows with at least 2 columns that meet the condition --which we counted as explained in the previous bullet
The value of x in the lambda is a Series, which can be indexed like this.
df[9] = df.apply(lambda x: x[x > 2].count(), axis=1)

Find and Add Missing Column Values Based on Index Increment Python Pandas Dataframe

Good Afternoon!
I have a pandas dataframe with an index and a count.
dictionary = {1:5,2:10,4:3,5:2}
df = pd.DataFrame.from_dict(dictionary , orient = 'index' , columns = ['count'])
What I want to do is check from df.index.min() to df.index.max() that the index increment is 1. If a value is missing like in my case the 3 is missing then I want to add 3 to the index with a 0 in the count.
The output will look like the below df2 but done in a programmatic fashion so I can use it on a much bigger dataframe.
RESULTS EXAMPLE DF:
dictionary2 = {1:5,2:10,3:0,4:3,5:2}
df2 = pd.DataFrame.from_dict(dictionary2 , orient = 'index' , columns = ['count'])
Thank you much!!!
Ensure the index is sorted:
df = df.sort_index()
Create an array that starts from the minimum index to the maximum index
complete_array = np.arange(df.index.min(), df.index.max() + 1)
Reindex, fill the null value with 0, and optionally change the dtype to Pandas Int:
df.reindex(complete_array, fill_value=0).astype("Int16")
count
1 5
2 10
3 0
4 3
5 2

How map() function works in python?

I want to apply numpy function average on pandas dataframe object. Since, I want to apply this function on row wise element of dataframe object, therefore I have applied map function. code is as follows:
df = pd.DataFrame(np.random.rand(5,3),columns = ['Col1','Col2','Col3'])
df_averge_row = df.apply(np.average(weights=[[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]]),axis=0)
Unfortunately, it is not working. Any Suggestion would be helpful
Since you have 3 columns in each row and are applying the function row-wise (not column wise) per your question, the weights function can only have 3 elements (one per each column in a given row, let's say [1,2,3]):
df = pd.DataFrame(np.random.rand(5,3),columns = ['Col1','Col2','Col3'])
weights = weights=[1,2,3]
df_averge_row = df.apply(lambda x: np.average(x, weights=weights),axis=1)
df_averge_row
out:
0 0.618617
1 0.757778
2 0.551463
3 0.497654
4 0.755083
dtype: float64

How can I apply a function that I created to each successive row of a column in python?

I have a 4 rows in a column that I want to iterate over and apply a function to.
The df is a column and the values for instance are:
a
b
c
d
I want to apply the function in this way:
function(a,b)
function(a,c)
function(a,d)
function(b,c)
function(c,d)
How can I do this in python?
I've tried using df.apply(lambda column: compare_score, axis=0)
1

Resources