How can I make new column with dynamic values - excel

Id
1
2
3
4
2
3
3
3
Questions
create one new column and that is newid
output should be like this.
id newid
1 1
2 1
3 1
4 1
2 0
3 0
3 0
3 0
Please suggest me how can I do it and which formula to be used in excel

Related

Cumulative count using grouping, sorting, and condition

i want Cumulative count of zero only in column c grouped by column a and sorted by b if other number the count reset to 1
this a sample
df = pd.DataFrame({'a':[1,1,1,1,2,2,2,2],
'b':[1,2,3,4,1,2,3,4],
'c':[10,0,0,5,1,0,1,0]}
)
i try next code that work but if zero appear more than one time shift function didn't depend on new value and need to run more than one time depend on count of zero series
df.loc[df.c == 0 ,'n'] = df.n.shift(1)+1
i try next code it done with small data frame but when try with large data take a long time and didn't finsh
for ind in df.index:
if df.loc[ind,'c'] == 0 :
df.loc[ind,'new'] = df.loc[ind-1,'new']+1
else :
df.loc[ind,'new'] = 1
pd.DataFrame({'a':[1,1,1,1,2,2,2,2],
'b':[1,2,3,4,1,2,3,4],
'c':[10,0,0,5,1,0,1,0]}
The desired result
a b c n
0 1 1 10 1
1 1 2 0 2
2 1 3 0 3
3 1 4 5 1
4 2 1 1 1
5 2 2 0 2
6 2 3 1 1
7 2 4 0 2
Try use cumsum to create a group variable and then use groupby.cumcount to create the new column:
df.sort_values(['a', 'b'], inplace=True)
df['n'] = df['c'].groupby([df.a, df['c'].ne(0).cumsum()]).cumcount() + 1
df
a b c n
0 1 1 10 1
1 1 2 0 2
2 1 3 0 3
3 1 4 5 1
4 2 1 1 1
5 2 2 0 2
6 2 3 1 1
7 2 4 0 2

Remove rows from Dataframe where row above or below has same value in a specific column

Starting Dataframe:
A B
0 1 1
1 1 2
2 2 3
3 3 4
4 3 5
5 1 6
6 1 7
7 1 8
8 2 9
Desired result - eg. Remove rows where column A has values that match the row above or below:
A B
0 1 1
2 2 3
3 3 4
5 1 6
8 2 9
You can use boolean indexing, the following condition will return true if value of A is NOT equal to value of A's next row
new_df = df[df['A'].ne(df['A'].shift())]
A B
0 1 1
2 2 3
3 3 4
5 1 6
8 2 9

Create a new pandas column with repeating a value according with another column

I have a table like this
times v2
0 4 10
1 2 20
2 0 30/n30
3 1 40
4 0 9
What I want if change the values of v2 when times != 0, and the change consists in adding "\0" as many times as the times columns says.
times v2
0 4 10\n0\n0\n0\n0
1 2 20\n0\n0
2 0 30\n30
3 1 40\n0
4 0 9
You can do
df.v2+=df.times.map(lambda x : x*"\n0")
df
Out[325]:
times v2
0 4 10\n0\n0\n0\n0
1 2 20\n0\n0
2 0 30/n30
3 1 40\n0
4 0 9

Increment values in a column based on another column (Pandas)

I have DataFrame containing three columns:
The incrementor
The incremented
Other
I would like lengthen the DataFrame in a particular way. In each row, I want to add a number of rows, depending on the incrementor, and in these rows we increment the incremented, while the "other" is just replicated.
I made a small example which makes it more clear:
df = pd.DataFrame([[2,1,3], [5,20,0], ['a','b','c']]).transpose()
df.columns = ['incrementor', 'incremented', 'other']
df
incrementor incremented other
0 2 5 a
1 1 20 b
2 3 0 c
The desired output is:
incrementor incremented other
0 2 5 a
1 2 6 a
2 1 20 b
3 3 0 c
4 3 1 c
5 3 2 c
Is there a way to do this elegantly and efficiently with Pandas? Or is there no way to avoid looping?
First get repeated rows on incrementor using repeat and .loc
In [1029]: dff = df.loc[df.index.repeat(df.incrementor.astype(int))]
Then, modify incremented with cumcount
In [1030]: dff.assign(
incremented=dff.incremented + dff.groupby(level=0).incremented.cumcount()
).reset_index(drop=True)
Out[1030]:
incrementor incremented other
0 2 5 a
1 2 6 a
2 1 20 b
3 3 0 c
4 3 1 c
5 3 2 c
Details
In [1031]: dff
Out[1031]:
incrementor incremented other
0 2 5 a
0 2 5 a
1 1 20 b
2 3 0 c
2 3 0 c
2 3 0 c
In [1032]: dff.groupby(level=0).incremented.cumcount()
Out[1032]:
0 0
0 1
1 0
2 0
2 1
2 2
dtype: int64

Create a list of duplicate records that are in several columns

I have a data set that is spread across five columns. Sample of data:
Raw Data End Results
A B C D E A B C D E
1 2 2 1 6 1 2 2 1 6
0 3 3 0 6 0 3 3 0 6
1 2 2 1 6
0 3 3 0 6
1 2 2 1 6
0 3 3 0 6
1 2 2 1 6
0 3 3 0 6
1 2 2 1 6
0 3 3 0 6
1 2 2 1 6
0 3 3 0 6
1 2 2 1 6
0 3 3 0 6
The length of record varies from 10 to 40.
The data is to help me keep record of inventory and I wish to know which orders are popular.
Unfortunately I am still using Excel 2003.
Because I am not really sure what you have, this is deliberately simple:
In ColumnG Row1 put:
=A1&B1&C1&D1&E1
and copy down to suit. Select ColumnG and Paste Special, Values. Select ColumnG and sort. Insert in H1 and copy down to suit:
=COUNTIF(G$1:G1,G1)
1 should indicate the first ("unique") instance of each of the rows of Raw Data (and the other numbers the number of repetitions - up to 7 in your example, so one 'original' and six 'copies'.

Resources