i want Cumulative count of zero only in column c grouped by column a and sorted by b if other number the count reset to 1
this a sample
df = pd.DataFrame({'a':[1,1,1,1,2,2,2,2],
'b':[1,2,3,4,1,2,3,4],
'c':[10,0,0,5,1,0,1,0]}
)
i try next code that work but if zero appear more than one time shift function didn't depend on new value and need to run more than one time depend on count of zero series
df.loc[df.c == 0 ,'n'] = df.n.shift(1)+1
i try next code it done with small data frame but when try with large data take a long time and didn't finsh
for ind in df.index:
if df.loc[ind,'c'] == 0 :
df.loc[ind,'new'] = df.loc[ind-1,'new']+1
else :
df.loc[ind,'new'] = 1
pd.DataFrame({'a':[1,1,1,1,2,2,2,2],
'b':[1,2,3,4,1,2,3,4],
'c':[10,0,0,5,1,0,1,0]}
The desired result
a b c n
0 1 1 10 1
1 1 2 0 2
2 1 3 0 3
3 1 4 5 1
4 2 1 1 1
5 2 2 0 2
6 2 3 1 1
7 2 4 0 2
Try use cumsum to create a group variable and then use groupby.cumcount to create the new column:
df.sort_values(['a', 'b'], inplace=True)
df['n'] = df['c'].groupby([df.a, df['c'].ne(0).cumsum()]).cumcount() + 1
df
a b c n
0 1 1 10 1
1 1 2 0 2
2 1 3 0 3
3 1 4 5 1
4 2 1 1 1
5 2 2 0 2
6 2 3 1 1
7 2 4 0 2
Suppose I have dataframe like this
>>> df = pd.DataFrame({'id':[1,1,1,2,2,2,2,3,4],'value':[1,2,3,1,2,3,4,1,1]})
>>> df
id value
0 1 1
1 1 2
2 1 3
3 2 1
4 2 2
5 2 3
6 2 4
7 3 1
8 4 1
Now I want top all records from each group using group id except last 3. That means I want to drop last 3 records from all groups. How can I do it using pandas group_by. This is dummy data.
Use GroupBy.cumcount for counter from back by ascending=False and then compare by Series.gt for greater values like 2, because python count from 0:
df = df[df.groupby('id').cumcount(ascending=False).gt(2)]
print (df)
id value
3 2 1
Details:
print (df.groupby('id').cumcount(ascending=False))
0 2
1 1
2 0
3 3
4 2
5 1
6 0
7 0
8 0
dtype: int64
I have a dataframe df:
df = pd.DataFrame({})
df['X'] = [3,8,11,6,7,8]
df['name'] = [1,1,1,2,2,2]
X name
0 3 1
1 8 1
2 11 1
3 6 2
4 7 2
5 8 2
For each group within 'name' and want to remove that group if the difference between the first and last row of that group is smaller than a specified value d_dif in absolute way:
For example, when d_dif= 5, I want to get:
X name
0 3 1
1 8 1
2 11 1
If your data is increasingly in X, you can use groupby().transform() and np.ptp
threshold = 5
ranges = df.groupby('name')['X'].transform(np.ptp)
df[ranges > threshold]
If you only care about first and last, then transform just first and last:
threshold = 5
groups = df.groupby('name')['X']
ranges = groups.transform('last') - groups.transform('first')
df[ranges.abs() > threshold]
Id
1
2
3
4
2
3
3
3
Questions
create one new column and that is newid
output should be like this.
id newid
1 1
2 1
3 1
4 1
2 0
3 0
3 0
3 0
Please suggest me how can I do it and which formula to be used in excel
I have a issue with OrderByDescending in Orchard CMS
Example Data:
ID Name DomainId
1 First 2
2 Join 3
3 Peter 1
4 Abert 1
5 saha 2
with LinQ to SQL code here:
IQueryable().OrderByDescending(r=> r.DomainId == 2)
it returns the correct result !
ID Name DomainId
1 First 2
5 saha 2
2 Join 3
3 Peter 1
4 Abert 1
but with OrchardCMS
IContentQuery().OrderByDescending(r=> r.DomainId == 2)
it returns the incorrect result and it will order from large to small
ID Name DomainId
2 Join 3
1 First 2
5 saha 2
3 Peter 1
4 Abert 1
Why incorrect ? and how to fix
Please help me!
i want to return results with sort by DomainId
Example: with DomainId = 2 => IContentQuery().OrderByDescending(r=> r.DomainId == 2)
ID Name DomainId
1 First 2
5 saha 2
2 Join 3
3 Peter 1
4 Abert 1
Example: with DomainId = 3 => IContentQuery().OrderByDescending(r=> r.DomainId == 3)
ID Name DomainId
2 Join 3
1 First 2
5 saha 2
3 Peter 1
4 Abert 1
Try:
.OrderByDescending(r=> r.DomainId == 2 ? 1 : 0)
IContentQuery().OrderByDescending(r=> r.DomainId) is the right approach I think. The expression (r=> r.DomainId == 2) is it evaluate right result?