EXCEL - CountIF per category - excel

I have this:
1 A B C
2 Country Value Valid
3 Sweden 10 0
4 Sweden 5 1
5 Sweden 1 1
6 Norway 5 1
7 Norway 5 1
8 Germany 12 1
9 Germany 2 1
10 Germany 3 1
11 Germany 1 0
I want to fill in B15 to D17 in table below with number of valid values (a 1 in column C) per country and value range:
A B C D
13 Value count
14 0 to 3 4 to 7 above 7
15 Sweden 1 1 0
16 Norway 0 2 0
17 Germany 3 0 1
I have tried IF combined with COUNTIF but i cant figure it out.
What would the formula be for cell B15?

Formula you are looking for is this:
=COUNTIFS($A$3:$A$11,$B15,$C$3:$C$11,1,$B$4:$B$11,"<4")
You will just need to change last criterion to $C$3:$C$11,">3",$C$3:$C$11,"<8" to make it count only values between.
Note: Germany will be 2 because value for valid in last row is 0

Related

Fill null and next value with avarge value

i work with customers consumptions and sometime didn't have this consumption for month or more
so the first consumption after that need to break it down into those months
example
df = pd.DataFrame({'customerId':[1,1,1,1,1,1,1,2,2,2,2,2,2,2],
'month':['2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01','2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01'],
'consumption':[100,130,0,0,400,140,105,500,0,0,0,0,0,3300]})
bfill() return same value not mean (value/count of null +1)
desired value
'c':[100,130,133,133,133,140,105,500,550,550,550,550,550,550]
You can try something like this:
df = pd.DataFrame({'customerId':[1,1,1,1,1,1,1,2,2,2,2,2,2,2],
'month':['2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01','2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01'],
'consumption':[100,130,0,0,400,140,105,500,0,0,0,0,0,3300]})
df['grp'] = df['consumption'].ne(0)[::-1].cumsum()
df['c'] = df.groupby(['customerId', 'grp'])['consumption'].transform('mean')
df
Output:
customerId month consumption grp c
0 1 2021-10-01 100 7 100.000000
1 1 2021-11-01 130 6 130.000000
2 1 2021-12-01 0 5 133.333333
3 1 2022-01-01 0 5 133.333333
4 1 2022-02-01 400 5 133.333333
5 1 2022-03-01 140 4 140.000000
6 1 2022-04-01 105 3 105.000000
7 2 2021-10-01 500 2 500.000000
8 2 2021-11-01 0 1 550.000000
9 2 2021-12-01 0 1 550.000000
10 2 2022-01-01 0 1 550.000000
11 2 2022-02-01 0 1 550.000000
12 2 2022-03-01 0 1 550.000000
13 2 2022-04-01 3300 1 550.000000
Details:
Create a group by checking for zero, the do a cumsum in reverse order
to group zeroes with the next non-zero value.
Groupby that group and transform mean to distribute that non-zero
value across zeroes.

map data from one column to another

I have two DataFrames d1 and d2.
d1:
category value
0 a 4
1 b 9
2 c 14
3 d 19
4 e 24
5 f 29
d2:
one two
0 NaN a
1 NaN a
2 NaN c
3 NaN d
4 NaN e
5 NaN a
I want to map values from d1 to 'one' column in d2 using category marker form d1.
this should return me:
one two
0 4 a
1 4 a
2 14 c
3 19 d
4 24 e
5 4 a
Try:
df2['one'] = df2['two'].map(df1.set_index('category')['value'])

Create a column in pandas dataframes based on conditionals

I have a pandas dataframe as below:
import pandas as pd
import numpy as np
import datetime
# intialise data of lists.
data = {'month' :[2,3,4,5,6,7,2,3,6,5],
'flag': ["A","A","A","A","A","A","B","B","B","B"],
'month1' :[4,4,7,15,11,13,6,5,6,5],
'value' :[100,20,50,10,65,86,24,12,1000,200]
}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
df
month flag month1 value
0 2 A 4 100
1 3 A 4 20
2 4 A 7 50
3 5 A 15 10
4 6 A 11 65
5 7 A 13 86
6 2 B 6 24
7 3 B 5 12
8 6 B 6 1000
9 5 B 5 200
Now for each month in unique flag, I want to perform below logic
1) Create a variable "final" and set it to 0
2) for each month, If month1 <= max(month), set "final" for where month == month1 to "final" from month1 + value from original month. For example,
index 0 to 5 are one group(flag = 'A')
MAX of month column for group A is 7
for row 1(month 2), month1 is 4 which is less than 7, go to month 4(row 3) update the value of "final" column to 100(0(current "final" value)+100(value from original month)
perform above step to each row in a group.
Expected output:
month flag month1 value Final
0 2 A 4 100 0
1 3 A 4 20 0
2 4 A 7 50 120
3 5 A 15 10 0
4 6 A 11 65 0
5 7 A 13 86 50
6 2 B 6 24 0
7 3 B 5 12 0
8 6 B 6 1000 1024
9 5 B 5 200 212
Define the following functions:
A function to be applied to each row (in the current group):
def fn(row, tbl, maxMonth):
return tbl[tbl.month1 == row.month].value.sum()
A function to be applied to each group:
def fnGrp(grp):
return grp.apply(fn, axis=1, tbl=grp, maxMonth=grp.month.max())
Then, to compute final column, group df by flag and apply
fnGrp to each group and save the result in final column:
df['final'] = df.groupby('flag').apply(fnGrp).reset_index(level=0, drop=True)
The result (df with added column) is:
month flag month1 value final
0 2 A 4 100 0
1 3 A 4 20 0
2 4 A 7 50 120
3 5 A 15 10 0
4 6 A 11 65 0
5 7 A 13 86 50
6 2 B 6 24 0
7 3 B 5 12 0
8 6 B 6 1000 1024
9 5 B 5 200 212
you can groupby 'flag' and 'month1' and get the sum of 'value', then merge this with df plus fillna with 0 such as:
new_df = df.merge(df.groupby(['flag', 'month1'])[['value']].sum(),
left_on=['flag','month'], right_index=True,
how='left', suffixes=('','_final'))\
.fillna({'value_final':0})
print (new_df)
month flag month1 value value_final
0 2 A 4 100 0.0
1 3 A 4 20 0.0
2 4 A 7 50 120.0
3 5 A 15 10 0.0
4 6 A 11 65 0.0
5 7 A 13 86 50.0
6 2 B 6 24 0.0
7 3 B 5 12 0.0
8 6 B 6 1000 1024.0
9 5 B 5 200 212.0

How do I create a new column in pandas which is the sum of another column based on a condition?

I am trying to get the result column to be the sum of the value column for all rows in the data frame where the country is equal to the country in that row, and the date is on or before the date in that row.
Date Country ValueResult
01/01/2019 France 10 10
03/01/2019 England 9 9
03/01/2019 Germany 7 7
22/01/2019 Italy 2 2
07/02/2019 Germany 10 17
17/02/2019 England 6 15
25/02/2019 England 5 20
07/03/2019 France 3 13
17/03/2019 England 3 23
27/03/2019 Germany 3 20
15/04/2019 France 6 19
04/05/2019 England 3 26
07/05/2019 Germany 5 25
21/05/2019 Italy 5 7
05/06/2019 Germany 8 33
21/06/2019 England 3 29
24/06/2019 England 7 36
14/07/2019 France 1 20
16/07/2019 England 5 41
30/07/2019 Germany 6 39
18/08/2019 France 6 26
04/09/2019 England 3 44
08/09/2019 Germany 9 48
15/09/2019 Italy 7 14
05/10/2019 Germany 2 50
I have tried the below code but it sums up the entire column
df['result'] = df.loc[(df['Country'] == df['Country']) & (df['Date'] >= df['Date']), 'Value'].sum()
as your dates are ordered you could do:
df['Result'] = df.grouby('Coutry').Value.cumsum()

Excel formulas for range criteria date, arranged in columns

I want to write a formula for a large data chart. The criteria which I have to choose is on rows and columns.
I attach the file with the manually written calculus.
|PRODUCT|01-feb|02-feb|03-feb|04-feb|05-feb|06-feb|07-feb|08-feb|09-ef|10-feb|11-feb|feb-12|
|PRODUCT 1|4|3|1|5|2|9|1|3|5|8|0|5|
|PRODUCT 3|2|5|7|4|4|8|3|5|7|4|4|8|
|PRODUCT 1|1|0|5|3|1|1|8|0|5|3|1|1|
|PRODUCT 2|5|4|6|6|0|7|4|4|6|6|0|7|
|PRODUCT 5|8|7|8|7|1|9|2|7|8|7|1|9|
|PRODUCT 4|4|2|9|3|5|1|7|2|9|3|5|1|
|PRODUCT 1|9|8|1|4|4|6|5|8|1|4|4|6|
|PRODUCT 2|6|4|4|7|2|8|6|4|4|7|2|8|
|PRODUCT 5|2|6|1|8|3|9|3|6|1|8|3|9|
|PRODUCT 3|3|9|5|1|7|4|7|9|5|1|7|4|
|PRODUCT 4|7|6|5|5|8|2|1|6|5|5|8|2|
The compact chart that I have to get:
|PRODUCT|04-feb|08-feb|12-feb|
|PRODUCT 1|44|48|43|
|PRODUCT 2|42|35|40|
|PRODUCT 3|36|47|40|
|PRODUCT 4|41|32|38|
|PRODUCT 5|47|40|46|
The formula that it should works:
=SUMAR.SI.CONJUNTO(C5:N15,B5:B15,H20,C4:N4,"=<"&J19)
because I want to show a range of date between 01-feb to 04-feb from the first chart in the new column 04-feb.
Please, help me.
The following might help you. The formula in the upper left cell of the table of the summary is
{=SUM((($B$1:$M$1<=B$14)*($B$1:$M$1>=A$14)*$B$2:$M$13)*($A15=$A$2:$A$13))}
and can be copied over to the over cells. The 31.01 in the summary table is used as a "helper cell", so that you don't have to alter the formula for the different cells.
Product 01. Feb 02. Feb 03. Feb 04. Feb 05. Feb 06. Feb 07. Feb 08. Feb 09. Feb 10. Feb 11. Feb 12. Feb
Product1 5 2 3 3 5 5 3 3 5 3 3 5
Product3 5 4 2 4 5 1 5 3 3 5 3 3
Product4 3 1 2 2 4 5 5 1 5 5 1 5
Product1 4 1 4 3 4 1 4 1 3 4 1 3
Product3 1 2 2 4 5 2 5 1 1 5 1 1
Product4 3 2 4 1 1 4 3 5 2 3 5 2
Product1 4 3 5 1 1 1 2 2 2 2 2 2
Product3 3 2 4 3 5 1 1 1 4 1 1 4
Product4 2 1 4 2 2 1 4 4 3 4 4 3
Product1 4 5 5 2 3 4 3 4 5 3 4 5
Product3 4 2 3 1 4 1 1 3 1 1 3 1
Product4 3 5 3 3 1 4 1 1 3 1 1 3
31. Jan 04. Feb 08. Feb 12. Feb
Product1 54 55 62
Product2 0 0 0
Product3 46 56 46
Product4 41 54 61
Product5 0 0 0
You can use sumproduct for this. B2:E12 is the range of data for Feb 1 though Feb 4, and O2 is equal to the criteria you are searching for. So in my case O2 was equal to Product 1. When you want the range for Feb 8, just change B2:E12 to the range of data corresponding to Feb 5 to Feb 8.
=SUMPRODUCT(B2:E12*(A2:A12=O2))

Resources