How to set values based on a date range in Excel? - excel

I want to set values based on a arrival and departure date:
Idx Arrive Depart 01. Jan 02. Jan 03. Jan 04. Jan 05. Jan ...
1 01. Jan 04. Jan 1 1 1
2 02. Jan 04. Jan 1 1
3 02. Jan 05. Jan 1 1 1
4 01. Jan 05. Jan 1 1 1 1
5 03. Jan 05. Jan 1 1
... ... ... ... ... ... ... ... ...
Total 2 4 5 3
For example, Idx 1:
Arrives on 01 January
Departs on 04 January
A total of 3 nights accommodation needed (value of '1' in the columns 01, 02 and 03 January) You'll note that a '1' isn't entered in the 04 January column, as this is the date of departure and no accommodation isn't required that night.
How to archieve this in Excel?

Assuming that Arrive is in column A and the column headers (Arrive, Depart, 01. Jan) are on row 1, you want to put the following formula into cell C2:
=IF(AND(C$1>=$A2,C$1<$B2),1,"")
From there, you can copy the formula into the other cells. The formula assumes that the dates on the left and at the top are proper data values, i.e. Excel treats them as dates.

Related

Excel function to dynamically SUM UP data based on matching rows and columns

I have a table with metrics shown as rows and month shown as columns.
Example is below:
Quarter
2022-01-01
2022-01-01
2022-01-01
2022-04-01
2022-04-01
2022-04-01
2022-07-01
2022-07-01
2022-07-01
2022-10-01
2022-10-01
2022-10-01
Month
2022-01-01
2022-02-01
2022-03-01
2022-04-01
2022-05-01
2022-06-01
2022-07-01
2022-08-01
2022-09-01
2022-10-01
2022-11-01
2022-12-01
Metrics
Jan 2022
Feb 2022
Mar 2022
Apr 2022
May 2022
Jun 2022
Jul 2022
Aug 2022
Sep 2022
Oct 2022
Nov 2022
Dec 2022
Revenue
1000
1000
1000
500
500
500
100
100
100
0
0
0
Cost
10
10
10
10
10
10
20
20
20
0
5
10
I want to have a dynamic summary table of quarterly data. I can use sumifs and look up the quarter month using this function:
SUMIFS([Value row range],[Quarter range],[Quarter wanted])
However, i still have to manually select the correct value row range to sum. Is it possible to select the entire table and then match the correct row based on matching labels (metric in this case)?
Insert Report Month
Dec-22
Last 3 quarter report
Metrics
Q2 2022
Q3 2022
Q4 2022
Revenue
1500
300
0
Cost
30
60
15
I'm aware of the index & match function, but it only looks for the first match and does not sum up all months in the same quarter.
Thanks for helping!
Excel 365 for MAC should have the BYCOL function,
Given:
Your data table is a Table named Metrics
Report_Month is a Named Range containing a "real date" in the month of the final month of the desired quarter.
The following formula will return your output and will adjust as you add columns to the data table.
A11: =Metrics[[#All],[Metrics]]
B11: =LET(x,EDATE(Report_Month,SEQUENCE(,3,-6,3)),TEXT(MONTH(x)/3,"\Q0 ") & YEAR(x))
B12: =BYCOL(XLOOKUP(TEXT(DATE(YEAR(Report_Month),MONTH(Report_Month)-9+SEQUENCE(3,,1,1)+SEQUENCE(,3,0,3),1),"mmm-yy"),Metrics[#Headers],INDEX(Metrics,XMATCH(A12,Metrics[Metrics]),0)),LAMBDA(arr,SUM(arr)))
Select B12 and fill down as far as needed.
Notes
DATE(YEAR(Report_Month),MONTH(Report_Month)-9+SEQUENCE(3,,1,1)+SEQUENCE(,3,0,3),1)
creates a matrix of the previous nine month starting dates with each column consisting of a given quarter:
So for 12/1/2022 =>
The TEXT function then formats the same as the column headers in the Metrics table.
XLOOKUP will then return the appropriate columns from the table into that matrix, and using the BYCOL allows us to SUM by column which is the relevant quarter.

Excel: Dynamic Range Date used in other fields: Sumproduct

I am using sumproduct formula to get the first four month, then the second four month, third four month of net sales until one month before today. This is my formula that I used:
=IFERROR(SUMPRODUCT($B3:$Y3*(COLUMN($B3:$Y3)>=AGGREGATE(15,6,COLUMN($B3:$Y3)/($B3:$Y3<>0),1)+4*(COLUMNS(B3)-1))*(COLUMN($B3:$Y3)<AGGREGATE(15,6,COLUMN($B3:$Y3)/($B3:$Y3<>0),1)+4*(COLUMNS(B3)))*($B$1:$Y$1<EOMONTH(TODAY(),-1)+1)),0)
However, I need to capture the same range as I have it for the net sales as for other measures like COGS in my example. I cannot use the formula above for the other measures like COGS as sometimes they are zero in the same range as in the Net Sales.But I need to capture the zeros here as well.
Example 1
Example 2
Net Sales
Jan
Feb
Mar
Apr
May
June
July
Aug
Sept
Oct
Nov
Dec
0
0
2
3
4
5
2
3
2
3
2
4
---> 1st period= 14 2nd period= 10
COGS (follows the same date range as Net Sales)
Jan
Feb
Mar
Apr
May
June
July
Aug
Sept
Oct
Nov
Dec
0
0
0
0
0
2
1
4
2
3
2
4
---> 1st period= 2 2nd Period= 11
You can leave the entire range check logic from the first formula and change just the value range, i.e first formula in my sample:
=IFERROR(SUMPRODUCT($A3:$L3*(COLUMN($A3:$L3)>=AGGREGATE(15,6,COLUMN($A3:$L3)/($A3:$L3<>0),1)+4*(COLUMN(A3)-1))*(COLUMN($A3:$L3)<AGGREGATE(15,6,COLUMN($A3:$L3)/($A3:$L3<>0),1)+4*(COLUMN(A3)))*($A$2:$L$2<EOMONTH(TODAY(),-1)+1)),0)
second formula for COGS:
=IFERROR(SUMPRODUCT($O3:$Z3*(COLUMN($A3:$L3)>=AGGREGATE(15,6,COLUMN($A3:$L3)/($A3:$L3<>0),1)+4*(COLUMN(A3)-1))*(COLUMN($A3:$L3)<AGGREGATE(15,6,COLUMN($A3:$L3)/($A3:$L3<>0),1)+4*(COLUMN(A3)))*($A$2:$L$2<EOMONTH(TODAY(),-1)+1)),0)

Counting the weekday number in excel

Is it possible to count the weekday number in excel?
Let's say
A
B
Aug 1
Aug 2
1
Aug 3
2
Aug 20
15
Aug 30
22
Sep 1
1
Sep 2
2
Sep 3
3
Sep 4
Sep 5
Sep 6
4
Aug 1 is a Sunday so it is blank, Aug 2 is a Monday and it's the first weekday of the month so it counts as number 1, Aug 3 as number 2, Aug 20 as number 15 all the way to Aug 31 which is number 22. Then it starts counting again the following month.
Can this be done without VBA?
Use NETWORKDAYS to return the workdays and wrap in IF to get blank the weekends:
=IF(WORKDAY(A1-1,1)=A1,NETWORKDAYS(EOMONTH(A1,-1)+1,A1),"")
You can also provide a range in the option third criterion of NETWORKDAYS and WORKDAY to include a list of holidays to not count.

Handle ValueError while creating date in pd

I'm reading a csv file with p, day, month, and put it in a df. The goal is to create a date from day, month, current year, and I run into this error for 29th of Feb:
ValueError: cannot assemble the datetimes: day is out of range for month
I would like when this error occurs, to replace the day by the day before. How can we do that? Below are few lines of my pd and datex at the end is what I would like to get
p day month year datex
0 p1 29 02 2021 28Feb-2021
1 p2 18 07 2021 18Jul-2021
2 p3 12 09 2021 12Sep-2021
Right now, my code for the date is only the below, so I have nan where the date doesn't exist.
df['datex'] = pd.to_datetime(df[['year', 'month', 'day']], errors='coerce')
You could try something like this :
df['datex'] = pd.to_datetime(df[['year', 'month', 'day']], errors='coerce')
Indeed, you get NA :
p day year month datex
0 p1 29 2021 2 NaT
1 p2 18 2021 7 2021-07-18
2 p3 12 2021 9 2021-09-12
You could then make a particular case for these NA :
df.loc[df.datex.isnull(), 'previous_day'] = df.day -1
p day year month datex previous_day
0 p1 29 2021 2 NaT 28.0
1 p2 18 2021 7 2021-07-18 NaN
2 p3 12 2021 9 2021-09-12 NaN
df.loc[df.datex.isnull(), 'datex'] = pd.to_datetime(df[['previous_day', 'year', 'month']].rename(columns={'previous_day': 'day'}))
p day year month datex previous_day
0 p1 29 2021 2 2021-02-28 28.0
1 p2 18 2021 7 2021-07-18 NaN
2 p3 12 2021 9 2021-09-12 NaN
You have to create a new day column if you want to keep day = 29 in the day column.

Find earliest date within daterange

I have the following market data:
data = pd.DataFrame({'year': [2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020],
'month': [10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11],
'day': [1,2,5,6,7,8,9,12,13,14,15,16,19,20,21,22,23,26,27,28,29,30,2,3,5,6,9,10,11,12,13,16,17,18,19,20,23,24,25,26,27,30]})
data['date'] = pd.to_datetime(data)
data['spot'] = [77.3438,78.192,78.1044,78.4357,78.0285,77.3507,76.78,77.13,77.0417,77.6525,78.0906,77.91,77.6602,77.3568,76.7243,76.5872,76.1374,76.4435,77.2906,79.2239,78.8993,79.5305,80.5313,79.3615,77.0156,77.4226,76.288,76.5648,77.1171,77.3568,77.374,76.1758,76.2325,76.0401,76.0529,76.1992,76.1648,75.474,75.551,75.7018,75.8639,76.3944]
data = data.set_index('date')
I'm trying to find the spot value for the first day of the month in the date column. I can find the first business day with below:
def get_month_beg(d):
month_beg = (d.index + pd.offsets.BMonthEnd(0) - pd.offsets.MonthBegin(normalize=True))
return month_beg
data['month_beg'] = get_month_beg(data)
However, due to data issues, sometimes the earliest date from my data does not match up with the first business day of the month.
We'll call the earliest spot value of each month the "strike", which is what I'm trying to find. So for October, the spot value would be 77.3438 (10/1/21) and in Nov it would be 80.5313 (which is on 11/2/21 NOT 11/1/21).
I tried below, which only works if my data's earliest date matches up with the first business date of the month (eg it works in Oct, but not in Nov)
data['strike'] = data.month_beg.map(data.spot)
As you can see, I get NaN in Nov because the first business day in my data is 11/2 (spot rate 80.5313) not 11/1. Does anyone know how to find the earliest date within a date range (in this case the earliest date of each month)?
I was hoping the final df would like like below:
data = pd.DataFrame({'year': [2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020],
'month': [10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11],
'day': [1,2,5,6,7,8,9,12,13,14,15,16,19,20,21,22,23,26,27,28,29,30,2,3,5,6,9,10,11,12,13,16,17,18,19,20,23,24,25,26,27,30]})
data['date'] = pd.to_datetime(data)
data['spot'] = [77.3438,78.192,78.1044,78.4357,78.0285,77.3507,76.78,77.13,77.0417,77.6525,78.0906,77.91,77.6602,77.3568,76.7243,76.5872,76.1374,76.4435,77.2906,79.2239,78.8993,79.5305,80.5313,79.3615,77.0156,77.4226,76.288,76.5648,77.1171,77.3568,77.374,76.1758,76.2325,76.0401,76.0529,76.1992,76.1648,75.474,75.551,75.7018,75.8639,76.3944]
data['strike'] = [77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,77.3438,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313,80.5313]
data = data.set_index('date')
I Believe, We can get the first() for every year and month combination and later on join that with main data.
data2=data.groupby(['year','month']).first().reset_index()
#join data 2 with data based on month and year later on
year month day spot
0 2020 10 1 77.3438
1 2020 11 2 80.5313
Based on the question, What i have understood is that we need to take every month's first day and respective 'SPOT' column value.
Correct me if i have understood it wrong.
Strike = Spot value from first day of each month
To do this, we need to do the following:
Step 1. Get the Year/Month value from the Date column. Alternate, we
can use Year and Month columns you already have in the DataFrame.
Step 2: We need to groupby Year and Month. That will give all the
records by Year+Month. From this, we need to get the first record
(which will be the earliest date of the month). The earliest date can
either be 1st or 2nd or 3rd of the month depending on the data in the
column.
Step 3: By using transform in Groupby, pandas will send back the
results to match the dataframe length. So for each record, it will
send the same result. In this example, we have only 2 months (Oct &
Nov). However, we have 42 rows. Transform will send us back 42 rows.
The code: groupby('[year','month'])['date'].transform('first') will give
first day of month.
Use This:
data['dy'] = data.groupby(['year','month'])['date'].transform('first')
or:
data['dx'] = data.date.dt.to_period('M') #to get yyyy-mm value
Step 4: Using transform, we can also get the Spot value. This can be
assigned to Strike giving us the desired result. Instead of getting
first day of the month, we can change it to return Spot value.
The code will be: groupby('date')['spot'].transform('first')
Use this:
data['strike'] = data.groupby(['year','month'])['spot'].transform('first')
or
data['strike'] = data.groupby('dx')['spot'].transform('first')
Putting all this together
The full code to get Strike Price using Spot Price from first day of month
import pandas as pd
import numpy as np
data = pd.DataFrame({'year': [2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020],
'month': [10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11],
'day': [1,2,5,6,7,8,9,12,13,14,15,16,19,20,21,22,23,26,27,28,29,30,2,3,5,6,9,10,11,12,13,16,17,18,19,20,23,24,25,26,27,30]})
data['date'] = pd.to_datetime(data)
data['spot'] = [77.3438,78.192,78.1044,78.4357,78.0285,77.3507,76.78,77.13,77.0417,77.6525,78.0906,77.91,77.6602,77.3568,76.7243,76.5872,76.1374,76.4435,77.2906,79.2239,78.8993,79.5305,80.5313,79.3615,77.0156,77.4226,76.288,76.5648,77.1171,77.3568,77.374,76.1758,76.2325,76.0401,76.0529,76.1992,76.1648,75.474,75.551,75.7018,75.8639,76.3944]
#Pick the first day of month Spot price as the Strike price
data['strike'] = data.groupby(['year','month'])['spot'].transform('first')
#This will give you the first row of each month
print (data)
The output of this will be:
year month day date spot strike
0 2020 10 1 2020-10-01 77.3438 77.3438
1 2020 10 2 2020-10-02 78.1920 77.3438
2 2020 10 5 2020-10-05 78.1044 77.3438
3 2020 10 6 2020-10-06 78.4357 77.3438
4 2020 10 7 2020-10-07 78.0285 77.3438
5 2020 10 8 2020-10-08 77.3507 77.3438
6 2020 10 9 2020-10-09 76.7800 77.3438
7 2020 10 12 2020-10-12 77.1300 77.3438
8 2020 10 13 2020-10-13 77.0417 77.3438
9 2020 10 14 2020-10-14 77.6525 77.3438
10 2020 10 15 2020-10-15 78.0906 77.3438
11 2020 10 16 2020-10-16 77.9100 77.3438
12 2020 10 19 2020-10-19 77.6602 77.3438
13 2020 10 20 2020-10-20 77.3568 77.3438
14 2020 10 21 2020-10-21 76.7243 77.3438
15 2020 10 22 2020-10-22 76.5872 77.3438
16 2020 10 23 2020-10-23 76.1374 77.3438
17 2020 10 26 2020-10-26 76.4435 77.3438
18 2020 10 27 2020-10-27 77.2906 77.3438
19 2020 10 28 2020-10-28 79.2239 77.3438
20 2020 10 29 2020-10-29 78.8993 77.3438
21 2020 10 30 2020-10-30 79.5305 77.3438
22 2020 11 2 2020-11-02 80.5313 80.5313
23 2020 11 3 2020-11-03 79.3615 80.5313
24 2020 11 5 2020-11-05 77.0156 80.5313
25 2020 11 6 2020-11-06 77.4226 80.5313
26 2020 11 9 2020-11-09 76.2880 80.5313
27 2020 11 10 2020-11-10 76.5648 80.5313
28 2020 11 11 2020-11-11 77.1171 80.5313
29 2020 11 12 2020-11-12 77.3568 80.5313
30 2020 11 13 2020-11-13 77.3740 80.5313
31 2020 11 16 2020-11-16 76.1758 80.5313
32 2020 11 17 2020-11-17 76.2325 80.5313
33 2020 11 18 2020-11-18 76.0401 80.5313
34 2020 11 19 2020-11-19 76.0529 80.5313
35 2020 11 20 2020-11-20 76.1992 80.5313
36 2020 11 23 2020-11-23 76.1648 80.5313
37 2020 11 24 2020-11-24 75.4740 80.5313
38 2020 11 25 2020-11-25 75.5510 80.5313
39 2020 11 26 2020-11-26 75.7018 80.5313
40 2020 11 27 2020-11-27 75.8639 80.5313
41 2020 11 30 2020-11-30 76.3944 80.5313
Previous Answer to get the first day of each month (within the column data)
One way to do it is to create a dummy column to store the first day of each month. Then use drop_duplicates() and retain only the first row.
Key assumption:
The assumption with this logic is that we have at least 2 rows for each month. If there is only one row for a month, then it will not be part of the duplicates and you will NOT get that month's data.
That will give you the first day of each month.
import pandas as pd
import numpy as np
data = pd.DataFrame({'year': [2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020],
'month': [10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11],
'day': [1,2,5,6,7,8,9,12,13,14,15,16,19,20,21,22,23,26,27,28,29,30,2,3,5,6,9,10,11,12,13,16,17,18,19,20,23,24,25,26,27,30]})
data['date'] = pd.to_datetime(data)
data['spot'] = [77.3438,78.192,78.1044,78.4357,78.0285,77.3507,76.78,77.13,77.0417,77.6525,78.0906,77.91,77.6602,77.3568,76.7243,76.5872,76.1374,76.4435,77.2906,79.2239,78.8993,79.5305,80.5313,79.3615,77.0156,77.4226,76.288,76.5648,77.1171,77.3568,77.374,76.1758,76.2325,76.0401,76.0529,76.1992,76.1648,75.474,75.551,75.7018,75.8639,76.3944]
#create a dummy column to store the first day of the month
data['dx'] = data.date.dt.to_period('M')
#drop duplicates while retaining only the first row of each month
dx = data.drop_duplicates('dx',keep='first')
#This will give you the first row of each month
print (dx)
The output of this will be:
year month day date spot dx
0 2020 10 1 2020-10-01 77.3438 2020-10
22 2020 11 2 2020-11-02 80.5313 2020-11
If there is only one row for a given month, then you can use groupby the month and take the first record.
data.groupby(['dx']).first()
This will give you:
year month day date spot
dx
2020-10 2020 10 1 2020-10-01 77.3438
2020-11 2020 11 2 2020-11-02 80.5313
data['strike']=data.groupby(['year','month'])['spot'].transform('first')
I guess this can be achieved by this without creating any other dataframe.

Resources