I am trying to get data for weekday Sunday and Monday, but it only give me one day's data. I can find answer for one weekday name from a question asked by somebody.
Below are the code:
import pandas as pd
df=pd.DataFrame({'CustomerID':[1,2,3,4,5,6,7,8,9,10],
'PurchaseDate':['2007-5-7','2007-6-7','2007-7-7','2007-8-7','2007-9-9','2007-10-7',
'2007-11-7','2007-12-7','2008-1-7','2008-2-7' ],
'OrderQuantity':[1,1,1,1,1,1,1,1,1,1]})
df['PurchaseDate']=pd.to_datetime(df.PurchaseDate)
df.dtypes
df.PurchaseDate.dt.weekday_name.value_counts()
df1=df[(df.PurchaseDate.dt.weekday_name==('Sunday' and 'Monday'))]
The result I got is as in the picture below:
How would I get data for Sunday and Monday?
Use Series.isin if want weekday_name Sunday OR Monday - each date cannot be Sunday and also Monday:
df1=df[(df.PurchaseDate.dt.weekday_name.isin(['Sunday','Monday']))]
print (df1)
CustomerID PurchaseDate OrderQuantity
0 1 2007-05-07 1
4 5 2007-09-09 1
5 6 2007-10-07 1
8 9 2008-01-07 1
Verify:
print (df.PurchaseDate.dt.weekday_name)
0 Monday
1 Thursday
2 Saturday
3 Tuesday
4 Sunday
5 Sunday
6 Wednesday
7 Friday
8 Monday
9 Thursday
Name: PurchaseDate, dtype: object
Related
I am making a table up that will sum all matches of a company found within a specific time period. I need to also exclude certain months if they are inserted into a cell as mm/yy. Excluding one month is fine but when i type 10/22, 11/22, it will sum everthing. THe below code is what i am using with U$4 being the end of a month minus the tracking period which is 90 days. Note that the Raw Data that it is reading from only goes to end of November.
=IF([#[Company Name]]="","",SUM(IF(ISNUMBER(SEARCH([#[Company Name]],RawData[Description]))=TRUE,IF(RawData[Home]=XLOOKUP($D$1,HomeList[Home Code],HomeList[Home]),IF(RawData[Source]="Spend Money",IF(RawData[Date]<=U$4,IF(RawData[Date]>=U$4-[#[Tracking period (Days)]],1,0)))))))
With one date inserted which is correct:
28/Feb 31/Mar 30/Apr 31/May 30/Jun 31/Jul 31/Aug 30/Sep 31/Oct 30/Nov 31/Dec 31/Jan
Exclude Company Name Tracking period (Days) Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7 Month 8 Month 9 Month 10 Month 11 Month 12
11/22 CLH 90 0 0 0 0 0 0 0 1 2 2 1 0
With multiple months inserted which is incorrect:
28/Feb 31/Mar 30/Apr 31/May 30/Jun 31/Jul 31/Aug 30/Sep 31/Oct 30/Nov 31/Dec 31/Jan
Exclude Company Name Tracking period (Days) Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7 Month 8 Month 9 Month 10 Month 11 Month 12
10/22,11/22 CLH 90 0 0 0 0 0 0 0 2 3 8 6 5
Expected if multiple months as it has found one match for September so counts it
28/Feb 31/Mar 30/Apr 31/May 30/Jun 31/Jul 31/Aug 30/Sep 31/Oct 30/Nov 31/Dec 31/Jan
Exclude Company Name Tracking period (Days) Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7 Month 8 Month 9 Month 10 Month 11 Month 12
10/22,11/22 CLH 90 0 0 0 0 0 0 0 1 1 1 0 0
Had to use MATCH with the TEXTSPLIT for it to work
=IF([#[Company Name]]="","",SUM(IF(ISNUMBER(SEARCH([#[Company Name]],RawData[Description]))=TRUE,IF(RawData[Home]=XLOOKUP($D$1,HomeList[Home Code],HomeList[Home]),IF(RawData[Source]="Spend Money",IF(RawData[Date]<=S$4,IF(RawData[Date]>=S$4-[#[Tracking period (Days)]],IF(ISNUMBER(MATCH(RawData[Find Date],TEXTSPLIT([#Exclude],","),)),0,1))))))))
I have a time-series data and i want to get the week number from the initial date
date
20180401
20180402
20180902
20190130
20190401
Things Tried
Code
df["date"]= pd.to_datetime(df.date,format='%Y%m%d')
df["week_no"]= df.date.dt.week
But the week getting reset in 2019 results in getting a common week number of 2018.
is there anything we can do in it ??
You can use this function that will calculate the difference between two days in weeks:
def Wdiff(fromdate, todate):
d = pd.to_datetime(todate) - pd.to_datetime(fromdate)
return int(d / np.timedelta64(1, 'W'))
You can create a datetime object with the specified date, then retrieve the week number using the isocalendar method:
import datetime
myDate = datetime.date(2018, 4, 1)
week = myDate.isocalendar()[1]
print(week)
You could then calculate the total number of remaining weeks in 2018, then add the total number of weeks in each year in between, and finally add the week number of the current date.
For example, this code would print the number of weeks from the 1st of April 2018 to the 6th May 2020:
import datetime
myDate = datetime.date(2018, 4, 1)
currentDate = datetime.date(2020, 5, 6)
weeks = datetime.date(myDate.year, 12, 28).isocalendar()[1] -
myDate.isocalendar()[1]
for i in range(myDate.year, currentDate.year):
weeks += datetime.date(i, 12, 28).isocalendar()[1]
weeks += currentDate.isocalendar()[1]
print(weeks)
Note that because of the way isocalendar works, the 28th of December will always be in the last week of the given year.
The ISO year consists of 52 or 53 full weeks, and where a week starts on a Monday and ends on a Sunday. The first week of an ISO year is the first (Gregorian) calendar week of a year containing a Thursday. This is called week number 1, and the ISO year of that Thursday is the same as its Gregorian year.
You can get more information about isocalendar here: https://docs.python.org/3/library/datetime.html
To get the week number, but as a 2-digit string (with leading zero),
you can run:
df['week_no'] = df.date.dt.strftime('%W')
The result, for slightly extended source data is:
date week_no
0 2018-04-01 13
1 2018-04-02 14
2 2018-09-02 35
3 2018-12-30 52
4 2018-12-31 53
5 2019-01-01 00
6 2019-01-02 00
7 2019-01-03 00
8 2019-01-04 00
9 2019-01-05 00
10 2019-01-06 00
11 2019-01-07 01
12 2019-01-30 04
13 2019-04-01 13
Note that the last day of 2018 (monday) has week No == 53 and "initial" days
in 2019 (up to 2019-01-06 - Sunday) have week No == 00.
If you want this column as int, append .astype(int) to the above code.
I have a dataframe with a date column. The duration is 365 days starting from 02/11/2017 and ending at 01/11/2018.
Date
02/11/2017
03/11/2017
05/11/2017
.
.
01/11/2018
I want to add an adjacent column called Day_Of_Year as follows:
Date Day_Of_Year
02/11/2017 1
03/11/2017 2
05/11/2017 4
.
.
01/11/2018 365
I apologize if it's a very basic question, but unfortunately I haven't been able to start with this.
I could use datetime(), but that would return values such as 1 for 1st january, 2 for 2nd january and so on.. irrespective of the year. So, that wouldn't work for me.
First convert column to_datetime and then subtract datetime, convert to days and add 1:
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
df['Day_Of_Year'] = df['Date'].sub(pd.Timestamp('2017-11-02')).dt.days + 1
print (df)
Date Day_Of_Year
0 02/11/2017 1
1 03/11/2017 2
2 05/11/2017 4
3 01/11/2018 365
Or subtract by first value of column:
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
df['Day_Of_Year'] = df['Date'].sub(df['Date'].iat[0]).dt.days + 1
print (df)
Date Day_Of_Year
0 2017-11-02 1
1 2017-11-03 2
2 2017-11-05 4
3 2018-11-01 365
Using strftime with '%j'
s=pd.to_datetime(df.Date,dayfirst=True).dt.strftime('%j').astype(int)
s-s.iloc[0]
Out[750]:
0 0
1 1
2 3
Name: Date, dtype: int32
#df['new']=s-s.iloc[0]
Python has dayofyear. So put your column in the right format with pd.to_datetime and then apply Series.dt.dayofyear. Lastly, use some modulo arithmetic to find everything in terms of your original date
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
df['day of year'] = df['Date'].dt.dayofyear - df['Date'].dt.dayofyear[0] + 1
df['day of year'] = df['day of year'] + 365*((365 - df['day of year']) // 365)
Output
Date day of year
0 2017-11-02 1
1 2017-11-03 2
2 2017-11-05 4
3 2018-11-01 365
But I'm doing essentially the same as Jezrael in more lines of code, so my vote goes to her/him
Date Status LastWorkingDate
7/3/2017 Day 0 7/3/2017
7/1/2017 Day 1 7/3/2017
7/2/2017 Day 1 7/3/2017
6/30/2017 Day 1 7/3/2017
6/29/2017 Day 2 7/3/2017
6/28/2017 Day 3 7/3/2017
6/27/2017 Day 4 7/3/2017
6/26/2017 Day 5 7/3/2017
6/25/2017 Day 6 7/3/2017
6/24/2017 Day 6 7/3/2017
6/23/2017 Day 6 7/3/2017
6/22/2017 More than Day 6 7/3/2017
7/4/2017 Day 0 7/4/2017
7/3/2017 Day 1 7/4/2017
7/2/2017 Day 2 7/4/2017
7/1/2017 Day 2 7/4/2017
6/30/2017 Day 2 7/4/2017
6/29/2017 Day 3 7/4/2017
6/28/2017 Day 4 7/4/2017
6/27/2017 Day 5 7/4/2017
6/26/2017 Day 6 7/4/2017
6/25/2017 More than Day 6 7/4/2017
i have tried using =
IF(NETWORKDAYS(E21,G21)-1=0,"day 0",IF(NETWORKDAYS(E21,G21)-1=1,"Day 1",IF(NETWORKDAYS(E21,G21)-1=2,"Day 2",IF(NETWORKDAYS(E21,G21)-1=3,"Day 3",IF(NETWORKDAYS(E21,G21)-1=4,"Day 4",IF(NETWORKDAYS(E21,G21)-1=5,"Day 5","Greater than 5 Days"))))))
but not getting desired output.
All i want is Day 0 to Day 5 based on two date columns(Date and LAstWorkingDate).
Day 0 = if today is monday then lastworkingdate will be friday and friday, Sat and Sunday will become Day 0 and previous week's thursday will be Day 1 and so on
Day 1 = if today is Tuesday then Lastworking Date will be Monday and Monday will become Day 0, Friday,Sat and Sunday will be Day 1 and so on
Day 2 = if today is wednesday ten Lastworkind Date will be Tuesday and Tuesday will become Day 0, Monday - Day 1, Friday, Sat and Sunday wull be Day 2 and so on
.
.
.
How about:
="Day "&(NETWORKDAYS(IF(WEEKDAY(A1,2)=7,A1-2,IF(WEEKDAY(A1,2)=6,A1-1,A1)),C1)-1)
Using your current layout for Last Working Day and Date.
The weekday functions are needed because otherwise the Saturday and Sunday would get the same value as Monday instead of Friday.
Of course you can wrap the whole thing in an IF-formula to make sure you display "Greater than 5 days" when the value is bigger than 5.
Output:
Date | Formula column | Last working day
--------------------------------------------
6/17/2017| Day 11 | 7/3/2017 'Weekend
6/18/2017| Day 11 | 7/3/2017 'Weekend
6/19/2017| Day 10 | 7/3/2017
6/20/2017| Day 9 | 7/3/2017
6/21/2017| Day 8 | 7/3/2017
6/22/2017| Day 7 | 7/3/2017
6/23/2017| Day 6 | 7/3/2017
6/24/2017| Day 6 | 7/3/2017 'Weekend
6/25/2017| Day 6 | 7/3/2017 'Weekend
6/26/2017| Day 5 | 7/3/2017
6/27/2017| Day 4 | 7/3/2017
6/28/2017| Day 3 | 7/3/2017
6/29/2017| Day 2 | 7/3/2017
6/30/2017| Day 1 | 7/3/2017
7/1/2017 | Day 1 | 7/3/2017 'Weekend
7/2/2017 | Day 1 | 7/3/2017 'Weekend
7/3/2017 | Day 0 | 7/3/2017
I am working on a scheduling sheet. I want to calculate the distance in weeks since a person was last scheduled on one of 3 different 'jobs'. I only want to look at the time since someone was last scheduled on a weekend, and individuals may be scheduled on weekdays intervening between the last weekends.
For example:
Date Day_of_week Task_a Task_b Task_c Distance_a Distance_b Distance_c
7/1/2015 Wednesday Ed Mary Amy 0 0 0
7/2/2015 Thursday Bill Judy Bob 0 0 0
7/3/2015 Friday Ed Mary Amy 0 0 0
7/4/2015 Saturday Ed Mary Amy 0 0 0
7/5/2015 Sunday Ed Mary Amy 0 0 0
7/6/2015 Monday Bill Mary Bob 0 0 0
7/7/2015 Tuesday Ed Judy Amy 0 0 0
7/8/2015 Wednesday Ed Amy Bob 0 0 0
7/9/2015 Thursday Bob Ed Judy 0 0 0
7/10/2015 Friday Ed Bob Judy 0 0 0
7/11/2015 Saturday Ed Bob Judy 7 0 0
7/12/2015 Sunday Ed Bob Amy 7 0 7
gives the correct distances for each of the 3 tasks with labels for cells first followed by new line of data at each date.
For Distance A I have attempted:
{=IF(AND(B3="Saturday", (A3-MAX(IF($C$2:E2=C3, $A$2:A2, 0)))/7 <53), (A3-MAX(IF($C$2:E2=C3, $A$2:A2, 0)))/7, ".")}
which returns a value on each Saturday (as intended), but cannot scan past the most recent weekday assignment, giving falsely low values. I have attempted other IF(OR) & IF(AND) statements but the first failure generates a false value effectively ceasing the program.
Any assistance with code and formatting the example to .csv or tab seperated values would be appreciated.
OK so just to recap you need to calculate the distance in weeks from the current date to date of the previous job scheduled at the weekend.
The conditions are:-
Current date must fall on a weekend
Previous date must also be on a weekend
Previous date must not be on same weekend as current date
Must be same person.
I've found it easier to use the WEEKDAY function to check if the dates fall on a weekend, so putting this all together I get:-
=IF(WEEKDAY($A3,16)>=3,"",
IF(MAX($A$2:$A2*(C$2:C2=C3)*(WEEKDAY($A$2:$A2,16)<3)*(($A3-$A$2:$A2)>1))=0,"",
INT(($A3-MAX($A$2:$A2*(C$2:C2=C3)*(WEEKDAY($A$2:$A2,16)<3)*(($A3-$A$2:$A2)>1))+1)/7)
)
)
to be entered as an array formula in row 3 using Ctrl Shift Enter
I think the general logic of your formula is OK, but OR and AND functions don't work very well in array formulas so I have replaced ANDs with multiplications.
At the moment this only checks for a person appearing in the same column so I will post another version if it needs to check across the three columns.