I have a data set listing activities over a year (in logistics context).
I can see that the loads are much more different across day of the week (e.g. Wednesdays are super busy while Sundays are very empty), but a teammate argue that grouping by month is a better idea (e.g. Jan vs Dec).
I wonder if there is any appropriate statistical analysis to prove which grouping is more appropriate?
Thanks,
Related
I know this isn't necessarily programming but I have used this community many a time and you have always been able to provide guidance or an answer.
My business have asked me to calculate the Annual Leave for all of our staff for 2019 and update this. They have altered the way they want the AL to be calculated and so the previous calculator I built is now obsolete.
I have managed to make the calculation work for my full time staff as I am not having to take their FTE into consideration, even when they move up to a higher allocation of AL, based on their service with the company.
When it comes to part time staff, we have to also add in the bank holidays as they are entitled to them, and then take out the hours that they would be working on those bank holidays.
My issue is when the agent changes allocation half way through the year.
This is the calculation for an agent that stays within their allocation for the whole year.
(Allocation+BankHolidays)x(FTE)x(TimeWithinYearSpentInAllocation)
So as an example the calculation would be:
(172.5+(8x7.5))x(15/37.5)x(365/365)=93
From that number we would then subtract however many hours they would be "working" on bank holidays.
My issue is when they change allocation the below calculation doesn't work.
(172.5+(3x7.5))x(15/37.5)x(120/365)+(187.5+(5x7.5))x(15/37.5)x(245/365)= 88 (rounded)
Can anyone help me on a calculation that will help me get there?
Regards,
Jordan.
Your problem is with the bank holidays. Just remove the allocation part and comapre the results:
((8*7,5))*(15/37,5)*(365/365)=24
((3*7,5))*(15/37,5)*(120/365)=3
((5*7,5))*(15/37,5)*(245/365)=10
this doesnt make sense. I dont know laws in your country, but for me if I work over public holidays I am entiteled to the same number of hours leave but here you calculate that if you work for 3 days of public holidays you are entiteled for 3 hours of leave. Shouldnt the days*hours be added after the multiplication of the allocation?
I'm currently looking at daily sales data of a specific brand over the course of the past year. My objective is to create a formula to roughly estimate the sales growth for future months.
My project isn't going very well, as the brand is very volatile in monthly sales, making it impossible to predict with a basic linear formula. I'm arriving at the conclusion that a single year's worth of sales isn't enough data, and I may have to result to provide a specific formula depending on the month. Is there anything I haven't thought of?
Note: Recording of sales start on the 15th of every month
Your sales are showing seasonality. Consider using sasonal ARIMA models.
My problem concerns day of month however, I can see that the same logic would apply to month number or hour number or any other variable that ends on some value and then starts from 0 again.
It is defined as follows: I'm trying to calculate a day of month when a payment is made to use it for a forecast. So I have for example for one case:
1 May 2016
2 June 2016
30 June 2016
29 July 2016
6 September 2016
A simple average would give me 14th, and the median would give me 6th. But the result I'm looking for is more like the 1st.
I see I could do it somehow by calculating geometric median, or euclidean distances after placing the points on a circle etc, but I believe it can be approached in a much simpler way. I also see that solving this problem with standard means and averages would cause a situation where it gives more than one result.
But if we add an assumption that it should occur once in 30 days/a month? Wouldn't this assumption make the problem easier?
Please let me know if you solved a similar problem before or if you have any ideas
If the result you are "looking for is more like the 1st", then I would hazard a guess that you are really looking at a series of monthly payments (perhaps falling due on the first of each month or the first working day of each month) and you want some measure of the deviation between the due date and the actual date of payment.
If that is the case then simply calculate the difference in days between the due date and the actual date of payment for each monthly payment (following a consistent convention such as positive values denote late payment and negative values are early) and then apply your chosen measure (median, mean, etc) to the series of differences.
I have a table that tracks the dispatching of personnel. The table has the employee name and the date the person went out and the date they returned.
The table has hundreds of entries from 1988 to current.
In Excel I track the cumulative count per day (of the year) of how many people have been sent out, and I also track the number of people out on any given day. The table lists the Month & Day in the first column (every day of the year, including leap days) and the years on the first row. There is data for every date (a zero is entered until the first person is sent out that year, then starts counting up as there are more dispatches, or in the case of the number of people out each day, it will show zero if no one is out that day or if there were, say 5 people out, it would show "5" for that day). I then use the data in Excel to construct a graph that shows the number of dispatches on the y axis and the day of the year on the x axis (along with the current year’s number, the average number and the max over the 27 year history). Currently I just track this manually (I just keep a running count of each and enter it in manually in Excel.) I would like to build a query of my Access data that would return the same information that I could import into my Excel spreadsheet. One query that would show the day & month in the first column and the years along the top row and for each day show a cumulative count for that year of how many people have been sent out. Another query that has the day & month in the first column and the years along the top and a count of how many people were out for that particular day for that particular year. There shouldn't be any gaps (every day has data, even if it is "0"). I would then import those queries into Excel to replace my manual tracking that I am doing now.
I know how to construct the Excel stuff (I have that running already), and how to import info from Access to Excel, what I need to know is how to construct these 2 Access queries.
Any help/ideas on how to construct those 2 queries would be greatly appreciated!
I'd recommend that you migrate this app to a web based solution that uses a real database - SQL Server or MySQL, not Access.
"Desk drawer software" is what I call homegrown apps that someone creates for themselves to perform some small task that eventually become integral to running a business and grow out of hand. Your truck factor is 1: if anything happened to you, no one would know how to do this function. The software may not be backed up or checked into a source code management system. There's no QA. There's no way to migrate new features to production: if you alter the app, then that is what you have.
I'd recommend a web app to mitigate all the risks I've described:
You have to deploy a web app to a server, which takes it off your desktop and puts it in a central place where anyone who's authorized can access it.
Separates database from display issues.
Makes you think about how to archive historical data. Partitioning by year makes sense.
Likely you'll put this in a source code management system like Subversion or Git.
We are using a custom list on Sharepoint where we require users to enter data with a date and time field. We have been facing huge issues in data validity when generating reports due to this field. Following are the kinds of mistakes:
Selecting AM instead of PM or vice verse. Changing to 24 hrs format doesn't help much because then the users select (as an example) 02:00 instead of 14:00 for 02:00PM.
There are errors regarding formats of dates, hence some entries have dates from the future or the past.
As the reports are generated each week, the list needs to be populated by the end of the week. If the month has changed between the week, people forget to change the month in the calendar and the entries are of the last week of the current month instead of the last week of the previous month.
Are there ways to configure the list(Pref. without programming) so that:
A. Only working hours are available in the time related dropdown.
B. Dates from the future are not allowed( Or not available)
Any help would be appreciated.
As far as I know, you won't be able to satisfy these requirements with no custom code.
If you decide to go down the coding path, what you need to do is create a custom field type. Let me know of you need help on this.