Significance test between Overlapped data in excel - statistics

Below is the brand preference metric data for two overlapped time periods (Week 1-3 and Week 2-4 data). I wanted do a significance test between these two overlapped groups, please provide me the details of the method and formula’s to use. We are doing these activities in excel format.
Here is the my data :
Roll Up Week 1-3 Roll Up Week 2-4
Count 697 706
Brand Preference 43% 40%
Thanks,
Tanuvi

Related

Number of days for delivery and number of orders delivered in two separate columns. Is there a way to get summary statistics about orders?

I've had a bit of trouble explaining this so please bear with me. I'm also very new to using excel so if there's a simple fix, I apologize in advance!
I have two columns, one listing number of days starting from 0 and increasing consecutively. The other column has the number of orders delivered. The two correspond to each other. For example, I've typed out how it would look below. It would mean that there were 100 orders delivered in 1 day, 150 orders delivered in 2 days, 800 orders delivered in 3 days, etc.
Is there a way to get summary statistics (mean, median, mode, upper and lower quartiles) for the number of days it took for the average order to get delivered? The only way I can think of solving this is to manually punch in "1" 100 times, "2" 150 times, etc. into a new column and take median, mean, and upper & lower quartile from that, but that seems extremely inefficient. Would I use a pivot table for this? Thank you in advance!
I tried using the data analysis add-on and doing summary statistics that way, but it didn't work. It just gave me the mean, median, mode, and quartiles of each individual column. It would have given me 3 for median number of days for delivery and 300 for median number of orders.
Method 1
The mean is just
=SUMPRODUCT(A2:A6,B2:B6)/SUM(B2:B6)
Mode is the value with highest frequency
=INDEX(A2:A6,MATCH(MAX(B2:B6),B2:B6,0))
The quartiles and median (or any other quantile by varying the value of p) from first principles following this reference
=LET(p,0.25,
values,A2:A6,
freq,B2:B6,
N,SUM(freq),
h,(N+1)*p,
floorh,FLOOR(h,1),
ceilh,CEILING(h,1),
frac,h-floorh,
cusum,SCAN(0,SEQUENCE(ROWS(values)),LAMBDA(a,c,IF(c=1,0,a+INDEX(freq,c-1)))),
xlower,XLOOKUP(floorh-1,cusum,values,,-1),
xupper,XLOOKUP(ceilh-1,cusum,values,,-1),
xlower+(xupper-xlower)*frac)
Method 2
If you don't like doing it this way, you can always expand the data like this:
=AVERAGE(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1))
=MODE(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1))
=QUARTILE.EXC(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1),1)
=MEDIAN(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1))
and
=QUARTILE.EXC(XLOOKUP(SEQUENCE(SUM(B2:B6),1,0),SCAN(0,SEQUENCE(ROWS(A2:A6)),LAMBDA(a,c,IF(c=1,0,INDEX(B2:B6,c-1)+a))),A2:A6,,-1),3)

Dynamic condition formatting in excel

I'm trying to improve one of my sheets and implement some smart conditioning in it.
The sheet holds information about 3 KPIs regarding parts in a production process.
Now those KPIs are measured on a monthly basis.
If each of them is smaller than 1% for 6 consecutive months the sample size reduces to a lower sampling tier from 100% to 50%.
If such a trend is seen for the next 6 months, the sample size will reduce again to a lower tier from 50% to 25%.
There are four tiers together, with the last tier being 10%.
Now in case any of the KPIs is above 1% the sample size goes back to the previous sample tier. For example, the KPIs were under 1% for 13 months. The sample size was reduced to 25%. In the 14th month, one KPI was 2%, so the sample size must increase to the previous threshold which is 50%.
Currently, I'm using an excel file that marks in the sheet a 0 next to KPI that failed for a specific month and 1 for the ones that have passed. As the next step, those values are dropped into one cell, creating a string of 1 and 0 for each KPIs. Then I need to manually analyse each string and establish the sample size.
Please keep in mind that every month new data is added.
Added a picture and sample data:
https://docs.google.com/spreadsheets/d/1g2tMkUHb_gn4EYegr0aPp3bqr5dwYusg/edit?usp=share_link&ouid=108643160822075860310&rtpof=true&sd=true
Does anyone have an idea how this could be automated?

Excel - Multifamily Value Add Pro Forma Algorithm

I'm putting together a monthly pro forma in Excel for a multifamily property where 10 units will be renovated and each unit renovation takes 45 days. After the unit is renovated, the unit will bring in an additional $100/mo. I managed to calculate the unit-months of vacancy per month via multiple rows of IFS statements and using the actual number of days for each renovation (I used 45 days in the calculation instead of having to roundup to 2 months).
I'm trying to come up with an algorithm to calculate the cumulative rent premium for renovated units in each month using 45 days of unit down time rather than rounding the down time to 2 months.
The solution is easy if I round the downtime to 2 months: The beginning of the row is summed until 2 cells before the current month and we simply multiply by the premium of $100. But I cannot come up with a solution that uses the actual days (45) as I did for unit-months of vacancy.
I'm not even sure if this can be done unless the pro forma is done daily. Is this correct or is there an algorithm to calculate this on a monthly basis as its laid out?
I'd appreciate any help. Thank you for your consideration.

Assign specific dates based to an employee level dataset on weighted averages in excel

I have a dataset in excel which shows headcount by employee level and which department each employee would fall under (sales, ops, or support). I would like to send a survey to each employee once every 26 weeks (2 times a year), but I would also like to keep sending surveys every week to ensure continuation of surveys to a certain amount of population split between sales, ops, and support departments based on their weight of the total population.
This way, I am sending surveys every week to a tiny bit of my overall headcount but only repeating people every 26 weeks.
Can anyone please help on how to solve this in excel with a formula?
From attached sample data, how can I split the headcount to send surverys for 26 weeks straight but to different population every week and not repeat? This different population should be split by % of department out of total headcount. Meaning if I have 10 people every week and % split is 40% sales, 30% operations, and 20% support, the survey should be sent to 4 sales, 3 operations, and 2 support people. Please note that the 10 people and the %s may vary every week because of new hires and resignations.
Thank you!
Sample Data
In the data sheet, ceate a helper column D, where you hand out the numbers to each employee, label it MOD. Use the formula for each employee, enter to cell D2:
=MOD(ROWS(A$2:A2)-1;$H$2)+1
That way each employee is assign a number from 1 to whatever is in the cell H2, e.g. 26. Then contact list all employees with 1 and you have the first batch and so you continue each week to get to employees with 26 in 26 weeks. This way all get the survey but just once.
Of course the share of the individual depts cannot be achieved each time, as there are less employees in some. If you wanted to keeps the shares, some employees of the smaller departments would get the survey more times.
If you want to get some randomness into the order, just mix the order of MOD numbers, e.g. start with 7, continue with 23 etc.
I hope I got the question right, I am not sure in some parts.

Tests to Compare Sales Mix Percent between Periods

Background
I wish to compare menu sales mix ratios for two periods.
A menu is defined as a collection of products. (i.e., a hamburger, a club sandwich, etc.)
A sales mix ratio is defined as a product's sales volume in units (i.e., 20 hamburgers) relative to the total number of menu units sold (i.e., 100 menu items were sold). In the hamburger example, the sales mix ratio for hamburgers is 20% (20 burgers / 100 menu items). This represents the share of total menu unit sales.
A period is defined as a time range used for comparative purposes (i.e., lunch versus dinner, Mondays versus Fridays, etc.).
I am not interested in overall changes in the volume (I don't care whether I sold 20 hamburgers in one period and 25 in another). I am only interested in changes in the distribution of the ratios (20% of my units sold were hamburgers in one period and 25% were hamburgers in another period).
Because the sales mix represents a share of the whole, the mean average for each period will be the same; the mean difference between the periods will always be 0%; and, the sum total for each set of data will always be 100%.
Objective:
Test whether the sales distribution (sales mix percentage of each menu item relative to other menu items) changed significantly from one period to another.
Null Hypothesis: the purchase patterns and preferences of customers in period A are the same as those for customers in period B.
Example of potential data input:
[Menu Item] [Period A] [Period B]
Hamburger 25% 28%
Cheeseburger 25% 20%
Salad 20% 25%
Club Sandwich 30% 27%
Question:
Do common methods exist to test whether the distribution of share-of-total is significantly different between two sets of data?
A paired T-Test would have worked if I was measuring a change in the number of actual units sold, but not (I believe) for a change in share of total units.
I've been searching online and a few text books for a while with no luck. I may be looking for the wrong terminology.
Any direction, be it search terms or (preferably) the actual names appropriate tests, are appreciated.
Thanks,
Andrew
EDIT: I am considering a Pearson Correlation test as a possible solution - forgetting that each row of data are independent menu items, the math shouldn't care. A perfect match (identical sales mix) would receive a coefficient of 1 and the greater the change the lower the coefficient would be. One potential issue is that unlike a regular correlation test, the changes may be amplified because any change to one number automatically impacts the others. Is this a viable solution? If so, is there a way to temper the amplification issue?
Consider using a Chi Squared Goodness-of-Fit test as a simple solution to this problem:
H0: the proportion of menu items for month B is the same as month A
Ha: at least one of the proportions of menu items for month B is
different to month A
There is a nice tutorial here.

Resources