Excel Formula to calculate SD and SUMSQ based on 4 conditions. 2 of the conditions will determine from which column the data will be taken from - excel-formula

I am having problems building two formulas with 4 conditions to return the Sum of Squares and the Standard Deviation from a dataset:
First two conditions. Located in data validation cells outside of the data range:
KPI
Period of time
Second two conditions:
Campaign Name
Player Group = Target Vs Control
The basic idea is that a player will generate a value for a series of KPIs over different periods of time (7,14 and 28 days). Therefore the structure of the data sets will be defined as:
Columns:
Player ID
Campaign ID
Player Group: Target or Contol
Rest of the columns: Combinations of a KPI and a given period of time.
The objective is to calculate the SD and SUMSQ of a subset of data based on the following restrictions: the KPI, the period of time, the Campaign ID and the Group of the player. As the dataset is built, the conditions "KPI" and "Period" will determine the column from which the data is going to be taken and the conditions "Campaign" and "Player Group" will act as row filters.
I have tried the following... with little hope, since I don't even expect that an array formula can be nested in an "if" function:
=IF(AND(Test!$J$9="STAKES",Test!$M$9=7),STDEV.S(IF(DATA!$A$2:$A$9237=Test!$B$13,DATA!$AD$2:$AD$9237,0)))
Can someone come up with a solution, please?

It looks like you are getting close. The "0" is where your next "if statement" will go.
So if the first "AND" condition is false the next "if statement" in the position where "0" was will execute.
Instead of $J$9="STAKES" and $M$9=7 I changed it to $J$9="TACOS" and $M$9=9000.
=IF(AND(Test!$J$9="STAKES",Test!$M$9=7),STDEV.S(IF(DATA!$A$2:$A$9237=Test!$B$13,DATA!$AD$2:$AD$9237,IF(AND(Test!$J$9="TACOS",Test!$M$9=9000),STDEV.S(IF(DATA!$A$2:$A$9237=Test!$B$13,DATA!$AD$2:$AD$9237,0))))))

Related

EXCEL - Dual VLOOKUP and Interpolation

I have a table on Excel with data as the following:
Meaning, I have different JPH based on the %SMALL unit and the number of active stations.
I need to create a matrix like the following (with %SMALL on horizontal and STATIONS on vertical axes):
And the formula for each cell should:
Take the input of Stations (column "B")
Check, for that specific Stations number, the amount of data on the other table (like make a filter on STATIONS for the specific number)
Perform an VLOOKUP for checking the JPH based on the %SMALL value on row 2
Interpolate for the exact JPH value, if not found on table
For now, I was able to create the last part (the VLOOKUP and the interpolation), with the following:
=IFERROR(VLOOKUP(C2;'EARLY-STATIONS'!$F:$H;3;FALSE);AVERAGE(OFFSET(INDEX('EARLY-STATIONS'!$H:$H;MATCH(C2;'EARLY-STATIONS'!$F:$F;1));0;0;2;1)))
The problem I'm facing is than with this, the calculation is not checking the number of stations, so the Iteration is not accurate.
Unfortunately I cannot use VBA macros to solve this.
Any clue?
This is an attempt because more clarity is needed in terms of all possible scenarios to consider, based on different input data and how to understand the "extrapolation" process. This approach understands as extrapolation the average of two values (lower and greater), but the idea can be customized to any other way to calculate it. Per tags listed in the question I assume there is no Excel version constraint. This is O365 solution:
=LET(sm, A2:A10, st, B2:B10, jph, C2:C10, smx, F1:J1, sty, E2:E4, NULL, "",
GETLk, LAMBDA(x,y,mode, FILTER(jph, (st=y)
* (sm = INDEX(sm, XMATCH(x, sm, mode))), NULL)),
GET, LAMBDA(x,y, LET(f, FILTER(jph, (jph=GETLk(x,y, 1))
+ (jph=GETLk(x,y, -1)), NULL), IF(#f=NULL, NULL, AVERAGE(f)))),
HREDUCE, LAMBDA(yi, DROP(REDUCE("", smx, LAMBDA(ac,x,
HSTACK(ac, GET(x, yi)))),,1)),
DROP(REDUCE("", sty, LAMBDA(ac,y, VSTACK(ac, HREDUCE(y)))),1))
The above formula spills the entire result, I don't think for this case you can use a LOOKUP-like function.
Here is the output:
The highlighted cells where the average is calculated.
Explanation
The main idea is to use DROP/REDUCE/HSTACK/VSTACK pattern to generate the grid. Check my answer to the following question: how to transform a table in Excel from vertical to horizontal but with different length on how to apply it.
We use two user LAMBDA functions to abstract some calculations:
GETLk(x,y,mode), filters jph name based on %SMALL and Stations columns values, based on input values x (x-axis value from the grid), y (y-axis value form the grid) respectively. The third input argument mode, is for doing the approximate search in XMATCH (1-next largest, -1 next smallest). In case the value exist in the input table, XMATCH returns the same value in both cases.
GET(x,y) has the logic to find the value or if the value doesn't exist to calculate the average. It uses the previous LAMBDA function GETLk. We filter for jph values that match the input values (x,y), but we use an OR condition in the FILTER (+), to select both lower or greater values. If the value exist, returns just one value otherwise two values are returned by FILTER (f). Finally if f is not empty we return the average, otherwise the value we setup as NULL.
HREDUCE: Concatenate the result by columns for a given row of the grid. Check the referred question for more information about it.

Sort data contained in blocks in excel

I have a large amount of reference data in excel, which I am trying to manipulate in a variety of ways. I'm having some problems with the way it is structured and sorting into a more manageable format.
Problem number 1:
I have three columns. Column A contains first a date, and then a designator of high or low. Column B contains times, Column C contains heights.
I would like to sort the data by column B (easy enough) EXCEPT I would like the date headings in Column A preserved. It's almost as though I have 365 tables, each with between 3 and 5 pieces of data - I'm looking to sort the 3 - 5 pieces of data within each date only.
This is what I have currently:
There's no issue with me taking the data and manipulating it some other way first - this is ultimately around me being able to take a batch of data (5x different reference points, each for 365 days) and develop a process to sanitise it and get it displayed in time order, as well as being able to get it into a usable format for problem 2 (I need to adjust some other data points by the sorted data once I have it).
This is what I would like it to look like (I manually went through each of these blocks and sorted them):
It is possible to do it in Excel as follows in cell E2:
=LET(rng, A1:C11, set, FILTER(rng, (INDEX(rng,,1) <>"")),
dates, SCAN("", INDEX(set,,1), LAMBDA(acc, item, IF(ISNUMBER(item), item, acc))),
in, FILTER(HSTACK(dates, set), INDEX(set,,2)<>""), inDates, INDEX(in,,1),
out, REDUCE("", UNIQUE(inDates), LAMBDA(acc, date,
LET(sorted, VSTACK(date, DROP(SORT(FILTER(in, inDates = date),3),,1), {"","",""}),
VSTACK(acc, sorted)
))), IFERROR(DROP(DROP(out,1),-1),"")
)
Here is the output:
You can avoid the clean-up process except for removing the last row as follow:
=LET(rng, A1:C11, set, FILTER(rng, (INDEX(rng,,1) <>"")),
dates, SCAN("", INDEX(set,,1), LAMBDA(acc, item, IF(ISNUMBER(item), item, acc))),
in, FILTER(HSTACK(dates, set), INDEX(set,,2)<>""), inDates, INDEX(in,,1),
out, REDUCE("", UNIQUE(inDates), LAMBDA(acc, date,
LET(sorted, VSTACK(HSTACK(date,"",""), DROP(SORT(FILTER(in, inDates = date),3),,1),
{"","",""}), IF(MAX(LEN(acc))=0, sorted, VSTACK(acc, sorted))
))), DROP(out, -1)
)
Explanation
Basically is to carry out the manual steps but using excel functions. The name set, is the same as the input data (rng) but we removed the empty rows. The name dates, is a column with the same size as rng, repeating all the dates. The condition in the SCAN function to identify a new date is ISNUMBER because dates are stored in Excel as whole numbers. The name in has the data in the format we want for doing the sorting and filter by date removing the date header and adding as the first column the dates.
Now we use DROP/REDUCE/VSTACK pattern (check the answer to the question: how to transform a table in Excel from vertical to horizontal but with different length provided by David Leal) to append each sorted data for a given unique date. We add the date as the first row, then sorted data, and finally an empty row to separate each group of data. Finally, we do a clean-up via IFERROR/DROP to remove the #N/A values and the first and the last empty row.

How can I extract a total count or sum of agents who made their first sale in a specified month?

I am trying to extract some data out of a large table of data in Excel. The table consists of the month, the agent's name, and either a 1 if they made a sale or a 0 if they did not.
What I would like to do is plug in a Month value into one cell, then have it spit out a count of how many agents made their first sale that month.
Sample Data and Input Output area
I found success by creating a secondary table for processing a minif and matching to agent name, then countif in that table's data how many sales months matched the input month. However I would like to not have a secondary table and do this all in one go.
=IF(MINIFS(E2ERawData[Date Group],E2ERawData[Agent],'Processed Data'!B4,E2ERawData[E2E Participation],1)=0,"No Sales",MINIFS(E2ERawData[Date Group],E2ERawData[Agent],'Processed Data'!B4,E2ERawData[E2E Participation],1))
=COUNTIFS(ProcessedData[Month of First E2E Sale],H4)
Formula in column F is:
=MAX(0;COUNTIFS($A$2:$A$8;E3;$C$2:$C$8;1)-SUM(COUNTIFS($A$2:$A$8;"<"&E3;$C$2:$C$8;1;$B$2:$B$8;IF($A$2:$A$8=E3;$B$2:$B$8))))
This is how it works (we'll use 01/03/2022 as example)
COUNTIFS($A$2:$A$8;E3;$C$2:$C$8;1) This counts how many 1 there are for the proper month (in our example this part will return 2)
COUNTIFS($A$2:$A$8;"<"&E3;$C$2:$C$8;1;$B$2:$B$8;SI($A$2:$A$8=E3;$B$2:$B$8)) will count how many 1 you got in previous months of the same agents (in our example, it will return 1)
Result from step 2, because it's an array formula, we sum up using SUM() (in our example, this return 1)
We do result from step 1 minus result from step 3 (so we get 1)
Finally, everything is inside a MAX function to avoid negative results (February would return -1 because there were no sales at all and agent B did a sale on January, so it would return -1. To avoid this, we force Excel to show biggest value between 0 and our calculation)
NOTE: Because it's an array formula, depending on your Excel version maybe you must introduce pressing CTRL+ENTER+SHIFT
If one has got access to the newest functions:
=LET(X,UNIQUE(C3:C9),VSTACK({"Month","Total of First time sales"},HSTACK(X,BYROW(X,LAMBDA(a,SUM((C3:C9=a)*(MINIFS(C3:C9,D3:D9,D3:D9,E3:E9,1)=C3:C9)))))))

How to define an array of numbers with a formula

I have a project where I need to break people into 3 buckets with task lists that rotate quarterly (Phase A = task list 1, B = task list 2, C = task list 3). The goal here is to sort people into the buckets based on a departure date, with the ideal being that they would depart when they're in the C phase. I have a formula already set up that will tell me the number of quarters between the project start date and the person's departure date, so now I'm trying to figure out how to get Excel to tell me if a person's departure date falls within their bucket's C Phase.
I have this formula in a column called DEROSQtr:=ROUNDDOWN(DAYS360("1-Oct-2020",[#DEROS],FALSE)/90,0)
Now the easy way to approach this would be to build a static array and just see if that formula results in a value in the right array, where the numbers in the array define which quarter from Oct 2020 that the bucket's C Phase is going to be in:
ArrayA = {1;4;7;10;13;16} ArrayB = {2;5;8;11;14;17} ArrayC = {0;3;6;9;12;15}
The formula that pulls this all together is then:
=IF([#EFP]="A",IF(IFNA(MATCH([#DEROSQtr],ArrayA,0),-1)<>-1,TRUE,FALSE),IF([#EFP]="B",IF(IFNA(MATCH([#DEROSQtr],ArrayB,0),-1)<>-1,TRUE,FALSE),IF([#EFP]="C",IF(IFNA(MATCH([#DEROSQtr],ArrayC,0),-1)<>-1,TRUE,FALSE),"-")))
Now while this will work for as long as I build out the static array, I'm trying to figure out how to define each of these buckets with a formula that Excel can work with, i.e. bucket A hits phase C in 3n + 1 quarters where n is the number of cycles through all 3 phases, so ArrayA = 3n+1, ArrayB = 3n+2 and ArrayC = 3n. What I'm hunting for here is the best way to define each of the arrays as a formula.
After some additional digging and looking back at how to define each array, I came across the MOD() function in Excel. I was then able to rewrite the formula that does the checking as =IF([#EFP]="A",IF(MOD([#DEROSQtr]-1,3)=0,TRUE,FALSE),IF([#EFP]="B",IF(MOD([#DEROSQtr]-2,3)=0,TRUE,FALSE),IF([#EFP]="C",IF(MOD([#DEROSQtr],3)=0,TRUE,FALSE),"-"))), replacing ArrayA(3n+1) with MOD([#DEROSQtr]-1,3), ArrayB(3n+2) with MOD([#DEROSQtr]-2,3), and ArrayC(3n) with MOD([#DEROSQtr],3).
Since I do not have the data you are calculating your quarter, its difficult to give you exact answer. However, as I understand your have a column which has the formula to calculate the quarter say "Formula_Col"
Solution will be to add a new column and flag it based on the values in "Formula_Col".
If you can give some sample data I can provide exact answer.

Using tbl.Lookup to match just part of a column value

This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")

Resources