Sumproduct with multiple criteria on one range - excel

In a dataset I have answers that participants to a survey gave. The answers are in one example numbered 1 to 5, with 1 being yes, and 2 to 5 being variants of no.
20 or so similar questions have been asked, and participants can be in either one of 20 subgroups. Questions were categorized into 6 classes.
Now the best way to go about such a dataset would normally be the use of a pivot-table, however the way the data is set up doesn't work with a pivot table, and due to the sheer size of the dataset remodelling isn't efficient.
To extract the amount of people in a certain subgroup that answered yes for questions in a certain class I use the following function:
=SuMPRODUCT(--(Test!D$4:$CC$1824=1)*(Test!$C$4:$C$1824=$C3)*(Test!$D$3:$CC$3=D$2))
In which Test!D$4:$CC$1824 is the range where answers are given, and the other two are ranges for subgroup and classes respectively.
By using --(Test!D$4:$CC$1824=1) I convert all data to 0's except for where participants answered yes (cell value = 1).
Now I would like to do the same thing for where they answered no, so the value is either 2 or 3 or 4 or 5. The ideal way would be to append some OR logic into the first test, coming about something like this: --(Test!D$4:$CC$1824={2,3,4,5})
Ofcourse this doesn't work, but is there any simple notation besides retyping the first part 4 times, and adding them together?

I'd say you could just use >1 instead of =1
For selected results like 1 and 3 and 5 you probably need to add the sumproducts of each number.
Sidenote: the -- is not necessary anymore as it is just for converting true and false to 1 and 0 when there is only one bracket inside sumproduct

The OR operation can be mimicked by adding all of the possibilities together.
=SuMPRODUCT(((Test!D$4:$CC$1824=2)+(Test!D$4:$CC$1824=3)+(Test!D$4:$CC$1824=4)+(Test!D$4:$CC$1824=5))*
(Test!$C$4:$C$1824=$C3)*(Test!$D$3:$CC$3=D$2))
If there is any possibilitiy that two could be correct (in this case there isn't) wrap the sum in the SIGN function to get only zero or one.
=SuMPRODUCT(SIGN((Test!D$4:$CC$1824=2)+(Test!D$4:$CC$1824=3)+(Test!D$4:$CC$1824=4)+(Test!D$4:$CC$1824=5))*
(Test!$C$4:$C$1824=$C3)*(Test!$D$3:$CC$3=D$2))

Related

Permutations of two lists with each three different outcomes (32,000 combinations)

There are 50 Success Criteria (“requirements”) broken into two levels: Single-A (with 25 requirements) and Double-A (with 12 requirements). Using the philosophy of the distributive property, I need to create a sort of permutated list of all possible combinations from these two levels. The trouble I’m running into, though, there are various make ups of the levels themselves against one of three conformance levels.
A reviewer will go through each of the Success Criteria to fill out a VPAT. A VPAT will have the 50 Success Criteria listed out and my reviewer will look at the product, and based on the given success criteria, give it a result of “Does Not Support”, “Partially Supports”, or “Supports”. Each line can have only one result.
So, a completed review could look like this:
Requirement #
Level
Status
1
Single-A
SUPPORTS
2
Single-A
SUPPORTS
3
Single-A
PARTIALLY SUPPORTS
4
Single-A
DOES NOT SUPPORT
5
Single-A
DOES NOT SUPPORT
11
Double-A
DOES NOT SUPPORT
12
Double-A
PARTIALLY SUPPORTS
13
Double-A
SUPPORTS
14
Double-A
PARTIALLY SUPPORTS
The final tally, is what I’m trying to summarize (tab “Intended Result”). Every VPAT should output a pivoted result like this:
Single A Fail
Single A Partial
Single A Pass
Double A Fail
Double A Partial
Double A Pass
0
0
25
0
0
12
Here’s my problem. There are 351 permutations for the Single A list and 91 for the Double A list. I’ve already mapped those out manually in the given tabs. Now, I need to permutate both lists by their three dimensions (nearly 32,000 possibilities), but I can’t figure it out with the several dimensions. Here’s the “distributive property”: From the Single-A list, row #1, I need to line up with all 91 of the Double-A list. Then, do it again for line #2 to all 91 lines. And on and on. See example here.
Can anyone help me with a formula that might be able to accomplish this without me copying 91 lines 361 times and doing a bunch of “fill downs”? Attached, as well, is my workbook so far.
I've tried the various permutation formulas, but I'm getting the "number" of combinations that are possible, not the actual results. I'd like to list out every possible scenario for both lists.
I don't know if it's helpful. I think solved it through Google Sheets, through this formula:
=REDUCE({SINGLE_A!A1:D1,DOUBLE_A!A1:D1},SEQUENCE(COUNTA(SINGLE_A!A2:A)),LAMBDA(a,b,{a;
SCAN(,SEQUENCE(COUNTA(DOUBLE_A!A2:A)),LAMBDA(c,d,{INDEX(SINGLE_A!A2:D,b),INDEX(DOUBLE_A!A2:D,d)}))}))
The problem is can't be replicated in Excel... but if you only need the values, you can download it as an .XLSX and it will keep the values without the formula.
As you can see, there are 31491 combinations. Here is the link
Permutate Rows of Two Tables
Feel free to download the file from my Google Drive.
LAMBDA
=LAMBDA(Table1,Table2,LET(Data1,DROP(Table1,1),Data2,DROP(Table2,1),
rCount1,ROWS(Data1),cCount1,COLUMNS(Data1),rCount2,ROWS(Data2),cCount2,COLUMNS(Data2),rCount,rCount1*rCount2,
Result1,WRAPCOLS(INDEX(TOCOL(Data1,,1),ROUNDUP(SEQUENCE(rCount*cCount1,,1)/rCount2,0)),rCount),
Result2,WRAPROWS(INDEX(TOCOL(Data2),MOD(SEQUENCE(rCount*cCount2,,1)-1,rCount2*cCount2)+1),cCount2),
VSTACK(HSTACK(TAKE(Table1,1),TAKE(Table2,1)),HSTACK(Result1,Result2))))(A1:C4,E1:G5)
PermutRows Function (Formulas->Name Manager->New)
=LAMBDA(Table1,Table2,LET(Data1,DROP(Table1,1),Data2,DROP(Table2,1),
rCount1,ROWS(Data1),cCount1,COLUMNS(Data1),rCount2,ROWS(Data2),cCount2,COLUMNS(Data2),rCount,rCount1*rCount2,
Result1,WRAPCOLS(INDEX(TOCOL(Data1,,1),ROUNDUP(SEQUENCE(rCount*cCount1,,1)/rCount2,0)),rCount),
Result2,WRAPROWS(INDEX(TOCOL(Data2),MOD(SEQUENCE(rCount*cCount2,,1)-1,rCount2*cCount2)+1),cCount2),
VSTACK(HSTACK(TAKE(Table1,1),TAKE(Table2,1)),HSTACK(Result1,Result2))))
=PermutRows(A1:C4,E1:G5)
Using Your Tables
=PermutRows(Table2[[#All];[A_FAIL]:[A_PASS]];Table1[[#All];[AA_FAIL]:[AA_PASS]])
Helper Formulas
J2: =TOCOL(A2:C4,,1)
K2: =ROUNDUP(SEQUENCE(36,,1)/4,0)
L2: =INDEX(J2#,K2#)
M2: =WRAPCOLS(L2#,12)
Q2: =TOCOL(E2:G5,,0)
R2: =MOD(SEQUENCE(36,,1)-1,12)+1
S2: =INDEX(Q2#,R2#)
T2: =WRAPROWS(S2#,3)
X2: =HSTACK(M2#,T2#)

Criteria cutoffs for INDEX

I'm not even sure how to ask this so please excuse the roundabout manner forthcoming.
I have a list of tasks and would like to use =INDEX to create my array. However, there are multiple different versions of the task that could show up, and I would like to have all possible avenues covered when creating (only 4 differences).
The name of the range is TaskCode. I want to have it so I can return the first seven numbers, the period, and then only the digits directly after the period. So in case 1, I would want 0527011.3, in case 2 I would want 0527011.01, in case 3 I would want 0527011.23, and in case 4 I'd want 0527011.3.
I initially did =LEFT(TaskCode,10) but that will obviously not work in case 1 or 4. Basically I need to say cut off EITHER at the second period OR the first blank.
Thanks
=LEFT(A2,FIND("|",SUBSTITUTE(A2&".",".","|",2))-1)

How do I count all the instances where a certain number is between multiple sets of numbers?

I would like to count the number of times a specific number lies between multiple ranges.
For instance,
Specific number: 2.5 (let's say this one is in AD1)
J3=14
K3=22
L3=0
M3=6
N3=6
O3=14
P3=2
Q3=8
I need to find how many times 2.5 is between:
J3&K3
L3&M3
N3&O3
P3&Q3
The reason I would like a formula for this is because I have many "specific numbers" that there are many numbers that I need to test within the same range.
I know I can combine multiple CountIf, but the formula would be way too long.
I remember I can use Sum(CountIf("INSERTFORMULA")) but I think somehow using a combination of Sum(CountIf(Median())) will be simpler to read
SUM(Countif(MEDIAN($AD$1,J3,K3)=$AD$1,TRUE),MEDIAN($AD$1,L3,M3)=$AD$1,TRUE),MEDIAN($AD$1,N3,O3)=$AD$1,TRUE),MEDIAN($AD$1,P3,Q3)=$AD$1,TRUE))
Expected result: 2 (i.e. between L3&M3 and between P3&Q3)
Try: (Edited to correct typo)
=SUMPRODUCT(($AD$1>=INDEX(J3:Q3,1,N(IF(1,{1,3,5,7}))))*($AD$1<=INDEX(J3:Q3,1,N(IF(1,{2,4,6,8})))))*emphasized text*
The N(IF(1,{array})) is a method of returning discontinuous elements of an array using the INDEX function.
Depending on whether you want to include/exclude the bounds of the ranges when you write between, you may want to remove the equal = sign from the comparisons.
Try:
=SUMPRODUCT((J3:P3<=AD1)*(K3:Q3>=AD1))
divide your formula on two parts:
first one - just calculate MEDIAN($AD$1,J3,K3) and put it in J4 (for example), then drag and copy this formula on the all raw (so in K4 will be MEDIAN($AD$1,K3,L3), and so on)
second one - just summarize raw 4 with formulas - SUM(A4:AA4)
it takes more space on the sheet, but more simple for creation and checking.

Define Status depending on Criteria

I have advanced Excel/Google Sheets skills. I have more of a conceptual question. I am happy with any solution (Excel or for Sheets, no difference for me).
I have a sheet where various coworkers have access and work with. It is used to define which product needs to go through which steps. Then when a part of a job is done, the status of the product is changed depending on criteria.
You can also think of it as projects and the status of a project.
The 3 examples shows how the data is input by the workers. Sometimes, the "No" cells are empty, sometimes they have a "No", sometimes for the same product, one criterion is empty, the other has a "No".
If I do a nested IF formula, I would have to create 32 of them (I believe, since its 5 criteria with each 2 options).
Obviously I can do that. I was wondering anyone has a better solution for me? Something more practical.
Thanks in advance!
Based on the data you've provided, it looks like your statuses are based on the number of Yes's in the input columns. Also you don't have a status shown for zero Yes's so I'll make an additional for that.
Given that assumption you can use a combination of the COUNTIF function (to count the Yes's), and the IFS function (to manage nested Ifs better) to drastically reduce the size of your function.
To make this cleaner I suggest you add a column and hide it containing: =COUNTIF([InputCriteria1to5Range],"Yes")
For the next formula assume the formula above is in B2. In your status column put the following:
=IFS(B2=5, Status1, B2=4, Status2, B2=3, Status3, B2=2, Status4, B2=1, Status5, B2=0, Status6)
Solution: Thanks to all for your help, I ended up firstly, creating ALL scenarios. This was actually the most complex part. See https://www.mrexcel.com/forum/excel-questions/654871-how-generate-all-possible-combinations-two-lists-without-macro.html (Answer from "Tusharm") where I had to repeat this process 5 times to have all possible outcomes. In the end, there were 192 combinations.
Then, I assigned a status for each combination.
Finally, for each product/row, I created another column where I concatenated the different criteria so that it looks exactly like my above combinations. Then finally index match the concatenated criteria to my combinations.

Rank the top 5 entries in different criteria

I have a table that I want to find the top X people in each of the different groups.
Unique Names Number Group
a 30 1
b 4 2
c 19 3
d 40 2
e 1 1
f 9 2
g 15 3
I've ranked the top 5 people by number by using =index($A$2:$A$8,match(large($B$2:$B$8,1),$B$2:$B$8,0)). The 1 in the LARGE function I linked to a ranked range so that when I dragged down it changed up the number.
What I would like to do next is rank the top x number of people in each group. So top 3 in group 1.
I tried =index($A$2:$A$8,match("1"&large($B$2:$B$8,1),$C$2:$C$8&$B$2:$B$8,0)) but it didn't seem to work.
Thanks
EDIT: After looking at the answers below I have realised why they are not working for me. My actual data that I want to use the formula with have multiple entries of numbers. I have adjusted the example data to show this. The problem I have is that if there are duplicate numbers then it returns both of the names even if one is not in the group.
Unique Names Number Group
a 30 1
b 30 2
c 19 3
d 40 2
e 1 1
f 30 2
g 15 3
Proof of Concept
Use the following formula in the example above in cell F2 and copy down and to the right as needed.
=IFERROR(INDEX($A$2:$A$8,MATCH(AGGREGATE(14,6,($C$2:$C$8=F$1)*($B$2:$B$8),ROW($A2)-1),$B$2:$B$8,0)),"")
In the header row provide the group numbers. or come up with a formula to augment and reset the group number as you copy down based on your X number in your question.
Explanation:
The AGGREGATE function unlike the large function is an array function without the need to use CSE. As such we can add criteria to what we want to use. In this case only 1 criteria was used and that was the group number. in the formula it was the following part:
($C$2:$C$8=F$1)
If there were multiple criteria we would use either an + operator as an OR or we would use an * operator as an AND.
The 6 option in the aggregate function allows us to ignore errors. This is useful when trying to get the small. It is also useful for dealing with other information that may cause errors that do not need to be worried about.
As this is technically an array operation avoid using full column/row references as they can bog down your system.
The basics of what the over all formula is doing is building a list that match the group number you are interested in. After filtering your numbers, it then determines which is the largest, second largest etc by what row you have copied down to. It then determine what row the nth largest number occurs in through the match function, and finally it returns to the corresponding name to that row with the index function.
Building on all the other great answers.
Because you have the possibilities of duplicate values in each group we need to do this with two formulas.
First we need to get the numbers in order. I used the Aggregate, but this could be done with the array LARGE(IF()) also:
=IFERROR(AGGREGATE(14,6,$B$2:$B$8/($C$2:$C$8=E$1),ROW(1:1)),"")
Then using that number and order we can reference, we can use a modified version of #ForwardEd's formula, using COUNTIF() to ensure we get the correct name in return.
=IFERROR(INDEX($A$2:$A$8,AGGREGATE(15,6,(ROW($B$2:$B$8)-ROW($B$2)+1)/(($C$2:$C$8=F$1)*($B$2:$B$8=E3)),COUNTIF(E$2:E2,E3)+1)),"")
This will count the number in the results returned and then bring in the correct name.
You could also solve this with array formulas - to filter a group whose name is stored in E1, your code
=INDEX($A$2:$A$8,MATCH(LARGE($B$2:$B$8,1),$B$2:$B$8,0))
would then be adapted to
=INDEX($A$2:$A$8,MATCH(LARGE(IF($C$2:$C$8<>E1,-1,$B$2:$B$8),1),$B$2:$B$8,0))
Note: After entering an array formula, you have press CTRL+SHIFT+ENTER.
Thank you to everyone who offered help but for some reason none of your methods worked for me, which I am sure was to do with the quality of my data. I used an alternate method in the end which is slightly convoluted but seemed to work.
=IF($C2="1",RANK($B2,$B$2:$B$8,1)+ROW()/10000,-1)
Essentially using the rank function and adding a fraction to separate out duplicate values.

Resources