Transpose multiple occurrences - excel-formula

EDIT: I have revived the source data source to remove the ambiguity of my last screen shots
I am trying to transpose spreadsheet data where there are many rows where the customer name may be duplicated but each row contains a different product.
For instance
revised original data source
to
revised proposed data format
I would like to do it with formulae if possible as I struggle with VB
Thank you for any help

I realise this is a huge answer, apologies but I wanted to be clear. If you need anything from me, drop me a comment and I'll help out.
Here's the output from my formula:
EDITED ANSWER - Named ranges used for ease of understanding:
These are just an example of a few of the named ranges I have used, you can reference the ranges directly or name them yourself (simplest way is to highlight the data then put the name in the drop down next to the formula bar [top left])
Be wary that as we will be using Array formulas for AccNum and AccType, you will not want to select the entire column and instead opt for either the exact data length or overshoot it by 100 or so. Large array formulas tend to slow down calculation and will calculate every cell individually regardless of it being empty.
First formula
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Account Number ",LEFT((COLUMN(A:A)+1)/2,1)),"")
This formula is identical to the one in the original answer apart form the adjusted heading title.
=IF(Condition,True,False) - There are so many uses for the IF logic, it is the best formula in Excel in my opinion. I have used to IF with COUNTIF to check whether there is more than 0 cells that are more than BLANK (or ""). This is just a trick around using ISBLANK() or other blank identifiers that get confused when formula is present.
If the result is TRUE, I use CONCATENATE(Text1,Text2,etc.) to build a text string for the column header. ROW(1:1) or COLUMN(A:A) is commonly used to initiate an automatically increasing integer for formulas to use based on whether the count increase is required horizontally or vertically. I add 1 to this increasing integer and divide it by 2 so that the increase for each column is 0.5 (1 > 1.5 > 2 > 2.5) I then use LEFT formula to just take the first digit to the left of this decimal answer so the number increases only once every 2 columns.
If the result is FALSE then leave the cell blank ,""). Standard stuff here, no explanation needed.
Second Formula
=CONCATENATE(INDEX(Forename,MATCH(Sheet4!$A2,Reference,0)))
=CONCATENATE(INDEX(Surname,MATCH(Sheet4!$A2,Reference,0)))
CONCATENATE has only been used here to force blank cells to remain blank when pulled by INDEX. INDEX will read blank cells as values and therefore 0's whereas CONCATENATE will read them as text and therefore "".
INDEX(Range,Row,Column): This is a lookup formula that is much more advanced than VLOOKUP or HLOOKUP and not limited in the way that they are.
The range i have used is the expected output range - Forename or Surname
The row is then calculated using MATCH(Criteria,Range,Match Type). Match will look through a range and return the position as an integer where a match occurs. For this I have set the criteria to the unique reference number in column A for that row, the range to the named range Reference and the match type as 0 (1 Less than, 0 Exact Match, -1 Greater than).
I did not define a column number for INDEX as it defaults to the first column and I am only giving it one column of data to output from anyway.
Third Formula
Remember these need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter)
=IFERROR(INDEX(AccNum,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0))),"")
=IFERROR(INDEX(AccType,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))),"")
As you can see, one of these is used for AccNum and the other for AccType.
IFERROR(Value): The reason that this has been used is that we are not expecting the formula to always return something. When the formula cannot return something or SMALL has run out of matches to go through then an error will occur (usually #VALUE or #NUM!) so i use ,"") to force a blank result instead (again standard stuff).
I have already explained the INDEX formula above so let's just dive in to how I have worked out the rows that match what we are looking for:
SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))
The IF statement here is fairly self explanatory but as we have used it as an array formula, it will perform =Sheet4!$A2 which is the unique reference on every cell in the named range Reference individually. In your mock data this returns a result of: {FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} for the first entry (I included titles in the range, hence the initial FALSE). IF will do my row calculation* for every true but leave the FALSEs as they are.
This leaves a result of {FALSE;2;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} that SMALL(array,k) will use. SMALL will only work on numeric values and will display the 'k'th result. Again the column trick has been used but to cover more ground, I used another method: ROUNDDOWN(Number,digits) as opposed to using LEFT() Digits here means decimal places so I used 0 to round down to a whole integer for the same result. As this copies across the columns like so: 1, 1, 2, 2, 3, 3, SMALL will alternatively (as the formulas alternate) grab the 1st smallest AccNum then the 1st Smallest AccType before grabbing the 2nd AccNum and Acctype and so forth.
*(Row number of the match minus the first row number of the range then plus 1, again fairly common as a foolproof way to always get the correct row regardless of where the data starts; actually as your data starts on row 1 we could just do ROW(Reference) but I left it as is incase you had data in a different format)
ORIGINAL ANSWER - Same logic as above
Here's your solution in 3 parts
Part 1 being a trick for the auto completion of the titles so that they will hide when not used (in case you will just copay and paste values the whole lot to speed up use again).
=IF(COUNTIF(C2:C11,">""")>0,CONCATENATE("Product ",LEFT((COLUMN(A:A)+1)/2,1)),"") in C
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Prod code ",LEFT((COLUMN(B:B)+1)/2,1)),"") in D
Highlight both of the cells and drag across to stagger the outputs "Product " and "Prod code "
Part 2 would be inputting the unique IDs to the new sheet, I would suggest copying your entire column A across to a new sheet and using DATA > REMOVE DUPLICATES > Continue with current selection to trim out the multiple occurrences of unique IDs.
In column B use =INDEX(Sheet2!$B$1:$B$7,MATCH(Sheet4!$A2,Sheet2!$A$1:$A$7,0)) to get the names pulled across.
Part 3, the INDEX
Once again, we are doing a staggered input here before copying the formula across the page to cover the entirety of the data.
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0)),1),"") in C
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0)),2),"") in D
The formulas of Part 3 will need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter) . This will need to be done before copying the formulas across.
These formulas can now be dragged / copied in all directions and will feed off of the unique ID in column A.
My Answer is already rather long so I haven't gone on to break the formula down. If you have any trouble understanding how this works, let me know and I will be happy to write up a quick guide, breaking it down chunk by chunk for you.

Related

Multiply based on condition then sum results ( multiply if empty <> ) in google sheets

I need help in writing a formula in cell b7. The formula must look to the right and multiply the nonempty cells by the corresponding value in row 3, and I would like to sum up the results.
File link provided.
FILE LINK
ScreenShot
Please see my comment to your original post.
That said, I will try to explain how to approach this as I think you intend. (This solution will be a Google Sheets solution which will not work in Excel.)
The first thing you will need to do is to delete everything from Row 11 down: all of your examples and notes must be deleted for the following proposed formula to work correctly.
Once you have no superfluous data below your main chart, delete everything from B6:B (including the header "Total").
Then, place the following formula in cell B6:
={"TOTAL"; FILTER(MMULT(C7:G*1, TRANSPOSE(C$3:G$3*1)), A7:A<>"")}
This formula will return the header text "TOTAL" (which you can change within the formula itself if you like) followed by the calculation you want for each row where a name is listed in A7:A.
MMULT is a difficult function to explain, but it multiplies one matrix ("grid") or numbers by another matrix ("grid") and returns the sum of all products per row (or per column, depending on how you set it up) —— which is what you are trying to do.
MMULT must have every element of both matrices be a real number. To convert potential nulls to zeroes, you'll see *1 appended to each range (since null times 1 is zero).
This assumes that all data entered into C7:G and C3:G3 will always be either a number or null. If you enter text, you'll throw the formula into an error. If you think accidental text entries in those ranges are possible, use this version instead:
={"TOTAL"; FILTER(MMULT(IFERROR(C7:G*1, ROW(C7:G)*0), TRANSPOSE(IFERROR(C$3:G$3*1, COLUMN(C$3:G$3)*0))), A7:A<>"")}
The extra bits use IFERROR to exchange error-producing entries with zeroes, since MMULT must have every space in both matrices filled with a real number.

Excel: Flashfill Offset Horizontal + Vertical

So I'm not a fan of VBA and I recently learned that OFFSET can be used with COUNTA to flashfill a range as far at it is as long as you aim for a longer range than you have data. Now I want to be able to achieve this both for columns and rows at the same time, where the rows are averaged. Could this be done? I am banging my head against the wall to find some logic to do it, but can only manage to combine it in a way that multiplies the rows with the number of the column.. which is not desired, of course.
I have posted a Minimal Reproducible Example in Excel Online:
https://onedrive.live.com/view.aspx?resid=63EC0594BD919535!1491&ithint=file%2cxlsx&authkey=!ALmV0VtFb7QZCvI
If you see Cell J9 and J11 you will see what I want to combine. The three rows in J11 and down, I want to average in J10, and spill/flashfill (like J9 and 11 does automatically because of the formula already) them from to the right, for as many columns as there data in the range A1-G4..
So I have raw data of numbers with titles in A1-G4, and by writing =OFFSET($A$1:$A$1,0,0,1,COUNTA($A$1:$EV$1)-1) in J9 I get all the titles of the columns filled from left to right, and by writing =OFFSET($A$1,1,0,COUNTA($A:$A)-1) in J11 I get the rows of the first column filled from top to bottom. They can also be combined, by writing OFFSET(Days,1,0,COUNTA($A:$A)-1,COUNTA(Days)), where "Days" is =OFFSET($A$1:$A$1,0,0,1,COUNTA($A$1:$EV$1)-1) (in a named range for readability) or OFFSET($A$1:$A$1,0,0,1,COUNTA($A$1:$EV$1)-1) without using a named range
As a thought, though I'm not sure how to implement it, maybe this could somehow be used in some form to get the column reference for the horizontal part in combination with =AVERAGE(OFFSET($A$1,1,0,COUNTA($A:$A)-1))
=MID(ADDRESS(ROW(),COLUMN()),2,SEARCH("$",ADDRESS(ROW(),COLUMN()),2)-2)
..found at https://superuser.com/questions/1259506/formula-to-return-just-the-column-letter-in-excel/1259507
Now, based on your explanation, here is the screenshot of my test:
Section A1:Exxx
I have converted that section into a Table, called «TblData», having numerous avantages:
It expands automatically without any additional efforts/formula
We can identify Data by its Columns attributed automatically by the Table [#1], [#2],[#3], [#4], [#5]
Section J9:N9
As a replica of the table name, I have used the following formula to retrieve it:
=INDEX(TblData[#Headers],1,COLUMN(A1)) '<--- This is for J9
=INDEX(TblData[#Headers],1,COLUMN(E1)) '<--- This is for N9
Section J11:Nxx
As a replica of the Table Content, I have used the following formula to populate the content:
=INDEX(TblData,ROW($A1),MATCH(J$9,TblData[#Headers],0)) '<--- This is on J11
=INDEX(TblData,ROW($A3),MATCH(N$9,TblData[#Headers],0)) '<--- This is on N13
Section J10:N10
Now this is the interesting part of the Average, so here is the formula I used for it:
=AVERAGE(TblData[1]) '<--- This is on J10
=AVERAGE(TblData[5]) '<--- This is on N10
NB: (1) Instead of using the Content below J10:N10, I prefer to reuse the Table as it expands automatically as more rows are added.
(2) Unless it is really necessary, I feel it is a double work as well to replicate again A1:Exxx from J9:Nxxx, because you can use the Table for whatever you need, with less maintenance.
Kindly find attached the file as well after I updated those items:
File Link: https://drive.google.com/open?id=1wRbpUxg0XLpfGqdvMF4fNKXDrL7xPPWs
We can correspond more below for further info. Hoping you to strech more your compentence :)
Sorry, mate, I can't figure out what you want to calculate. If it makes sense to add J9+J11 then you could just concatenate the two formulas in J9 and J11 with a plus sign. After much deliberation I decided to assume that your question is not one of formula but of formula-writing - "referencing" for short. Therefore I prepared this answer for you, hoping that it will prove helpful.
Building on your named range Days I suggest you create a dynamic named range Data with this formula.
[Data] =OFFSET(Sheet1!$A$1,0,0,COUNTA(Sheet1!$A:$A),COUNTA(Sheet1!$1:$1))
The range thus defined is dynamic in both directions. However, bearing in mind that OFFSET is volatile (slows down your worksheet) you may like to keep its use limited to this one formula and perhaps start the range at A2, but I shall tempt you to break the rule. Now you can use the INDEX function to refer to the Data range.
= INDEX(Data, [Row number], [Column number])
defines a single cell. But by setting either column or row to zero you can define an entire column or row. =INDEX(Data,0,1) defines column 1 of the Data range, =INDEX(Data,1,0) defines its first row.
=INDEX(OFFSET(Data,1,0),0,1) defines the first column of a range moved down by one row from its original position. I recommend the alternative and start the Data range from A2 and perhaps declare another range for the first row if needed.
=AVERAGE(INDEX(Data,0,1)) would draw the same average you already have in your sheet, provided that Data was defined starting at A2. For fun's sake, =AVERAGE(INDEX(OFFSET(Data,1,0),0,1)) would do the same without the change in the range's definition.
=COLUMN() returns the number of the column this formula resides in. So, you could enter =COLUMN()-6 in column G, copy to the right and get a count starting from 1. (You can do the same vertically with the ROW() function.) Applied to your formula, =AVERAGE(INDEX(Data,0,COLUMN()-6)) would return the average from column 1 if entered in column G, and from columns 2, 3 4, etc as copied to the right.
As I said, I don't understand enough of your request to bring this idea to a conclusion but I think that using the method described above will provide you with a tool to copy formulas into the table your sample has at its right. If you would elaborate on your requirement I might be able to assist more.

VLookUp with Nested SUMIFS

I have been trying to make a form for some of my team members who are not that computer literate, and I essentially want to make it click and go. I thought I could do it...but alas I am not as good with nesting functions as I thought I was.
I have this spreadsheet where I want to put data into the yellow cell. On the next sheet I have the below table. What I want to do is use a formula to fill H4 with the "Request Branch's" Account Number. Now, I have currently filled the cells with information. They, in fact, have drop down options - which are pulled from the Account List table. As a result the value in H4 will continually change based on the needs of the user - but must be within the confines of the Account List Table.
What I have tried is here and enter link description here. I keep getting result of #Value, or N/A. I can't figure out what I am doing wrong. I know that I need to nest the SUMIFS withing VLookUp, but I am not sure as to why it won't work.
I'm providing you with two possible solutions.
1) The first one uses the SUMPRODUCT function. You may not have seen this kind of notation before.
When ranges are multiplied by each other like so (B3:B8=G3)*(C3:C8=G4) they are actually turned into boolean arguments. If you highlighted this part of the code and pressed F9 it would look like this: {0;0;0;0;1;0;0}. This is an array where TRUE for both criteria meet. So our Branch is "A" and our Carrier is "F". In the rest of the cases either or both are false resulting in zeroes.
Now if you multiply this array by the range with account numbers, obviously the only number remaining will be the one multiplied by 1 and so you have the answer however keep in mind that as you are multiplying if the account is not a number the function will fail!
2) This is why we have a second method using =INDEX() and =MATCH() functions.
To overly simplify this - the INDEX function grabs contents from an array at a specified position (row and column), while the MATCH function gets the position of an item in an array.
The idea with using ranges as multiple criteria is the same as in the first example, however this time when we get our array of zeroes and ones {0;0;0;0;1;0;0} we use the match function to find at which position our criteria cross (as seen on the screenshot it's the 5th position, as it's in the 5th row of the entire column D, the match function searches the {0;0;0;0;1;0;0} array for a 1 and returns its position in the array) and so this is our ROW.
Knowing the position of the contents we searched for we use the INDEX function to grab the contents of the cell in that position so =INDEX(D2:D8,MATCH(1, INDEX((B2:B8=G3)*(C2:C8=G4),0),0)) is actually =INDEX(D2:D8, 5) meaning that the INDEX function grabs contents of the 5th row from the range D2:D8 which is cell D6.
The green boxes are just there to show the instance where both of our criteria are met (cross).
Please try this formula.
=INDEX(Table1[Account],SUMPRODUCT((Table1[Branch]=F$3)*(Table1[Carrier]=F$4)*ROW(Table1[Account]))-1)
Note that you may have to adjust the final -1. This is because the indexed table starts in row #2 and the SUMPRODUCT function returns a sheet row number. If your table would start in a different row the difference between the sheet row and table row would be larger. It must be adjusted here. Or you might work with sheet references (named ranges) and require no adjustment at all.

I want to use sumproduct with two different tables based on selection

I am working on a statistical model where we use sumproduct to generate forecast values by multiplying coefficients in one table with variables in another. Right now it is being done manually and that is taking time. I would like to automate it but I'm not able to figure this out.
We are using concatenate to identify different rows to use for vlookup. The variable columns are the same in number for both tables. I need to multiply each variable cell respectively in both tables and sum them, hence sumproduct.
this is what I am trying to do
Forecast model 1 sales for product A in phones in USA = sumproduct([variables by year from table 1 for USA for phones], [Variables for USA phone product A model 1 from table 2] )
I hope someone can help me.
Proof of Concept
You will need to update the references to suit your spreadsheet table locations.
In cell E21 use the following and copy right and down as required:
=SUMPRODUCT(INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0),INDEX($F$15:$H$18,MATCH($A21&$C21&$D21&MID(E$20,16,1),$A$15:$A$18,0),0))
This process was simplified because you had a unique ID tag on each of the previous two tables that could be built from the information in the third table. If you ever get into double digit forecast models the MID() function part of the formula will need to be modified. The 16 in the mid function refers to the character location of the number in the forecast model sales header name in Table 3. As such you either need to keep that header format exactly the same or modify the position of the number in the MID() function.
UPDATE 1
Explanation of Formulas
The following formulas were used in this solution:
SUMPRODUCT
INDEX
MATCH
MID
Concatenate
I will start with the assumption that you already understand sumproduct() as you were already using it before you ran into your problem. One thing to note about sumproduct is that it causes array like calculation to occur on the portion within it brackets. In this case we fed it two ranges of equal size. The difficult part was more an issue of determining those ranges.
Using your ID columns as a lookup row we used the match() function to determine which row to use. For the first set of variables we used the following to determine which row to look in:
=MATCH($B21&$A21&$C21,$A$3:$A$12,0)
Match is made up of three arguments inside the brackets:
MATCH(what to look for, where to look, type of match)
What we need to look for in table is a concatenation of various cells in Table 3 to build the ID in Table 1. It could have been written using the full formula:
=CONCATENATE($B21,$A21,$C21)
but the short form using & was used instead:
=$B21&$A21&$C21
Once we had what to look for we needed the range of where to look and supplied the ID column from table 1:
$A$3:$A$12
This now leaves the third and final argument of what type of search to perform. An exact match seemed to be the most appropriate match to perform so the value of 0 was supplied. What match returns is the row within the supplied range. It is relative to the range supplied and not the actual row in the spreadsheet. If it cannot make a match it will return an error instead of a row number.
Now that we know what row we want, we can use this information with the INDEX() function. The INDEX() function is made up of 3 arguments as well with the third argument being optional depending on if a 1D or 2D range is being indexed:
INDEX(Range to work with, 2D Row or 1D Position reference, 2D Column reference)
IN the case we are dealing with for the first table, the range to work with was your list of variables:
$G$3:$I$12
This is a 2D range. As such we need to tell INDEX() both what Row to look in as well as which Columns to look in. For the row to look in, we used the previously discussed MATCH() function. Since we want all columns and not just a specific column we use the value of 0. If Match returns an error, or if a number greater than the number of rows or columns selected is supplied, INDEX() will return an error. Based on the information discussed, the index function would look like:
=INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0)
You can try entering the above in a cell but it will give you an error. if you select three adjacent cells in the same row and use CONTROL+SHIFT+ENTER when entering the formula, Excel will add {} around the formula and it will be an array formula and should show you the three variables being used.
The same process as described above can be used for determining the second range of variable from Table 2. The only difference here is that the forecast model number was not in a column of its own but instead in the header row surrounded by text. As such the MID() function needed to be used to go into the header row, bypass the surrounding text and pull the model number out so it could be used as part of the CONCATENATION() used for the "what to look for" in MATCH():
=MID(E$20,16,1)
The MID() function work again with three arguments:
MID(Text to look in, which character to start at, how many characters to pull)
So in this case we are looking at the header in E20. Note the lock $ on the row number so the formula is always looking in row 20 no matter how far down it gets copied. It is then going to the 16th character. In this case the character "1" and pulling 1 character. If the header had just been 1 and 2, there would be no need for the MID function and the cell (with proper lock) could have been used.

Obtain group average for numerous groups that vary in size

I get the feeling this is close to impossible, but here goes. I have data structured like this:
Test Group 1
pass
fail
pass
fall
Test Group 2
pass
fail
Test Group 3
pass
fail
fail
pass
I want to be able to paste it into Excel and have Excel summarise the percentage of each test group. So it would end up looking like this:
Test Group 1 Percentage Pass: 50%
pass
fail
pass
fall
Test Group 2 Percentage Pass: 50%
pass
fail
Test Group 3 Percentage Pass: 50%
pass
fail
fail
pass
BUT as you can see the test group length is not set, and it may vary. I was hoping is that I might be able to create a formula that follows this logic:
if( A<n> contains "Test")
count A<n> +1 until A<n> contains "Test"
It seems like I'm asking a lot from Excel formulas. I have spent some of today writing a small C# app that will split the configs into separate files so that I could copy and paste separate files into Excel. But it would be great to have fewer steps!
--- UPDATE ----
There were three very interesting approaches to this problem proving nothing is impossible :p
I wanted a solution that allowed me to copy and paste results in and see a load of percentages pop up so I was always sort of looking for a formula solution HOWEVER, please take the time to look at pnuts and Jerrys answers as they reveal some useful features of Excel!
chuffs answer was the one I was looking for and it worked out of the box. For anyone who wants to delve deep into how it works and why, I broke the formula down into steps and filled in some help information. The key to these formulas is combining MATCH, OFFSET and the slightly more obvious SEARCH / FIND / LEFT (I used to use IFERROR(FIND type approach, LEFT seems cleaner :) )
Do look at the documentation for these formulas but to see it all in action with some examples see the Google Spreadsheets I created detailing chuffs answer:
https://docs.google.com/spreadsheet/ccc?key=0AqODI11eAjtldDhDd2dBcFhpZW9SXzEybGtMUWMwM3c#gid=0
---P.S---
For the record I did actually create a C++ program to prettify my data and ouput it as .csv files. If I had had this info I wouldn't have bothered but am glad I tried both routes, it was a fun learning adventure.
A single, array-formula solution to this problem is possible. It requires only that a stop value "Test" be added at the bottom of your column of data.
Assuming your data are in the range A2:A18, here is the formula that would compute the average in cell B2:
=IF(LEFT(A1,4)="Test",
COUNTIF(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1),"pass")
/ COUNTA(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1)),
"")
The key parts of the formula are expressions that calculate the ranges for the two count functions, COUNTIF and COUNTA.
The COUNTIF function - which counts the number of passes in a group - takes two arguments, the range for the count and the condition to be met by cells counted.
I use the OFFSET function to provide the count range. OFFSETtakes five arguments:
An anchor cell (if a range is provided, the function only uses the top left cell in the range).
The number of rows below (+) or above (-) the anchor cell that the range to be returned will begin.
The number of columns to the right (+) or left (-) of the anchor cell for the start of the return range.
The number of rows to return.
The number of columns to return.
For example, OFFSET(A1,5,2,4,1) will return the range C6:D9.
In the formula, the anchor cell is A1, the row offset is 1 and the column offset is 0, in other words, the start of the range to return is A2.
The number of rows to return is computed by the MATCH function. For the calculation of the average for the first test group, MATCH looks in the range A2:A18 for the row of the first cell that starts with the word 'Test'. A2 is the cell in the pass/fail column just below the cell that starts the test group.
The last cell in the column is A18, which has been added to the data and contains the word 'Test'. This prevents the formula from returning an error for the last test group. Otherwise, the match would return an error because it could not find another occurrence of Test beyond the start of the final group.
Also note that the anchor points for the match range, i.e., $A2:$A$18, leaves unanchored the row reference to the top of the range (the 2 in A2). That's so the range will adjust downward as the formula is copied down so as to find the next, not the previous, occurrence of 'Test'.
Finally, I decrease the number of return rows by 1 to exclude the Text that MATCH found, which belongs to the next group.
The COUNTA functions - which counts the total number of passes and fails in a group - uses the same OFFSET/MATCH expression to get the range for the group.
This is an array formula and so must be entered with the Control-Shift-Enter key combination. Copy the formula down to the bottom of the data (excluding the cell with the 'Test' stop value) to calculate the averages for each group.
For convenience, here is the formula without the breaks shown above. It can be directly pasted into the worksheet (remembering to confirm it with Control-Shift-Enter).
=IF(LEFT(A1,4)="Test",COUNTIF(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1),"pass")/COUNTA(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1)),"")
Basically Subtotal will meet the requirement, provided some fairly tedious layout adjustments are acceptable.
Preparation:
Assuming Test Group 1 is in A1, insert Row1 and ColumnA, with in A2 =IF(LEFT(B2,1)="T",B2,"") and in C2 =IFERROR(MATCH(B2,{"fail","pass"},0)-1,"") and both formulae copied down as required. (And change fall to fail in source!)
Subtotal:
Select A:C, Data > Outline – Subtotal, At each change in: (1) Test Group 1, Use function: Average, Add subtotal to: check (Column C), uncheck Summary below data, OK.
Tidy up:
1. Move A3:C3 to A1 and A2:C2 to A3.
2. Filter A:C and for ColumnB, Text Filters, Contains add Test, OK.
3. In C3 put ="Percentage Pass: "&C4*100&"%" and copy down to last row numbered in blue (should show #DIV/0!).
4. Highlight A:C for all the rows numbered in blue and embolden.
5. Hide ColumnA and, optionally, hide or delete Rows1:2.
6. Take off filter selection.
Hopefully with preparation as on the left the result would be similar to as shown on the right:
Example re comment to #Jerry's answer:
There are many ways to tackle this -- one possible easy solution is to add a column next to your pass/fail column that says (assuming column A):
=IF(A1="pass",100,0)
if you do that, then the average of those values would be equal to the percentage pass.

Resources