Excel multiple running totals - excel-formula

I'm trying to make a simple formula for multiple running totals.
Basically, it's for recording transactions for different accounts, showing the running total of that particular account for each row. So, it's impossible to use SCAN+LAMBDA function. One way to do it is to have a set of helper arrays somewhere, but here I'm using another way, by using XLOOKUP.
=C2+XLOOKUP(D2,D$1:OFFSET(D2,-1,),E$1:OFFSET(E2,-1,),0,,-1)
Basically, it looks up the last account balance above the current row of the corresponding account and add the current transaction amounnt to it. It works by draggin down to all the transaction rows.
Since the number of transactions is over 10 thousand, I was trying to minimize the file size by using named function with LAMBDA.
Name: AddtoBalance
=LAMBDA(c,c+XLOOKUP(OFFSET(c,,1),Sheet1!E$1:OFFSET(c,-1,1),Sheet1!F$1:OFFSET(c,-1,2),0,,-1))
And changed cell E2 to
=AddtoBalance(C2)
and dragged it down to all transaction rows.
However, after saving and re-opening, the cells are having errors. I have to go to Name Manager, click Edit but without doing anything and Close it. Then the cells becomes fine again. It seems to me that when re-opening a workbook, the formulas are not calculated sequentially from top to bottom. Is that right? Is there any options to change it?

I think you are going to hate me when you see this…
Put this in E2:
=SUMIF( B$2:B2, B2, C$2:C2 )
Then copy it down. Mind the dollar signs, they are important. You could place this in a named Lambda but the character count reduction is probably immaterial.

After rebooting and restarting excel, actually I could not reproduce the error. It works fine with the Lambda function now.

Whole different approach, but how about:
=LET(f,ISNUMBER(C:C),
c,FILTER(C:C,f),
d,FILTER(D:D,f),
s,SEQUENCE(COUNTA(d)),
MMULT(
(d=TRANSPOSE(d))*(s>=TRANSPOSE(s)),
c))
It first creates an array of TRUE's and FALSE's on column C:C if it contains a number. This is used to filter the values to be used from column C and D.
A sequence of the count of filtered values in column C is used to simulate it's row number in that filtered range.
Now MMULT checks row by row how many values of column D equal it's current value where the sequence number for that "row" is smaller or equal than the current.

Related

How to Mimic Excel Tables Equations / New Row Behavior in Google Sheets

For those of us used to Microsoft Excel who switch to using Google Sheets, there are many differences which need to be taken into account.
One of the nice features in Excel that I miss is tables. If you insert a table into your Excel Spreadsheet, it does a lot of automatic things for you. You can have a single formula for one column of your table, and not have to update it whenever you add new data - whether adding a table row, or adding a row in the middle of the table.
Sometimes (though I haven't figured out why it sometimes does and why it sometimes doesn't) even without tables, Excel will suggest a formula fill as you're entering new data into a row, making copying the formula as easy as pressing "Tab".
There is no functionality in Google Sheets that matches this exactly. When you have a lot of data to enter, having to copy the formulas every single time you add a row is very tedious and time consuming and further leaves open the possibility in making a mistake when transcribing the information and copying / pasting the formulas. Any single cell could have a mistake and you won't know until it causes a problem later, then troubleshooting it will also be time consuming and difficult.
There are various questions in StackOverflow, StackExchange, Google Support and other sites that tackle this issue, but none seem to have a good solution that works for everyone. A lot of people have written an Apps Script do do just this, or use Apps Script + HTML forms as well... but it seems like that shouldn't be necessary, it adds more time & setup, and ends up with a specific solution for that sheet and that sheet only.
So, how can you replicate this behavior in Google Sheets so you don't have to keep copying & pasting your formulas over and over again and save yourself time (and your company money) and make Google Sheets act more like Excel?
BACKGROUND
There is a Google Support Thread on Inserting new Rows which suggests the use of ARRAYFORMULA to do this job. It is not an exact replacement for Excel's functionality, but it can work in most applications. There are other functions that output arrays, such as SEQUENCE which can also be applied similar to these examples depending on the situation, but I'll focus on ARRAYFORMULA here as it's the most generic and MOST functions can be wrapped in it and otherwise behave as you'd expect.
Here is also a link to an ARRAYFORMULA & MMULT Example Provided by Google (Note that this link will make a copy of the sheet, not let you directly access the example). The first tab is all about matrix multiplication, the second and later tabs have examples using ARRAYFORMULA.
The examples above are pretty limited in scope, so let's expand on those. To illustrate, I will use a basic formula involving 4 columns as an example. Let's say we have data in columns A, B, and C, and we want to do a relatively simple formula between them. Let's assume row 1 is being used as a header row, and your data is from row 2 down, as most people would do. Let's make the formula simple, but a little interesting, by having column D equal to the PREVIOUS value of A plus the product of B and C. Let's also assume we currently have 12 rows of data, but we know we will have data we need to enter in the future. Most of that data will get entered at the end, but sometimes we may need to add data in the middle of the range.
You can follow along with my Publicly Posted Example Sheet Here if you want (this will also create a copy on your drive so you can make changes and follow along). Each example below corresponds to a tab in the Example Sheet.
EXAMPLE: FORMULA ON EVERY ROW
In it's simplest form, the formula in D2 would be = A1 + B2 * C2. Except, of course, we know A1 is a text header and if we include that we'll get an error. It's also commonly understood that absolute references (with $) execute faster in Google sheets, and we don't need relative references on the columns (but rows are necessary to fill), so let's modify cell D2 as follows:
=IF( ISNUMBER($A1), $A1, 0 ) + $B2 * $C2
Then fill down to cell D13 (this is already done in the example).
So now you have your current data... but what if you need to add data?
If you add data to row 14, in columns A, B, and C, you then also have to copy the formula to D14. Easy peazy for this example, but what if you have 30 columns, 5 of them with formulas and you add another 10 entries to the list every day? This becomes very tedious. You can avoid entering it for every row, but filling down the number of rows you need today and save a little time, but it breaks your flow of data entry.
Even worse, what if the entries are in some sort of order (e.g. order of date data was captured) and you get old data that needs to be entered in the middle of the range? You can add at the end and copy, then sort.
Some sheets won't let you sort, or won't sort correctly if you have certain data, so you may need to insert in the middle... let's say between rows 8 and 9. If you did this in an Excel table, and used "insert row" it would automatically populate cell D9 with your formula.
But here, when you add this new row 9 not only is D9 blank and need you to enter the formula, but now the A column reference in cell D10 is pointing to A8 instead of A9 where it should! So you have to recopy / refill your equation to cell D10 as well - and this is easy to miss - you may not know to do it, or forget to do it, and now your formulas are broken.
... Now, to be honest, Excel didn't get this part right, either.
Somehow, it properly fills D9 in with the correct formula but botches
D10 with a reference to A8, but then continues with a correct reference to A10 in D11. Which is almost worse because since D9 was filled and all the other rows are correct, you may not realize you have a problem in D10...
This is basic spreadsheet use and is roughly the same behavior as using Excel WITHOUT Tables (except those instances where it decides to make suggestions for you) - so par for the course here if Excel didn't have the Table or suggestion ability.
Pros:
Simplest formula to implement
Works fine in fixed size sheets or sheets that don't change often
Tried and true, Will always work
Cons:
Have to copy formula to every new row you make
Very tedious for "living documents" that change often
If any formulas cross between rows, the pattern breaks when you insert a row
in between and you have to copy your formula to the row below as
well as your new one
With all the additional required repeated actions, it's very easy to make a mistake
Since the mistake could be in a single cell, finding the mistake after the fact can be difficult
EXAMPLE: CLOSED RANGE ARRAYFORMULA
Google support touts this as the best method. Indeed, if you want your formulas to update automatically when you add data in between and you want the least amount of computation time, then an ARRAYFORMULA with a limited (or "closed") range is the best solution.
To use ARRAYFORMULA, you put the formulas only in your top row of data (in this example, row 2). What makes this example closed range is that we will set it to cover exactly the data we have. So, the formula in D2 would be:
=ARRAYFORMULA( IF( ISNUMBER( $A$1:$A$12 ), $A$1:$A$12, 0 ) + $B$2:$B$13 * $C$2:$C$13 )
Here, we can (and I recommend) use all absolute references as the range we're using doesn't change as the cell row it's calculating changes. When you enter this formula, you will see it automatically populate D3 through D13 with the correct data as well.
If we want to add another row in the middle, it's easy. Taking the previous example, if we add a row between rows 8 and 9, you will see the formula in D2 has changed all the last rows - 12 is now 13, and 13 is now 14. When you enter data into columns A, B, and C in the new row 9, it automatically calculates correctly in D9.
When you look at the data in rows in column D (except D2), however, it shows the number itself in the formula bar - so someone looking at this sheet unaware there is an ARRAYFORMULA in use has no indication that it's an ARRAYFORMULA and overwriting ANY cell that was populated by ARRAYFORMULA will break the formula, give you an error in D2 and leave the rest of the values in the column blank. This is true for all methods using ARRAYFORMULA So, for that reason, I recommend you make your column a protected range!
Alternate: You could name all of your ranges. For example, $A$1:$A$12 could be col_A_prev, $B$2:$B$12 could be col_B, and $C$2:$C$12 could be col_C. Which gives the formula:
=ARRAYFORMULA( IF( ISNUMBER( col_A_prev ), col_A_prev, 0 ) + col_B * col_C )
The behavior would be identical. When you add a row in between, the named ranges will automatically expand to include it. You could also use the same ranges for your column protection to ensure no data is written over.
Note: I do want to give kudos where it is due. Google Sheets handles named ranges WAY better than Excel. When you add or remove rows / columns inside your named range in Google it automatically expands the range - and Google actually allows you to use the named ranges as references in any of the settings (conditional formatting, protection, etc.). While you can enter a named range in Excel for some of these, it will convert it to R/C references which won't change even if your range changes later. If you want to add to the ends or you move rows / columns in your named range - well, they're both still terrible at that
However, if we want to add new data to the end, in row 14 or after, this arrayformula will not automatically update.
Even worse, if you add a row between rows 12 and 13, it breaks the formula - as the references to columns B and C will update, but the references to column A will not - because A only went to row 12. In row 14 you now get the error:
Array arguments to ADD are of different size.
Because you're trying to add an array with 12 elements to an array with 13 elements. Admittedly, this is only a problem if you're referencing other rows which isn't that common across all useful spreadsheets. However, there are many practical reasons to do so, like cumulative sums.
So, either you have to deal with updating your ARRAYFORMULA columns each time you add data to the end (which doesn't make it much better than just copying your formulas to each row) or, you could basically make the last two rows "dummy rows" that you don't care about and add protection to those rows so they can't be edited or a row added between them, with perhaps a note saying "To add new data, insert a row above this line" so other people using it know what they have to do.
Pros:
Relatively simple formula to implement
Fastest Execution time
Will automatically adjust formula to any rows added in the middle
Can manage your ranges as named ranges
Cons:
Have to change the formula if you add any new data to the bottom (which is where you usually add new data) -OR- you have to implement one or more blank rows included in range with protection & reminders to ensure no one adds data to the bottom
Data below ARRAYFORMULA looks like just number entries and could easily confuse people into thinking it's not a formula entry and overwrite it without thinking.
EXAMPLE: OPEN RANGE ARRAYFORMULA
If you're following along in the example sheet, the first thing you'll note is this sheet doesn't do the same thing. It is simply using the CURRENT value in column A, rather than the previous row. This is because you CAN'T reference a previous row with this method (see a couple paragraphs down for why). To compensate, I forced A, B, and C to 0 in the first row and added another row to the bottom.
This is similar to the closed range example in its application of ARRAYFORMULA the difference here, is instead of having a fixed end to the ranges (rows 12 & 13 above), you leave the range open by using just the column letter at the end of the range, which references the last row of the column. So the equation in D2 now looks like this:
=ARRAYFORMULA( IF( ISNUMBER( $A$2:$A ), $A$2:$A, 0 ) + $B$2:$B * $C$2:$C )
The reason you can't reference a previous row's cell is if we used $A$1:$A here, that array would always have one more element than either $B$2:B or $C$2:$C and thus won't be able to add and will result in the error:
Result was not automatically expanded, please insert more rows (1).
Except inserting more rows won't work because the ranges will all expand by 1 also. Again, this is only a problem if you need to reference other rows which isn't common but is useful for things like cumulative sums.
When it comes to adding rows, though, this method is the best. Whether you are adding to the middle or the end of your data, it will automatically update the values in your ARRAYFORMULA columns.
Alternate: Same as with closed ranges, you could name all of your ranges. For example, $A$1:$A could be col_A_prev, $B$2:$B could be col_B, and $C2:$C could be col_C. Which gives the same formula as with closed range:
=ARRAYFORMULA( IF( ISNUMBER( col_A_prev ), col_A_prev, 0 ) + col_B * col_C )
So if you're not referencing previous rows, or if you just add a top "dummy" row like I did in the example, it's all good... easy peazy lemon squeezy, right?
Yes, at least at first. The other problem here is that open ranges are computationally intense for Google Sheets algorithms. As you add more and more rows, especially if you have multiple open range ARRAYFORMULA columns, the sheet calculations get slower and slower and slower. The sheet I was working on that prompted this had 21 columns, 8 of which had ARRAYFORMULA formulas in row 2. At around 200 rows of data (not that much in the world of spreadsheets) it was taking MINUTES to calculate with each and every change I was making. That's simply not useable - I almost went back to copying the formula to each row. (It's possible using named ranges may improve the speed some - I didn't try it on that sheet)
So this solution doesn't really work for big (but not even that big) spreadsheets where you have lots of formulas.
Also, a more minor gripe - you'll notice in the example that every row on the spreadsheet was now populated in column D, even where no data was entered. That's annoying, but not a sheet killer by any means - and you could add an IF statement to the ARRAYFORMULA to just output "" whenever you have no data in one or more data columns.
Pros:
Relatively simple and straight forward formula to implement
"Works" with any number of rows
Automatically includes any rows that are added - on the end or in between
Can manage with named ranges
Cons:
Cannot reference data from previous rows
Extremely slow - computation time goes up with every added row (& every added column with an open reference)
Data below ARRAYFORMULA looks like just number entries and could easily confuse people into thinking it's not a formula entry and overwrite it without thinking.
EXAMPLE: HYBRID ARRAYFORMULA
Are you ready to give up on Google Sheets yet?
Well, there is one more option. It gets complicated and involved, but IMO works better in most situations than any of the above examples.
What I do here is add a cell with a formula for the number of rows in the sheet that have data in a certain column. Let's just say column A for this example. That formula looks like this:
= ARRAYFORMULA( MAX( IF( LEN($A:$A), ROW($A:$A), ) ) )
This, in and of itself, is an open ranged formula. It scans everything in column A and returns the last row that has SOMETHING in it. But it's one single formula in one cell reporting 1 value - no other cells get populated from it. It's relatively computationally intense for this one cell, but it's just one cell in the entire sheet.
Then, to make sure that any changes you make (adding / removing rows or columns) do not affect any references to that cell, name it. In the example provided, this is named last_example_row.
I also strongly recommend that you add protection to last_example_row so it's not accidentally changed. Extra tip: you can actually set both sets of permissions: "Only You can edit" and "show a warning when editing" so even if you try to edit it accidentally it will give you the chance to cancel the edit.
Since it's not a piece of data you need visually, hiding it is also a good idea (I left it unhid in the example so you can easily see the formula)
Now, in order to use the value in last_example_row as part of our ranges, we have to use the INDIRECT function. We replace every open-ended instance in the previous example with a specific INDIRECT call.
For calls to the same row, for example, we replace with a pattern like this:
$B$2:$B is replaced with $B$2:INDIRECT( "$B$" & last_example_row )
so it ends on the last used row.
For calls to the previous row, we replace with a pattern like this:
$A$1:$A is replaced with $A$1:INDIRECT( "$B$" & ( last_example_row - 1 ) )
so it ends 1 row before the last used row.
So the final equation becomes this monstrosity:
=ARRAYFORMULA( IF( ISNUMBER( $A$1:INDIRECT( "$A$" & ( last_example_row - 1 ) ) ), $A$1:INDIRECT( "$A$" & ( last_example_row - 1 ) ), 0 ) + $B2:INDIRECT( "$B$" & last_example_row ) * $C2:INDIRECT( "$C$" & last_example_row ) )
So it's a closed range reference that points to a single open range calculation, and it works. Whether you add data in the middle or to the end, it automatically calculates your column for you - and it only populates rows where your data is also populated.
Since it only does the open range calculation ONCE, then uses that value in all remaining closed range calculations, this is much, much faster than the open range example above. It IS slower calculating than the first two examples, however - but I haven't yet hit the point in my real sheets where the delay has made it unusable (stay tuned as I add more data to my sheets over time). If anyone reading this has hit that point with this method, please let me know how many columns & rows you got to, including how many of the columns used an ARRAYFORMULA like this.
Unfortunately, however, since this method requires an INDIRECT call, you cannot use named ranges to accomplish this.
Pros:
Most flexible option
"Works" with any number of rows
Automatically includes any rows that are added - on the end or in between
Much Faster than completely open references
Cons:
Formulas are complex, hard-to-follow, and easy to make a mistake while entering
Slower than closed references - computation still time goes up with every added row and every added column with these "hybrid" references
Data below ARRAYFORMULA looks like just number entries and could easily confuse people into thinking it's not a formula entry and overwrite it without thinking.
Cannot manage with named ranges
Epilogue
Maybe (hopefully) someday Google will add a feature that will keep track of your formulas and execute them in a speedy way and this post will be obsolete. Until then, I hope this post helps someone out there.
Additional Note
Using any of the ARRAYFORMULA methods above can break sorting. If you add filters, and the sort by A->Z or Z->A on a particular column and row 2 is no longer row 2 - then your ARRAYFORMULA gets moved to whatever row it gets sorted to - and then only applies from that row down. Rows above it will be blank in all your ARRAYFORMULA columns. This is very disappointing to me. One way around it (that I don't like) is you can make row 2 a "dummy" row where whatever columns you may sort by have values that will always make it the top row. That's a pretty ugly solution, though.
You can make it less "ugly" by hiding row 2. Then columns will sort fine and you won't see any of the dummy data ("dummy" data may not even be necessary as the hidden row shouldn't sort with the rest). The caveat here is if you share it with multiple users - they won't even see there is a formula being used, it looks like all manual entries - and if one gets overwritten, it will break the ARRAYFORMULA. So, I would recommended protecting the ARRAYFORMULA columns, as well.

Automate concatenation process

Here I am stucked with one excel issue where i want to concatenate from column F till column I where the logic is when the benchmark column A3 (for example) is blank it need to concatenate column F till column I till there is a value at column A4.and this logic need to automatically concatenate the mentioned column till there is a value under the benchmark column. currently i need to keep change the concatenate range in order to concatenate it fully with the logic. Appreciate if anyone can help me out.
Below image shows how i am doing manually which very time consuming
You can use the MATCH function (with a wildcard) to find the next non-blank row; and use that in an INDEX function to detect the range to concatenate.
Assuming your data starts in A3 and the lowest possible row is row 1000 (change the 1000's in the formula below if it might be much different:
J2: =IF(A2="","",CONCAT(INDEX(F2:$I$1000,1,0):INDEX(F2:$I$1000,IFERROR(MATCH("*",A3:$A$1000,0),1000-ROW()),0)))
Note: It is possible to also develop solutions using INDIRECT and/or OFFSET. Unfortunately, these functions are volatile, which means they recalculate anytime anything changes on your worksheet. If there are a number of formulas using these functions, worksheet performance will be impaired. INDEX and MATCH are non-volatile (except in ancient versions of Excel - pre-2003 or so)
The OFFSET-function would come on handy here. One solution is to do it like
This works in my worksheet.
Cell Q6 just defines the number of rows downwards that the MATCH-function is checking for the next "HEADER1" value. If "HEADER1" is found, the MATCH-function returns how many rows down-1. If no "HEADER1"-value is found within that range, that value is then the number of rows used.
If the first column also has "HEADER2" and so on, you can add the MID-function to both references inside MATCH to limit which part of the string are to be searched for.
I tried to adjust the references properly to fit your sheet, but I may have missed something:
=IF(ISBLANK($B2),"",CONCAT(OFFSET($B2,0,0,IFNA(MATCH(MID($B2,1,6),MID(OFFSET($B2,1,0,$B$1),1,6),0),$B$1),4)))

Can these formulas be simplified? Why does INDIRECT function seem to not work inside an ISBLANK test within a MATCH formula?

Summary
I need an array formula that takes a row of data of certain length from Sheet1. For that row, in each column that is not blank, I need to grab the Sheet1 header value for that column and display that data in a continuous row on Sheet2 (without any spaces in between the row's cells).
Background
I have a table of data (employees and industry certifications with expiration date being the table's cell data) on sheet 1, with a row for each employee the spreadsheet is tracking. The certifications are the columns.
We are using this information to link to ID Badge Printer software (Bodno Silver), where we are limited to linking columns of data to a particular textbox.
The problem lies in the fact that not everyone has every certification. The rows are peppered with blanks separating the certifications that each employee does have. While setting up the required text boxes in the badge software template, that each link to a specific column, I quickly realized that since not everyone has every certification if we used the data how it was we would have a bunch of strange looking blanks in between the listed certifications rather than a continuous list.
What I did
My solution to this (which I'm open to a better one if anyone knows of one, other than "use better software"), was to create a new sheet and array formulas that no one would use except for me and the id printer software. This sheet would have a similar data table that took the rows of data interspersed with blank cells between expiration dates, and put the matching column headers for cells that had a date in them into a continuous row of the same maximum length (eliminating the blank cells).
Essentially, this would allow me to circumvent the restrictions of the badge software and each textbox would be MatchedCert1, MatchedCert2, MatchedCert3, etc. up to the original maximum number of certifications.
Pictures are probably better than my words at explaining what I am going for:
Sheet1 (source)
Sheet2 (result)
The array formulas
I worked on this one for a while. What I thought would be a simple INDEX, MATCH, ISBLANK formula (that I could create using the appropriate relative and absolute cell linking) and then expand to the whole sheet turned into a witch hunt and me praying for forgiveness for my sins to all that may be holy. Also a lot of googling.... I realized quickly that this one may not be so simple after all.
Finally, I arrived at the following two array formulas in order to correctly show what I was going for:
First Column of training section
{=IFERROR(INDEX(Sheet1!$E$2:$P3,1,MATCH(FALSE,ISBLANK(Sheet1!E3:Q3),0)),"")}
(easy enough, right? I thought so...)
I felt good about this until I tried to think through what would be required to get the formula to be universal so that I could use it on the entire table.
I feel dirty just putting the following in public, but here goes...
Second column through last column array formula
{=IFNA(INDEX(INDIRECT(ADDRESS(ROW($E$2),(MATCH(E3,Sheet1!$2:$2,0)+1),1,1, "Sheet1")&":"&ADDRESS(ROW(E3),COLUMN($Q3),1)),1,MATCH(FALSE, ISBLANK(INDEX(INDIRECT("Sheet1!"&ADDRESS(ROW(E3),(MATCH(E3,Sheet1!$2:$2,0)+1),1)&":"&ADDRESS(ROW(E3),COLUMN($Q3),1)),0,0)), 0)),"")}
(please don't call the police...)
[ninja edit] While this array formula works for 2nd result column through the final column, it doesn't work if there's not a blank column following the result range. The actual spreadsheet has 4 different groups of certifications that run horizontally, but I was able to just add a blank column in the corresponding data from the other sheet easily enough, so I just let it go. I'd give somebody a nickle for the answer to why that's the case here too [/edit]
Results
The first array formula, and INDEX MATCH using ISBLANK is rather straightforward.
The biggest question for me here, and the thing that drove me absolutely nuts for a couple of days, is why the second array formula requires the additional INDEX function nested inside of the ISBLANK function.
While taking the function apart and experimenting I realized that if I have any INDIRECT reference inside a ISBLANK function, which is itself inside of a MATCH function, the result of the match was ALWAYS 1:
{=MATCH(FALSE,ISBLANK(INDIRECT("$E3:$Q3")), 0)}
The above ALWAYS returns 1, whereas if I put the range in explicitly, the function would work just fine. That wasn't an option for me, since I needed to dynamically return the starting position for the match using the previous cell's address.
However, adding an INDEX function (with a column and row value of 0) to encapsulate the INDIRECT function provides the correct answer. I figured this out just by trial and error.
Questions
Can someone with more knowledge please let me know what is causing this behavior?
As a broader question, given I am limited to using formulas (no VBA), I would also like to know if I'm going about this in the wrong way or if there is a much simpler way of accomplishing this without this behemoth of a formula?
I know this sheet will probably require maintenance in a year - good luck future self!
Put this in E3, Copy over and down
=IFERROR(INDEX(Sheet1!$2:$2,AGGREGATE(15,6,COLUMN(INDEX($E:$P,MATCH($C3,Sheet1!$C:$C,0),0))/(INDEX(Sheet1!$E:$P,MATCH($C3,Sheet1!$C:$C,0),0)<>""),COLUMN(A:A))),"")
As to why your formula is not working, it is too convoluted to parse. One note, unless the sheets is the variable, one should avoid INDIRECT as much as possible. INDEX can almost always be used in its place.
Both INDIRECT and ADDRESS are volatile functions. Volatile functions will re-calculate every time Excel re-calculates, leading to a lot of unnecessary computations.
Not a solution but to answer why you are seeing this behavior:
EDIT: PREVIOUS EXPLANATION WAS JUST PLAIN WRONG
This confused me so, I did a bit of investigation:
I think that your problem is actually coming from the ISBLANK function because it is intended to be used with single values, and cannot handle ranges. Any BLANKs which are returned by functions are only converted to numeric values (0), when the BLANK is returned to (or displayed on) the sheet. If the function is returning to another function, the BLANK value seems to be preserved.
EDIT: ADDING A SOLUTION WITHOUT ARRAY FORMULAS
This is probably more complex than using an array formula... but I strongly dislike them, so do all I can to remove them.
Firstly, I would add an index to your positions in the results sheet:
=IF(F$7>COUNTIFS($F3:$L3,"<>"),
"",
IF(
MINIFS(
$F$7:$L$7,$F$7:$L$7,
">" & IFNA(INDEX($F$7:$L$7,MATCH(E9,$F$2:$L$2,0)),0),
$F3:$L3,
"<>"
)=0,
"",
INDEX(
$F$2:$L$2,
MATCH(
MINIFS(
$F$7:$L$7,$F$7:$L$7,
">" & IFNA(INDEX($F$7:$L$7,MATCH(E9,$F$2:$L$2,0)),0),
$F3:$L3,
"<>"
),
$F$7:$L$7,
0
)
)
)
)
Basically, the formula looks at the cert in the previous cell, and looks for the next, minimum index, greater than that.

Sumproduct or Countif on a 2D matrix

I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....

Formula to follow addition of rows

New to VBA, please help. My apologies. I have not done a good job of making myself clear. Let me try one more time.
My sales reps enter every call into a call sheet. They call on 50-60 people a week; some they will see more than once a week, some only a couple of times a year. On this call sheet are 4 columns; date of call, customer, numerical date, and days since last call. This sheet may have hundreds of rows, many are duplicate customers called on a different date.
I have written code that will eliminate duplicates as needed (works fine). New calls are added using NextRow=_ (also works fine). $C$2 is set at TODAY().
Formula in column C is $C10=$A10(Column C is formatted to number). Column D is number of days since last call; $C$2-$C10 etc. Simple and works fine.
Issue is that say I have 50 rows (all different customers) sorted ascending and a new customer is added, key being new. I need the formulas in C and D to drop down one row automatically when the new customer is added. I can drag the formulas down a head of time and everything will work until I sort, then my data is a the bottom of my sort because all rows in column A without a date will produce a 0 in both C and D. My finished product should be a range of different customers (no duplicates); with the customer that has not been called on the longest at the top.
I hope this is a better explanation. Can I write code to ignore the 0's?
I am going to go a little out on a limb here and say maybe your formulas need refactoring...
For instance. If the aim is to calculate the days since the last call was made to a customer, a simple formula such as this would work
=(max(C:C)-Today())
This gets the largest value in column C and subtracts today from it.
If you want to get the value in column D which corresponds to this entry then VLOOKUP() is your friend. you would use it as such:
=VLOOKUP(MAX(C:C),C:D,2,FALSE)
Hope this helps.
Incidentaly, the best way to do your problem in VBA, the simplest way would be to create a Named Range. You can then replace the $C$2-$D11 with the name of the named range. The simplest way to do this would be to say:
Range(Range(C2),Range(C2).End(xlDown)).Name = NAmeOfYourRange
This effectively just gets cell C2, goes to the last non blank cell in the downward direction and names that range NameOfYourRange
Hope this helps :)

Resources