Excel LINEST with conditional array and multiple X variables - excel

I have a data set formatted as table in Excel 2019. What I would like to achieve is a regression analysis, but only for those records in data set, where we have 'X' in E column which is named IncludeInRegression.
Known Y's are in column F (Price) and known X's are in columns B:D (L, W, Volume).
I have managed to make it work for one independent variable X (variable L in column B) and here is the array formula:
=LINEST(
INDEX(F:F;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[Price]))))));
INDEX(B:B;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[L]))))));
TRUE;FALSE)
However, I cannot make it work for 3 independent variables. I have tried the following array formula, but #VALUE! is returned:
=LINEST(
INDEX(F:F;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[Price]))))));
INDEX(B:D;N(IF(1;MODE.MULT(IF(tblData[IncludeInRegression]={"X","X"};ROW(tblData[[L]:[Volume]]))))));
TRUE;FALSE)
So it will be easier for you to visualize, I am attaching an image as well.

You need to include another array of the column numbers in the second INDEX so it returns all three columns:
=LINEST(
INDEX(F:F,N(IF(1,MODE.MULT(IF(tblData[includeinRegression]={"X","X"},ROW(tblData[Price])))))),
INDEX(B:D,N(IF(1,MODE.MULT(IF(tblData[includeinRegression]={"X","X"},ROW(tblData[[L]:[Volume]]))))),N(IF(1,{1,2,3}))),
TRUE,FALSE)
Depending on ones version this may need to be array entered by selecting four horizontal cells, putting the formula in the formula bar and using Ctrl-Shift-Enter instead of Enter when leaving edit mode

Related

how to sum columns using column headers and bridge tables in excel

I have two sets of data in excel, set 1 is the raw data, and set 2 is a bridge table. The desired output is also added. How should I prepare for this formula.
set 1:
set 2:
output expected:
Here, a solution that assumes a variable number of headers and no specific pattern in the column names. Assumed no Excel version constraints as per tags listed in the question. In cell H1, put the following formula which spills the entire result all at once:
=LET(in, A1:F5, lk, A8:B12, header, DROP(TAKE(in,1),,1), A, TAKE(lk,,1),
B, DROP(lk,,1), data, DROP(in,1,1), REDUCE(TAKE(in,,1), UNIQUE(B),
LAMBDA(ac,bb, LET(f, FILTER(A, B=bb),values, CHOOSECOLS(data,XMATCH(f, header)),
sum, MMULT(values, SEQUENCE(ROWS(f),,1,0)), HSTACK(ac, VSTACK(bb, sum))))))
Here it the output:
We use LET function with two input ranges only: in, lk, so the rest of the names defined depend on such range names. It makes the formula easy to maintain and to adapt to your real scenario.
Using DROP and TAKE we extract each portion of the input ranges: header, data, A, B (columns from the second table). We use REDUCE/HSTACK pattern to concatenate the column of the result on each iteration. Check my answer from the question: how to transform a table in Excel from vertical to horizontal but with different length for more information.
We iterate by unique values of B and for each value (bb) we select the column A values (f). We use XMATCH to select the corresponding index columns from header (it doesn't include the date column). We use CHOOSECOOLS to select the corresponding columns from data (values). Now we need to sum by column, and we use MMULT for that. The result is in sum name. Finally, we use HSTACK to concatenate the selected columns one each iteration, including as header the unique values from B.
Note: Instead of MMULT function, you can use the following array function, it is a matter of personal preferences:
BYROW(values, LAMBDA(x, sum(x)))
You could try SUMIFS with the wild card character for each row. For example, for the first column, put the following formula and drag it down.
=SUMIFS($B2:$F2,$B$1:$F$1,"=A*")
Then do the same thing for the other columns, e.g. for column B:

Excel Index to look up multiple values

I have a small data set of 2 columns and several rows (columns A and B)
I want to return each instance of codeblk 3 in a formula that is elsewhere in my sheet, (so a vlookup is out as it only shows the first instance) if it does not appear then a message to say its not there should come up.
I have the formula partially working but i cant see the reason why its not displaying the values.
My formula is as below:
This is an array
{=IF(ISERROR(INDEX($A$55:$B$70,SMALL(IF($B$55:$B$70=3,ROW($B$55:$B$70)),ROW(1:1))-1,1)),"No value's produced",INDEX($A$2:$C$7,SMALL(IF($B$55:$B$70=3,ROW($B$55:$B$70)),ROW(1:1))-1,1))}
The result that shows up is only "No values produced" but it should reflect statement B, C and D in 3 separate cells (when changing ROW(1:1), ROW(2:2) etc)
{=SMALL(IF($B$56:$B$69=4,ROW($B$56:$B$69)),ROW(1:1))} - This produces the result 68 which is the correct row.
Any ideas?
Thanks,
This is an array formula - Validate the formula with Ctrl+Shift+Enter while still in the formula bar
=IFERROR(INDEX($A$55:$B$70,SMALL(IF($B$55:$B$70=3,ROW($B$55:$B$70)-54),ROW(1:1)),1),"No value's produced")
The issue you are facing is that your index starts it's first row on $B$55, you need to offset the row numbers in the array to reflect this. For example, the INDEX contains 16 rows but if you had a match on the first row you are asking for the 55th row from that INDEX(), it just can't fulfil that.
EDIT
The offset was out of sync as your original formula included another -1 outside of the IF(), I also left an additional bracket in play (the formula above has now been edited)
The ROW() function will essentially translate $B$55:$B$70 into ROW(55:70) which will produce the array {55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70} so the offset is needed to translate those row numbers in to the position they represent in the indexed data of INDEX().
The other IF() statement then produces and array of {FALSE;2;3;4;FALSE etc.
You can see these results by highlighting parts of the formula in the formula bar and hitting F9 to calculate.

Multiple search criteria within Excel Formula

I have a sheet with the following demo data (yeah the content is german don't mind that)
And I need a formula which will search for the criteria in A1 and B1 and returns the respective value out of the matrix E3:M8
For example
Search criteria is: X and 2 - Return value should be wert2
or
Search criteria is: Z and 1 - Return value should be wert7
I think I can somehow use an INDEX formula but not quite sure how to do so..
Hope you can help
It looks as if your data has merged cells. If you want to keep life simple, avoid merged cells.
Your data has the same 1,2,3 sequence in each section x, y and z. I assume that these values will always be the same. With the data laid out like in your screenshot, the formula you need is
=INDEX($E$3:$M$7,MATCH(B1,$E$3:$E$7,0),MATCH(A1,$E$3:$M$3,0)+2)
Because of the merged cells it is impossible to select the range in column E for the first Match function. It needs to by typed instead. You also need to adjust the second Match result for the column, since the x, y, and z are stored in the first of the merged three cells, but you want the value underneath the third of the merged cells. Avoid merged cells.
A better data layout would be this:
The Index/Match could be simplified to
=INDEX($E$3:$H$7,MATCH(B1,$E$3:$E$7,0),MATCH(A1,$E$3:$H$3,0))
Or you could use Vlookup
=VLOOKUP(B1,$E$3:$H$7,MATCH(A1,$E$3:$H$3,0),FALSE)

How to select a VLOOKUP row based on multiple column values

We have 11 columns (Columns B through L) of codes that I need to select based on a VLOOKUP from another sheet. IF ANY of the column values are "HI" or "EXT", I need to keep the record, if ALL of the column values are "M" I can exclude it. Column A is my LOOKUP list.
Right now the best I can come up with is 11 nested =IF(VLOOKUP(...) statements to set an inclusion flag, but if there's a way to SUM a TRUE/FALSE flag based on equality to the value "M" across all 11 columns...I've not had success finding that.
Any ideas?
This can be solved in two steps:
For columns B-L, the formula needs to be your VLookup formula (which you didn't put here) and ="M" at the end of it, which will result in a binary true/false value.
Then, in column M, simply do a logical AND using the AND function across B-L for each row e.g. =AND(B1:L1)
Another option, if you wish to keep the display format the same, is to do an array formula.
Enter =IF(AND(B1:L1="M"), "KEEP", "EXCLUDE"), then press CTRL+SHIFT+ENTER and it will add curly braces to it, meaning it calculates an array value. The resulting formula in the cell will be {=IF(AND(B1:L1="M"), "KEEP", "EXCLUDE")}. I tested, and it appeared to work as expected.

Need to find the closest greater number in a list filtered by a numeric value in another column

I need to create an Excel formula that would do the following operation:
Whenever there is a particular number in the column Number then then the formula should compare the corresponding value in Target column with the value in Code column and flag in the Expected Output all rows, where Code greater and nearest value in column Target.
What you describe could be easily done in SQL database lanuguage using the GROUP BY statement.
You need formula and Excel does not directly support this, because it is not an SQL database.
However it can be done quite easily with formulas. There are more ways how to achieve that, I can show you, how I would do it. I have used array formula MIN(IF(... which works in a similiar way as SELECT MIN(A) FROM range GROUP BY B
I would create 2 helper columns (D and E) - they could be eventually placed on a separate sheet, if necessary - see picture bellow.
In the column D place formula: =IF(B1-C1>0,B1-C1,65535) (65535 is a little hack for a relatively very high number)
In the column E place formula: =MIN(IF(A$1:A$12=A1,D$1:D$12))as array formula Ctrl+Shift+Enter A$1:A$12 and D$1:D$12 are ranges of your data
In the column F you get your Expected output: =IF(E1=D1,1,0)
So in the column E you get minimum difference between C and B that is > 0 filtered by A.
When working with Excel data in this way I recommend using named ranges instead of A$1:A$12 etc.

Resources