I'm trying to write a linear regression function that dynamically references columns, can handle #N/A values, and will function as additional rows are added over time. Here is a sample dataset:
Date Value 1 Value 2
1/2/1991 #N/A #N/A
2/4/2002 276.36 346.31
1/7/2003 252 350
1/21/2004 232 345.5
1/6/2005 257 368
2/1/2006 278.24 390.11
2/23/2007 #N/A 380.46
2/11/2008 326.34 383.04
2/12/2009 #N/A 399.9
2/17/2009 334.39 #N/A
1/29/2010 344.24 400.83
1/27/2011 342.88 404.52
2/7/2012 379 417.91
1/23/2013 #N/A 433.35
Here is the function that I've developed so far, based on this forum post. It calculates the linear regression for Value 1.
=TRANSPOSE(
LINEST(
N(
OFFSET(
INDIRECT("B2" & ":B" & COUNTA(B:B)),
SMALL(
IF(
ISNUMBER(
INDIRECT("A2:A" & COUNTA($A:$A)) *
INDIRECT("B2" & ":B" & COUNTA(B:B))),
ROW(INDIRECT("B2:B" & COUNTA(B:B))) - ROW(B2)),
ROW(INDIRECT("1:" & MIN(
COUNT(INDIRECT("A2:A" & COUNTA($A:$A))),
COUNT(INDIRECT("B2:B" & COUNTA(B:B))))))), 0, 1)),
N(
OFFSET(
INDIRECT("A2:A" & COUNTA($A:$A)),
SMALL(
IF(
ISNUMBER(
INDIRECT("A2:A" & COUNTA($A:$A)) *
INDIRECT("B2:B" & COUNTA(B:B))),
ROW(INDIRECT("B2:B" & COUNTA(B:B))) - ROW(B2)),
ROW(INDIRECT("1:" & MIN(
COUNT(INDIRECT("A2:A" & COUNTA($A:$A))),
COUNT(INDIRECT("B2:B" & COUNTA(B:B))))))), 0, 1)),
TRUE, FALSE))
With the way it is currently written, dragging my array to the right to solve for Value 2 requires some manual updating of the formula. Everything in quotes in the INDIRECT formulas must be manually changed from B to C. I have 40 columns of data, though, so I tried to make the formula entirely dynamic using ADDRESS, ROW, and COLUMN:
=TRANSPOSE(
LINEST(
N(
OFFSET(
INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2))),
SMALL(
IF(
ISNUMBER(
INDIRECT("A2:A" & COUNTA($A:$A)) *
INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2)))),
ROW(INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2)))) - ROW(B2)),
ROW(INDIRECT("1:" & MIN(
COUNT(INDIRECT("A2:A" & COUNTA($A:$A))),
COUNT(INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2)))))))), 0, 1)),
N(
OFFSET(
INDIRECT("A2:A" & COUNTA($A:$A)),
SMALL(
IF(
ISNUMBER(
INDIRECT("A2:A" & COUNTA($A:$A)) *
INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2)))),
ROW(INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2)))) - ROW(B2)),
ROW(INDIRECT("1:" & MIN(
COUNT(INDIRECT("A2:A" & COUNTA($A:$A))),
COUNT(INDIRECT(ADDRESS(2, COLUMN(B2)) & ":" & ADDRESS(COUNTA(B:B), COLUMN(B2)))))))), 0, 1)),
TRUE, FALSE))
This gives me #REF!. When I do a step-by-step evaluation of the formula, it looks like the issue comes when Excel evaluates COLUMN. It introduces braces to the formula, which propagate through the rest of the INDIRECT evaluation. Here is a quick comparison:
Original formula:
INDIRECT("B2:B15")
Dynamic formula:
INDIRECT({"$B$2:$B$15"})
This evaluates as #VALUE, and at that point the rest of the formula is broken. Is there a way to force Excel to not use braces in this evaluation, or is there a better way of making this calculation?
Are you only trying to get the SLOPE from the linear regression? If so, you can just use the SLOPE function after converting the #N/A to blanks (using IFERROR in a formula). SLOPE will then just toss out the blanks. If you want the intercept as well, use the same formulas below and substitute INTERCEPT for SLOPE.
Picture of ranges
Formulas are array formulas (use CTRL+SHIFT+ENTER) and copied over. Given this arrangement, the simple formula (non-dyanmic) would be:
=SLOPE(IFERROR(B2:B15,""),$A$2:$A$15)
If you want these to be dynamic, you can use INDEX and COUNTA to get the dynamic range.
=SLOPE(IFERROR(B2:INDEX(B:B,COUNTA(B:B)),""),$A$2:INDEX($A:$A,COUNTA($A:$A)))
Use a Table instead
Even better, you could define this data inside a Table and then use the headers to pull in the whole column. That formula would look nice and copy easily.
Still using an array formula here, but the only variable is the column heading which is used to look into the Table1. This one would be more resistant to blanks in the data which will break the COUNTA used above.
=SLOPE(IFERROR(INDEX(Table1,,MATCH(M1,Table1[#Headers])),""),Table1[Date])
It appears you can use the following, shorter, non-volatile array formula**:
=LINEST(INDEX(B:B,N(IF(1,MODE.MULT(IF(ISNUMBER(B2:B15),{1,1}*ROW(B2:B15)))))),INDEX($A:$A,N(IF(1,MODE.MULT(IF(ISNUMBER(B2:B15),{1,1}*ROW(B2:B15)))))))
B2:B15 can be dynamically defined if desired as per Jeeped's solution.
Regards
You are going to want to get rid of the use of the INDIRECT function as much as possible; certainly as it pertains to substituting column references for string equivalents. It seems that many can be replaced with a form of INDEX/MATCH function pairs.
=TRANSPOSE(
LINEST(
N(
OFFSET(B2:INDEX(B:B, MATCH(1E+99,$A:$A )),
SMALL(
IF(
ISNUMBER(
$A2:INDEX($A:$A, MATCH(1E+99,$A:$A )) *
B2:INDEX(B:B, MATCH(1E+99,$A:$A ))),
ROW(B2:INDEX(B:B, MATCH(1E+99,$A:$A ))) - ROW(B2)),
ROW(INDIRECT("1:" & MIN(
COUNT($A2:INDEX($A:$A, MATCH(1E+99,$A:$A ))),
COUNT(B2:INDEX(B:B, MATCH(1E+99,$A:$A ))))))), 0, 1)),
N(
OFFSET(
$A2:INDEX($A:$A, MATCH(1E+99,$A:$A )),
SMALL(
IF(
ISNUMBER(
$A2:INDEX($A:$A, MATCH(1E+99,$A:$A )) *
B2:INDEX(B:B, MATCH(1E+99,$A:$A ))),
ROW(B2:INDEX(B:B, MATCH(1E+99,$A:$A ))) - ROW(B2)),
ROW(INDIRECT("1:" & MIN(
COUNT($A2:INDEX($A:$A, MATCH(1E+99,$A:$A ))),
COUNT(B2:INDEX(B:B, MATCH(1E+99,$A:$A ))))))), 0, 1)),
TRUE, FALSE))
Fill right as necessary and have column A locked while column B cell range references will shift to column C, D, etc.
Further function replacement could likely exchange at least some of the OFFSET functions use for an appropriate INDEX function but the formula seems to work well as it is now.
Related
In column A, starting with A1, I have a set of database column names which are Pascale case and without spaces. I'd like to use an Excel formula in column B to insert spaces before each Capital letter or number. Ideally any consecutive capital letters or numbers would remain together. I've done this in the past with C#, but on this project, I can't even use VBA macros. Example output:
Can this, or something close, be achieved using only formulas?
This is pretty hard, but with ms365 doable with the give sample data:
Formula in B1:
=MAP(A1:A10,LAMBDA(v,TRIM(REDUCE(v,SEQUENCE(LEN(v),,LEN(v),-1),LAMBDA(a,b,LET(x,MAKEARRAY(26,3,LAMBDA(r,c,CHOOSE(c,CHAR(r+64),CHAR(r+96),r-0))),y,MID(a,b,1),z,MID(a,b+1,1),r,BYCOL(x,LAMBDA(c,SUM(EXACT(c,y)+EXACT(c,z)))),IF(MAX(r)=1,LEFT(a,b-1)&IF((CONCAT(r)="110")*(EXACT(UPPER(y),y))," "&y,y&" ")&RIGHT(a,LEN(a)-b),a)))))))
Maybe others have shorter solutions...
Just for fun, this uses a single Reduce but I have defined some auxiliary functions. I put them in a module called 'is' in the Advanced Formula Environment so their full names are Is.Upper, Is.Lower and Is.Digit:
Upper=lambda(c,if(c="",false,and(code(c)>64,code(c)<91)));
Digit=lambda(c,if(c="",false,and(code(c)>47,code(c)<58)));
Lower=lambda(c,if(c="",false,and(code(c)>96,code(c<123))))
=REDUCE(LEFT(A1,1),SEQUENCE(1,LEN(A1)-1,2),LAMBDA(a,c,a&IF(OR(AND(is.Digit(MID(A1,c,1)),NOT(is.Digit(MID(A1,c-1,1)))),AND(is.Upper(MID(A1,c,1)),OR(NOT(is.Upper(MID(A1,c-1,1))),is.Lower(MID(A1,c+1,1)))))," ","")&MID(A1,c,1)))
This is how the main formula looks in the Advanced Formula Environment:
=REDUCE(
LEFT(A2, 1),
SEQUENCE(1, LEN(A2) - 1, 2),
LAMBDA(a, c,
a &
IF(
OR(
AND(
is.Digit(MID(A2, c, 1)),
NOT(is.Digit(MID(A2, c - 1, 1)))
),
AND(
is.Upper(MID(A2, c, 1)),
OR(
NOT(is.Upper(MID(A2, c - 1, 1))),
is.Lower(MID(A2, c + 1, 1))
)
)
),
" ",
""
) & MID(A2, c, 1)
)
)
Note - assumes string length>1.
I am using the column header titles as the comma separated content in another cell. I am using Excel 2016. I have a table named StudentCourse and for a better illustration please see the below example layout:
[Name] [Math] [Geo] [Bio] [Fees] [Fixes]
Ram Very Bad Good Good Unpaid Urgent: Math, Fees
Dam Neutral Good Bad Paid Urgent: Math, Bio
Rik Good Good Good Paid OK: Not Urgent
Nik Good Good Good Partial Urgent: Fees
The values for the subject columns are from a drop down menu which has the options Good, Neutral, Bad and Very Bad and if the values Neutral, Bad or Very Bad are selected then the Fixes column will be updated with the prefix Urgent: and the column header name (Math, Geo or Fees) depending on what needs to be fixed. If, no fixes are needed then the Fixes column's value will be Ok: Not Urgent.
The Fees column also follows the same concept. Meaning that if the Partial (means partial payment) or unpaid dropdown options are selected for the Fees Column value, then the Fees will be added to the Fixes column. So in short the Fixes column is for easily sorting through what needs to be given special by having the values be automatically selected based on what was chosen for the other columns.
I should also mention that I am new to Excel.
Assuming that the table is located at [A1:E9] and there are no [BLANK] cells as confirmed by OP. Enter this formula in [F2] and copy it to [F3:F9].
Excel 2016
= IF( SUMPRODUCT( ($B2:$E2<>{"Good","Good","Good","Paid"})*1 )=0, "Ok: Not Urgent",
"Urgent: " & SUBSTITUTE(
IF( $B2<>"Good", ", " & $B$1, "" )
& IF( $C2<>"Good", ", " & $C$1, "" )
& IF( $D2<>"Good", ", " & $D$1, "" )
& IF( $E2<>"Paid", ", " & $E$1, "" ), ", ", "", 1 ) )
Excel 2019 (Formula Array)
= IF( SUMPRODUCT( ($B2:$E2<>{"Good","Good","Good","Paid"})*1 )=0, "Ok: Not Urgent",
"Urgent: " &
TEXTJOIN( ", ", TRUE, IF( ($B2:$E2<>{"Good","Good","Good","Paid"}), $B$1:$E$1, TEXT(,) ) ) )
The FormulaArray is entered holding down ctrl+shift+enter simultaneously, the formula would be wrapped within { and } if entered correctly.
If you list the acceptable data in column H:I (as example below). You could use:
=IF(TEXTJOIN(", ",1,IF(INDEX($I$1:$I$4,MATCH($B$1:$E$1,$H$1:$H$4,0))=B2:E2,"",$B$1:$E$1))="","OK: No urgent","Urgent: "&TEXTJOIN(", ",1,IF(INDEX($I$1:$I$4,MATCH($B$1:$E$1,$H$1:$H$4,0))=B2:E2,"",$B$1:$E$1)))
I have one list starting from B1 (=UNIQUE(A1:A8)), and another list starting from D1 (=UNIQUE(C1:C8)). Thus =B1# and =D1# in other cells both spill.
Now, I would like to know if we could find one formula to combine List B1# and List D1# (extract only unique values) by dynamic array functions, LAMBDA, LET, etc.
I don't want to move the position of the two lists. Does anyone have any good idea?
I may not be following what you want the shape to be, but here are two shapes:
Side-by-Side
=CHOOSE({1,2},B1#,D1#)
If you want it to take the original A and C columns as input and do all the work, then:
=CHOOSE({1,2},UNIQUE(FILTER(A:A,NOT(ISBLANK(A:A)))),UNIQUE(FILTER(C:C,NOT(ISBLANK(C:C)))))
or a LET version of the same which does not require retyping the inputs:
=LET( Ltrs, A:A,
Nmbrs, C:C,
CHOOSE( {1,2},
UNIQUE(FILTER(Ltrs,NOT(ISBLANK(Ltrs)))),
UNIQUE(FILTER(Nmbrs,NOT(ISBLANK(Nmbrs)))) ) )
End-on-End
=LET( uLtrs, B1#,
uNmbrs, D1#,
ltrCt, ROWS(uLtrs),
idx, SEQUENCE( ltrCt + ROWS(uNmbrs) ),
IF( idx <= ltrCt, uLtrs, INDEX( uNmbrs, idx-ltrCt ) ) )
Similar as above, if you want it to take the original A and C columns as input and do all the work, then:
=LET( Ltrs, A:A,
Nmbrs, C:C,
uLtrs, UNIQUE(FILTER(Ltrs,NOT(ISBLANK(Ltrs)))),
uNmbrs, UNIQUE(FILTER(Nmbrs,NOT(ISBLANK(Nmbrs)))),
ltrCt, ROWS(uLtrs),
idx, SEQUENCE( ltrCt + ROWS(uNmbrs) ),
IF( idx <= ltrCt, uLtrs, INDEX( uNmbrs, idx-ltrCt ) ) )
Both spill the results.
Place the following code into cell F2 and drag formula downwards to F14. This will give you a unique list of both Column A and Column D
=IF(IFERROR(IF(INDEX($A$1:$A$99999,MATCH(0,COUNTIF($F$1:F1,$A$1:$A$99999),0))=0,NA(),INDEX($A$1:$A$99999,MATCH(0,COUNTIF($F$1:F1,$A$1:$A$99999),0))),INDEX($C$1:$C$99999,MATCH(0,COUNTIF($F$1:F1,$C$1:$C$99999),0)))=0,NA(),IFERROR(IF(INDEX($A$1:$A$99999,MATCH(0,COUNTIF($F$1:F1,$A$1:$A$99999),0))=0,NA(),INDEX($A$1:$A$99999,MATCH(0,COUNTIF($F$1:F1,$A$1:$A$99999),0))),INDEX($C$1:$C$99999,MATCH(0,COUNTIF($F$1:F1,$C$1:$C$99999),0))))
Let me know if you need it to behave differently.
After much research I have found nothing without VBA to count a range of cells that have been affected by conditional formatting (specifically are turned "red").
I know there is no way to count the "red" cells so I am going the route of creating a CountIF formula with the same criteria that is in the conditional formatting but i'm having issues creating the criteria.
I thought it would be simple and to just add "CountIF($G:$G," before the below code. This data is also inside a table named "TT".
=AND(OR(AND(TODAY()-$F1>1095,TODAY()-$G1>1095),$G1=0,AND($F1=0,TODAY()-$G1>1095)),$A1>0)
The OR makes things slightly mode complicated - you need to add the COUNTIFS, and then subtract when both are true (to prevent double-counting), To demonstrate, if we want where Column A = 0 or Column B = 0:
=COUNTIF(A:A, 0) + COUNTIF(B:B, 0) - COUNTIFS(A:A, 0, B:B, 0)
Except, you seem to be doing this with 3 conditions, which makes it bigger (add individual, subtract where 2 match, then add where all 3 match) - but there's actually a trick here, which I'll get to later.
To make it easier, we can rewrite your conditions from format Value - A1 > Const to A1 < Value - Const. This means the COUNTIF would be Countif(A:A, "<" & Value - Const)
=AND(OR(AND($F1<TODAY()-1095,$G1<TODAY()-1095),$G1=0,AND($F1=0,$G1<TODAY()-1095)),$A1>0)
Now, let's split that out into our individual COUNTIFS. There's the outer AND, so $A1>0 is in all of them, then there's an OR with 3 conditions. This gives us:
COUNTIFS($A:$A,">0", $G:$G, "<" & Today()-1095, $F:$F, "<" & Today()-1095)
COUNTIFS($A:$A,">0", $G:$G, 0)
COUNTIFS($A:$A,">0", $G:$G, "<" & Today()-1095, $F:$F, 0)
Now, here's the trick I mentioned earlier: I don't know about you, but I can see some duplication going on here. For example, the first and the third? Column F is less than Today()-1095, OR Column F is 0. Except, day 1095 is the 30th December 1902 - so Today()-1095 will always be greater than 0. Today, for example, it will be 42576. This means when the third condition is True, the first condition will also always be true. So, we can ignore the third COUNTIF entirely!
Now, we can't do this with the first and second conditions - because if column F is greater than Today()-1095 the first condition will always be False, but the second condition will be True if Column G is 0
So, using our example from earlier, we have the following:
=COUNTIFS($A:$A,">0", $G:$G, "<" & Today()-1095, $F:$F, "<" & Today()-1095)
+COUNTIFS($A:$A,">0", $G:$G, 0)
-COUNTIFS($A:$A,">0", $G:$G, 0, $G:$G, "<" & Today()-1095, $F:$F, "<" & Today()-1095)
But! Look at that last COUNTIFS. It has G:G = 0 AND G:G < Today()-1095. But, if Column G is 0, then it is also less than Today()-1095 (Disclaimer: On-or-after New Year's Eve 1902) So, we can simplify that:
-COUNTIFS($A:$A,">0", $G:$G, 0, $F:$F, "<" & Today()-1095)
Which means our entire equation is as follows:
=COUNTIFS($A:$A,">0", $G:$G, "<" & Today()-1095, $F:$F, "<" & Today()-1095)+COUNTIFS($A:$A,">0", $G:$G, 0)-COUNTIFS($A:$A,">0", $G:$G, 0, $F:$F, "<" & Today()-1095)
I figured out my own formula using the table headers and a combination of SUM( COUNTIFS( COUNTBLANK(. It's battle tested and works!
=SUM(COUNTIFS(TT[Fiscal Law 301 CBT],"<"&TODAY()-1095,TT[Fiscal Law In-Residence],"<"&TODAY()-1095),COUNTBLANK(TT[Fiscal Law 301 CBT]),COUNTIFS(TT[Fiscal Law In-Residence],"",TT[Fiscal Law 301 CBT],"<"&TODAY()-1095))
I have gone through the variations on these and each has a different solution depending on how the names are in a cell. Let me make it clear. I have an excel sheet containing the names of my colleagues from my college days. The names are in no particular format. The "Name" column has the names like:
1) Dipak C. Chopra
2) Amar D Pathak
3) Lara Naik
4) Reshma Laxman Bhavsar
So as can be seen, some have simply a middle initial, some have it with a period and some have it missing while some have a full middle name. What I wish to do is to rewrite these names in another column by last name so that it turns out like:
1) Chopra Dipak C.
2) Pathak Amar D
3) Naik Lara
4) Bhavsar Reshma Laxman
I can do it but I have to use formulae with variations depending the name in the cell: e.g.
=TRIM(RIGHT(B2,LEN(B2)-FIND(" ",B2)) & " " & LEFT(B2,FIND(" ",B2))) for the 3rd entry
=TRIM(RIGHT(B4,LEN(B4)-FIND(" ",B4)-1) & " " & LEFT(B4,FIND(" ",B4)+1)) for the 2nd entry
=TRIM(RIGHT(B13,LEN(B13)-FIND(" ",B13,FIND(" ",B13)+1))&" "&LEFT(B13,FIND(" ",B13,FIND(" ",B13)+1))) for the 4th entry.
My question is how can I revise this formula to give me the desired result in all the cases mentioned above?
=IF(ISERROR(FIND(" ",B2)),B2,RIGHT(B2,LEN(B2)-FIND("~",SUBSTITUTE(B2," ","~",LEN(B2)-LEN(SUBSTITUTE(B2," ","")))))&" "&LEFT(B2,FIND("~",SUBSTITUTE(B2," ","~",LEN(B2)-LEN(SUBSTITUTE(B2," ",""))))-1))
It involves doing a reverse string search to find the last space and then uses its position to cut the last name off and then add on the rest of the string to the end, if there is no spaces in the string it will just return the value is B2
Beautified
=IF(
ISERROR(
FIND(
" ",
B2
)
),
B2,
RIGHT(
B2,
LEN(
B2
) -
FIND(
"~",
SUBSTITUTE(
B2,
" ",
"~",
LEN(
B2
) -
LEN(
SUBSTITUTE(
B2,
" ",
""
)
)
)
)
) & " " &
LEFT(
B2,
FIND(
"~",
SUBSTITUTE(
B2,
" ",
"~",
LEN(
B2
) -
LEN(
SUBSTITUTE(
B2,
" ",
""
)
)
)
) - 1
)
)