I'm a problem with one of my formula in excel. The formula is supposed to return the standard deviation of an range of cells if the value is not judged to be an outlier in another column.
My formula is STDAFV.S(HVIS(R2:R15<>"Outlier";P2:P15;""))
The problem is that it returns a wrong value. In the example below, the formula returns a wrong value of 0.010729, which should be 0.001652.
I'm well aware, that this is a array formula, thus I do press Ctrl+Shift+Enter. So this is not the issue!
Does anyone have a clue for what is wrong?
Col P | Col R
0.0215|
0.0207|
0.0233|
0.0213|
0.0187|
0.0917| Outlier
I get the same result as you when I am computing STDDEV across the 6 values of your OP including 7 empty cells as in your formula (P2:P15)
By doing so I add 7 times a value = 0 to the set which will affect the mean value and the divisor (N-1) in the STDDEV formula.
If I limit the range from row 2 to 7 ... the ones actually containing data, I get the result you're expecting.
Edit:
To circumvent the problem of empty cells without redefinig the range everytime you can make use of =COUNT() and =OFFSET() functions ...
Your values are placed in a growing list, but without empty cells between. So the number of cells (=rows) is the result of a =COUNT(B2:B1000)
Your effective data range now is "from B2 and [count result] cells down" ... i.e. =OFFSET(B2;0;0;COUNT(B2:B1000);1)
Likewise, your effective comment range now is "from C2 and [count result in B] cells down" ... i.e. =OFFSET(C2;0;0;COUNT(B2:B1000);1)
Combining all this ... with data in B2:B15 and remark in C2:C15, the array formula becomes
{ =STDEV.S(IF(OFFSET(C2;0;0;COUNT(B2:B15);1)<>"Outlier";OFFSET(B2;0;0;COUNT(B2:B15);1))) }
ta-taaaa!
Related
I have a spreadsheet that I track my hours. Each cell initially is populated with a formula, i.e. =IF(WORKDAY(B24-1,1,holidays2019)=B24,OFFSET(C24,-1,2),0)
and then as the month progresses I enter my actual time.
In the following excerpt all values through 5/10/2019 are entered.
The formula =SUMIF(C5:C19,NOT(ISFORMULA(C5:C19))) shows zero. I do not understand why this does not work.
I appreciate any help! Column B in my spreadsheet corresponds to the dates shown below and Column C to the time entries.
Expected Result: 48.9
=SUMPRODUCT(J6:J20,--NOT(ISFORMULA(J6:J20)))
The key to this solution is the -- in front of the NOT(). A boolean that is processed by a math operator gets converted to 1 or 0. --, +0, -0, *1, /1 would have all worked to do the conversion. So now you wind up with an array of values you may want to sum being multiplied by an array of 1 and 0 to indicate the ones you want. The 1 are manual entry and the 0 are your formulas entries.
Now SUMPRODUCT performs array like calculations. As a result avoid using full column/row references inside it or you will wind up with a lot of excess calculations. Adjust the ranges in the answer to suit your needs.
Here's the MSDN definition of the Criteria in =SUMIF
criteria Required. The criteria in the form of a number, expression,
a cell reference, text, or a function that defines which cells will be
added. For example, criteria can be expressed as 32, ">32", B5, "32",
"apples", or TODAY().
Important: Any text criteria or any criteria that includes logical or
mathematical symbols must be enclosed in double quotation marks (").
If the criteria is numeric, double quotation marks are not required.
So, the reason, why your SUMIF returns 0 is, because none of the cells match the criteria, as they return a number and meanwhile they expect FALSE
Another issue here being, that the ISFORMULA will return TRUE, even when a range contains a single formula while all the rest has none. So basically you need to drag the formula down for each cell individually and sum them up only when a value is TRUE
Starting from cell D1:
=ISFORMULA(B1)
And then you can simply sum them up with the formula you provided.
=SUMIF(D1:D16,TRUE,C1:C16)
Obviously, you can hide the column D to make it more aesthetically pleasing.
Your formula fails because the criteria you're matching against, is TRUE/FALSE. Obviously the values in C5:C19 don't contain any booleans, so the sum is 0.
To solve this, you can add the correct criteria in cell D5 and below: =ISFORMULA(C5)
Then use =SUMIF(D5:D19,FALSE,C5:C19) to sum the values in column C.
I have a list of 1s and 0s in excel row ranging from B2:K2, I want to calculate the current streak of 1's in cell M2,
example dataset where streak would be 4
1 0 1 0 1 1 1 1 0
Is there a simple way of doing this? I have tried research but not been able to find anything specific.
Any help would be much appreciated.
Here is a way of doing this with just one formula, no helper columns/rows needed:
The formula used translates to:
{=MAX(FREQUENCY(IF(B1:K1=1,COLUMN(B1:K1)),IF(B1:K1=1,0,COLUMN(B1:K1))))}
Note: It's an array formula and should be entered through CtrlShiftEnter
Assuming your data is layed out horizontally like the image below, the following two formulas should do it for you.
The first cell requires a different formula as the is no cell to the left to refer to. so a simple formula to check if the first cell is one or not is entered in B2.
=--(A1=1)
The part in the bracket will either be true or false. A quirk of excel is that if you send a true or false value through a math operation it will be converted to 1 for true and 0 for false. That is why you see the double - in front. could have also done *1, /1, +0,-0 at the end.
In B2 place the following formula and copy right as needed:
=(A2+1)*(B1=1)
Basically it adds 1 to the series, then check if the number in the sequence is 1 or 0. In the event its one, it keeps the value as it is TRUE sent through the math operator *. If it is false it set the sequence back to zero by multiplying False by the math operator *.
Alternate IF
Now the above while it works and may save a few characters is not necessarily intuitive for most. The go to option would be to use an IF function. The above formulas can be replaced with the following:
A3
=IF(A1=1,1,0)
B3 ->Copied right
=IF(B1=1,A3+1,0)
Longest streak
To get the longest streak, the highest value in your helper row is what you want. You can grab this with the following formula in an empty cell.
=MAX(2:2)
=MAX(A2,I2)
If you have no other numbers in your helper row, you can use the first formula which looks in the entire row. If there are other numbers due to calculations off to the left or right as an example, then you will want to restrict your range to you data as in the second formula.
I've put those values in cells B2 to B8.
In cell C3, I've put this formula:
=IF(AND(B3=1;B2=1);C2+1;1)
Dragging this downto C8, and then take the maximum of the C column.
I have a small data set of 2 columns and several rows (columns A and B)
I want to return each instance of codeblk 3 in a formula that is elsewhere in my sheet, (so a vlookup is out as it only shows the first instance) if it does not appear then a message to say its not there should come up.
I have the formula partially working but i cant see the reason why its not displaying the values.
My formula is as below:
This is an array
{=IF(ISERROR(INDEX($A$55:$B$70,SMALL(IF($B$55:$B$70=3,ROW($B$55:$B$70)),ROW(1:1))-1,1)),"No value's produced",INDEX($A$2:$C$7,SMALL(IF($B$55:$B$70=3,ROW($B$55:$B$70)),ROW(1:1))-1,1))}
The result that shows up is only "No values produced" but it should reflect statement B, C and D in 3 separate cells (when changing ROW(1:1), ROW(2:2) etc)
{=SMALL(IF($B$56:$B$69=4,ROW($B$56:$B$69)),ROW(1:1))} - This produces the result 68 which is the correct row.
Any ideas?
Thanks,
This is an array formula - Validate the formula with Ctrl+Shift+Enter while still in the formula bar
=IFERROR(INDEX($A$55:$B$70,SMALL(IF($B$55:$B$70=3,ROW($B$55:$B$70)-54),ROW(1:1)),1),"No value's produced")
The issue you are facing is that your index starts it's first row on $B$55, you need to offset the row numbers in the array to reflect this. For example, the INDEX contains 16 rows but if you had a match on the first row you are asking for the 55th row from that INDEX(), it just can't fulfil that.
EDIT
The offset was out of sync as your original formula included another -1 outside of the IF(), I also left an additional bracket in play (the formula above has now been edited)
The ROW() function will essentially translate $B$55:$B$70 into ROW(55:70) which will produce the array {55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70} so the offset is needed to translate those row numbers in to the position they represent in the indexed data of INDEX().
The other IF() statement then produces and array of {FALSE;2;3;4;FALSE etc.
You can see these results by highlighting parts of the formula in the formula bar and hitting F9 to calculate.
I've been trying to write a formula to summarise in one cell the presence/ absence of certain values in a different range of of cells in Excel
So in the one table and worksheet, I wrote
=IF(B1:F1=1,1,0) Formula 1
which is supposed to mean
If any of the values in cells B1:F1 are equal to 1, then note 1, otherwise not 0.
But somehow the syntax isn't working.
I've applied "" and ; and brackets right left and centre, but to no avail.
I'm pretty sure I done this before and it was pretty simple when I hit upon the right synstax, but the how and where fell through the colander which is my brain today :-?
Additionally I will want to ask the formula to apply another condition to the output cell which is
=if (A1 = value n or certain values, 1, 0) Formula2
Column A has numerically coded ordinal values 0-9, so an aexample of teh 1 conditions might be any of values 1, 2 or 9 in column, should produce a 1 in the result cell in which Formula 1 and 2 will be written.
So the result cell would contain somelike
=Formula1_or_Formula2_contain_certain_values, 1, 0) Formula 3
Obviously the systax of Formulas 2 and 3 is awol, but I write to demonostrate the formulae intended purposes.
The easiest way to make the first formula is like this:
=IF(SUM(B1:E1)>0,1,0)
No arrays, no problems :)
I have two different sheets with 300,000 data in Excel.
First sheet contains:
S2_Symbol Start_Pos End Position
STE 254857 267891
PRI 748578 758962
ILA 852741 963369
VIS 789456 796325
Second:
S1_Location
789460
852898
748678
My output should be like this:
S1_Location Symbol
789460 VIS
852898 ILA
748678 PRI
I have to find that S1_location falls in which S2_location and its corresponding Symbol. I have used INDEX formula in Excel but for each cell, I have to change the reference cell manually. I couldn't do it 300,000 data.
How can I do in an in Excel or should I use a script?
This solution assumes the following:
Start and End Positions for each S2 Symbol are unique (i.e. there is no intersection between the ranges allocated to each symbol)
Data in first sheet is located at A1:D17 (adjust ranges in formulas as needed)
Data in second sheet is locate at A1:B300010 (adjust ranges in formulas as needed)
The solution requires:
To add a working column in worksheet one. Enter this formula in D2 and copy till last record.
=ROWS($A$1:$A2)
Fig. 1
Then in second worksheet enter this formula at B2 and copy till last record.
=INDEX( Sheet1!$A$1:$A$17,
SUMIFS( Sheet1!$D$1:$D$17,
Sheet1!$B$1:$B$17, "<=" & $A2, Sheet1!$C$1:$C$17, ">=" & $A2 ) )
Fig. 2
It took aprox. less than 14 seconds to copy downwards and calculate the formulas in sheet 2.
As it can be seen in figures 1 and 2 none of the tables need to be sorted.
Assuming both sheets start in A1, and First sheet ColumnB is sorted ascending, in Second sheet B2 please try:
=INDEX(First!A:A,MATCH(A2,First!B:B))
copied down to suit. It relies on inexact matching.
Assuming we have a Sheet1 like this:
note, the Sheet1is sorted by Start_Pos, End_Pos in ascending order.
and a Sheet2 like this:
Then the formula in Sheet2!B2 downwards could be:
=INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-(10^-10),Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
See MATCH: https://support.office.com/en-us/article/MATCH-function-e8dffd45-c762-47d6-bf89-533f4a37673a
The idea is: MATCH without exact matching (without parameter match_type) gets the row of the largest value which is smaller or equal the search value. So in the Start_Pos column it will get the row from which we can get the S2_Symbol. But from the End_Pos column it should get one row beforehand if the value is not outside the given ranges.
There is only one exception. If the value is exact the value in the End_Pos column, then it will return the same row as in the Start_Pos column. Considering this exception, we can search in the End_Pos column with a little bit smaller value. Thanks to Tom Sharpe for his comment.
The formula in Sheet2!D2 downwards is:
{=INDEX(Sheet1!A:A,MIN(IF($A2>=Sheet1!$B$2:$B$300000,IF($A2<=Sheet1!$C$2:$C$300000,ROW(Sheet1!$A$2:$A$300000),2^20+1))))}
this is an array formula which is exactly formulated respecting the requirements. But this is very bad in performance for using in much many cells. But using this, the Sheet1 is not required to be sorted.
Benchmark test:
Have the following Sheet1:
Formulas:
A2:A300002: ="S"&(ROW(A1)-1)*10&"-"&(ROW(A1)-1)*10+7
B2:B300002: =(ROW(A1)-1)*10
C2:C300002: =B2+7
and the following Sheet2:
Formulas:
A2:A300002: =RANDBETWEEN(0,3000007)
B2:B300002: =INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-10^-9,Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
Note the -10^-9 instead of -10^-10 in previous version. This is because we have only 16 digits precision. In previous version this was maximum 6 digits integer part and then 10 digits decimal part. Now it is maximum 7 digits integer part and then 9 digits decimal part.
Calculation after pressing F9 in Sheet2 takes ca. 2 s. (Excel 2007, Windows 7, 4 core processor).
I would have gone for something like this which gives you the first match if there is one:-
=INDEX(First!A:A,MATCH(1,(First!B:B<=A2)*(First!C:C>=A2),0))
assuming keys and start and end values are in a sheet called First and lookup values start in A2.
Array formula which must be entered with CtrlShiftEnter
In response to the question from #pnuts about how long it will take, I have set up a similar benchmark with 300,000 rows in each sheet and it has reached 1% after 90 minutes, so it should take about 150 hours to reach 100% or roughly one week. This is to be expected as the number of computations required is (rows in sheet 1) X (rows in sheet 2)
300,000 X 300,000
but in fact because the multiplication applies to complete columns, I believe it is more correctly
300,000 X 1,048,576
i.e. > 300 billion.
A practical version which gives good response for smaller ranges is as follows:-
I define three named ranges Range1, Range2 and Range3
=First!$A$1:INDEX(First!$A:$A,MATCH("ZZZ",First!$A:$A))
=First!$B$1:INDEX(First!$B:$B,MATCH(9.9E+307,First!$B:$B))
=First!$C$1:INDEX(First!$C:$C,MATCH(9.9E+307,First!$C:$C))
and the modified formula is
=INDEX(Range1,MATCH(1,(Range2<=A2)*(Range3>=A2),0))
I was thinking of deleting this answer, but would rather it stood as a counter-example.