I'm trying to bulid a formula that could do the following: calculate an average for a variable that meets couple conditions. The problem with this case is that I want the average to be calculated when a sum of investments in specified year, and types is within high (K4) and low (J4) values.
In this example the expected result would be variable a = 15% - conditions would be year - 2015, type - b & c (from a range H8:H10, order of a,b,c may vary), sum of investments for previous criteria within 400 and 600. Here type c do not meet criteria becasue the sum of investments is 900, so out of range specified before. Type b has investments equal to 500 so the average is calculated.
Any ideas how can I handle this? Thanks.
This is a first try at it, but the formula is still a huge monster. There might be a better, more elegant way. But, at least it works.
=(IF(AND(SUMIF($D$4:$D$21,$H$8,$E$4:$E$21)>=$J$4, SUMIF($D$4:$D$21,$H$8,$E$4:$E$21)<=$K$4), SUMIFS($F$4:$F$21,$B$4:$B$21,$H$4,$C$4:$C$21,$I$4,$D$4:$D$21,$H$8), 0) + IF(AND(SUMIF($D$4:$D$21,$H$9,$E$4:$E$21)>=$J$4, SUMIF($D$4:$D$21,$H$9,$E$4:$E$21)<=$K$4), SUMIFS($F$4:$F$21,$B$4:$B$21,$H$4,$C$4:$C$21,$I$4,$D$4:$D$21,$H$9), 0) + IF(AND(SUMIF($D$4:$D$21,$H$10,$E$4:$E$21)>=$J$4, SUMIF($D$4:$D$21,$H$10,$E$4:$E$21)<=$K$4), SUMIFS($F$4:$F$21,$B$4:$B$21,$H$4,$C$4:$C$21,$I$4,$D$4:$D$21,$H$10), 0)) / (COUNTIF($D$4:$D$21,$H$8)*--(IF(AND(SUMIF($D$4:$D$21,$H$8,$E$4:$E$21)>=$J$4, SUMIF($D$4:$D$21,$H$8,$E$4:$E$21)<=$K$4), SUMIFS($F$4:$F$21,$B$4:$B$21,$H$4,$C$4:$C$21,$I$4,$D$4:$D$21,$H$8), 0)<>0) + COUNTIF($D$4:$D$21,$H$9)*--(IF(AND(SUMIF($D$4:$D$21,$H$9,$E$4:$E$21)>=$J$4, SUMIF($D$4:$D$21,$H$9,$E$4:$E$21)<=$K$4), SUMIFS($F$4:$F$21,$B$4:$B$21,$H$4,$C$4:$C$21,$I$4,$D$4:$D$21,$H$9), 0)<>0) + COUNTIF($D$4:$D$21,$H$10)*--(IF(AND(SUMIF($D$4:$D$21,$H$10,$E$4:$E$21)>=$J$4, SUMIF($D$4:$D$21,$H$10,$E$4:$E$21)<=$K$4), SUMIFS($F$4:$F$21,$B$4:$B$21,$H$4,$C$4:$C$21,$I$4,$D$4:$D$21,$H$10), 0)<>0))
Here a screenshot:
I have named the columns with ranges (type: data for column type D4:D21, invst: E4:E21, year:b4:b21, min:cell j4, max: cell k4, pcg:f4:f21) for ease of understanding.
Paste this in h8 as array formula (for a with year filter) and name this cell "suma"
=SUMIFS(invst,type,H8,year,2015) # for type a
paste this in h9 as array formula (for b) and name it sumb
=SUMIFS(invst,type,H9)
paste this in j8 (answer for type a) as array formula
=IFERROR(AVERAGE(IF(AND(suma>min,suma<max),IF(type=H8,pcg,""),"")),"")
paste this in j9 (answer for type b) as array formula
=IFERROR(AVERAGE(IF(AND(sumb>min,sumb<max),IF(type=H9,pcg,""),"")),"")
Related
I'm wanting a sumifs formula such that when I drag it down a row, the sumrang jumps across a column (in a different tab).
For simplicity, say in cell B2 I hav:
=Sumif(array, criteria, 'Sheet 1'!A:A).
I would like to drag this formula down such that cell B3 has
=Sumif(array, criteria, 'Sheet 1'!B:B).
I would also need this to be robust against dragging across the columns. ie. if I were to then drag across to cell C3, I'd like it to be
=Sumif(array2, criteria2, 'Sheet1'!B:B).
It seems to come down to inputting a number to produce an array, such that when this gets dragged down the rows, the number increases by 1 and thus the column of the sumrange increases
by 1. But I just can't get it to work.
Any thoughts?
Here/screeshot(s) refer:
M1: sumif/offset
=SUMIF(A:A,F$4,OFFSET($C:$C,0,ROWS(J$4:J4)-1,ROWS(C:C),1))
Notes: dragging J4 to right updates criteria range and criteria, but leaves sum range (A:A) unchanged. Dragging down updates only sum range (from 1 -> 2).
M2: sum/offset
=SUM(IFERROR((A:A=F$4)*(OFFSET($C:$C,0,ROWS(J$9:J9)-1,ROWS(A:A),1)),))
Notes: due to 'lazy range referencing' (e.g. column A, C etc.) - iferror required within sum to mitigate instances where cells do not contain legitimate values for this fn. - see overall note section below
M3: sum/index
=SUM(IFERROR(1*(A:A=F$4)*INDEX($C:$D,0,ROWS(J$15:J15)),))
General notes/caveats etc.
Ranges used in your e.g. would conflict given B2 contains initial sumif fn, with sum range immediately to left (A:A), then dragging down would attempt to sum column B (i.e. circular reference) - have updated ranges to something more purposeful accordingly). Appreciate you only provided illustrative e.g. so no sweat ☺
Various other methods exist e.g. sumproduct, or substitute offset in
(1) with index (2) for same outcome; General principle: use
reference eqn (offset/index/direct reference, etc.) s.t. column
offset = rows
RE: 'robust' - simply use bullets [i.e. $] to fix columns (only for sum range; other ranges assumed to be adjacent to one another given info providdd - these can be modified in similar fashion as sum range (i.e. using index/offset etc.) as req.
This assumes sum range (given as Sheet1 A:A in your Q) identical to array / criteria range (would advise using specific range to avoid intensive calc delays - table functions, named ranges, '#' Spill references [if you have 365 Office compatable version Excel) per here etc.)
(shared file link - onedrive secure - provides example of sum/filter/offset (requires Office 365) - not included above as three methods already provided (only 1 requested/req.)
Checks substantiate veracity/accuracy of these fns as req.
Ta
Disclaimer: New to VBA
I have an Excel sheet where I would like to build a VBA that does the following things.
My requirement is to fill a value in row Z based on a formula determined by the text in row M.
The formula deals with summing values from row A, B, C in different combinations (determined by text in row M).
Loop this for each row starting from row 5 to row 10000.
Eg:
If cell M5 = Apple, then Z5 = A5+B5;
If cell M9 = Samsung, then Z9 = C9+A9
I know a nested IF formula does it easily but there are just too many conditions and I'm looking for an automatic and cleaner route.
Thanks!
Not a VBA, but I think this will solve the issue:
The solution requires two parts:
a) A table (or named range) to hold the string value to match, and two columns to be added. This table (BrandEq - my example) is essentially a map to indicate which cells are to be summed. Where the column (A,B,C,...) is followed by a '#" for the row to be added in later.
b) A custom formula (detailed below) in each cell (z, in your case) to return the sum of the appropriate columns.
[Please see the attached image]
Personally, I prefer tables, but either that or a named range is recommend because additional "brands" and combinations (Cell1,Cell2) (my example) can be added easily and the formula automatically updates, so no going back to the base formula.
The the "z cell" formula is:
=INDIRECT(SUBSTITUTE(VLOOKUP(Sheet1!$D15,BrandEq,2,FALSE),"#",TEXT(ROW(),"#####"))) +INDIRECT(SUBSTITUTE(VLOOKUP(Sheet1!$D15,BrandEq,3,FALSE),"#",TEXT(ROW(),"#####")))
This may be easier to understand working from the 'inside-out':
VLOOKUP (from the 'DATA' table/range- current row brand column,into the 'BRANDEQ' table/range, return the contents of the corresponding column - note 2 on the first line and 3 on the second line, FALSE=exact matches on the lookup value)
with this value (eg. a#) ->
SUBSTITUTE( the hashtag '#' place holder, with the current row() as a text() value)
Indirect( return the corresponding value of that cell -- from the data table, the formula is entered in 'zval' row 14 (my example) (eg. 'a#' -> a11 = 100)
This formula is called a second time, but this time returning the 'Cell2 value, all else is the same function. (eg. 'b#' ->b11 = 15)
The results are added together.
Row 11, for example, parses as follows:
INDIRECT(SUBSTITUTE(VLOOKUP(Sheet1!$D11,BrandEq,2,FALSE),"#",TEXT(ROW(),"#####")))
+
INDIRECT(SUBSTITUTE(VLOOKUP(Sheet1!$D11,BrandEq,3,FALSE),"#",TEXT(ROW(),"#####")))
INDIRECT(SUBSTITUTE(VLOOKUP('Apple',BrandEq,2,FALSE),"#",TEXT(14,"#####")))
+
INDIRECT(SUBSTITUTE(VLOOKUP('Apple',BrandEq,3,FALSE),"#",TEXT(14,"#####")))
INDIRECT(SUBSTITUTE("a#","#","11")) + INDIRECT(SUBSTITUTE("b#","#","11"))
INDIRECT("a11") + INDIRECT("b11")
100 + 15
115
If additional/different cells need to be reference they only need be included in the 'BrandEQ' Table. Note: the cells be referenced do not need to be contiguous - they can be spread throughout the row.
I have two different sheets with 300,000 data in Excel.
First sheet contains:
S2_Symbol Start_Pos End Position
STE 254857 267891
PRI 748578 758962
ILA 852741 963369
VIS 789456 796325
Second:
S1_Location
789460
852898
748678
My output should be like this:
S1_Location Symbol
789460 VIS
852898 ILA
748678 PRI
I have to find that S1_location falls in which S2_location and its corresponding Symbol. I have used INDEX formula in Excel but for each cell, I have to change the reference cell manually. I couldn't do it 300,000 data.
How can I do in an in Excel or should I use a script?
This solution assumes the following:
Start and End Positions for each S2 Symbol are unique (i.e. there is no intersection between the ranges allocated to each symbol)
Data in first sheet is located at A1:D17 (adjust ranges in formulas as needed)
Data in second sheet is locate at A1:B300010 (adjust ranges in formulas as needed)
The solution requires:
To add a working column in worksheet one. Enter this formula in D2 and copy till last record.
=ROWS($A$1:$A2)
Fig. 1
Then in second worksheet enter this formula at B2 and copy till last record.
=INDEX( Sheet1!$A$1:$A$17,
SUMIFS( Sheet1!$D$1:$D$17,
Sheet1!$B$1:$B$17, "<=" & $A2, Sheet1!$C$1:$C$17, ">=" & $A2 ) )
Fig. 2
It took aprox. less than 14 seconds to copy downwards and calculate the formulas in sheet 2.
As it can be seen in figures 1 and 2 none of the tables need to be sorted.
Assuming both sheets start in A1, and First sheet ColumnB is sorted ascending, in Second sheet B2 please try:
=INDEX(First!A:A,MATCH(A2,First!B:B))
copied down to suit. It relies on inexact matching.
Assuming we have a Sheet1 like this:
note, the Sheet1is sorted by Start_Pos, End_Pos in ascending order.
and a Sheet2 like this:
Then the formula in Sheet2!B2 downwards could be:
=INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-(10^-10),Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
See MATCH: https://support.office.com/en-us/article/MATCH-function-e8dffd45-c762-47d6-bf89-533f4a37673a
The idea is: MATCH without exact matching (without parameter match_type) gets the row of the largest value which is smaller or equal the search value. So in the Start_Pos column it will get the row from which we can get the S2_Symbol. But from the End_Pos column it should get one row beforehand if the value is not outside the given ranges.
There is only one exception. If the value is exact the value in the End_Pos column, then it will return the same row as in the Start_Pos column. Considering this exception, we can search in the End_Pos column with a little bit smaller value. Thanks to Tom Sharpe for his comment.
The formula in Sheet2!D2 downwards is:
{=INDEX(Sheet1!A:A,MIN(IF($A2>=Sheet1!$B$2:$B$300000,IF($A2<=Sheet1!$C$2:$C$300000,ROW(Sheet1!$A$2:$A$300000),2^20+1))))}
this is an array formula which is exactly formulated respecting the requirements. But this is very bad in performance for using in much many cells. But using this, the Sheet1 is not required to be sorted.
Benchmark test:
Have the following Sheet1:
Formulas:
A2:A300002: ="S"&(ROW(A1)-1)*10&"-"&(ROW(A1)-1)*10+7
B2:B300002: =(ROW(A1)-1)*10
C2:C300002: =B2+7
and the following Sheet2:
Formulas:
A2:A300002: =RANDBETWEEN(0,3000007)
B2:B300002: =INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-10^-9,Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
Note the -10^-9 instead of -10^-10 in previous version. This is because we have only 16 digits precision. In previous version this was maximum 6 digits integer part and then 10 digits decimal part. Now it is maximum 7 digits integer part and then 9 digits decimal part.
Calculation after pressing F9 in Sheet2 takes ca. 2 s. (Excel 2007, Windows 7, 4 core processor).
I would have gone for something like this which gives you the first match if there is one:-
=INDEX(First!A:A,MATCH(1,(First!B:B<=A2)*(First!C:C>=A2),0))
assuming keys and start and end values are in a sheet called First and lookup values start in A2.
Array formula which must be entered with CtrlShiftEnter
In response to the question from #pnuts about how long it will take, I have set up a similar benchmark with 300,000 rows in each sheet and it has reached 1% after 90 minutes, so it should take about 150 hours to reach 100% or roughly one week. This is to be expected as the number of computations required is (rows in sheet 1) X (rows in sheet 2)
300,000 X 300,000
but in fact because the multiplication applies to complete columns, I believe it is more correctly
300,000 X 1,048,576
i.e. > 300 billion.
A practical version which gives good response for smaller ranges is as follows:-
I define three named ranges Range1, Range2 and Range3
=First!$A$1:INDEX(First!$A:$A,MATCH("ZZZ",First!$A:$A))
=First!$B$1:INDEX(First!$B:$B,MATCH(9.9E+307,First!$B:$B))
=First!$C$1:INDEX(First!$C:$C,MATCH(9.9E+307,First!$C:$C))
and the modified formula is
=INDEX(Range1,MATCH(1,(Range2<=A2)*(Range3>=A2),0))
I was thinking of deleting this answer, but would rather it stood as a counter-example.
I'm a problem with one of my formula in excel. The formula is supposed to return the standard deviation of an range of cells if the value is not judged to be an outlier in another column.
My formula is STDAFV.S(HVIS(R2:R15<>"Outlier";P2:P15;""))
The problem is that it returns a wrong value. In the example below, the formula returns a wrong value of 0.010729, which should be 0.001652.
I'm well aware, that this is a array formula, thus I do press Ctrl+Shift+Enter. So this is not the issue!
Does anyone have a clue for what is wrong?
Col P | Col R
0.0215|
0.0207|
0.0233|
0.0213|
0.0187|
0.0917| Outlier
I get the same result as you when I am computing STDDEV across the 6 values of your OP including 7 empty cells as in your formula (P2:P15)
By doing so I add 7 times a value = 0 to the set which will affect the mean value and the divisor (N-1) in the STDDEV formula.
If I limit the range from row 2 to 7 ... the ones actually containing data, I get the result you're expecting.
Edit:
To circumvent the problem of empty cells without redefinig the range everytime you can make use of =COUNT() and =OFFSET() functions ...
Your values are placed in a growing list, but without empty cells between. So the number of cells (=rows) is the result of a =COUNT(B2:B1000)
Your effective data range now is "from B2 and [count result] cells down" ... i.e. =OFFSET(B2;0;0;COUNT(B2:B1000);1)
Likewise, your effective comment range now is "from C2 and [count result in B] cells down" ... i.e. =OFFSET(C2;0;0;COUNT(B2:B1000);1)
Combining all this ... with data in B2:B15 and remark in C2:C15, the array formula becomes
{ =STDEV.S(IF(OFFSET(C2;0;0;COUNT(B2:B15);1)<>"Outlier";OFFSET(B2;0;0;COUNT(B2:B15);1))) }
ta-taaaa!
I have a table in Excel which looks like this:
A B C
Row 1: 2100-2200 2200-2300 2300-2400
I'm using a VLOOKUP formula. I want this formula to find a number e.g. 2152 in the table above.
Cell A1 is supposed to contain numbers from 2100 to 2200.
Is this possible to do in Excel?
I dont know exactly what you want to return, this array formula will return the correct interval in A1:C1:
=INDEX($A$1:$C$1;MATCH(1;(E1>=VALUE(LEFT($A$1:$C$1;4)))*(E1<=VALUE(RIGHT($A$1:$C$1;4)));0))
Numeric value your looking for in E1
Dont forget to Ctrl Shift Enter to enter the formula...
Instead of providing the ranges, you need to provide only the lower bound. I.e. try this data:
A B C
Row 1: 2100 2200 2300
Because it is a horizontal setup, you need to use HLOOKUP (VLOOKUP will check the cells in the first column of a table, HLOOKUP the cells in the first row) - and you need to leave the fourth parameter of HLOOKUP/VLOOKUP blank (or set it to TRUE which is the same as leaving it blank). E.g. if you number 2152is in cell A2, use this formula:
=HLOOKUP(A2,$A$1:$C$1,1)
and you'll get 2100.
If you want to have the full range returned, you should use the MATCH function instead:
=INDEX($A$1:$C$1,MATCH(A2,$A$1:$C$1))&"-"&INDEX($A$1:$C$1,MATCH(A2,$A$1:$C$1)+1)
This will return 2100-2200
use lower range of your numbers
A B C D
Lower range 2100 2200 2300 2400
Ranking 1 2 3 4
You want to find a number or ranking corresponding to say 2350
Formula: = HLOOKUP(2350,range (range of your values),2,TRUE)
your range includes 2100 -2400 plus 1-4.
the value 2 in the formula indicates the row that you want result returned from - in this case you want ranking - where 2350 will return a 4; change this number to 1 to return exact value.
sample formula:
= HLOOKUP(O65,J74:N75,2,TRUE)
As OP wants to apply HLookUp to numeric range strings ("2100-2200", "2200-2300", etc.), I suggest the following steps:
reconstruct the range contents to their starting values (2100,2200,2300) isolated via Find searching till the - delimiter) by VALUE(LEFT(A1:C1,FIND("-",A1:C1)-1))
define a named search cell MySrch (example value of 2151 due to OP) and apply HLookUp on the above values to get the lower boundary (here: 2100),
Match the found lower boundary (e.g. 2100) plus character "-" and a * wild card to find the ordinal column position (here: 1st column),
return the found range text (e.g. 2100-2200) via Index(A1:C1,1,{column number}) referring to row 1 and the column position as result:
=INDEX(A1:C1,1,MATCH(HLOOKUP(MySrch,VALUE(LEFT(A1:C1,FIND("-",A1:C1)-1)),1)&"-*",A1:C1,0))
Note that it would be necessary to exclude search values outside the given range boundaries either by formula extension (e.g. If(MySrch>...,"?",{above formula}) or to add a last range defining a maximum limit.
if the number you search for is in E1 (aka e1 = 2152) then use 1 of 3:
the easiest and probably the best:
1) =HLOOKUP(E1 & "-" & E1,A1:C1,1,true)
or
2) =index(a1:c1,1, max(ARRAYFORMULA( if($E$1>VALUE(left(A1:C1,4)),column(A1:C1),0) )) )
or
3) =index(A1:c1,1, min(ARRAYFORMULA( if(VALUE(right(A1:C1,4))>=$E$1,column(A1:C1),99) )) )
this the range you want
to get the column remove the index(a1:c1,1, .... ) leaving the .... in 2) and 3) or use the fo;;owin gin 1)
=HLOOKUP(E1 & "-" & E1,{A1:C1 ; arrayformula(column(A1:C1))},2,true)
glad to help