I have this problem at work to populate the worksheet with the right case number.
Sheet 1: (Report)
SSN | Service Date
123456 | 10/01/2014
Sheet 2: (Data)
SSN | Case Number | Start Date | End Date
123456 | 0000000 | 01/01/2010 | 12/31/2012
123456 | 1111111 | 01/01/2013 | 05/31/2014
123456 | 2222222 | 06/01/2014 | 11/10/2015
How can I do a VLOOKUP based on the Service Date to be within the "range" of the Start and End Date of another sheet?
In this case I would like to lookup the SSN and return case number 2222222 because that is the case active for such date of service.
I was looking online and found "MATCH". I am able to match the first result of the case matches the SSN, but how to go to the next case if it does not match?
=IF(E2>=INDEX('CASE NUMBERS'!A:F,MATCH(C2,'CASE NUMBERS'!A:A,0),4)&E2<=INDEX('CASE NUMBERS'!A:F,MATCH(C2,'CASE NUMBERS'!A:A,0),5),"YES","NO")
I am using Excel 2013 on Windows 7 at work.
You will need 3 conditions. a) Is the start date less than the Service Date b) Is the End Date greater than the Service Date and c) do the SSN numbers match?
Use the newer AGGREGATE¹ function to force any non-matches into an error state while using the ignore errors option (e.g. 6) to discard errors.
=INDEX(Sheet2!$B$2:$B$9999, AGGREGATE(15, 6, ROW($1:$9998)/((Sheet2!C$2:C$9999<=B2)*(Sheet2!D$2:D$9999>=B2)*(Sheet2!A$2:A$9999=A2)), 1))
For all intents and purposes, a worksheet formula treats FALSE as zero (e.g. 0) and TRUE as one (e.g. 1). Any number multiplied by zero is zero and any number multiplied by one is the same number. The AGGREGATE function is retrieving the row position of the first match within Sheet2!B2"B9999. That row position will be a number somewhere within ROW(1:9998). Any of the rows that do not match all three condition will have at least one zero multiplied by the denominator. This makes the denominator zero. Anything divided by zero forces a #DIV/0! error and AGGREGATE will discard those from the result set. AGGREGATE's 15 option is the SMALL and the last 1 is the k ordinal for SMALL (the very smallest). So of all the rows that match all three conditions, AGGREGATE returns the lowest one to the INDEX function which retrieves the value from Sheet2!B2"B9999.
Tighten the ranges up to a maximum of 5 rows and use the Evaluate Formula command to step through the formula and gain a better understanding.
It may be worthwhile to note that it is very easy to convert this formula to retrieve the second, third, etc. matches as well since it only requires sequencing the k ordinal up.
¹ The AGGREGATE function was introduced with Excel 2010. It is not available in earlier versions.
If SSN is in A1 of both sheets and your Case Numbers are numeric (other than 0000000) then you might try:
=SUMIFS(Sheet2!B:B,Sheet2!A:A,A2,Sheet2!C:C,"<="&B2,Sheet2!D:D,">="&B2)
SUMIFS is explained here (and elsewhere!).
This array-formula will always print the last match:
=INDEX(Sheet2!B:B,MAX((Sheet2!A:A=A2)*(Sheet2!C:C<=B2)*(Sheet2!D:D>=B2)*ROW(A:A)))
This is an array formula and must be confirmed with Ctrl+Shift+Enter.
It works if there are multiple solutions which fit the criteria
It also works with every kind of data you want to show (values/dates/strings)
! However, you should cut the range as short as possible. (its a huge calculation for the entire sheet)
Related
I have the following table in Excel:
Name Col_A Col_B
Michael Some_value
Alex Some_value Some_value
Jennifer
I want to count in a single cell (without adding any columns to assist me) how many names I have that have at a value at least in one of the columns A or B. So in this case the result will be 2.
I tried to do it with COUNTIFS and COUNT (IF) but it seems to cover only one column at a time.
Using MMULT()
• Formula used in cell F4
=SUM(N(MMULT(--(D4:E6<>""),{1;1})>0))
So, we can use either -- or N() which means
The double unary (also called a double negative) is an operation used
to coerce TRUE FALSE values to ones and zeros in more advanced
formulas, especially formulas that work with arrays.
While N() function converts non-number values to a number, dates to
serial numberss, TRUE to 1 and anything else to 0
Note: Source for -- taken from exceljet.net
I have some data in B1:B10 (values) and in C1:C10 (strings) that I want to average.
My values are (from row 1-10):
B | C
-----
1 | Approved
1 | Approved
1 | Approved
1 | Approved
| N/A
| N/A
| N/A
1 | Approved
1 | Approved
0 | Disapproved
When I enter the following formula in A1 to average the data in column B, I get a result (0.857143), no problem:
=AVERAGE(B1,B2,B3,B4,B5,B6,B7,B8,B9,B10)
When I instead enter the following formula in D1, I get a #VALUE! error instead, though from what I can tell, the logic is the same (replacing N/A's with blanks):
=AVERAGE(
IF(C1="Approved",1,IF(C1="Disapproved",0,IF(C1="N/A","",""))),
IF(C2="Approved",1,IF(C2="Disapproved",0,IF(C2="N/A","",""))),
IF(C3="Approved",1,IF(C3="Disapproved",0,IF(C3="N/A","",""))),
IF(C4="Approved",1,IF(C4="Disapproved",0,IF(C4="N/A","",""))),
IF(C5="Approved",1,IF(C5="Disapproved",0,IF(C5="N/A","",""))),
IF(C6="Approved",1,IF(C6="Disapproved",0,IF(C6="N/A","",""))),
IF(C7="Approved",1,IF(C7="Disapproved",0,IF(C7="N/A","",""))),
IF(C8="Approved",1,IF(C8="Disapproved",0,IF(C8="N/A","",""))),
IF(C9="Approved",1,IF(C9="Disapproved",0,IF(C9="N/A","",""))),
IF(C10="Approved",1,IF(C10="Disapproved",0,IF(C10="N/A","","")))
)
What gives, and what do I need to change in order to get 0.857143 as a result in the formula for the strings values in column C?
Also tried changing the "if true" and "if false" parts for N/A with VALUE("") and VALUE(0). With VALUE("") it still results in #VALUE! error, and with VALUE(0) it still counts the blank into the average, which is not desired as I only want an average on the 1's and 0's
Additional info: If I split up the formula for the strings to evaluate each one on a separate line, THEN pull an average on THAT range, it works fine.. Though, considering the data set I am working with, I would rather not add them all separately, as it clutters the work space enormously.
AVERAGE won't work with text-strings in a given range of numbers. It might skip empty cells (as per your first example), but surely will error out on comparing text in a numeric equation (your second example). So try this instead:
=COUNTIF(C1:C10,"Approved")/SUM(COUNTIF(C1:C10,{"Approved","Disapproved"}))
This will leave N/A out of the equation.
I have looked for proper formula that would solve my problem but I couldn't find anything.
I have a table with multiple date ranges and I want to highlight all dates in my calendar between these ranges. I've tried to use formula AND
=AND(F5>=$A$6,F5<=$B$6)
however the formula highlights only dates between 1st range. I tried to put array ($A6:$A$9 and $B6:$B$9) but it doesn't work.
Column A Column B
row 6 | 05/01/2018 | 12/01/2018
row 7 | 03/04/2018 | 16/04/2018
row 8 | 06/05/2018 | 17/05/2018
row 9 | 01/11/2018 | 05/11/2018
My calendar starts in cell F5 and ends in AP16.
Regards,
Adrian
You need to wrap your AND's within an OR:
=OR(AND(F5>=$A$6,F5<=$B$6),AND(F5>=$A$7,F5<=$B$7), AND(...))
or, in a more compact but equivalent form:
=SUMPRODUCT((F5>=$A$6:$A$9)*(F5<=$B$6:$B$9))
or
=OR((F5>=$A$6:$A$9)*(F5<=$B$6:$B$9))
Each of the equality arrays returns an array of 1's or 0's. Multiplying them together is the equivalent of AND and will return a 1 if and only if both values in the same position are TRUE. Adding the arrays (the equivalent of OR) will then show if any result is a 1.
Although Excel 2016 will accept an OR in the conditional format formula, I seem to recall that some earlier versions will not, hence I have also supplied the equivalent SUMPRODUCT formula.
Or once again you can use countifs
=COUNTIFS($A$6:$A$10,"<="&F5,$B$6:$B$10,">="&F5)
Column A Column B
13-06-2013 10:50
13-06-2013 11:30
13-06-2013 12:40
14-06-2013 10:30
I need to find the values which are before a particular entry date and time.
For example, say I want to find the values in the example table above that are immediately prior to the values "13-06-2013" and "12:30".
Since 12:30 is not in column B, how do I find the values I am looking for? The answer should be 13-06-2013 and 11:30.
C7 =VLOOKUP(A7&B7,A1:C4,3,TRUE)
Here A1 = B1&C1
A B C
1 414380.451388888888889 13-06-2013 10:50
2 414380.479166666666667 13-06-2013 11:30
3 414380.527777777777778 13-06-2013 12:40
4 414390.4375 14-06-2013 10:30
5
6 Enter date Enter Time Returned Time
7 13-06-2013 12:30 11:30:00
Setting 'range_lookup' as 'True' adds the flexibility to return the closest approximate value if the exact value is not available.
I think you're looking for something like this. using index and match.
I didn't take into account the date for now. but this gives you an example.
You can compare date strings with operators like > or < etc. Concatenate your values in columns A & B, compare to the desired date/time string. In cell C1 put the following formula, and then drag down:
="13-06-2013 12:30"<A1&" "&B1
or more specifically, depending on which "12:30" you want (AM or PM), ="13-06-2013 12:30AM" or ="13-06-2013 12:30PM"
Your data in column B may default to AM unless otherwise specified/imported differently, so you may need to tweak the data or to account for this.
Here is another approach to answering your question that uses a combination of MATCH, INDEX, and array operations to provide a compact formula solution that does not rely on helper columns.
I'll assume that your two columns of dates and times are in cells A2:B5, and the two date and time values that you want to look up are in cells A9:A10. Then the following two formulas will return what you require, the latest date and time values in your data that are less than or equal to the date and time that you are looking up. (The dollar signs in the formulas are hard on the eyes, but they are important if you will need to copy the formulas to other locations; for clarity, I omit them in the discussion that follows.)
DATE: =INDEX($A$2:$B$5,MATCH(A9+A10,$A$2:$A$5+$B$2:$B$5,1),1) --> 13-06-2013
TIME: =INDEX($A$2:$B$5,MATCH(A9+A10,$A$2:$A$5+$B$2:$B$5,1),2) --> 11:30 AM
These are array formulas and need to be entered with the Control-Shift-Enter key combination. (Of course, only the bits starting with the equal (=) sign and ending with the last parenthesis need to be entered into the worksheet.)
Things to consider:
The formulas assume that your data are valid Excel date and time values. Excel date values are whole numbers that count the number of days that have elapsed since January 1, 1900; Excel time values are decimal amounts between 0 and 1 that represent the fraction of 24 hours that a particular time represents. While your example data don't display AM or PM, I assume that their underlying values do have that information.
If your values are text (having been imported from another source, for instance), you should convert them to date/time values, if lucky, using only Excel's DATEVALUE and TIMEVALUE functions; if not so lucky, using some combination of Excel's string manipulation functions as well. (The values could be kept as strings, but you would almost certainly need to massage them so they would compare correctly "alphabetically" - much easier just to deal with Excel date/time values.)
If they are not already, your dates and times will need to be sorted from smallest to largest. (Your sample looks like they are sorted, and the formulas assume as much.)
How the formulas work
The basic idea behind the formulas is two-fold: first find the row in your data that holds the latest (largest) date and time that is still less than or equal to the date and time you are looking up. That row information can then be used to fetch the final result from each column of the data range (one for date and one for time).
Since both date and time figure in to what point in time is latest, the date and time components of both the value to be looked up and the values that will be searched must be combined somehow.
This can be achieved by simply adding the dates and times together. This does nothing more than what Excel does: an Excel date/time value has an integer part (the number of days since 1/1/1900) and a decimal part (the fraction of 24 hours that a particular time represents).
What is neat here is that the adding up of the dates and times - and the lookup of the particular date and time - can be done all at once, on the fly.
Take a look at the MATCH: The cells that contain the date and time to be looked up - A9 and A10 - are added together, and then this sum is matched against the sum of the date column (A2:A5) and the time column (B2:B5) - an operation that is possible of Excel's array arithmetic capabilities. The match returns a value of 2, indicating correctly that the date and time that fill your requirements are in row 2 of the data table.
DATE/TIME MATCH: = MATCH( A9+A10, A2:A5 + B2:B5, 1 ) --> 2
The 1 that is the final argument to the MATCH function is an instruction that the match results be calculated to be less than or equal to the value to be looked. It is the default value and is often omitted, or replaced with another value (for example, using a value of 0 will produce an exact match, if there is one).
(For readability, I've removed the dollar signs that are in the full formula; these anchor a range so that it remains the same even if the formula is copied to another location.)
Having figured out the row to look in, the rest of the formula is straightforward. The INDEX function returns the value in a data range that is at the intersection of a specified row and column. So, for the date in question, the formula reduces to:
DATE FETCH: = INDEX( A2:B5, 2, 1) --> 13-06-2013
In other words, INDEX is to return the value in the second row and first column of the data range A2:B5.
The formula for the time proceeds in exactly the same fashion, with the only difference that the value is returned from the second column of the data range.
I'm trying to calculate the conditional median of a chart that looks like this:
A | B
-------
x | 1
x | 1
x | 3
x |
y | 4
z | 5
I'm using MS Excel 2007. I am aware of the AVERAGEIF() statement, but there is no equivalent for Median. The main trick is that there are rows with no data - such as the 4th "a" above. In this case, I don't want this row considered at all in the calculations.
Googling has suggested the following, but Excel won't accept the formula format (maybe because it's 2007?)
=MEDIAN(IF((A:A="x")*(A:A<>"")), B:B)
Excel gives an error saying there is something wrong with my formula(something to do with the * in the condition) I had also tried the following, but it counts blank cells as 0's in the calculations:
=MEDIAN(IF(A:A = "x", B:B, "")
I am aware that those formulas return Excel "arrays", which means one must enter "Ctrl-shift-enter" to get it to work correctly.
How can I do a conditional evaluation and not consider blank cells?
Nested if statements.
=MEDIAN(IF(A:A = "x",IF(B:B<>"",B:B, ""),"")
Not much to explain - it checks if A is x. If it is, it checks if B is non-blank. Anything that matches both conditions gets calculated as part of the median.
Given the following data set:
A | B
------
x |
x |
x | 2
x | 3
x | 4
x | 5
The above formula returns 3.5, which is what I believe you wanted.
Use the Googled formula, but instead of hitting Enter after you type it into the formula bar, hit Ctrl+Shift+Enter simultaneously (instead of Enter). This places brackets around the formula and will treat it as an array.
Be warned, if you edit it, you cannot hit Enter again or the formula will not be valid. If editing, you must do the same thing when done (Ctrl+Shift+Enter).
There is another way that does not involve the array formula that requires the CtrlShiftEnter operation.
It uses the Aggregate() function offered in Excel 2010, 2011 and beyond. The method also works for min,max and various percentiles.
The Aggregate() allows errors to be ignored, so the trick is to make all values that are not required cause errors. The easiest way is to do the task set above is:
=Aggregate(16,6,(B:B)/((A:A = "x")*(B:B<>"")),0.5)
The first and last parameters set the scene to do a percentile 50%, which is a median, the second says ignore all errors (including DIV#0) and the third says select the B column data, and divide it by a number which is one for all non empty values that have an x in the A column, and a zero otherwise.
The zeros create a divide by zero exception and will be ignored because a/1=a and a/0=Div#0
The technique works for quartiles (with an appropriate p value), all other percentiles of course, and for max and min using the large or small function with appropriate arguments.
This is a similar construct to the Sumproduct() tricks that are so popular, but which cannot be used on any quantiles or max min values as it produces zeros which look like numbers to these functions.
Bob Jordan
Perhaps to generalize it a little more, instead of this...
{=MEDIAN(IF(A:A="x",IF(B:B<>"",B:B)))}
... you could use the following:
{=QUARTILE.EXC(IF(A:A="x",IF(B:B<>"",B:B)),2)}
Note that the curly brackets refer to an array formula; you should not place the brackets in your formula but press CTRL+SHIFT+ENTER (or CMD+SHIFT+ENTER on macOS) when entering the formula
Then you could easily get the first and third quartile by altering the last number from 2 to 1 or 3 respectively. QUARTILE.EXC is what most commercial statistical software (e.g. Minitab) use by the way. The "regular" function is QUARTILE.INC, or for the older versions of Excel, just QUARTILE.