Conditional vlookup in Excel - excel

I have a table where the first column is name, and second column is value. The value can either be a 0 or one of pass/fail. I also have time stamps so names can repeat. For example,
Column1 Column2 Column3
X 0 12AM
X Pass 3AM
I am trying to do a vlookup to get the pass/fail status. But since I do not know if the 0 will come first or after (and I cannot control the sorting of the lookup table); I need to write a formula that would automatically do a conditional and pick the pass/fail but not the 0. Also - I can't lookup on the time as it changes every x minutes and not available on the main table.
Any clues?
I know there are ways to pick the nth item from a vlookup; however, I can't figure out how to get this scenario as the pass/fail won't always be the second one, for example.

If I understand you correctly, I think you actually want to use =INDEX with =MATCH() as an array formula here.
I added to your fake data:
A B C
Panda 0 12AM
Panda pass 3AM
Panda 0 5AM
Koala fail 2AM
Koala 0 1PM
Koala 0 3PM
Polar 0 12AM
Polar pass 9AM
You'll get this:
E F
Panda pass
Koala fail
Polar pass
In column F where the results are, place this formula and hold down [control] and [shift] and hit [enter]. It won't work if you just hit [enter].
=IFERROR(INDEX($B$2:$B$9,MATCH(E2&"pass",$A$2:$A$9&$B$2:$B$9,0)),INDEX($B$2:$B$9,MATCH(E2&"fail",$A$2:$A$9&$B$2:$B$9,0)))

Related

Is there a non-VBA way to calculate the average of the sum of two sets of columns?

I'm creating an excel spreadsheet to track when an item is received as well as when a response to the item having been received has been made (ie: my mail was delivered at 1:00pm (item received) but I didn't check the mail until 5:00pm (response to item having been received)).
I need to track both the date and time of the item being received and want to separate these in two separate columns. At the moment this translates to:
Column A: Date item received
Column B: Time item received
Column L: Date item was responded to having been received
Column M: Time item was responded to having been received
In essence I'm looking to run calculations on the response time between when the item is received and when it has been responded to (ie: average response time, number of responses in less than an hour, and even things like the number of responses that took between 2 and 3 hours where Bob was the person who responded).
The per-line pseudo code would look something like:
(Lr + Mr) - (Ar + Br) ' where L,M,A,B are the columns and 'r' is the row number.
An example, with the following data:
1. A B L M
2. 1/5/19 10:00 1/5/19 12:00
3. 1/5/19 21:00 1/6/19 1:00
4. 1/5/19 22:00 1/5/19 23:00
5. 1/6/19 3:00 1/6/19 4:00
The outcome for the average response time would be 2 hours (average(rows 2-5) = average(2, 4, 1, 1) = 2)
The number of items with an average response times would be as follows:
(<=1 hour) = 2
(>1 & <=2) = 2
(>2 & <=3) = 0
(>3) = 1
I don't know (or can find) a function that will perform this and then let me use it within something like a countifs() or averageifs() function.
While I could do this (fairly easily) in VBA, the practical implementation of this spreadsheet limits me to standard Excel. I suspect that sumproduct() will be fundamental to make this work, but I feel that I need something like a sumsum() function (which doesn't exist) and I'm not familiar with sumproduct() to better understand what to even look for to set something like this up.
If you are not so familiar with SUMPRODUCT() or the likes I would suggest one helper column. Like so:
You can see the formula used is:
=((C2+D2)-(A2+B2))
You can probably do all type of calculations on this helper column. Note, column is formatted hh:mm. However, if you want to look into SUMPRODUCT() you could think about these:
Formula in H2:
=SUMPRODUCT(--(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)<=1))
Formula in H3:
=SUMPRODUCT((ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)>1)*(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)<=2))
Formula in H4:
=SUMPRODUCT((ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)>2)*(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)<3))
Formula in H5:
=SUMPRODUCT(--(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)>3))
The helper column is the easiest approach. It gives you the time differences that you can then easily analyse however you want. Analysis without the helper column is possible, but the approach differs depending on what type of analysis you want to do.
For the example you provided, which is counting the number of time differences grouped into ranges, you would use the FREQUENCY function:
=FREQUENCY(C2:C5+D2:D5-A2:A5-B2:B5,F2:F4)
In F2:F4 (called the "bins"), enter the upper limit of each range you want to count. The Frequency function counts up to and including the first value, then counts from there up to and including the second value, and so on. Enter the bins as times, e.g. 1:00 for 1 hour.
Note that Frequency is an array-entered and an array-returning function. This you means you need to first select the range that will contain all output values, G2:G5 in this example, then enter the function, then press CTRL+SHIFT+ENTER
Also note that Frequency returns an array that is one element larger than the number of bins specified. The extra element is the count of all values greater than the largest bin specified.

Grab minimum time in year

As an athlete I want to keep track of my progression in Excel.
I need a formula that looks for the fastest time ran in a given season. (The lowest value in E for a given year. For 2017, for example, this is 13.32, for 2018 12 and so on.
Can you help me further?
Instead of formula you can use PIVOT
Keep the Year in Report Filter and Time into Value. Then on value field setting select min as summarize value by.
So every you change the year in the Filter the min value will show up.
=AGGREGATE(15,6,E3:E6/(B3:B6=2017),1)
15 tell aggregate to sort the results in ascending order
6 tells aggregate to ignore any errors such as when you divide by 0
E3:E6 is your time range
B3:B6 is you Year as an integer.
B3:B6=2017 when true will be 1 and false will be 0 (provide it goes through a math operation like divide.
1 tells aggregate to return the 1st value in the sorted list of results

how to count of issue with open status in spotfire

I need to calculate count of issue ID for each month with open status.
I have below 3 columns-
Issue_ID
Issue_Open_Date
Issue_Closed_Date
Issue_ID Issue_Open_Date Issue_Closed_Date Open_Issue_Count(required output)
IS_10 11/11/2014 1/5/2015 3
IS_11 11/12/2014 12/14/2014
IS_12 11/13/2014 11/15/2014
IS_13 11/14/2014 3/5/2015
IS_1 12/1/2014 12/15/2014 4
IS_2 12/2/2014 2/10/2015
IS_3 12/3/2014 1/15/2015
IS_4 1/1/2015 2/10/2015 4
IS_5 1/2/2015 3/11/2015
IS_6 1/3/2015 1/22/2015
IS_7 2/1/2015 3/5/2015 3
IS_8 2/2/2015 2/2/2015
IS_9 2/7/2015 2/28/2015
IS_14 3/1/2015 4/5/2015 1
Based on above table, i need a count of open status of each month.
lets suppose in December i need to count than it should check in dec and nov month.
If any issue is closing in same month, it mean that is not in open stage,
Basically for each month it should check for their records also and previous month records also.
Required output is below-
Nov- 3
Dec- 4
Jan-4
Feb-3
march-1
So... I have a way but it's ugly. I'm sure there's a better way but I spent a while banging my head on this trying to make it work just within Spotfire without resorting to a python script looping through rows and making comparisons.
With nested aggregated case statements in a Cross Table I made it work. It's a pain in the butt because it's pretty manual (have to add each month) but it will look for things that have a close date after the month given and an open date that month or earlier.
<
Sum(Case
when ([Issue_Closed_Date]>Date(2014,11,30)) AND ([Issue_Open_Date]<Date(2014,12,1)) then 1 else 0 end) as [NOV14_OPEN] NEST
Sum(Case
when ([Issue_Closed_Date]>Date(2014,12,31)) AND ([Issue_Open_Date]<Date(2015,1,1)) then 1 else 0 end) as [DEC14_OPEN] NEST
Sum(Case
when ([Issue_Closed_Date]>Date(2015,1,31)) AND ([Issue_Open_Date]<Date(2015,2,1)) then 1 else 0 end) as [JAN15_OPEN] NEST
Sum(Case
when ([Issue_Closed_Date]>Date(2015,2,28)) AND ([Issue_Open_Date]<Date(2015,3,1)) then 1 else 0 end) as [FEB15_OPEN] NEST
Sum(Case
when ([Issue_Closed_Date]>Date(2015,3,31)) AND ([Issue_Open_Date]<Date(2015,4,1)) then 1 else 0 end) as [MAR15_OPEN]>
Screenshot:
As far as doing it with python you could probably loop through the data and do the comparisons and save it as a data table. If I'm feeling ambitious this weekend I might give it a try out of personal curiosity. I'll post here if so.
I think what makes this difficult is that it's not very logical to add a column showing number of issues open at a point in time because the data doesn't show time; it's "one row per unique issue."
I don't know what your end result should be, but you might be better off unpivoting the table.
unpivot the above data with the following settings:
pass through: [Issue_ID]
transform: [Issue_Open_Date], [Issue_Closed_Date]
optionally rename Category as "Action" and Value as "Action Date"
now that each row represents one action, create a calculated column assigning a numeric value to the action with the following formula.
CASE [Action]
WHEN "Issue_Open_Date" THEN 1
WHEN "Issue_Closed_Date" THEN -1
END
create a bar chart with [Action Date] along the X axis (I wouldn't drill further than month or week) and the following on the Y axis:
Sum([Action Numeric]) over (AllPrevious([Axis.X]))
you'll wind up with something like this:
you can then do all sorts of fancy things with this data, such as show a line chart with the rate at which cases open and close (you can even plot this on a combination chart with the pictured example).

Find the top n values in a range while keeping the sum of values in another range under x value

I'd like to accomplish the following task. There are three columns of data. Column A represents price, where the sum needs to be kept under $100,000. Column B represents a value. Column C represents a name tied to columns A & B.
Out of >100 rows of data, I need to find the highest 8 values in column B while keeping the sum of the prices in column A under $100,000. And then return the 8 names from column C.
Can this be accomplished?
EDIT:
I attempted the Solver solution w/ no luck. 200 rows looks to be the max w/ Solver, and that is what I'm using now. Here are the steps I've taken:
Create a column called rank RANK(B2,$B$2:$B$200) (used column D -- what is the purpose of this?)
Create a column called flag just put in zeroes (used column E)
Create 3 total cells total_price (=SUM(A2:A200)), total_value (=SUM(B2:B200)) and total_flag (=(E2:E200))
Use solver to minimize total_value (shouldn't this be maximize??)
Add constraints -Total_price<=100000 -Total_flag=8 -Flag cells are binary
Using Simplex LP, it simply changes the flags for the first 8 values. However, the total price for the first 8 values is >$100,000 ($140k). I've tried changing some options in the Solver Parameters as well as using different solving methods to no avail. I'd like to post an image of the parameter settings, but don't have enough "reputation".
EDIT #2:
The first 5 rows looks like this, price goes down to ~$6k at the bottom of the table.
Price Value Name Rank Flag
$22,538 42.81905675 Blow, Joe 1 0
$22,427 37.36240932 Doe, Jane 2 0
$17,158 34.12127693 Hall, Cliff 3 0
$16,625 33.97654031 Povich, John 4 0
$15,631 33.58212402 Cow, Holy 5 0
I'll give you the solver solution as a starting point. It involves the creation of some extra columns and total cells. Note solver is limited in the amount of cells it can handle but will work with 100 anyway.
Create a column called rank RANK(B2,$B$2:$B$100)
Create a column called flag just put in zeroes
Create 3 total cells total_price, total_value and total_flag
Use solver to minimize total_value
Add constraints
-Total_price<=100000
-Total_flag=8
-Flag cells are binary
This will flag the rows you want and you can grab the names however you want.

Is there a way to make Lookup always return the first encountered match in a column?

I have user retention data that looks like this:
signup_date days retention
2/24/13 0 1
2/23/13 0 1
2/23/13 1 0.4855
2/22/13 0 1
2/22/13 1 0.4727
2/22/13 2 0.3647
2/21/13 0 1
2/21/13 1 0.5135
2/21/13 2 0.3879
2/21/13 3 0.3463
2/20/13 0 1
2/20/13 1 0.5402
2/20/13 2 0.4166
2/20/13 3 0.3615
2/20/13 4 0.3203
2/19/13 0 1
2/19/13 1 0.5317
2/19/13 2 0.4348
2/19/13 3 0.366
2/19/13 4 0.3077
The second column ("days") represents days elapsed since the signup date and the retention is based on that day and the signup_date (since retention can change over time). I need to make projections going forward, and unfortunately for me (since I would prefer to do this programmatically), my boss wants them in Excel. So I'm trying to use the Lookup() function to find the most recent value in the retention column that would match the "days" elapsed from a certain signup date.
Anyway, that's all prelude to the question, which is that right now if I enter the formula:
=lookup(1,B:B,C:C)
where B:B is "days" and C:C is retention, it doesn't necessarily return the first (i.e. most recent) retention value in the data set. For example, in this case, I need the cell to be 0.4855, but the formula may give me 0.4727 (which is the second "days=1" row). Is there any way to configure it to do this or is there another function that will do what I need?
Instead of using LOOKUP, use the VLOOKUP function - it has another parameter that specifies that the data is not in order:
=VLOOKUP(1,$B:$C,2,0)
This will return you the entry in the second column of your range B:B where "1" is found in the first column of your range. Do not forget the 0 at the end, as this tells Excel to search row by row (vs. a binary search as LOOKUPor omitting the parameter would do).
Alternative:
VLOOKUPis the simple formula and default for those situation. The even better, as more flexible way is to use INDEX/MATCH:
=INDEX($C:$C,MATCH(1,$B:$B,0))
This will do exactly the same, just that you have a bit more flexibility and don't need to include the "second column from the lookup data"...
HTH!

Resources