Is it possible to return a dynamic formula in VLOOKUP? - excel

I am looking for a possibility to use VLOOKUP in combination with a dynamic formula (or a different solution if possible).
A simplified problem is provided in the image below. Based on a category the number of outlets in a room is calculated. The number of outlets can either be a fixed amount or based on the total area of a room which is provided in a separate column.
What is the best method to apply this to a (much larger) sample?

If "based on total area" is always a x per unitA, then enter it as a proportion, e.g.: 0.1a instead of 1 per 10 m^2, and use a formula that checks for the trailing 'a' and responds accordingly. e.g.:
=IF(RIGHT(VLOOKUP(B2,$F$2:$G$3,2,FALSE),1)="a",CEILING.MATH(SUBSTITUTE(VLOOKUP(B2,$F$2:$G$3,2,FALSE),"a","")*C2),VLOOKUP(B2,$F$2:$G$3,2,FALSE))
(I used "Ceiling" to get the integer value. Replace with Floor or Round as needed).
Probably easier from a formula writing (and readability) POV would be a separate column that holds the a (or rather: a "fixed/proportional" column) and key off of that, but the result is the same.

Related

Deal with Ties when Using Index/Match

I'm currently pulling the top (5) number of numerical values from one sheet and inputting them into a different sheet. Each number is within its own column and there is a name matching that column, EX:
And so, having a tie is common with the data that I'm working with, so it nearly deprecates my formulas.
For getting the name:
=INDEX('Total Cases by Categories'!$B$18:$B$50, MATCH(LARGE('Total Cases by Categories'!$H$18:$H$50, A39),'Total Cases by Categories'!$H$18:$H$50, 0))
For getting the numerical value associated with the name:
=LARGE('Total Cases by Categories'!$H$18:$H, A39)
And so, when there are 2 people with the same numerical value associated within a category, then that person appears twice, I assume because of their position within the sheet.
So something like this happens:
So in the event of a tie, I would want to list both names that have the same amount of points instead of the first name that shows up with the duplicated value.
Any help would be appreciated!
Actually, LARGE will give you both of tied names. It's MATCH that can't look beyond the first. To the best of my knowledge there is no way around that (the difficult one being not to use MATCH). Therefore the solution is to have no ties.
This is achieved with helper columns that contain no identical numbers. This can be achieved by adding an insignificant decimal. Since you are dealing with integers, adding 0.1 would be insignificant for your purposes but 13.1 is different from 13.2. If you need to extract the "real" number from this use INT(13.2).
Using the row number to generate an insignificant decimal is popular for this purpose. In row 1 ROW()/10 will return 0.1. But in row 10 ROW()/10 will return 1.0 which isn't an insignificant number anymore. Therefore you have to work with ROW()/100 or an even larger divisor, depending upon how many rows you have. Try ROW()/10^6 - any decimal will do the tie-breaking job.
You may not like that using ROW() will list tied participants in the order in which they appear in the worksheet. The differentiating decimals can be created by any other means that doesn't create ties in itself.
Normally, the helper columns with the decimals added will be hidden. They contain a formula like =D23 + (ROW()/10000) which manages itself. You can then use that column for the MATCH function to list all participants in the order of LARGE using the helper column or the original. Just make sure that MATCH refers to the helper column.

How do I count all the instances where a certain number is between multiple sets of numbers?

I would like to count the number of times a specific number lies between multiple ranges.
For instance,
Specific number: 2.5 (let's say this one is in AD1)
J3=14
K3=22
L3=0
M3=6
N3=6
O3=14
P3=2
Q3=8
I need to find how many times 2.5 is between:
J3&K3
L3&M3
N3&O3
P3&Q3
The reason I would like a formula for this is because I have many "specific numbers" that there are many numbers that I need to test within the same range.
I know I can combine multiple CountIf, but the formula would be way too long.
I remember I can use Sum(CountIf("INSERTFORMULA")) but I think somehow using a combination of Sum(CountIf(Median())) will be simpler to read
SUM(Countif(MEDIAN($AD$1,J3,K3)=$AD$1,TRUE),MEDIAN($AD$1,L3,M3)=$AD$1,TRUE),MEDIAN($AD$1,N3,O3)=$AD$1,TRUE),MEDIAN($AD$1,P3,Q3)=$AD$1,TRUE))
Expected result: 2 (i.e. between L3&M3 and between P3&Q3)
Try: (Edited to correct typo)
=SUMPRODUCT(($AD$1>=INDEX(J3:Q3,1,N(IF(1,{1,3,5,7}))))*($AD$1<=INDEX(J3:Q3,1,N(IF(1,{2,4,6,8})))))*emphasized text*
The N(IF(1,{array})) is a method of returning discontinuous elements of an array using the INDEX function.
Depending on whether you want to include/exclude the bounds of the ranges when you write between, you may want to remove the equal = sign from the comparisons.
Try:
=SUMPRODUCT((J3:P3<=AD1)*(K3:Q3>=AD1))
divide your formula on two parts:
first one - just calculate MEDIAN($AD$1,J3,K3) and put it in J4 (for example), then drag and copy this formula on the all raw (so in K4 will be MEDIAN($AD$1,K3,L3), and so on)
second one - just summarize raw 4 with formulas - SUM(A4:AA4)
it takes more space on the sheet, but more simple for creation and checking.

Ranking with subsets

I'm trying to rank values and have managed to work out how to sort ties. My data looks at the total number of entries, ranks based on that and if there is a tie it looks to the next column of values to sort them out. However, I have two classes (East and West I've called them) of data within my dataset and want to rank them both separately (but stick to the rules above). So, if I had seven entries, 3 of them West and 4 of the East, I want West to have ranking 1,2,3 based on all the values that lie in that subset and East would have ranking 1,2,3,4. Can you explain what your formula is doing so I can understand how to apply your answer better in the future.
Effectively I'm asking what formula needs to go in achieve my result.
Cheers
Paul
There are a few related ways to do this, most involving SUMPRODUCT. If you don't like the solution below and would like to research other ways/explanations, try searching for "rankif".
The function looks up the Class and Value columns and, for every value in those columns, returns a TRUE or 1 if the current Class is a match AND if its Value is larger than the current Value, False or 0 if otherwise. The SUM adds up all these 1s, and the 1+ is for decoration. Remember to enter as an array formula using Ctrl+Shift+Enter before dragging down.
I used the array formula and SUM above to explain, but the following also works and might even be faster since it's not an array formula. It's the same idea, except we hijack SUMPRODUCT's ability to spit out a single value from an array.
=1+SUMPRODUCT(($A$2:$A$8=A2)*($B$2:$B$8>B2))
EDIT
To extend the rank-if, you could add more subsets to rank by multiplying more conditions:
You can also easily add tiebreakers by adding another SUMPRODUCT to treat the ties as an additional subset:
The first SUMPRODUCT is the 'base rank', while the second SUMPRODUCT is tiebreaker #1.

Match 2 columns based on a % difference within the value

I am looking for a method to match two excel tables.
I basically have two Systems, where the values do not exactly match only some IDs. The values in system 2 are usually 10-20% different from system 1.
Here is how the sheet looks like:
I tried to use vlookup on the IDs and then going hand-by-hand through the values if they match, by using the filter with the ID. However, this takes extremely long and is very cumbersome.
Any recommendation how to match these two tables, much more easily?
I really appreciate your replies!
If you look at a formula for G3 you would be involving D3:E3 and A:B (where A10:B10 are the matching values).
When someone states that they are looking for a percentage, it is helpful to know "a percentage of what...?". You receive a different result if the calculation is ABS(12 - 15)/15 instead of ABS(12 - 15)/12. One may within tolerance and the other may not.
In any event, the formula for G3 would be something like,
=ABS(E3-VLOOKUP(D3,A:B, 2, FALSE))/E3
... or,
=ABS(E3-VLOOKUP(D3,A:B, 2, FALSE))/VLOOKUP(D3,A:B, 2, FALSE)
That produces a result of 0.25% or 0.20% depending on how you calculate the percentage. You could wrap that in an IF statement for a YES/NO text result or use a custom number format like [Color3][>0.2]\NO;;[Color10]\Y\E\S;# which will show a red NO for values greater than 20% and a green YES for values between 0 and 20%. Negative values do not have to be accounted for as the ABS removes them from consideration.
       
I've only reproduced a minimum of your sample data for demonstration purposes but perhaps you can get an idea on how to proceed from that.

nested excel functions with conditional logic

Just getting started in Excel and I was working with a database extract where I need to count values only if items in another column are unique.
So- below is my starting point:
=SUMPRODUCT(COUNTIF(C3:C94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"}))
what i'd like to figure out is the syntax to do something like this-
=sumproduct (Countif range1 criteria..., where range2 criteria="is unique value")
Am I getting this right? The syntax is a bit confusing, and I'm not sure I've chosen the right functions for the task.
I just had to solve this same problem a week ago.
This method works even when you can't always sort on the grouping column (J in your case). If you can keep the data sorted, #MikeD 's solution will scale better.
Firstly, do you know the FREQUENCY trick for counting unique numbers? FREQUENCY is designed to create histograms. It takes two arrays, 'data' and 'bins'. It sorts 'bins', then creates an output array that's one longer than 'bins'. Then it takes each value in 'data' and determines which bin it belongs in, incrementing the output array accordingly. It returns the array. Here's the important part: If a value appears in 'bins' more than once, any 'data' value meant for that bin goes in the first occurrence. The trick is to use the same array for both 'data' and 'bins'. Think it through, and you'll see that there's one non-zero value in the output for each unique number in the input. Note that it only counts numbers.
In short, I use this:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to count unique numeric values in <array>
From this, we just need to construct arrays containing numbers where appropriate and text elsewhere.
In the example below, I'm counting unique days when the color is red and the fruit is citrus:
This is my conditional array, returning 1 or true for the rows I'm interested in:
($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0))
Note that this requires ctrl-shift-enter to be used as an array formula.
Since the value I'm grouping by for uniqueness is text (as is yours), I need to convert it to numeric. I use:
MATCH($C$2:$C$10,$C$2:$C$10,0)
Note that this also requires ctrl-shift-enter
So, this is the array of numeric values within which I'm looking for uniqueness:
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
Now I plug that into my uniqueness counter:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to get:
=SUM(SIGN(FREQUENCY(
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),""),
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
)))
Again, this must be entered as an array formula using ctrl-shift-enter. Replacing SUM with SUMPRODUCT will not cut it.
In your example, you'd use something like:
=SUM(SIGN(FREQUENCY(
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),""),
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),"")
)))
I'll note, though, that scaling might be a problem on data sets as large as yours. I tested it on larger data sets, and it was fairly fast on the order of 10k rows, but really slow on the order of 100k rows, such as yours. The internal arrays are plenty fast, but the FREQUENCY function slows down. I'm not sure, but I'd guess it's between O(n log n) and O(n^2) depending on how the sort is implemented.
Maybe this doesn't matter - none of this is volatile, so it'll just need to calculate once upon refreshing the data. If the column data is changing, though, this could be painful.
Asuming the source data is sorted by the key value [A], start with determining the occurence of the key column
B2: =IF(A2=A1;B1+1;1)
Next determine a group sum
C2: =SUMIF($A$2:$A$9;A2;$B$2:$B$9)
A key is unique if its group sum is exactly 1
D2: =(C2=1)
To count records which match a certain criterium AND are unique, include column D in a =IF(AND(D2, [yourcondition];1;0) and sum this column
Another option is to asume a key unique within a sorted list if it is unequal to both its predecessor and successor, so you could find the unique records like
E2: =AND(A2<>A1;A2<>A3)
G2: =IF(AND(E2;F2="this");1;0)
E and G can of course be combined into one single formula (not sure though if that helps ...)
G2(2): =IF(AND(AND(A2<>A1;A2<>A3);F2="this");1;0)
resolving unnecessarily nested AND's:
G2(3): =IF(AND(A2<>A1;A2<>A3;F2="this");1;0)
all formulas in row 2 should be copied down to the end of the list

Resources