Replace Excel vlookup with a dynamic formula - excel

I have created a schedule where I am trying to lookup the value of rent for a period of time based on a given date (located in cell B1).
For example, I have the following data set:
Rent Change Date is the date when the rent increases for a specified tenant
Amount is the amount it increases to on the specified Rent Change Date
The schedule off to the right is the monthly rent schedule as dictated by the Rent Change Date
I am currently using a VLOOKUP to identify the range for each tenant using TRUE (or approximate match) to find the rent for the current month (as dictated by the date in B1).
Sample (located in cell G5):
=VLOOKUP(G4, C5:D10, 2, TRUE).
For each tenant, I then reset the table_array range. This works well for a small data set, but I have been searching for a way to set the range automatically. Is there an efficient way to get all of the Rent Change Dates by tenant? Maybe an Excel array formula?

So I'm still not 100% sure I understand what you're looking for, but here is what I came up with.
Using this formula in cell G5 (committing it with Ctrl + Shift + Enter as it's an array formula), and then copying the formula to all other cells in the G5:R7 range, you should get what you're looking for.
Array Formula:
=IF(AND(G$4>=MIN(IF((--($A$5:$A$19=$F5))=0,2,1)*($C$5:$C$19)),G$4<=EOMONTH(MAX(($A$5:$A$19=$F5)*($C$5:$C$19)),11)),INDEX($D$5:$D$19,MATCH($F5&G$4,$A$5:$A$19&$C$5:$C$19,1),1),0)
Example Results:
Result when a tenant is commencing during the year
Result when a tenant is occupied for the whole year
Result when a tenant is expiring during the year
Explanation:
There's a lot going on with this formula, and it's possible there's a simpler way, but I'll do my best to explain. After all, learning isn't just about getting an answer, but also an explanation.
Essentially what you're looking at is a large logical IF function (with three arguments: logical_test, [value_if_true], [value_if_false]). To understand its arguments better, let's break them down:
logical_test
AND(G$4>=MIN(IF((--($A$5:$A$19=$F5))=0,2,1)*($C$5:$C$19)),G$4<=EOMONTH(MAX(($A$5:$A$19=$F5)*($C$5:$C$19)),11))
I started with the AND function because I am attempting to find if a given date (G$4) is either the same as or greater than the first Rent Change Date for a Tenant (this is found using the MIN function) and if the same given date (G$4) is either the same as or less than the end of the month (EOMONTH function) 11 months in the future from the last Rent Change Date for a Tenant.
The embedded IF function within the MIN function is simply there to ensure the correct minimum date is returned. Removing this logical causes the minimum to return 0, which is incorrect for our needs.
I used the EOMONTH function due to the assumption that when the rent changes - including the final rent change - it lasts for a year. Failing to add this piece to the formula would end the rent the same month as the Rent Change Date.
If both statements within the AND function return TRUE, the logical_test then proceeds to the [value_if_true] argument.
[value_if_true]
INDEX($D$5:$D$19,MATCH($F5&G$4,$A$5:$A$19&$C$5:$C$19,1),1)
INDEX(array, row_num, [column_num])
MATCH(lookup_value, lookup_array, [match_type])
COMBINED:
INDEX(array, MATCH(lookup_value, lookup_array, [match_type]), [column_num])
Using the INDEX and MATCH functions together is allowing us to look at the data from any angle without having constraints placed on us. In this case, our array argument - $D$5:$D$19 - spans the entire area of our Amount column. This is where we want to return our value from, so it's important that we cover the full area.
In the place of row_num we use the MATCH function. In our table spanning F4:S8 (which includes the column and row names) we are given both the tenant name ($F5) as well as a reference date (G$4). Combining these with an ampersand (&), we now have a concatenated lookup_value for MATCH. By concatenating these two together, we are increasing the likelihood that our lookup_value is unique and will return us the required information.
If two tenants happen to have the same Rent Change Dates but different names OR if two tenants happen to have the same name but different Rent Change Dates, we will get a unique match; in the rare instance that two tenants share the same name and Rent Change Dates, one tenant will need to have something different to stand out, such as a unit number in the name
Now that we have our lookup_value for MATCH, we need to supply a lookup_array. Given that our lookup_value is a combination of the tenant name ($F5) and a reference date (G$4), we set our lookup_array for MATCH to be a concatenation of $A$5:$A$19 (spans the entire area of our Tenant column) and $C$5:$C$19 (spans the entire area of our Rent Change Date column), joined together with an ampersand (&).
The last argument our MATCH function needs is the [match_type]. For this formula I chose 1 - Less than (Finds the largest value that is less than or equal to lookup_value. The values in the lookup_array argument must be placed in ascending order) due to the fact that we are looking for a date that is either the date itself or one that is less than it as well as the fact that our dates are in ascending order for each Tenant. If we instead looked for an exact match ([match_type] set to 0 - Exact match), we would receive a lot of errors since the Rent Change Dates increase annually, not monthly (like our reference dates in G4:R4). Similarly, looking for a greater than match ([match_type] set to -1 - Greater than) also returns a lot of errors, primarily because the dates aren't in the required descending order.
Closing out of the MATCH function with a ), we return to the [column_num] of the INDEX function from earlier. While this argument is optional with a one-column array, I entered a 1 for clarity. All this means is that once the MATCH function determines which row to grab, I want to get the intersection of [row_num] (which is 1 in the case of G5 using the presented formula along with 2/1/2018 as the date in B1) and [column_num] (1) from the array (the Amount column, $D$5:$D$19). Row 1, Column 1 from the array is $150.00, which is exactly the expected result.
The last piece of our formula is to finish off the IF statement we started with, entering a value for the [value_if_false] argument.
[value_if_false]
0
In the case of this IF function, we simply entered 0 for the [value_if_false] argument. I chose 0 because if the tenant hasn't yet commenced or has expired, we want to reflect a total rent of $0.00 for a given month.
Hopefully this all makes sense and is what you're looking for.

If you create a pivot table based on your data table it can automatically put all the dates across the top row (and you can group by quarter, months, years, days if you so desire).
If you want to search for months, but only have rent rates for years in your source data you may need to use sumproduct. (A pivot table will only include dates where they exist in the source data that I'm aware of, it won't add all dates covered by a range unless explicitly included in that range)
It would like something like below;
=Sumproduct($c$5:$c$300,--($a$5:$a$300=$a2),--($b5:$b$300>b$1),--($b5:$b$300<c$1))
Assumption here is the grid showing data across the top starts from cell a1, and the source data starts from a5.
Because sumproduct effectively is an array formula (you don't need to use Ctrl + shift + enter, it just behaves the same way as an array formula) it's generally not recommended to apply it to a full column for performance reasons.

Related

Find the weighted average of two rows based on another containing dates

I have a table in which one row contains dates, another row contains AHT (Avg Handle Time) and the third row contains no of calls handled.
I have situation where I need to find the Weighted average for each week in another table. I am able to find simple average for each week. However not getting this weighted average for each week.
Thanks
This should work in most versions of Excel
=SUMPRODUCT(INDEX($3:$3,MATCH("Week "&A9,$1:$1,0)):INDEX($3:$3,MATCH("Week "&A9,$1:$1,0)+6),
INDEX($5:$5,MATCH("Week "&A9,$1:$1,0)):INDEX($5:$5,MATCH("Week "&A9,$1:$1,0)+6))
/SUM(INDEX($3:$3,MATCH("Week "&A9,$1:$1,0)):INDEX($3:$3,MATCH("Week "&A9,$1:$1,0)+6))
May need to be array-entered pre Excel 365.
Notes
The weighted mean formula is
Weighted mean = Σwx/Σw
where are the weights and x are the values. So in this case, from OP's comment, the third row is the weights (number of calls) and the last row is the values (AHT).
The easiest way to get Σwx is to use Sumproduct, and to get Σw is just to use Sum. So the basic formula for (say) week 40 would be simply
=SUMPRODUCT(A3:G3,A5:G5)/SUM(A3:G3)
However, I reasoned that it would be inconvenient to re-write this formula for each different week, so I have used Match to find the starting column of each week from row 1 , then index to find the corresponding position in either row 3 or 5 (let's call it startpos), then index again to find the position six places to the right of startpos (let's call it endpos). The required range to be placed in each part of the short formula above is therefore startpos:endpos (I can use this notation because startpos and endpos, the values returned from the Index function, are both references).
If Excel 365 is available, this can all be expressed much more succinctly and clearly using Let to assign variables names to each part of the formula.
=LET(startCol,MATCH("Week "&A9,$1:$1,0),
startWeight,INDEX($3:$3,startCol),
endWeight,INDEX($3:$3,startCol+6),
startValue,INDEX($5:$5,startCol),
endValue,INDEX($5:$5,startCol+6),
weightRange,startWeight:endWeight,
valueRange,startValue:endValue,
SUMPRODUCT(weightRange,valueRange)/SUM(weightRange))

Invoice total in row n, payment in row n+k, using dates

I have a table of payments to calculate interest. The column where the payment is applied takes its values from the column where the invoice totals are listed, however, the payment is applied k days after the period ends.
I had partial success doing it using this formula:
IFERROR(INDEX($F$12:$F$25,MATCH(D12,$G$12:$G$25,1)),0)
Where column G is a helper column with the dates of payment, basically period end + k, but since it only accounts for the period end, in monthly and semimonthly periods, sometimes the nearest lower date of payment was on the same period, so I MUST also account for the period start for this to not happen. I've been helped using an array formula like this:
=IFERROR(INDEX(F:F,SMALL(IF(F$12:F12>0,ROW(F$12:F12)),COUNT(1/((C$12:C12-C$12>C$7)+(D$12:D12-C$12>C$7))))),0)
And it works well and it does not require a helper column. But since it's an array formula, and this table is not for my usage, that's not suitable.
I would like to know if I can do this without an array formula and built-in excel 2013 functions.
Edit:
This formula does it:
=SUMPRODUCT(($D$12:$D$25+$C$7>=C12)*($D$12:$D$25+$C$7<=D12)*($F$12:$F$25))
But if there are blanks in column D that result from a formula, it returns an error. So the following formula is more stable:
=SUMIFS($F$12:$F$25,$D$12:$D$25,">="&C12-$C$7,$D$12:$D$25,"<="&D12-$C$7)
This one effectively places the payment rows within the range of dates it belongs to.
At least you can replace SMALL with AGGREGATE like this:
=IFERROR(INDEX(F:F;AGGREGATE(15;6;(ROW(F$12:F12)/(F$12:F12>0));COUNT(1/((C$12:C12-C$12>C$7)+(D$12:D12-C$12>C$7)))));0)
AGGREGATE(15;6;;) is the same as SMALL, but it ignores errors. This lets you switch out the IF(F$12:F12>0,ROW(F$12:F12)) for the quotient (ROW(F$12:F12)/(F$12:F12>0)). Every row-number divided by a FALSE produces an error which gets ignored by AGGREGATE.
For the COUNT-part I cannot say, what it is doing, as my results don't look like yours (i copied your formula).
I guess it works as an offset.
Can you test my formula and tell me if it is working without cse? Else we need to find a replacement for the COUNT-part. I currently cant test it as I am on Office 365 and don't know if i can activate the old CSE-functionality.

I want to use sumproduct with two different tables based on selection

I am working on a statistical model where we use sumproduct to generate forecast values by multiplying coefficients in one table with variables in another. Right now it is being done manually and that is taking time. I would like to automate it but I'm not able to figure this out.
We are using concatenate to identify different rows to use for vlookup. The variable columns are the same in number for both tables. I need to multiply each variable cell respectively in both tables and sum them, hence sumproduct.
this is what I am trying to do
Forecast model 1 sales for product A in phones in USA = sumproduct([variables by year from table 1 for USA for phones], [Variables for USA phone product A model 1 from table 2] )
I hope someone can help me.
Proof of Concept
You will need to update the references to suit your spreadsheet table locations.
In cell E21 use the following and copy right and down as required:
=SUMPRODUCT(INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0),INDEX($F$15:$H$18,MATCH($A21&$C21&$D21&MID(E$20,16,1),$A$15:$A$18,0),0))
This process was simplified because you had a unique ID tag on each of the previous two tables that could be built from the information in the third table. If you ever get into double digit forecast models the MID() function part of the formula will need to be modified. The 16 in the mid function refers to the character location of the number in the forecast model sales header name in Table 3. As such you either need to keep that header format exactly the same or modify the position of the number in the MID() function.
UPDATE 1
Explanation of Formulas
The following formulas were used in this solution:
SUMPRODUCT
INDEX
MATCH
MID
Concatenate
I will start with the assumption that you already understand sumproduct() as you were already using it before you ran into your problem. One thing to note about sumproduct is that it causes array like calculation to occur on the portion within it brackets. In this case we fed it two ranges of equal size. The difficult part was more an issue of determining those ranges.
Using your ID columns as a lookup row we used the match() function to determine which row to use. For the first set of variables we used the following to determine which row to look in:
=MATCH($B21&$A21&$C21,$A$3:$A$12,0)
Match is made up of three arguments inside the brackets:
MATCH(what to look for, where to look, type of match)
What we need to look for in table is a concatenation of various cells in Table 3 to build the ID in Table 1. It could have been written using the full formula:
=CONCATENATE($B21,$A21,$C21)
but the short form using & was used instead:
=$B21&$A21&$C21
Once we had what to look for we needed the range of where to look and supplied the ID column from table 1:
$A$3:$A$12
This now leaves the third and final argument of what type of search to perform. An exact match seemed to be the most appropriate match to perform so the value of 0 was supplied. What match returns is the row within the supplied range. It is relative to the range supplied and not the actual row in the spreadsheet. If it cannot make a match it will return an error instead of a row number.
Now that we know what row we want, we can use this information with the INDEX() function. The INDEX() function is made up of 3 arguments as well with the third argument being optional depending on if a 1D or 2D range is being indexed:
INDEX(Range to work with, 2D Row or 1D Position reference, 2D Column reference)
IN the case we are dealing with for the first table, the range to work with was your list of variables:
$G$3:$I$12
This is a 2D range. As such we need to tell INDEX() both what Row to look in as well as which Columns to look in. For the row to look in, we used the previously discussed MATCH() function. Since we want all columns and not just a specific column we use the value of 0. If Match returns an error, or if a number greater than the number of rows or columns selected is supplied, INDEX() will return an error. Based on the information discussed, the index function would look like:
=INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0)
You can try entering the above in a cell but it will give you an error. if you select three adjacent cells in the same row and use CONTROL+SHIFT+ENTER when entering the formula, Excel will add {} around the formula and it will be an array formula and should show you the three variables being used.
The same process as described above can be used for determining the second range of variable from Table 2. The only difference here is that the forecast model number was not in a column of its own but instead in the header row surrounded by text. As such the MID() function needed to be used to go into the header row, bypass the surrounding text and pull the model number out so it could be used as part of the CONCATENATION() used for the "what to look for" in MATCH():
=MID(E$20,16,1)
The MID() function work again with three arguments:
MID(Text to look in, which character to start at, how many characters to pull)
So in this case we are looking at the header in E20. Note the lock $ on the row number so the formula is always looking in row 20 no matter how far down it gets copied. It is then going to the 16th character. In this case the character "1" and pulling 1 character. If the header had just been 1 and 2, there would be no need for the MID function and the cell (with proper lock) could have been used.

Counting the number of occurences of a value dynamically

Not sure if I've worded the question correctly... but, I have a spreadsheet that imports data across with a 'transaction date' and on day 1 there may be 15 transactions, day 2 there may be 30 etc.
Now I already have a formula that is counting how MANY are imported each day
=SUMPRODUCT((MONTH('Further Evidence'!$A$2:$A$5000)=MONTH(DATEVALUE(Configuration!H2&" 1")))*('Further Evidence'!$A$2:$A$5000<>""))
That shows how many have come in that particular month, what I need to work out now is what the highest intake was during that month (and if possible, which day it was).
Rather than list 365 days of the year and doing a countif in every cell next to them, is there an intuitive way to only count values that exist in the list?
It will be simple for one of you, but I can't quite figure it out or what to google :)
edit -
=MAX(FREQUENCY('New Appeals'!A2:A5000,MONTH('New Appeals'!A2:A5000)))
This works for the whole list of dates, but how can I make it check months specifically, or pinpoint the specific day?
To find the max value within a given month you can use an array formula like below
I've used a sample range of rows 36 to 48. I've assumed that date is in column I and that transactions is in column J
=MAX(IF(TEXT($I$36:$I$48, "mmm")="jan", $J$36:$J$48, ""))
(To enter an array formula you have to press ctrl + shift + enter when you are in the cell)
This is restricting the MAX function to the month of jan.
You can then find the day associated to this max value by using another array formula that is a mix of first MATCH then INDEX. The MATCH first looks for the max value within the range of cells associated to the given month, then returns this position. This position is then used in the INDEX to return the date
=INDEX($I$36:$I$48, MATCH(K34, IF(TEXT($I$36:$I$48, "mmm")="jan", $J$36:$J$48, "")))
Please note that if you have two days within a month with the same max then it will just bring back the first one
Hope this helps

excel count/sum stop count/sum match?

I have tried to see if this question has been asked before, but I can't seem to find an answer.
I have a column of cells (>3000 rows), with either a value of 1 or 0 (call this column A). The value will depend on the value in column B (which will contain either a value or nothing). The values in column B are a SUMIFS function based, summing from column C, and based on months in column D.
The values in B are paid out on the first business day of the next month. So, the SUMIFS function will calculate the dates that match the last month. This works well in theory, however, not every first business day is the first day of the month. This leads the SUMIFS function to not include everything in the correct month, and allows for some discrepancy, which, when you are dealing with people's money is not great. Further, this discrepancy is compounded across multiple periods (in some cases, there are over 100 periods, and a discrepancy of $1 in period 1 amounts to nearly $1000 in period 100)
What I am wondering is:
Is there any way that I can tell the SUMIFS function (column B) to stop when the value in column A is 0? This would tell the SUM function start the summing from the current value in column B and continue the function to the cell below the preceding value in column B.
I've seen suggestions that the MATCH function may work, but I can't see how to do this with either COUNT or SUM.
For security reasons, this solution needs to be entered into the cell, and can't be VBA. Also, it can't be too large, as it will need to be replicated across 200 worksheets in the workbook (not my file originally, and I would have done it differently, but that is another story). There is no problem entering another column or two if that is required.
Any help is gratefully appreciated.
EDIT:
Unfortunately, I can't post an image of the screenshot. I've included a similar screenshot (columns are not the same layout, but hopefully it gives the idea) here:
Rates calculations
The SUMIF formula is (for B2)
=SUMIFS(C2:C35,D2:D35,D2-1,A2:A35,1)
This works fine if I want all the values in the month, irrelevant of when the payment was made.
However, what I need the formula to do is:
SUM (C2:C35,D2:D35,D2-1, but stop when the first 0 is encountered in A2:A35)
Thanks
The INDEX function can provide a valid cell reference to stop using a MATCH function to find an exact match on 0.
This formula is a bit of a guess as there was no sample data to reference but I believe I have understood your parameters.
=SUMIFS(C2:index(C2:C35, match(0, A2:A35, 0)), D2:index(D2:D35, match(0, A2:A35, 0)), D2-1)
This seems to be something that will stand-alone and not be filled down so I have left the cell addresses relative as per your sample(s).

Resources