VBA function for Upside/Downside Capture - excel

apologies for my ignorance, I'm brand new to VBA - I'm sure this is a simple problem...
I'm trying to write a fn. for up/down side capture in VBA. This is the problem:
There are two columns. One has fund performance in % (I've labelled 'returns'). The other has index performance in % (labelled 'index'). Both are same length / same number of rows. I need both to be variables to enter to the fn.
For UpsideCapture fn., for all nos. in the index column >0, I want to find the corresponding number in the returns column (which will be on the same row). Once I have those numbers I can compound them.
I've tried using Offset, assuming the returns column is 15 columns to the left of the index column but it doesn't return anything, and I don't really want to rely on it always being 15 columns apart (it arbitrary).
Many thanks!
One of my rubbish attempts is below. Any help is much appreciated. Its really just a case of finding the correct corresponding row based on the value in the index column...
Function UpsideCapture(returns As Range, index As Range) As Variant
Dim n As Integer
Dim m As Integer
Dim i As Integer
n = returns.Rows.Count
m = index.Rows.Count
For i = 1 To m
If index(i) > 0 Then
Upsidecap = ((1 + Upsidecap) * (1 + Offset(returns(i), -15))) - 1
End If
Next
UpsideCapture = Upsidecap
End Function
example

Related

New to VBA in Excel. Trying to sum an incremented function

So what I am trying to do is take the difference between two values(x) and (y) to get (n). I then want to run (x) through a formula (n) times incrementing (x) each time. Then I want to output the sum all of those results into a cell. Cant figure out how to do it neatly within one cell like normal, so I've turned to VBA for once.
Here is what I have so far:
Sub Math()
'
'Math
'
Dim i As Integer
i = 0
Do While i < ((E42) - (d42))
cell(h42).Value = ((((d42) + i) ^ 2) * 100) / 3
End Sub
What I'm stuck on is how to get the result of each loop and sum them all together. I expect to have an i value that can range anywhere from 1-100. The only way I can think that would definitely work is messy where i would have a large number of cells in a column set aside that will calculate each of the iterations individually, then sum all of those together.
Alternatively, if theres a way to write a function that can calculate the sum(n) of ((x+n)^2)*100/3 then that would be much nicer but I cant think of how it would be written.
Here is how you can make a function (which can be used directly in worksheet formulas) to form a sum:
Function eval_sum(n As Long, x As Double) As Double
Dim s As Double, i As Long
For i = 0 To n - 1
s = s + (x + i) ^ 2
Next i
eval_sum = s * 100 / 3
End Function
This function evaluates:
100/3 * (x^2 + (x+1)^2 + (x+2)^2 + ... + (x+(n-1))^2)
It wasn't completely clear if this is what you were trying to do. If not, you can tweak the code to fit your needs.
Small point: I used Long rather than Integer. Forget that Integer exists. It is really legacy from the days of limited memory and 16-bit computers. In modern Excel an Integer is stored with 32 bits anyway. You are just risking overflow for no good reason.

How can I lookup data from one column, when the value I'm referencing changes columns?

I want to do an INDEX-MATCH-like lookup between two documents, except my MATCH's index array doesn't stay in one column.
In Vague-English: I want a value from a known column that matches another value that may be found in any column.
Refer to the image below. Let's call everything to the left of the bold vertical line on column H doc1, and the right side will be doc2.
Doc2 has a column "Find This", which will be the INDEX's array. It is compared with "ID1" from doc1 (Note that the values in "Find This" will not be in the same order as column ID1, but it's easier to undertsand this way).
The "[Result]" column in doc2 will be the value from doc1's "Want This" column from the row that matches "FIND THIS" ...However, sometimes the value from "FIND THIS" is not in the "ID1" column, and is instead in "ID2","ID3", etc.
So, I'm trying to generate Col K from Col J. This would be like pressing Ctrl+F and searching for a value in Col J, then taking the value from Col D in that row and copying it to Col K.
I made identical values from a column the same color in the other doc to make it easier to visualize where they are coming from.
Note also that in column F of doc1, the same value from doc2's "Find This" can be found after some other text.
Also note that the column headers are only there as examples, the ID columns are not actually numbered.
I would simply hard-code the correct column to search from, but I'm not in control of doc1, and I'm worried that future versions may have new "ID" columns, with other's being removed.
I'd prefer this to be a solution in the form of a formula, but VB will do.
To generate column K based on given values of column J then you could use the following:
=INDEX(doc1!$D$2:$D$14,SUMPRODUCT((doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14))-1)
Copy that formula down as far as you need to go.
It basically only returns the row of the where a matching column J is found. we then find that row in the index of your D range to get your value in K.
Proof of concept:
UPDATE:
If you are working with non unique entities n column J. That is the value on its own can be found in multiple rows and columns. Consider using the following to return the Last row where there J value is found:
=INDEX(doc1!$D$2:$D$14,AGGREGATE(14,6,(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14),1)-1)
UPDATE 2:
And to return the first row where what you are looking in column J is found use:
=INDEX($D$2:$D$14,AGGREGATE(15,6,1/($B$2:$H$14=J2)*ROW($B$2:$H$14)-1,1))
Thanks to Scott Craner for the hint on the minimum formula.
To determine if you have UNIQUE data from column J in your range B2:H14 you can enter this array formula. In order to enter an array formula you need to press CTRL+SHFT+ENTER at the same time and not just ENTER. You will know you have done it right when you see {} around your formula in the formula bar. You cannot at the {} manually.
=IF(MAX(COUNTIF($B$2:$H$14,J2:J14))>1,"DUPLICATES","UNIQUE")
UPDATE 3
AGGREGATE - A relatively new function to me but goes back to Excel 2010. Aggregate is 19 functions rolled into 1. It would be nice if they all worked the same way but they do not. I think it is functions numbered 14 and up that will perform the same way an array formula or a CSE formula if you prefer. The nice thing is you do not need to use CSE when entering or editing them. SUMPRODUCT is another example of a regular formula that performs array formula calculations.
The meat of this explanation I believe is what is happening inside of the AGGREGATE brackets. If you click on the link you will get an idea of what the first two arguments are. The first defines which function you are using, and the second tell AGGREGATE how to deal with Errors, hidden rows, and some other nested functions. That is the relatively easy part. What I believe you want to know is what is happening with this:
(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14)
For illustrative purpose lets reduce this formula to something a little smaller in scale that does the same thing. I'll avoid starting in A1 as that can make life a little easier when counting since it the 1st row and first column. So by placing the example range outside of it you can see some more special considerations potentially.
What I want to know is what row each of the items list in Column C occurs in column B
| B | C
3 | DOG | PLATYPUS
4 | CAT | DOG
5 | PLATYPUS |
The full formula for our mini example would be:
{=($B$3:$B$5=C2)*ROW($B$3:$B$5)}
And we are going to look at the following as an array
=INDEX($B$3:$B$5,AGGREGATE(14,6,($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
So the first brackets is going to be a Boolean array as you noted. Every cell that is TRUE will TRUE until its forced into a math calculation. When that happens, True becomes 1 and False becomes 0.I that formula was entered as a CSE formula and place in D2, it would break down as follows:
FALSE X 3
FALSE X 4
TRUE X 5
The 3, 4 and 5 come from ROW() returning the value of the row number that it is dealing with at the time of the array math operation. Little trick, we could have had ROW(1:3). Just need to make sure the size of the array matches! This is not matrix math is just straight across multiplication. And since the Boolean is now experiencing a math operation we are now looking at:
0 X 3 = 0
0 X 4 = 0
1 X 5 = 5
So the array of {0,0,5} gets handed back to the aggregate for more processing. The important thing to note here is that it contains ONLY 0 and the individual row numbers where we had a match. So with the first aggregate formula, formula 14 was chosen which is the LARGE function. And we also told it to ignore errors, which in this particular case does not matter. So after providing the array to the aggregate function, there was a ,1) to finish off the aggregate function. The 1 tells the aggregate function that we want the 1st larges number when the array is sorted from smallest to largest. If that number was 2 it would be the 2nd largest number and so on. So the last row or the only row that something is found on is returned. So in our small example it would be 5.
But wait that 5 was buried inside another function called Index. and in our small example that INDEX formula would be:
=INDEX($B$3:$B$5,AGGREGATE(...)-2)
Well we know that the range is only 3 rows long, so asking for the 5th row, would have excel smacking you up side the head with an error because your index number is out of range. So in comes the header row correction of -1 in the original formula or -2 for the small example and what we really see for the small example is:
=INDEX($B$3:$B$5,5-2)
=INDEX($B$3:$B$5,3)
and here is a weird bit of info, That last one does not become PLATYPUS...it becomes the cell reference to =B5 which pulls PLATYPUS. But that little nuance is a story for another time.
Now in the comments Scott essentially told me to invert for the error to get the first row. And this is important step for the aggregate and it had me running in circles for awhile. So the full equation for the first row option in our mini example is
=INDEX($B$3:$B$5,AGGREGATE(15,6,1/($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
And what Scott Craner was actually suggesting which Skips one math step is:
=INDEX($B$3:$B$5,AGGREGATE(15,6,ROW($B$3:$B$5)/($B$3:$B$5=C2),1)-2)
However since I only realized this after writing this all up the explanation will continue with the first of these two equations
So the important thing to note here is the change from function 14 to function 15 which is SMALL. Think of it a finding the minimum. And this time that 6 plays a huge factor along with the 1/. So our array in the middle this time equates to:
1/FALSE X 3
1/FALSE X 4
1/TRUE X 5
Which then becomes:
1/0 X 3
1/0 X 4
1/1 X 5
Which then has excel slapping you up side the head again because you are trying to divide by 0:
#div/0 X 3
#div/0 X 4
1/1 X 5
But you were smart and you protected yourself from that slap upside the head when you told AGGREGATE to ignore error when you used 6 as the second argument/reference! Therefore what is above becomes:
{5}
Since we are performing a SMALL, and we passed ,1) as the closing part of the AGGREGATE, we have essentially said give me the minimum row number or the 1st smallest number of the resulting array when sorted in ascending order.
The rest plays out the same as it did for the LARGE AGGREGATE method. The pitfall I fell into originally is I did not use the 1/ to force an error. As a result, every time I tried getting the SMALL of the array I was getting 0 from all the false results.
SUMPRODUCT works in a very similar fashion, but only works when your result array in the middle only returns 1 non zero answer. The reason being is the last step of the SUMPRODUCT function is to all the individual elements of the resulting array. So if you only have 1 non zero, you get that non zero number. If you had two rows that matched for instance 12 and 31, then the SUMPRODUCT method would return 43 which is not any of the row numbers you wanted, where as aggregate large would have told you 31 and aggregate small would have told you 12.
Something like this maybe, starting in K2 and copied down:
=IFERROR(INDEX(D:D,MAX(IFERROR(MATCH(J2,B:B,0),-1),IFERROR(MATCH(J2,E:E,0),-1),IFERROR(MATCH(J2,G:G,0),-1),IFERROR(MATCH(J2,H:H,0),-1))),"")
If you want to keep the positions of the columns for the Match variable, consider creating generic range names for each column you want to check, like "Col1", "Col2", "Col3". Create a few more range names than you think you will need and reference them to =$B:$B, =$E:$E etc. Plug all range names into Match functions inside the Max() statement as above.
When columns are added or removed from the table, adjust the range name definitions to the columns you want to check.
For example, if you set up the formula with five Matches inside the Max(), and the table changes so you only want to check three columns, point three of the range names to the same column. The Max() will only return one result and one lookup, even if the same column is matched several times.
I came up with a vba solution if I understood correctly:
Sub DisplayActiveRange()
Dim sheetToSearch As Worksheet
Set sheetToSearch = Sheet2
Dim sheetToOutput As Worksheet
Set sheetToOutput = Sheet1
Dim search As Range
Dim output As Range
Dim searchCol As String
searchCol = "J"
Dim outputCol As String
outputCol = "K"
Dim valueCol As String
valueCol = "D"
Dim r As Range
Dim currentRow As Integer
currentRow = 1
Dim maxRow As Integer
maxRow = sheetToOutput.UsedRange.Rows.Count
For currentRow = 1 To maxRow
Set search = Range("J" & currentRow)
For Each r In sheetToSearch.UsedRange
If r.Value <> "" Then
If r.Value = search.Value Then
Set output = sheetToOutput.Range(outputCol & currentRow)
output.Value = sheetToSearch.Range(valueCol & currentRow).Value
currentRow = currentRow + 1
Set search = sheetToOutput.Range(searchCol & currentRow)
End If
End If
Next
Next currentRow
End Sub
There might be better ways of doing it, but this will give you what you want. We assume headers in both "source" and "destination" sheets. You will need to adapt the "Const" declarations according to how your sheets are named. Press Control & G in Excel to bring up the VBA window and copy and paste this code into "This Workbook" under the "VBA Project" group, then select "Run" from the menu:
Option Explicit
Private Const sourceSheet = "Source"
Private Const destSheet = "Destination"
Public Sub FindColumns()
Dim rowCount As Long
Dim foundValue As String
Sheets(destSheet).Select
rowCount = 1 'Assume a header row
Do While Range("J" & rowCount + 1).value <> ""
rowCount = rowCount + 1
foundValue = FncFindText(Range("J" & rowCount).value)
Sheets(destSheet).Select
Range("K" & rowCount).value = foundValue
Loop
End Sub
Private Function FncFindText(value As String) As String
Dim rowLoop As Long
Dim colLoop As Integer
Dim found As Boolean
Dim pos As Long
Sheets(sourceSheet).Select
rowLoop = 1
colLoop = 0
Do While Range(alphaCon(colLoop + 1) & rowLoop + 1).value <> "" And found = False
rowLoop = rowLoop + 1
Do While Range(alphaCon(colLoop + 1) & rowLoop).value <> "" And found = False
colLoop = colLoop + 1
pos = InStr(Range(alphaCon(colLoop) & rowLoop).value, value)
If pos > 0 Then
FncFindText = Mid(Range(alphaCon(colLoop) & rowLoop).value, pos, Len(value))
found = True
End If
Loop
colLoop = 0
Loop
End Function
Private Function alphaCon(aNumber As Integer) As String
Dim letterArray As String
Dim iterations As Integer
letterArray = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
If aNumber <= 26 Then
alphaCon = (Mid$(letterArray, aNumber, 1))
Else
If aNumber Mod 26 = 0 Then
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations - 1, 1)) & (Mid$(letterArray, 26, 1))
Else
'we deliberately round down using 'Int' as anything with decimal places is not a full iteration.
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations, 1)) & (Mid$(letterArray, (aNumber - (26 * iterations)), 1))
End If
End If
End Function

Get maximum value of columns in Excel with macro

First of all I have no idea of writing macros in excel, but now I have to write a code for a friend. So here we go.
In my excel sheet I have a table which holds some producers as columns and 12 months of the year as rows. In their intersecting cell it's written the amount of products produced by the producer during that month. Now I need to find maximum and minimum values of produced goods within each month and output the producers of that goods.
I found a code for a similar problem, but I don't understand it clearly and it has errors.
Here is the code:
Sub my()
Dim Rng As Range, Dn As Range, Mx As Double, Col As String
Set Rng = Range(Range("A1"), Range("A6").End(xlUp))
ReDim ray(1 To Rng.Count)
For Each Dn In Rng
Mx = Application.Max(Dn)
Select Case Mx
Case Is = Dn.Offset(, 0): Col = "A"
Case Is = Dn.Offset(, 1): Col = "B"
Case Is = Dn.Offset(, 2): Col = "C"
Case Is = Dn.Offset(, 3): Col = "D"
End Select
ray(Dn.Row - 1) = Col
Next Dn
Sheets("Sheet2").Range("A2").Resize(Rng.Count) = Application.Transpose(ray)
End Sub
I get the following error:
Run-time error'9': Subscript out of range.
So my question is, what does this error mean and what do I need to change in this code to work?
EDIT1:
OK, now the error is gone. But where do I get the results?
EDIT2
I know this line is responsible for inserting the results in specified place, but I cant see them after execution. What's wrong with that?
Error means the array you are trying to access has not been defined to contain the ordinal you're looking for: For example Array 10 has positions 0-9 so if I try and access array(10) it would throw that error or array(-1) it would throw that error.
I can't remember if excel is 0 or 1 based arrays.
Possibly change
ray(Dn.Row - 1) = Col
to
if dn.row-1<> 0 then ' or set it to <0 if zero based.
ray(Dn.Row - 1) = Col
end if
You don't need VBA (a macro) to do this. It can be done using a worksheet formula.
E.g.
If your producers are P1,P2,P3,P4 and your sheet looks like this:-
A B C D E F
+-------------------------------------------
1 | Month P1 P2 P3 P4 Top Producer
2 | Jan 5 4 3 2
3 | Feb 2 3 5 1
4 | Mar 6 4 4 3
...
...
The following formula placed in cells F2,F3,F4,... will pick out the top producer in each month.
=INDEX($B$1:$E$1,MATCH(MAX(B2:E2),B2:E2,0))
Generally it's better to try and use built in Excel functionality where possible. Resort to VBA only if you really need to. Even if you were to use the top producer/month data for some other operation which is only possible in VBA, at least the top producer/month data derivation is done for you by the worksheet, which will simplify the VBA required for the whole process.
Transposing a range can also be done using a worksheet formula by using the TRANSPOSE() function.
BTW - I'm not sure what you want to do if two producers have the same output value. In the VBA example in your question, the logic seems to be:- if two producers are joint top in a month, pick the first one encountered. The formula I've given above should replicate this logic.
I have used these functions quite extensively and they are very reliable and fast:
Public Function CountRows(ByRef r As Range) As Integer
CountRows = r.Worksheet.Range(r, r.End(xlDown)).Rows.Count
End Function
Public Function CountColumns(ByRef r As Range) As Integer
CountColumns = r.Worksheet.Range(r.End(xlToRight), r).Columns.Count
End Function
Give it a reference (ex. "A2") and it will return the filled cells down, or the the right until and empty cell is found.
To select multiple sells I usually do something like
Set r = Range("A2")
N = CountRows(r)
Set r = r.Resize(N,1)

Excel VBA - Referring between ranges

Here's my problem:
I have two ranges, r_products and r_ptypes which are from two different sheets, but of same length i.e.
Set r_products = Worksheets("Products").Range("A2:A999")
Set r_ptypes = Worksheets("SKUs").Range("B2:B999")
I'm searching for something in r_products and I've to select the value at the same position in r_ptypes. The result of Find method is being stored in cellfound. Now, consider the following data:
Sheet: Products
A B C D
1 Product
2 S1
3 P1
4 P2
5 S2
6 S3
Sheet: SKUs
A B C D
1 SKU
2 S1-RP003
3 P1-BQ900
4 P2-HE300
5 S2-NB280
6 S3-JN934
Now, when I search for S1, cellfound.Row gives me value 2, which is, as I understand, 2nd row in the total worksheet, but is actually 1st row in the range(A2:A999).
When I use this cellfound.Row value to refer to r_ptypes.cells(cellfound.Row), It is taking it as an Index value and returns B3 (P1-BQ900) instead of what I want, i.e. B2 (S1-RP003).
My question is how'll I find out the index number in cellfound? If not possible, how can I use Row number to extract data from r_ptypes?
Dante's solution above works fine. Also, I managed to get the index value using built in excel function Match instead of using Find method of a range. Listing it here for reference.
indexval = Application.WorksheetFunction.Match("searchvalue", r_products, 0)
Using the above, I'm now able to refer the rows in r_ptypes
skuvalue = r_ptypes.Rows(indexval).Value
Because .Row always returns the absolute row number of a sheet, not the offset (i.e. index) in the range.
So, just do some minus job to deal with it.
For you example,
r_ptypes.Cells(cellfound.Row - r_ptypes.Cells(1).Row + 1)
or a little bit neat (?)
With r_ptypes
.Cells(cellfound.Row - .Cells(1).Row + 1)
End With
That is, get the row difference between cellfound and the first cell and + 1 because Excel counts cells from 1.

SumProduct over sets of cells (not contiguous)

I have a total data set that is for 4 different groupings. One of the values is the average time, the other is count. For the Total I have to multiply these and then divide by the total of the count. Currently I use:
=SUM(D32*D2,D94*D64,D156*D126,D218*D188)/SUM(D32,D94,D156,D218)
I would rather use a SumProduct if I can to make it more readable. I tried to do:
=SUMPRODUCT((D2,D64,D126,D188),(D32,D94,D156,D218))/SUM(D32,94,D156,D218)
But as you can tell by my posting here, that did not work. Is there a way to do SumProduct like I want?
I agree with the comment "It might be possible with masterful excel-fu, but even if it can be done, it's not likely to be more readable than your original solution"
A possible solution is to embed the CHOOSE() function within your SUMPRODUCT (this trick actually is pretty handy for vlookups, finding conditional maximums, etc.).
Example:
Let's say your data has eight observations and is in two columns (columns B and C) but you don't want to include some observations (exclude observations in rows 4 and 5). Then the SUMPRODUCT code looks like this...
=SUMPRODUCT(CHOOSE({1,2},A1:A3,A6:A8),CHOOSE({1,2},B1:B3,B6:B8))
I actually thought of this on the fly, so I don't know the limitations and as you can see it is not that pretty.
Hope this helps! :)
It might be possible with masterful excel-fu, but even if it can be done, it's not likely to be more readable than your original solution. The problem is that even after 20+ years, Excel still borks discontinuous ranges. Naming them won't work, array formulas won't work and as you see with SUMPRODUCT, they don't generally work in tuple-wise array functions. Your best bet here is to come up with a custom function.
UPDATE
You're question got me thinking about how to handle discontinuous ranges. It's not something I've had to deal with much in the past. I didn't have the time to give a better answer when you asked the question but now that I've got a few minutes, I've whipped up a custom function that will do what you want:
Function gvSUMPRODUCT(ParamArray rng() As Variant)
Dim sumProd As Integer
Dim valuesIndex As Integer
Dim values() As Double
For Each r In rng()
For Each c In r.Cells
On Error GoTo VBAIsSuchAPainInTheAssSometimes
valuesIndex = UBound(values) + 1
On Error GoTo 0
ReDim Preserve values(valuesIndex)
values(valuesIndex) = c.Value
Next c
Next r
If valuesIndex Mod 2 = 1 Then
For i = 0 To (valuesIndex - 1) / 2
sumProd = sumProd + values(i) * values(i + (valuesIndex + 1) / 2)
Next i
gvSUMPRODUCT = sumProd
Exit Function
Else
gvSUMPRODUCT = CVErr(xlErrValue)
Exit Function
End If
VBAIsSuchAPainInTheAssSometimes:
valuesIndex = 0
Resume Next
End Function
Some notes:
Excel enumerates ranges by column then row so if you have a continuous range where the data is organized by column, you have to select separate ranges: gvSUMPRODUCT(A1:A10,B1:B10) and not gvSUMPRODUCT(A1:B10).
The function works by pairwise multiplying the first half of cells with the second and then summing those products: gvSUMPRODUCT(A1,C3,L2,B2,G5,F4) = A1*B2 + C3*G5 + L2*F4. I.e. order matters.
You could extend the function to include n-wise multiplication by doing something like gvNSUMPRODUCT(n,ranges).
If there are an odd number of cells (not ranges), it returns the #VALUE error.
Note that sumproduct(a, b) = sumproduct(a1, b1) + sumproduct(a2, b2) where range a is split into ranges a1 and a2 (and similar for b)
It might be helpful to create an intermediate table that summarizes the data that you are using to calculate the sum product. That would also make the calculation easier to follow.

Resources