Way to overcome Excel Vlookup function limit of 256 characters - excel

I have a excel array with multiple values. Some are less than 256 characters and some have a length greater than 256.
When I tried to do a VLookup using a sample string, I can get results when it matches the rows with less than 256 characters. For rows greater that 256 characters, it returns a '#N/A'.
Is there a way of using Vlookup or by using some other in-built function in Excel that I can overcome this limit?

If you are using VLOOKUP like this
=VLOOKUP(A2,D2:Z10,3,FALSE)
i.e. looking up A2 in D2:D10 and returning a result from F2:F10 then try this formula instead
=INDEX(F2:F10,MATCH(TRUE,INDEX(D2:D10=A2,0),0))
change ranges as required
Edit:
I mocked up a sample here - values in A2:A10 are the same as G2:G10 but in a different order. The length of each of those values is shown in column B, the VLOOKUP in column C fails on col A values > 255 chars but the INDEX/MATCH formula in col D works in all cases
https://www.dropbox.com/s/fe0sb6bkl3phqdr/vlookup.xls

I had the same problem and I've wrote this custom primitive vlookup. It doesn't care about the length of your cells' values.
Function betterSearch(searchCell, A As Range, B As Range)
For Each cell In A
If cell.Value = searchCell.Value Then
betterSearch = B.Cells(cell.Row, 1)
Exit For
End If
betterSearch = "Not found"
Next
End Function
PS Can't help but wonder why the original VLOOKUP written by professionals is implemented in this particular case more poorly than this 10-lined func?

This is a drag in replacement for Match() and is also optimised vba code unlike betterSearch above.
Public Function Match2(search As String, lookupArray As Range, Optional match_type As Integer = 0) As Long
Application.Volatile
Dim vArray As Variant
vArray = lookupArray.Value
For i = 1 To UBound(vArray, 1)
If match_type = 0 Then
If search = vArray(i, 1) Then
Match2 = i
Exit Function
End If
Else
If match_type = -1 Then
If search <= vArray(i, 1) Then
Match2 = i
Exit Function
End If
Else
If search >= vArray(i, 1) Then
Match2 = i
Exit Function
End If
End If
End If
Next
End Function
Usage:
Index(rangeA, Match2(LookupValue, LookupRange, 0)
Above Ans said:
Can't help but wonder why the original VLOOKUP written by
professionals is implemented in this particular case more poorly
than this 10-lined func?
Optimisation and performance. If you limit the number of characters to 255 this requires only 2 operations on the CPU where as comparison of variable length strings takes many more steps on the CPU, because you have to repeatedly compare across 255 char widths. Programming languages like VBA obscure this a lot because all of the sub-operations are taken care for you.
For example, to compare 2 strings "Hello" and "abc" of fixed length 5 then we simply do the following operation on the CPU:
0100100001100101011011000110110001101111 //Hello
- 0110000101100010011000110000000000000000 //abc
= -0000000000011000111111001111011010010100 //-419231380
Now you can simply ask whether the result is < 0, > 0, = 0 or even approximately 0. This can be done in 2 CPU operations. If cells are variable length (and formulae also), then first you'd have to use the CPU to pad out the end of the value with 0s to get the strings to the same length, before you can do the operations.

XLookup no longer has such limitation. I was able to Lookup > 500 Characters with it.

Related

Summing the digits in Excel cells (long and short strings)

I'm working on a research related to frequencies.
I want to sum all the numbers in each cell and reduce them to single number only.
some cells have 2 numbers, others have 13 numbers. like these..
24.0542653897891
25.4846064424057
27
28.6055035477009
I tried several formulas to do that. the best ones have me 2 digits number, that I couldn't sum it again to get a single result.
like these Formulas:
=SUMPRODUCT(MID(SUBSTITUTE(B5,".",""),ROW(INDIRECT("1:"&LEN(B5)-1)),1)+0)
=SUMPRODUCT(1*MID(C5,ROW(INDIRECT("1:"&LEN(C5))),1))
any suggestion?
Thank you in advance.
EDIT
Based on your explanation your comments, it seems that what you want is what is called the digital root of the all the digits (excluding the decimal point). In other words, repeatedly summing the digits until you get to a single digit.
That can be calculated by a simpler formula than adding up the digits.
=1+(SUBSTITUTE(B5,".","")-1)-(INT((SUBSTITUTE(B5,".","")-1)/9)*9)
For long numbers, we can split the number in half and process each half. eg:
=1+MOD(1+MOD(LEFT(SUBSTITUTE(B5,".",""),INT(LEN(SUBSTITUTE(B5,".",""))/2))-1,9)+1+MOD(RIGHT(SUBSTITUTE(B5,".",""),LEN(SUBSTITUTE(B5,".",""))-INT(LEN(SUBSTITUTE(B5,".",""))/2))-1,9)-1,9)
However, the numbers should be stored as TEXT. When numbers are stored as numbers, what we see may not necessarily be what is stored there, and what the formula (as well as the UDF) will process.
The long formula version will correct all the errors on your worksheet EXCEPT for B104. B104 appears to have the value 5226.9332653096000 but Excel is really storing the value 5226.9333265309688. Because of Excel's precision limitations, this will get processed as 5226.93332653097. Hence there will be a disagreement.
Another method that should work would be to round all of the results in your column B to 15 digits (eg: . Combining that with using the long formula version should result in agreement for all the values you show.
Explanation
if a number is divisible by 9, its digital root will be 9, otherwise, the digital root will be n MOD 9
The general formula would be: =1+Mod(n-1,9)
In your case, since we are dealing with numbers larger than can be calculated using the MOD function, we need to both remove the dot, and also use the equivalent of mod which is n-(int(n/9)*9)
Notes:
this will work best with numbers stored as text. Since Excel may display and/or convert large numbers, or numbers with many decimal places, differently than expected, working with text strings of digits is the most stable method.
this method will not work reliably with numbers > 15 digits.
If you have numbers > 15 digits, then I suggest a VBA User Defined Function:
Option Explicit
Function digitalRoot(num As String) As Long
Dim S As String, Sum As Long, I As Long
S = num
Do While Len(S) > 1
Sum = 0
For I = 1 To Len(S)
Sum = Sum + Val(Mid(S, I, 1))
Next I
S = Trim(Str(Sum))
Loop
digitalRoot = CLng(S)
End Function
You could use a formula like:
=SUMPRODUCT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s"))
You might need an extra SUBSTITUTE for changing . to , if that's your decimal delimiter:
=SUMPRODUCT(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(A1,".",",")," ","</s><s>")&"</s></t>","//s"))
However, maybe a UDF as others proposed is also a possibility for you. Though, something tells me I might have misinterpreted your question...
I hope you are looking for something like following UDF.
Function SumToOneDigit(myNumber)
Dim temp: temp = 0
CalcLoop:
For i = 1 To Len(myNumber)
If IsNumeric(Mid(myNumber, i, 1)) Then temp = temp + Mid(myNumber, i, 1)
Next
If Len(temp) > 1 Then
myNumber = temp
temp = 0
GoTo CalcLoop
End If
SumToOneDigit = temp
End Function
UDF (User Defined Functions) are codes in VBA (visual basic for applications).
When you can not make calculations with Given Excel functions like ones in your question, you can UDFs in VBA module in Excel. See this link for UDF .. If you dont have developer tab see this link ,, Add a module in VBA in by right clicking on the workbook and paste the above code in that module. Remember, this code remains in this workbook only. So, if you want to use this UDF in some other file your will have to add module in that file and paste the code in there as well. If you are frequently using such an UDF, better to make add-in out of it like this link
In addition to using "Text to Columns" as a one-off conversion, this is relatively easy to do in VBA, by creating a user function that accepts the data as a string, splits it into an array separated by spaces, and then loops the elements to add them up.
Add the following VBA code to a new module:
Function fSumData(strData As String) As Double
On Error GoTo E_Handle
Dim aData() As String
Dim lngLoop1 As Long
aData = Split(strData, " ")
For lngLoop1 = LBound(aData) To UBound(aData)
fSumData = fSumData + CDbl(aData(lngLoop1))
Next lngLoop1
fExit:
On Error Resume Next
Exit Function
E_Handle:
MsgBox Err.Description & vbCrLf & vbCrLf & "fSumData", vbOKOnly + vbCritical, "Error: " & Err.Number
Resume fExit
End Function
Then enter this into a cell in the Excel worksheet:
=fSumData(A1)
Regards,
The UDF below will return the sum of all numbers in a cell passed to it as an argument.
Function SumCell(Cell As Range) As Double
Dim Fun As Double ' function return value
Dim Sp() As String ' helper array
Dim i As Integer ' index to helper array
Sp = Split(Cell.Cells(1).Value)
For i = 0 To UBound(Sp)
Fun = Fun + Val(Sp(i))
Next i
SumCell = Fun
End Function
Install the function in a standard code module, created with a name like Module1. Call it from the worksheet with syntax like =SumCell(A2) where A2 is the cell that contains the numbers to be summed up. Copy down as you would a built-in function.

Summing cells that have formula and string concatenated together

I have a column with formula as follows:
=(2+3*6+8) & "KB"
Basically, each cell is a formula and string concatenated (using &). I want to add all these cells up. I tried the following things:
a) =SUM(B2:B21) gives me a sum of 0.
b) Using =B2+B3... gives me a #VALUE error.
c) I tried something like this also - didn't work, gives a sum of 0: =SUM(IF(ISNUMBER(FIND("KB",$C$2:$C$14)),VALUE(LEFT($C$2:$C$14,FIND("KB",$C$2:$C$14)-1)),0))
Make your own SUM function in VBA. Try this:
=StripTextAndSum(A2:A4) - returns 60
=StripTextAndAverage(A2:A4) - returns 20
This method keeps the left most decimal number and tosses away the rest.
NOTE: You can tweak this to fit your needs. One way would be to retain the text so you can return it in the sum....like 150MB (i am assuming KB means kilobyte). Let me know if you like that idea and I'll make it.
EDIT: As pointed out by #DirkReichel, this has been made a little more efficient using IsNumeric instead, but I have retained all the other functions too. IsLetter is a useful function too.
Public Function StripTextAndSum(myRange As Range)
Dim r As Range
Dim n As Double
n = 0
For Each r In myRange.Cells
n = n + ParseNumberFromString(r.Text)
Next r
StripTextAndSum = n
End Function
Public Function StripTextAndAverage(myRange As Range)
Dim n As Double
n = StripTextAndSum(myRange)
StripTextAndAverage = n / (myRange.Cells.Count * 1#)
End Function
Public Function ParseNumberFromString(s As String) As Double
ParseNumberFromString = Left(s, GetLastNumberIndex(s))
End Function
Public Function GetFirstLetterIndex(s As String) As Integer
Dim i As Integer
For i = 1 To Len(s) Step 1
If IsLetter(Mid(s, i, 1)) = True Then
GetFirstLetterIndex = i
Exit Function
End If
Next i
End Function
Public Function GetLastNumberIndex(s As String) As Integer
Dim i As Integer
For i = Len(s) To 1 Step -1
If IsNumeric(Left(s, i)) = True Then
GetLastNumberIndex = i
Exit Function
End If
Next i
End Function
Function IsLetter(s As String) As Boolean
Dim i As Integer
For i = 1 To Len(s)
If LCase(Mid(s, i, 1)) <> UCase(Mid(s, i, 1)) = True Then
IsLetter = True
Else
IsLetter = False
Exit For
End If
Next
End Function
I'd normally just move the KB to the following column and left-justify it.
That way, it still looks identical but the first column only has real numbers that you can manipulate mathematically to your heart's content.
Or, assuming they're all in kilobytes (which seems to be a requirement if you just want to add the numeric bits), don't put KB in the data area at all.
Instead change the heading from, for example, Used memory to Used memory (KB).
Do you really want to populate your beautiful spreadsheets with butt-ugly monstrosities like the following? :-)
=SUM(IF(ISNUMBER(FIND("KB",$C$2:$C$14)),VALUE(LEFT($C$2:$C$1‌​4,FIND("KB",$C$2:$C$‌​14)-1)),0))
If you need to keep your column as-is, you could always use an array formula to get the sum:
=sum(value(left(b2:b21,len(b2:b21)-2)))
You will need to enter this as an array formula (press Ctrl+Shift+Enter to submit it)
Basically this is taking the leftmost chunk of a cell (all but the last two characters, which we know are 'KB'), using value() to convert it into a numeric, and sum() to add it up. Entering it as an array formula just lets us do this to each cell in the list b2:b21 in one swoop.
As #paxdiablo mentioned, though, it might be best to restructure so that you don't have to deal with your values as text in the first place. My approach would be to enter the values and add the "KB" via formatting. You can use a custom formatting with something like 0.00 "KB" so the cell only holds, say, the value 17, but it displays as "17.00 KB".

How can I lookup data from one column, when the value I'm referencing changes columns?

I want to do an INDEX-MATCH-like lookup between two documents, except my MATCH's index array doesn't stay in one column.
In Vague-English: I want a value from a known column that matches another value that may be found in any column.
Refer to the image below. Let's call everything to the left of the bold vertical line on column H doc1, and the right side will be doc2.
Doc2 has a column "Find This", which will be the INDEX's array. It is compared with "ID1" from doc1 (Note that the values in "Find This" will not be in the same order as column ID1, but it's easier to undertsand this way).
The "[Result]" column in doc2 will be the value from doc1's "Want This" column from the row that matches "FIND THIS" ...However, sometimes the value from "FIND THIS" is not in the "ID1" column, and is instead in "ID2","ID3", etc.
So, I'm trying to generate Col K from Col J. This would be like pressing Ctrl+F and searching for a value in Col J, then taking the value from Col D in that row and copying it to Col K.
I made identical values from a column the same color in the other doc to make it easier to visualize where they are coming from.
Note also that in column F of doc1, the same value from doc2's "Find This" can be found after some other text.
Also note that the column headers are only there as examples, the ID columns are not actually numbered.
I would simply hard-code the correct column to search from, but I'm not in control of doc1, and I'm worried that future versions may have new "ID" columns, with other's being removed.
I'd prefer this to be a solution in the form of a formula, but VB will do.
To generate column K based on given values of column J then you could use the following:
=INDEX(doc1!$D$2:$D$14,SUMPRODUCT((doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14))-1)
Copy that formula down as far as you need to go.
It basically only returns the row of the where a matching column J is found. we then find that row in the index of your D range to get your value in K.
Proof of concept:
UPDATE:
If you are working with non unique entities n column J. That is the value on its own can be found in multiple rows and columns. Consider using the following to return the Last row where there J value is found:
=INDEX(doc1!$D$2:$D$14,AGGREGATE(14,6,(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14),1)-1)
UPDATE 2:
And to return the first row where what you are looking in column J is found use:
=INDEX($D$2:$D$14,AGGREGATE(15,6,1/($B$2:$H$14=J2)*ROW($B$2:$H$14)-1,1))
Thanks to Scott Craner for the hint on the minimum formula.
To determine if you have UNIQUE data from column J in your range B2:H14 you can enter this array formula. In order to enter an array formula you need to press CTRL+SHFT+ENTER at the same time and not just ENTER. You will know you have done it right when you see {} around your formula in the formula bar. You cannot at the {} manually.
=IF(MAX(COUNTIF($B$2:$H$14,J2:J14))>1,"DUPLICATES","UNIQUE")
UPDATE 3
AGGREGATE - A relatively new function to me but goes back to Excel 2010. Aggregate is 19 functions rolled into 1. It would be nice if they all worked the same way but they do not. I think it is functions numbered 14 and up that will perform the same way an array formula or a CSE formula if you prefer. The nice thing is you do not need to use CSE when entering or editing them. SUMPRODUCT is another example of a regular formula that performs array formula calculations.
The meat of this explanation I believe is what is happening inside of the AGGREGATE brackets. If you click on the link you will get an idea of what the first two arguments are. The first defines which function you are using, and the second tell AGGREGATE how to deal with Errors, hidden rows, and some other nested functions. That is the relatively easy part. What I believe you want to know is what is happening with this:
(doc1!$B$2:$H$14=J2)*ROW(doc1!$B$2:$H$14)
For illustrative purpose lets reduce this formula to something a little smaller in scale that does the same thing. I'll avoid starting in A1 as that can make life a little easier when counting since it the 1st row and first column. So by placing the example range outside of it you can see some more special considerations potentially.
What I want to know is what row each of the items list in Column C occurs in column B
| B | C
3 | DOG | PLATYPUS
4 | CAT | DOG
5 | PLATYPUS |
The full formula for our mini example would be:
{=($B$3:$B$5=C2)*ROW($B$3:$B$5)}
And we are going to look at the following as an array
=INDEX($B$3:$B$5,AGGREGATE(14,6,($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
So the first brackets is going to be a Boolean array as you noted. Every cell that is TRUE will TRUE until its forced into a math calculation. When that happens, True becomes 1 and False becomes 0.I that formula was entered as a CSE formula and place in D2, it would break down as follows:
FALSE X 3
FALSE X 4
TRUE X 5
The 3, 4 and 5 come from ROW() returning the value of the row number that it is dealing with at the time of the array math operation. Little trick, we could have had ROW(1:3). Just need to make sure the size of the array matches! This is not matrix math is just straight across multiplication. And since the Boolean is now experiencing a math operation we are now looking at:
0 X 3 = 0
0 X 4 = 0
1 X 5 = 5
So the array of {0,0,5} gets handed back to the aggregate for more processing. The important thing to note here is that it contains ONLY 0 and the individual row numbers where we had a match. So with the first aggregate formula, formula 14 was chosen which is the LARGE function. And we also told it to ignore errors, which in this particular case does not matter. So after providing the array to the aggregate function, there was a ,1) to finish off the aggregate function. The 1 tells the aggregate function that we want the 1st larges number when the array is sorted from smallest to largest. If that number was 2 it would be the 2nd largest number and so on. So the last row or the only row that something is found on is returned. So in our small example it would be 5.
But wait that 5 was buried inside another function called Index. and in our small example that INDEX formula would be:
=INDEX($B$3:$B$5,AGGREGATE(...)-2)
Well we know that the range is only 3 rows long, so asking for the 5th row, would have excel smacking you up side the head with an error because your index number is out of range. So in comes the header row correction of -1 in the original formula or -2 for the small example and what we really see for the small example is:
=INDEX($B$3:$B$5,5-2)
=INDEX($B$3:$B$5,3)
and here is a weird bit of info, That last one does not become PLATYPUS...it becomes the cell reference to =B5 which pulls PLATYPUS. But that little nuance is a story for another time.
Now in the comments Scott essentially told me to invert for the error to get the first row. And this is important step for the aggregate and it had me running in circles for awhile. So the full equation for the first row option in our mini example is
=INDEX($B$3:$B$5,AGGREGATE(15,6,1/($B$3:$B$5=C2)*ROW($B$3:$B$5),1)-2)
And what Scott Craner was actually suggesting which Skips one math step is:
=INDEX($B$3:$B$5,AGGREGATE(15,6,ROW($B$3:$B$5)/($B$3:$B$5=C2),1)-2)
However since I only realized this after writing this all up the explanation will continue with the first of these two equations
So the important thing to note here is the change from function 14 to function 15 which is SMALL. Think of it a finding the minimum. And this time that 6 plays a huge factor along with the 1/. So our array in the middle this time equates to:
1/FALSE X 3
1/FALSE X 4
1/TRUE X 5
Which then becomes:
1/0 X 3
1/0 X 4
1/1 X 5
Which then has excel slapping you up side the head again because you are trying to divide by 0:
#div/0 X 3
#div/0 X 4
1/1 X 5
But you were smart and you protected yourself from that slap upside the head when you told AGGREGATE to ignore error when you used 6 as the second argument/reference! Therefore what is above becomes:
{5}
Since we are performing a SMALL, and we passed ,1) as the closing part of the AGGREGATE, we have essentially said give me the minimum row number or the 1st smallest number of the resulting array when sorted in ascending order.
The rest plays out the same as it did for the LARGE AGGREGATE method. The pitfall I fell into originally is I did not use the 1/ to force an error. As a result, every time I tried getting the SMALL of the array I was getting 0 from all the false results.
SUMPRODUCT works in a very similar fashion, but only works when your result array in the middle only returns 1 non zero answer. The reason being is the last step of the SUMPRODUCT function is to all the individual elements of the resulting array. So if you only have 1 non zero, you get that non zero number. If you had two rows that matched for instance 12 and 31, then the SUMPRODUCT method would return 43 which is not any of the row numbers you wanted, where as aggregate large would have told you 31 and aggregate small would have told you 12.
Something like this maybe, starting in K2 and copied down:
=IFERROR(INDEX(D:D,MAX(IFERROR(MATCH(J2,B:B,0),-1),IFERROR(MATCH(J2,E:E,0),-1),IFERROR(MATCH(J2,G:G,0),-1),IFERROR(MATCH(J2,H:H,0),-1))),"")
If you want to keep the positions of the columns for the Match variable, consider creating generic range names for each column you want to check, like "Col1", "Col2", "Col3". Create a few more range names than you think you will need and reference them to =$B:$B, =$E:$E etc. Plug all range names into Match functions inside the Max() statement as above.
When columns are added or removed from the table, adjust the range name definitions to the columns you want to check.
For example, if you set up the formula with five Matches inside the Max(), and the table changes so you only want to check three columns, point three of the range names to the same column. The Max() will only return one result and one lookup, even if the same column is matched several times.
I came up with a vba solution if I understood correctly:
Sub DisplayActiveRange()
Dim sheetToSearch As Worksheet
Set sheetToSearch = Sheet2
Dim sheetToOutput As Worksheet
Set sheetToOutput = Sheet1
Dim search As Range
Dim output As Range
Dim searchCol As String
searchCol = "J"
Dim outputCol As String
outputCol = "K"
Dim valueCol As String
valueCol = "D"
Dim r As Range
Dim currentRow As Integer
currentRow = 1
Dim maxRow As Integer
maxRow = sheetToOutput.UsedRange.Rows.Count
For currentRow = 1 To maxRow
Set search = Range("J" & currentRow)
For Each r In sheetToSearch.UsedRange
If r.Value <> "" Then
If r.Value = search.Value Then
Set output = sheetToOutput.Range(outputCol & currentRow)
output.Value = sheetToSearch.Range(valueCol & currentRow).Value
currentRow = currentRow + 1
Set search = sheetToOutput.Range(searchCol & currentRow)
End If
End If
Next
Next currentRow
End Sub
There might be better ways of doing it, but this will give you what you want. We assume headers in both "source" and "destination" sheets. You will need to adapt the "Const" declarations according to how your sheets are named. Press Control & G in Excel to bring up the VBA window and copy and paste this code into "This Workbook" under the "VBA Project" group, then select "Run" from the menu:
Option Explicit
Private Const sourceSheet = "Source"
Private Const destSheet = "Destination"
Public Sub FindColumns()
Dim rowCount As Long
Dim foundValue As String
Sheets(destSheet).Select
rowCount = 1 'Assume a header row
Do While Range("J" & rowCount + 1).value <> ""
rowCount = rowCount + 1
foundValue = FncFindText(Range("J" & rowCount).value)
Sheets(destSheet).Select
Range("K" & rowCount).value = foundValue
Loop
End Sub
Private Function FncFindText(value As String) As String
Dim rowLoop As Long
Dim colLoop As Integer
Dim found As Boolean
Dim pos As Long
Sheets(sourceSheet).Select
rowLoop = 1
colLoop = 0
Do While Range(alphaCon(colLoop + 1) & rowLoop + 1).value <> "" And found = False
rowLoop = rowLoop + 1
Do While Range(alphaCon(colLoop + 1) & rowLoop).value <> "" And found = False
colLoop = colLoop + 1
pos = InStr(Range(alphaCon(colLoop) & rowLoop).value, value)
If pos > 0 Then
FncFindText = Mid(Range(alphaCon(colLoop) & rowLoop).value, pos, Len(value))
found = True
End If
Loop
colLoop = 0
Loop
End Function
Private Function alphaCon(aNumber As Integer) As String
Dim letterArray As String
Dim iterations As Integer
letterArray = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
If aNumber <= 26 Then
alphaCon = (Mid$(letterArray, aNumber, 1))
Else
If aNumber Mod 26 = 0 Then
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations - 1, 1)) & (Mid$(letterArray, 26, 1))
Else
'we deliberately round down using 'Int' as anything with decimal places is not a full iteration.
iterations = Int(aNumber / 26)
alphaCon = (Mid$(letterArray, iterations, 1)) & (Mid$(letterArray, (aNumber - (26 * iterations)), 1))
End If
End If
End Function

Deleting variable number of leading characters from a variable-length string

If I am having G4ED7883666 and I want the output to be 7883666
and I have to apply this on a range of cells and they are not the same length and the only common thing is that I have to delete anything before the number that lies before the alphabet?
This formula finds the last number in a string, that is, all digits to the right of the last alpha character in the string.
=RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1)
Note that this is an array formula and must be entered with the Control-Shift-Enter keyboard combination.
How the formula works
Let's assume that the target string is fairly simple: "G4E78"
Working outward from the middle of the formula, the first thing to do is create an array with the elements 1 through 25. (Although this might seem to limit the formula to strings with no more than 25 characters, it actually places a limit of 25 digits on the size of the number that may be extracted by the formula.
ROW($1:$25) = {1;2;3;4;5;6;7; etc.}
Subtracting from this array the value of (1 + the length of the target string) produces a new array, the elements of which count down from the length of string. The first five elements will correspond to the position of the characters of the string - in reverse order!
LEN(A1)+1-ROW($1:$25) = {5;4;3;2;1;0;-1;-2;-3;-4; etc.}
The MID function then creates a new array that reverses the order of the characters of the string.
For example, the first element of the new array is the result of MID(A1, 5, 1), the second of MID(A1, 4, 1) and so on. The #VALUE! errors reflect the fact that MID cannot evaluate 0 or negative values as the position of a string, e.g., MID(A1,0,1) = #VALUE!.
MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";"E";"4";"G";#VALUE!;#VALUE!; etc.}
Multiplying the elements of the array by 1 turns the character elements of that array to #VALUE! errors as well.
=1*MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";#VALUE!;"4";#VALUE!;#VALUE!;#VALUE!; etc.}
And the IFERROR function turns the #VALUES into 99, which is just an arbitrary number greater than the value of a single digit.
IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99) = {8;7;99;4;99;99;99; etc.}
Matching on the 99 gives the position of the first non-digit character counting from the right end of the string. In this case, "E" is the first non-digit in the reversed string "87E4G", at position 3. This is equivalent to saying that the number we are looking for at the end of the string, plus the "E", is 3 characters long.
MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0) = 3
So, for the final step, we take 3 - 1 (for the "E) characters from the right of string.
RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1) = "78"
One more submission for you to consider. This VBA function will get the right most digits before the first non-numeric character
Public Function GetRightNumbers(str As String)
Dim i As Integer
For i = Len(str) To 0 Step -1
If Not IsNumeric(Mid(str, i, 1)) Then
Exit For
End If
Next i
GetRightNumbers = Mid(str, i + 1)
End Function
You can write some VBA to format the data (just starting at the end and working back until you hit a non-number.)
Or you could (if you're happy to get an addin like Excelicious) then you can use regular expressions to format the text via a formula. An expression like [0-9]+$ would return all the numbers at the end of a string IIRC.
NOTE: This uses the regex pattern in James Snell's answer, so please upvote his answer if you find this useful.
Your best bet is to use a regular expression. You need to set a reference to VBScript Regular Expressions for this to work. Tools --> References...
Now you can use regex in your VBA.
This will find the numbers at the end of each cell. I am placing the result next to the original so that you can verify it is working the way you want. You can modify it to replace the cell as soon as you feel comfortable with it. The code works regardless of the length of the string you are evaluating, and will skip the cell if it doesn't find a match.
Sub GetTrailingNumbers()
Dim ws As Worksheet
Dim rng As Range
Dim cell As Range
Dim result As Object, results As Object
Dim regEx As New VBScript_RegExp_55.RegExp
Set ws = ThisWorkbook.Sheets("Sheet1")
' range is hard-coded here, but you can define
' it programatically based on the shape of your data
Set rng = ws.Range("A1:A3")
' pattern from James Snell's answer
regEx.Pattern = "[0-9]+$"
For Each cell In rng
If regEx.Test(cell.Value) Then
Set results = regEx.Execute(cell.Value)
For Each result In results
cell.Offset(, 1).Value = result.Value
Next result
End If
Next cell
End Sub
Takes the first 4 digits from the right of num:
num1=Right(num,4)
Takes the first 5 digits from the left of num:
num1=Left(num,5)
First takes the first ten digits from the left then takes the first four digits from the right:
num1=Right(Left(num, 10),4)
In your case:
num=G4ED7883666
num1=Right(num,7)

SumProduct over sets of cells (not contiguous)

I have a total data set that is for 4 different groupings. One of the values is the average time, the other is count. For the Total I have to multiply these and then divide by the total of the count. Currently I use:
=SUM(D32*D2,D94*D64,D156*D126,D218*D188)/SUM(D32,D94,D156,D218)
I would rather use a SumProduct if I can to make it more readable. I tried to do:
=SUMPRODUCT((D2,D64,D126,D188),(D32,D94,D156,D218))/SUM(D32,94,D156,D218)
But as you can tell by my posting here, that did not work. Is there a way to do SumProduct like I want?
I agree with the comment "It might be possible with masterful excel-fu, but even if it can be done, it's not likely to be more readable than your original solution"
A possible solution is to embed the CHOOSE() function within your SUMPRODUCT (this trick actually is pretty handy for vlookups, finding conditional maximums, etc.).
Example:
Let's say your data has eight observations and is in two columns (columns B and C) but you don't want to include some observations (exclude observations in rows 4 and 5). Then the SUMPRODUCT code looks like this...
=SUMPRODUCT(CHOOSE({1,2},A1:A3,A6:A8),CHOOSE({1,2},B1:B3,B6:B8))
I actually thought of this on the fly, so I don't know the limitations and as you can see it is not that pretty.
Hope this helps! :)
It might be possible with masterful excel-fu, but even if it can be done, it's not likely to be more readable than your original solution. The problem is that even after 20+ years, Excel still borks discontinuous ranges. Naming them won't work, array formulas won't work and as you see with SUMPRODUCT, they don't generally work in tuple-wise array functions. Your best bet here is to come up with a custom function.
UPDATE
You're question got me thinking about how to handle discontinuous ranges. It's not something I've had to deal with much in the past. I didn't have the time to give a better answer when you asked the question but now that I've got a few minutes, I've whipped up a custom function that will do what you want:
Function gvSUMPRODUCT(ParamArray rng() As Variant)
Dim sumProd As Integer
Dim valuesIndex As Integer
Dim values() As Double
For Each r In rng()
For Each c In r.Cells
On Error GoTo VBAIsSuchAPainInTheAssSometimes
valuesIndex = UBound(values) + 1
On Error GoTo 0
ReDim Preserve values(valuesIndex)
values(valuesIndex) = c.Value
Next c
Next r
If valuesIndex Mod 2 = 1 Then
For i = 0 To (valuesIndex - 1) / 2
sumProd = sumProd + values(i) * values(i + (valuesIndex + 1) / 2)
Next i
gvSUMPRODUCT = sumProd
Exit Function
Else
gvSUMPRODUCT = CVErr(xlErrValue)
Exit Function
End If
VBAIsSuchAPainInTheAssSometimes:
valuesIndex = 0
Resume Next
End Function
Some notes:
Excel enumerates ranges by column then row so if you have a continuous range where the data is organized by column, you have to select separate ranges: gvSUMPRODUCT(A1:A10,B1:B10) and not gvSUMPRODUCT(A1:B10).
The function works by pairwise multiplying the first half of cells with the second and then summing those products: gvSUMPRODUCT(A1,C3,L2,B2,G5,F4) = A1*B2 + C3*G5 + L2*F4. I.e. order matters.
You could extend the function to include n-wise multiplication by doing something like gvNSUMPRODUCT(n,ranges).
If there are an odd number of cells (not ranges), it returns the #VALUE error.
Note that sumproduct(a, b) = sumproduct(a1, b1) + sumproduct(a2, b2) where range a is split into ranges a1 and a2 (and similar for b)
It might be helpful to create an intermediate table that summarizes the data that you are using to calculate the sum product. That would also make the calculation easier to follow.

Resources