VBA/Excel - Matching Column values with pre-defined sets - excel

I have a case where I have a pre-defined set of values/codes what in a combination make some message. I also have an incoming file that I have to analyze on the codes, compare the sets against predefined values. The problem is that the incoming file doesn't exactly match my library:
In my library I have set of 4 columns and each row has it's unique meaning. I want to get that meaning as a string to further proceed with calculations. The problem is that the incoming file is not consistent. It doesn't follow exact sequence as in pre-defined file.
I need to compare them regardless of sequence. After being matched I want to grab corresponding Meaning from Library and proceed working with Case in my file.
Any ideas how to implement it in VBA?

Offhand, you could create a VBA function that would return a string representing the unique values.
If there is some character that will never be in the values, you could delimit the values with that character (such as _):
Function GetUniqueValuesString(rng As Range) As String
Dim rngValues() As Variant
rngValues = rng.Value
' The following line requires a reference to Microsoft Scripting Runtime
' (via Tools -> References...)
Dim dict As New Scripting.Dictionary
'The function only parses the first row of the range
Dim i As Integer
For i = 1 To UBound(rngValues, 2)
dict(rngValues(1, i)) = 1 ' 1 here is a dummy value
Next
GetUniqueValuesString = Join(dict.Keys, "_")
End Function
(If every value is always the same number of characters, you could simply join them, without any delimiter.)
Using this function against a horizontal cell range:
=GetUniqeValuesString(A4:D4)
should return a string with only the unique values:
BGAA_TGHJ_WETY
If you apply this function to both the library rows and the file rows, you should be able to match on the function's returned value.
Note that the function has a few limitations that might need to be resolved:
It assumes the order of the values will be the same. In other words, BGAA, TGHJ, WETY will resolve to a different string than BGAA, WETY, TGHJ.
The function only joins the first row; other rows are ignored.

Related

Searching for Specific Column Headers in Excel File - Runtime Error 91

I am attempting to write some excel vba code that will process the content of certain columns of data. Given the worksheet has some level of dynamic change (columns added and removed from time to time), I want my code to "find" the specific columns by their header names, and ultimately return the column number. My File has roughly 50 columns.
The problem is this: My code works just fine to find many of the columns (headers) I am interested in returning the column index, but some of the columns "while clearly existing", will return Nothing and thus, throws the runtime 91 error.
I can say, without a doubt that when I execute the .find, that truly, the columns DO exist (like the Comments column). I can randomly change the failing hdr search column to a different header name, passing it to the function in the code and some columns are found just fine, and other, cause the runtime error. I have checked the "failing" headers for special characters, blanks, LF's etc. No luck. Even tried re-ordering the 4 rows using FindColHdrNum function. Again, no luck.
Was hoping fresh eyes may provide answer. Simplified code is below which is triggered by a button on main excel worksheet. I have not worked with functions much in VBA, and even where the function does not generate the Runtime Error, it is not returning the column value, but this is a secondary problem I can work on once I get the find code not blowing up (returning 0).
Sub Button119_Click()
Dim L4RankCol As Integer
Dim DecomDriverCol As Integer
Dim SupTermImpactYrCol As Integer
Dim Comments As Integer
Dim L3RankCol As Integer
L4RankCol = FindColHdrNum("L4 Rank") '<-- This works
DecomDriverCol = FindColHdrNum("Decom Driver") '<-- This works
SupTermImpactYrCol = FindColHdrNum("Support Termination Impact Yr") '<-- This works
Comments = FindColHdrNum("Comments") '<-- This does not work
End Sub
Function FindColHdrNum(strHdr As String) As Integer
Dim rngAddress As Range
Set rngAddress = Range("Headers").Find(strHdr)
FindColumnHdrNum = rngAddress.Column '<--runtime error is caused by Nothing returned
End Function
Issue turns out to be a spurious line feed that was embedded in the header. It was strange as I kept re-typing it, but of course, I would always start at the "first letter" of the "comment" header, when in fact, the character preceded that. Thanks to all, for the help!
The name of your function is FindColHdrNum but you wrote this into the function:
FindColumnHdrNum = rngAddress.Column
Instead of:
FindColHdrNum = rngAddress.Column

How to check if a substring is in any of the entries in rows of an excel sheet?

So I basically have a table with multiple columns.
So for each of the ids (1, 2, 3), I would like to check in which column there is sub-string of * (as you see sometimes it's in B and sometimes in C). Then I would like to extract the whole string that contains * and is associated with the given ID.
Suppose that my actual table contains over 10 columns - but the idea remains the same.
In other words the entries that I am looking for that contain a specific substring are scatted all throughout the 10 different columns.
Use HLOOKUP
=HLOOKUP("*~**",B1:C1,1,FALSE)
Since the asterisk(*) is a wildcard we need to append it with the tilde(~) to tell Search to look for the actual character.
The outer * allow the HLOOKUP to look at part.
Try the following User Defined Function:
Public Function FindTheStar(rng As Range) As String
Dim r As Range, v As String
FindTheStar = ""
For Each r In rng
v = r.Text
If InStr(v, "*") > 0 Then
FindTheStar = v
Exit Function
End If
Next r
End Function
It will find and return the first cell in any range that contains an asterisk.

Excel find remaining columns efficiently

I have a script (thanks to SO for the help with that one!) to allow a user to select a number of discontinuous columns and insert their indexes into an array. What I need to be able to do now is efficiently select the remaining columns i.e. the ones that the user didn't select into another array to perform a separate action on these columns.
For example, the user selects columns A,C,F,G and these indexes are put into the array Usr_col(). The remaining columns (B,D,E) need to be stored in the array rem_col()
All I can think of right now is to test every used column's index against the array of user-selected columns and, if it is not contained in that array, insert it into a new array. Something like this:
For i = 1 to ws.cells(1, columns.count).end(xltoright).column
if isinarray(i, Usr_col()) = false Then
rem_col(n) = i
n = n+1
end if
next
I am just looking for a more efficient solution to this.
I agree with #ScottHoltzman that this site wouldn't normally be the arena to make working code more efficient. However, this question puts a different slant on your previous one, as the most obvious solution would be to assign column numbers to one or other of your arrays in one loop.
The code below gives you a skeleton example. You'd need to check the user's selection for proper columns. Also, it isn't great form to redimension an array within the loop, but if the user selects adjacent columns then you'd need to acquire area count and column count to get the array size. I'll leave that to you if rediming within the loop jars with you:
Dim targetCols As Range, allCols As Range
Dim selColNum() As Long, unselColNum() As Long
Dim selIndex As Long, unselIndex As Long
Set targetCols = Application.InputBox("Select your columns", Type:=8)
For Each allCols In Sheet1.UsedRange.Columns
If Intersect(allCols, targetCols) Is Nothing Then
ReDim Preserve unselColNum(unselIndex)
unselColNum(unselIndex) = allCols.Column
unselIndex = unselIndex + 1
Else
ReDim Preserve selColNum(selIndex)
selColNum(selIndex) = allCols.Column
selIndex = selIndex + 1
End If
Next

Look up and return values from 2 columns all matches

I have List A and B in excel and would like to compare ALL the items in List A with ALL the records in List B and if they match or partial match return the value of B in 3rd column. Hopefully demonstrated in the attached.
example
The easiest way to achieve it is to use VBA. Please find below example function which you can use in the same way as Excel functions:
Public Function findArea(item As String, areaRng As Range) As String
Dim i As Long
Dim ARR_area() As Variant
ARR_area = areaRng.Value2
For i = LBound(ARR_area) To UBound(ARR_area)
If (item Like "*" & ARR_area(i, 1) & "*") Then
findArea = ARR_area(i, 1)
GoTo endFunc
End If
Next i
endFunc:
End Function
Where:
item - Item which you want to check vs. areas
area - range of areas you want to check.
See usage example:
To achieve this result without you would need to format table to pivot view, where in rows you would have item and in rows area - as the value you can check matching for each combination. Nevertheless in this particular example I would recommend to use VBA.
Hope it helped.

Excel 2007 - Generate unique ID based on text?

I have a sheet with a list of names in Column B and an ID column in A. I was wondering if there is some kind of formula that can take the value in column B of that row and generate some kind of ID based on the text? Each name is also unique and is never repeated in any way.
It would be best if I didn't have to use VBA really. But if I have to, so be it.
Solution Without VBA.
Logic based on First 8 characters + number of character in a cell.
= CODE(cell) which returns Code number for first letter
= CODE(MID(cell,2,1)) returns Code number for second letter
= IFERROR(CODE(MID(cell,9,1)) If 9th character does not exist then return 0
= LEN(cell) number of character in a cell
Concatenating firs 8 codes + adding length of character on the end
If 8 character is not enough, then replicate additional codes for next characters in a string.
Final function:
=CODE(B2)&IFERROR(CODE(MID(B2,2,1)),0)&IFERROR(CODE(MID(B2,3,1)),0)&IFERROR(CODE(MID(B2,4,1)),0)&IFERROR(CODE(MID(B2,5,1)),0)&IFERROR(CODE(MID(B2,6,1)),0)&IFERROR(CODE(MID(B2,7,1)),0)&IFERROR(CODE(MID(B2,8,1)),0)&LEN(B2)
Sorry, I didn't found a solution with formula only even if this thread might help (trying to calculate the points in a scrabble game) but I didn't find a way to be sure the generated hash would be unique.
Yet, here is my solution, based on a UDF (Used-Defined Function):
Put the code in a module:
Public Function genId(ByVal sName As String) As Long
'Function to create a unique hash by summing the ascii value of each character of a given string
Dim sLetter As String
Dim i As Integer
For i = 1 To Len(sName)
genId = Asc(Mid(sName, i, 1)) * i + genId
Next i
End Function
And call it in your worksheet like a formula:
=genId(A1)
[EDIT] Added the * i to take into account the order. It works on my unit tests
May be OTT for your needs, but you can use a call to CoCreateGuid to get a real GUID
Private Declare Function CoCreateGuid Lib "ole32" (ID As Any) As Long
Function GUID() As String
Dim ID(0 To 15) As Byte
Dim i As Long
If CoCreateGuid(ID(0)) = 0 Then
For i = 0 To 15
GUID = GUID & Format(Hex$(ID(i)), "00")
Next
Else
GUID = "Error while creating GUID!"
End If
End Function
Test using
Sub testGUID()
MsgBox GUID
End Sub
How to best implement depends on your needs. One way would be to write a macro to get a GUID populate a column where names exist. (note, using it as a udf as is is no good, since it will return a new GUID when recalculated)
EDIT
See this answer for creating a SHA1 hash of a string
Do you just want an incrementing numeric id column to sit next to your values? If so, and if your values will always be unique, you can very easily do this with formulae.
If your values were in column B, starting in B2 underneath your headers for example, in A2 you would type the formula "=IF(B2="","",1+MAX(A$1:A1))". You can copy and paste that down as far as your data extends, and it will increment a numeric identifier for each row in column B which isn't blank.
If you need to do anything more complicated, like identify and re-identify repeating values, or make identifiers 'freeze' once they're populated, let me know. Currently, when you clear or add values to your list the identifers will toggle themselves up and down, so you need to be careful if your data changes.
Unique identifier based on the number of specific characters in text. I used an identifier based on vowels and numbers.
=LEN($J$14)-LEN(SUBSTITUTE($J$14;"a";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"e";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"i";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"j";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"o";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"u";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"y";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"1";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"2";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"3";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"4";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"5";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"6";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"7";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"8";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"9";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"0";""))
You say you are confident that there are no duplicate values in your words. To push it further, are you confident that the first 8 characters in any word would be unique?
If so, you can use the below formula. It works by individually taking each character's ASCII code - 40 [assuming normal characters, this puts numbers at between 8 & 57, and letters at between 57 & 122], and multiplying that characters code by 10 ^ [that character's digit placement in the word]. Basically it takes that character code [-40], and concatenates each code onto the next.
EDIT Note that this code no longer requires that at least 8 characters exist in your word to prevent an error, as the actual word to be coded has 8 "0"'s appended to it.
=TEXT(SUM((CODE(MID(LOWER(RIGHT(REPT("0",8)&A3,8)),{1,2,3,4,5,6,7,8},1))-40)*10^{0,2,4,6,8,10,12,14}),"#")
Note that as this uses the ASCII values of the characters, the ID # could be used to identify the name directly - this does not really create anonymity, it just turns 8 unique characters into a unique number. It is obfuscated with the -40, but not really 'safe' in that sense. The -40 is just to get normal letters and numbers in the 2 digit range, so that multiplying by 10^0,2,4 etc. will create a 2 digit unique add-on to the created code.
EDIT FOR ALTERNATIVE METHOD
I had previously attempted to do this so that it would look at each letter of the alphabet, count the number of times it appears in the word, and then multiply that by 10*[that letter's position in the alphabet]. The problem with doing this (see comment below for formula) is that it required a number of 10^26-1, which is beyond Excel's floating point precision. However, I have a modified version of that method:
By limiting the number of allowed characters in the alphabet, we can get the max total size possible to 10^15-1, which Excel can properly calculate. The formula looks like this:
=RIGHT(REPT("0",15)&TEXT(SUM(LEN(A3)*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}-LEN(SUBSTITUTE(A3,MID(Alphabet,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15},1),""))*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}),"#"),15)
[The RIGHT("00000000000000"... portion of the formula is meant to keep all codes the same number of characters]
Note that here, Alphabet is a named string which holds the characters: "abcdehilmnorstu". For example, using the above formula, the word "asdf" counts the instances of a, s, and d, but not 'f' which isn't in my contracted alphabet. The code of "asdf" would be:
001000000001001
This only works with the following assumptions:
The letters not listed (nor numbers / special characters) are not required to make each name unique. For example, asdf & asd would have the same code in the above method.
And,
The order of the letters is not required to make each name unique. For example, asd & dsa would have the same code in the above method.

Resources