VBA: How to find the values after a "#" symbol in a string - string

I am trying to set the letters after a # symbol to a variable.
For example, x = #BAL
I want to set y = BAL
Or x = #NE
I want y = NE
I am using VBA.

Split() in my opinion is the easiest way to do it:
Dim myStr As String
myStr = "#BAL"
If InStr(, myStr, "#") > 0 Then '<-- Check for your string to not throw error
MsgBox Split(myStr, "#")(1)
End If
As wisely pointed out by Scott Craner, you should check to ensure the string contains the value, which he checks in this comment by doing: y = Split(x,"#")(ubound(Split(x,"#")). Another way you can do it is using InStr(): If InStr(, x, "#") > 0 Then...
The (1) will take everything after the first instance of the character you are looking for. If you were to have used (0), then this would have taken everything before the #.
Similar but different example:
Dim myStr As String
myStr = "#BAL#TEST"
MsgBox Split(myStr, "#")(2)
The message box would have returned TEST because you used (2), and this was the second instance of your # character.
Then you can even split them into an array:
Dim myStr As String, splitArr() As String
myStr = "#BAL#TEST"
splitArr = Split(myStr, "#") '< -- don't append the collection number this time
MsgBox SplitArr(1) '< -- This would return "BAL"
MsgBox SplitArr(2) '< -- This would return "TEST"
If you are looking for additional reading, here is more from the MSDN:
Split Function
Description Returns a zero-based, one-dimensional array containing a specified number of substrings. SyntaxSplit( expression [ ,delimiter [ ,limit [ ,compare ]]] ) The Split function syntax has thesenamed arguments:
expression
Required. String expression containing substrings and delimiters. If expression is a zero-length string(""), Split returns an empty array, that is, an array with no elements and no data.
delimiter
Optional. String character used to identify substring limits. If omitted, the space character (" ") is assumed to be the delimiter. If delimiter is a zero-length string, a single-element array containing the entire expression string is returned.
limit
Optional. Number of substrings to be returned; -1 indicates that all substrings are returned.
compare
Optional. Numeric value indicating the kind of comparison to use when evaluating substrings. See Settings section for values.

You can do the following to get the substring after the # symbol.
x = "#BAL"
y = Right(x,len(x)-InStr(x,"#"))
Where x can be any string, with characters before or after the # symbol.

Related

How can I split a string in an array of strings such that each string is either a predefined array of strings, a variable or a string in Matlab?

I have two predefined arrays, say:
first = ["alpha" "beta"];
second = ["{" "}"];
and I want to create a function which receives a string and splits the string in different string arrays(or cell). Each array(or cell) should contain either a single member of one of the two arrays, a variable that is not a member of the arrays (without including the blank space) or a string. Ex:
Input string:
"alpha{ beta} new {} new2 "This is a string" }"
Output string:
"alpha" "{" "beta" "new" "{" "}" "new2" "This is a string" "}"
Hope it is clear!
Bests,
Stergios
I tried this:
S = "alpha{ beta} new new2 {} new3}";
T = ["alpha","beta", "{","}"];
[X,Y] = strsplit(S,[T " "], 'CollapseDelimiters',false);
X = strtrim(X); % you forgot to mention, that you also want to remove whitespace
X(2,:) = [Y,""];
X(strlength(X)==0) = []
but S does not accept strings of strings and if I use '' every word will be in a different cell!

Comparing two strings with wildcard conditions

Working on a VBA macro that compares string values, and have hit a wall at a point.
Here is some context on what I'm trying to accomplish:
Compare two strings, If String 2 is CONTAINED anywhere in String 1, I return a match. For the most part, I've used the builtin "instr" function in instances where the String 2 is contained in String 1 without any wildcards involved.
The trouble I'm having is that I must treat spaces or " " in String 2 as a wildcard.
Ex:
String 1 = Red Green Blue
String 2 = Red Blue
This should still return a valid match, since the " " in String 2 is being treated as a wildcard, and any number of characters can be between "Red" and "Blue"
What I did was use the "split" function with " " as a delimiter to split String 2 in instances where a space(" ") is involved, and run an instr function on the resulting array to check if each element of the array is contained in String 1. In the example above:
String 2 would be split into a String array(let's call with splitString) with the following elements:
splitString = (Red, Blue)
Using the logic above:
splitString(0) is contained in String 1
splitString(1) is contained in String 1
Therefore, String 2 is contained in String 1, and a match is returned. The match condition I am using is utilizing the UBounds value of splitString (details in the code snippet below)
The issue I am having is that I need to only return a match where the initial string order of String 2 is maintained. Ex:
If:
String 1 = Red Green Blue
String 2 = Blue Red
This is not a valid match since even though when we split String 2, we find the resulting array elements are "contained" in String 1, the order of String 2 is not being respected.
Here is a rough draft of the logic I've coded:
splitString = Split(String2," ")
x = -1
For y = LBound(splitString) To UBound(splitString)
splitStringCompare = InStr(1, String1, splitString(y), vbTextCompare)
If splitStringCompare > 0 Then
x = x + 1
If x = UBound(splitString) Then
"Match"
Else
"No Match"
End If
Next y
Any help or nudge in the right direction would be much appreciated. Thanks!!
You can use Regular Expressions object to test this and verify match. For demonstration purpose, I have created a UDF as below.
Public Function MatchWildCard(strMatchWith As String, strMatchString As String) As Boolean
strMatchString = Replace(Trim(strMatchString), " ", ".+")
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = False
.IgnoreCase = True '\\ Change to False if match is case sensitive
.Pattern = strMatchString
If .Test(strMatchWith) Then
MatchWildCard = True
Else
MatchWildCard = False
End If
End With
End Function
It can then be used in Excel sheet like below snapshot =MatchWildCard(A1,B1):
Note: I am a basic user of RegExp so there may be a better manner of handling this so you should test this on large sample to validate.

How to extract the first instance of digits in a cell with a specified length in VBA?

I have the following Text sample:
Ins-Si_079_GM_SOC_US_VI SI_SOC_FY1920_US_FY19/20_A2554_Si Resp_2_May
I want to get the number 079, So what I need is the first instance of digits of length 3. There are certain times the 3 digits are at the end, but they usually found with the first 2 underscores. I only want the digits with length three (079) and not 19, 1920, or 2554 which are different lengths.
Sometimes it can look like this with no underscore:
1920 O-B CLI 353 Tar Traf
Or like this with the 3 digit number at the end:
Ins-Si_GM_SOC_US_VI SI_SOC_FY1920_US_FY19/20_A2554_Si Resp_2_079
There are also times where what I need is 2 digits but when it's 2 digits its always at the end like this:
FY1920-Or-OLV-B-45
How would I get what I need in all cases?
You can split the listed items and check for 3 digits via Like:
Function Get3Digits(s As String) As String
Dim tmp, elem
tmp = Split(Replace(Replace(s, "-", " "), "_", " "), " ")
For Each elem In tmp
If elem Like "###" Then Get3Digits = elem: Exit Function
Next
If Get3Digits = vbNullString Then Get3Digits = IIf(Right(s, 2) Like "##", Right(s, 2), "")
End Function
Edited due to comment:
I would execute a 2 digit search when there are no 3 didget numbers before the end part and the last 2 digits are 2. if 3 digits are fount at end then get 3 but if not then get 2. there are times when last is a number but only one number. I would only want to get last if there are 2 or 3 numbers. The - would not be relevant to the 2 digets. if nothing is found that is desired then would return " ".
If VBA is not a must you could try:
=TEXT(INDEX(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"_"," "),"-"," ")," ","</s><s>")&"</s></t>","//s[.*0=0][string-length()=3 or (position()=last() and string-length()=2)]"),1),"000")
It worked for your sample data.
Edit: Some explaination.
SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"_"," "),"-"," ")," ","</s><s>") - The key part to transform all three potential delimiters (hyphen, underscore and space) to valid XML node end- and startconstruct.
The above concatenated using ampersand into a valid XML construct (adding a parent node <t>).
FILTERXML can be used to now 'split' the string into an array.
//s[.*0=0][string-length()=3 or last() and string-length()=2] - The 2nd parameter of FILTERXML which should be valid XPATH syntax. It reads:
//s 'Select all <s> nodes with
following conditions:
[.*0=0] 'Check if an <s> node times zero
returns zero (to check if a node
is numeric. '
[string-length()=3 or (position()=last() and string-length()=2)] 'Check if a node is 3 characters
long OR if it's the last node and
only 2 characters long.
INDEX(.....,1) - I mentioned in the comments that usually this is not needed, but since ExcelO365 might spill the returned array, we may as well implemented to prevent spilling errors for those who use the newest Excel version. Now we just retrieving the very first element of whatever array FILTERXML returns.
TEXT(....,"000") - Excel will try delete leading zeros of a numeric value so we use TEXT() to turn it into a string value of three digits.
Now, if no element can be found, this will return an error however a simple IFERROR could fix this.
Try this function, please:
Function ExtractThreeDigitsNumber(x As String) As String
Dim El As Variant, arr As Variant, strFound As String
If InStr(x, "_") > 0 Then
arr = Split(x, "_")
Elseif InStr(x, "-") > 0 Then
arr = Split(x, "-")
Else
arr = Split(x, " ")
End If
For Each El In arr
If IsNumeric(El) And Len(El) = 3 Then strFound = El: Exit For
Next
If strFound = "" Then
If IsNumeric(Right(x, 2)) Then ExtractThreeDigitsNumber = Right(x, 2)
Else
ExtractThreeDigitsNumber = strFound
End If
End Function
It can be called in this way:
Sub testExtractThreDig()
Dim x As String
x = "Ins-Si_079_GM_SOC_US_VI SI_SOC_FY1920_US_FY19/20_A2554_Si Resp_2_May"
Debug.Print ExtractThreeDigitsNumber(x)
End Sub

How to break a text block up so that it will display only One Word on each line

I am importing longer form text into a Unity program. I need one word of the longer text to be displayed on each line...
Thanks
The problem with working with large blocks of text in Word is that operations like Find and Replace can only be performed with Find text strings of 255 characters or less without causing an error. Once you import your text and assign it to a string variable, you can use Len() to determine the length of the string and then use Left() Mid() and Right() to breakup the larger string into shorter chunks of 250 characters each. Here's some code I wrote for just a find and replace situation:
With Selection.Find
y = Len(Selection.Text)
Select Case y
Case Is <= 250
x = 1
.Text = stFound
.Execute Replace:=wdReplaceAll
Case Is <= 500
Dim stFound2 As String
x = 2
z = Len(stFound) - 250
stFound1 = Left(stFound, 250)
stFound2 = Right(stFound, z)
Case Is <= 750
Dim stFound2 As String
Dim stFound3 As String
x = 3
stFound1 = Left(stFound, 250)
stFound2 = Mid(stFound, 251, 249)
stFound3 = Right(stFound, Len(stFound) - 500)
End Select
End With
I then used a For Next loop to run a Find and Replace on each string.
In your situation, it's going to be important to not break up the strings in the middle of a word. To do this you can use the InStr() function to find the position of spaces within your string and then break up the text according to where the spaces are. I wouldn't try using the Split() function on the raw text as depending on the size of the string you could run into a Subscript Out of Range error.
Once the text is chunked down into useable pieces, use the Split() function to send each word to an array and then run the following code to put each word on it's own line or paragraph:
Dim stTxt as String
dim stWord as String
dim stArr() as String
dim x as long
stTxt = 'One of your text strings
stArr() = Split(stTxt)
For x = LBound(stArr()) to UBound(stArr())
stWord = stArr(x) & "^p"
Selection.Typetext stWord
Next
After a little more research, I determined that the 255 character limit to text strings only affects some functions, not all. So I took a 17,335 character (including spaces) Word document and ran Split() on it to create an Array. There were no errors and the resulting array had a UBound of 2690.
So the next question is what kind of text is being imported into Word and what size is it. Is it just a list of words separated by spaces, or another delimiter? Does it contain any punctuation? If it's just a list of words separated by spaces or another delimiter such as a comma or semicolon, the Split() function will sort the words into an Array, at least up to 17,000 characters. More testing would be required for a larger text block. If the text contains punctuation, you would have to process the text to remove the unwanted punctuation which can be done with a Wildcard Find and Replace as long as the Find string is <= 255 characters. But if all you have are words and spaces or some other delimiter, using Split() to separate each word into an array element would work and then just run code as in the second half of my previous example:
For x = LBound(stArr()) to UBound(stArr())
stWord = stArr(x) & "^p"
Selection.Typetext stWord
Next

Octave - return the position of the first occurrence of a string in a cell array

Is there a function in Octave that returns the position of the first occurrence of a string in a cell array?
I found findstr but this returns a vector, which I do not want. I want what index does but it only works for strings.
If there is no such function, are there any tips on how to go about it?
As findstr is being deprecated, a combination of find and strcmpi may prove useful. strcmpi compares strings by ignoring the case of the letters which may be useful for your purposes. If this is not what you want, use the function without the trailing i, so strcmp. The input into strcmpi or strcmp are the string to search for str and for your case the additional input parameter is a cell array A of strings to search in. The output of strcmpi or strcmp will give you a vector of logical values where each location k tells you whether the string k in the cell array A matched with str. You would then use find to find all locations of where the string matched, but you can further restrain it by specifying the maximum number of locations n as well as where to constrain your search - specifically if you want to look at the first or last n locations where the string matched.
If the desired string is in str and your cell array is stored in A, simply do:
index = find(strcmpi(str, A)), 1, 'first');
To reiterate, find will find all locations where the string matched, while the second and third parameters tell you to only return the first index of the result. Specifically, this will return the first occurrence of the desired searched string, or the empty array if it can't be found.
Example Run
octave:8> A = {'hello', 'hello', 'how', 'how', 'are', 'you'};
octave:9> str = 'hello';
octave:10> index = find(strcmpi(str, A), 1, 'first')
index = 1
octave:11> str = 'goodbye';
octave:12> index = find(strcmpi(str, A), 1, 'first')
index = [](1x0)

Resources