Using Excel Proper Function with exception | Excel - excel

Essentially I have multiple strings within my Excel Spreadsheet that are structured the following way:
JOHN-MD-HOPKINS
REC-PW-RESIN
I would like to use the proper function but exclude the part of the string that is within the dashes (-).
The end result should look like the following:
John-MD-Hopkins
Rec-PW-Resin
Is there an excel formula that is capable of doing this?

You may need to create your own VBA function to do this, that checks if there are two hyphens in the data, and if so converts the first and last words to proper case without touching the middle word, otherwise just converts the string to proper case.
Paste the following into a module within Excel:
Function fProperCase(strData As String) As String
Dim aData() As String
aData() = Split(strData, "-")
If UBound(aData) - LBound(aData) = 2 Then ' has two hyphens in the original data
fProperCase = StrConv(aData(LBound(aData)), vbProperCase) & "-" & aData(LBound(aData) + 1) & "-" & StrConv(aData(UBound(aData)), vbProperCase)
Else ' just do a normal string conversion to proper case
fProperCase = StrConv(strData, vbProperCase)
End If
End Function
Then, in your worksheet, you can use this just as you would any built-in formula, so if "JOHN-MD-HOPKINS" is in cell A1, you would use this as a formula in another cell:
=fProperCase(A1)
Which would display John-MD-Hopkins as required.
EDITED CODE
As the requirement is to leave the second word, then this modified VBA function, which "walks" the array should work instead:
Function fProperCase2(strData As String) As String
Dim aData() As String
Dim lngLoop1 As Long
aData() = Split(strData, "-")
For lngLoop1 = LBound(aData) To UBound(aData)
If (lngLoop1 = LBound(aData) + 1) And (lngLoop1 <> UBound(aData)) Then
aData(lngLoop1) = aData(lngLoop1)
Else
aData(lngLoop1) = StrConv(aData(lngLoop1), vbProperCase)
End If
Next lngLoop1
fProperCase2 = Join(aData, "-")
End Function
It basically looks to see if the array element being dealt with is the second (lngLoop1=LBound(aData)+1) and also not the last (lngLoop1<>UBound(aData)).
Regards,

Related

Differentiate between "1" and "11"

I have 20 cases. For every row in my sheet, I have a cell that assigns related case numbers to it. A row could have multiple case numbers assigned to it in that cell (Example: 1,2,11,12)
I am writing a code to copy all the rows that have Case number 1 assigned to them, copy them someplace else..
and then go to case number 2 and repeat the same..
This is what I am using:
For CaseNumbers = 1 To 20
For i = Row1 To RowLast
If InStr(1, Range(CaseNoCell & i).Value, CaseNumbers & ",") Then
COPY AND PASTE CODE
End If
Next
Next
The problem I am facing is, the code considers case number 11 as case number 1 too (since it has the digit 1).
This is the first time I am writing a VBA code and I have no background in this.
Can someone please advise on better way of doing this? Should I assign a checklist instead to each row?
All I want to do is find all the rows that have Case number 1 assigned, copy them.. then find all the rows that have Case 2 assigned, copy them.. and so on.
Please help.
You can use a function to do the test
Public Function isCaseNumberIncluded(ByVal caseToCheck As Long, ByVal caseNumbers As String) As Boolean
'add , to make all values distinct
caseNumbers = "," & caseNumbers & ","
Dim strCaseToCheck As String
strCaseToCheck = "," & caseToCheck & ","
If InStr(1, caseNumbers, strCaseToCheck) > 0 Then
isCaseNumberIncluded = True
End If
End Function
You would call this function within your main code like this:
Dim caseNumber As Long 'I removed the s - as this could be misleading in my eyes
For caseNumber = 1 To 20
For i = Row1 To RowLast
If isCaseNumberIncluded(caseNumber, Range(CaseNoCell & i).Value) Then
COPY AND PASTE CODE
End If
Next
Next
Using a separate function to run the test has two advantages:
your code gets more readable, ie you know from reading the functions name what the result should be - without reading the whole code how to do it :-)
you can re-use this code propably at another place
Or you can test the function first:
Public Sub test_isCaseNumberIncluded()
Debug.Print isCaseNumberIncluded(1, "1,2,11,12"), "Should be true"
Debug.Print isCaseNumberIncluded(1, "2,11,12"), "Should be false"
Debug.Print isCaseNumberIncluded(11, "1,2,11,12"), "Should be true"
Debug.Print isCaseNumberIncluded(11, "1,2,12"), "Should be false"
End Sub
Well, you are working with this piece of code:
If InStr(1, Range(CaseNoCell & i).Value, CaseNumbers & ",") Then
This checks against 1,, 12,, ..., but obviously it won't cover the last entry so that's something you'll need to add. And you have the problem that 11, gets treated as 1,.
In a similar way you can use this piece of code:
If InStr(1, Range(CaseNoCell & i).Value, "," & CaseNumbers & ",") Then
This checks against ,1,, ,12,, ... so it will solve your error, but obviously it won't cover the last and the first entry so that's something you'll need to add.
This is something that should be encapsulated in a function rather than being done in line. The method provided in VBA for tokenising a string is 'Split'.
You could wite a function that checks tokens 1 by 1, or which compile a collection of the tokens which then uses a built checking method of the collection to determine if the specified token is present or not.
In this specific case I've chosen to use the collection method. The specific object for the collection is the ArrayList (but a Scripting.Dictionary is also possible). The function contains checks for zero length strings and allows the seperator to be specified if it isn't a comma.
Option Explicit
Function FindToken(ByVal ipToken As String, ByVal ipTokenList As String, Optional ByVal ipSeparator As String = ",") As Boolean
' Guard against ipSeparator being vbnullstring
Dim mySeparator As String
mySeparator = IIf(VBA.Len(ipSeparator) = 0, ",", ipSeparator)
'Raise an error if ipToken or ipTokenList are empty strings
If VBA.Len(ipToken) = 0 Or VBA.Len(ipTokenList) = 0 Then
Err.Raise 17, "Empty string error"
End If
'Convert the token list to tokens
Dim myTokens As Variant
myTokens = VBA.Split(ipTokenList, mySeparator)
' Put the tokens in an ArrayList so we can use the contains method
' no point is doing early binding as arraylist doesn't provide intellisense
Dim myAL As Object
Set myAL = CreateObject("System.Collections.ArrayList")
Dim myItem As Variant
For Each myItem In myTokens
' Trim just in case there are spaces
myAL.Add VBA.Trim(myItem)
Next
'Finally test if the Token exists in the token list
Find = myAL.contains(VBA.Trim(ipToken))
End Function
This means that your code
If InStr(1, Range(CaseNoCell & i).Value, CaseNumbers & ",") Then
can now be rewritten as
If FindToken(CStr(CaseNUmbers), Range(CaseNoCell & cstr(i)).Value) Then
Identify Criteria Rows
Option Explicit
Sub Test()
Const WordSeparator As String = ","
Dim ws As Worksheet: Set ws = ActiveSheet ' improve!
Dim CaseNumber As Long
Dim i As Long
Dim cValue As Variant
Dim cString() As String
For CaseNumber = 1 To 20
For i = Row1 To RowLast
cValue = CStr(ws.Range(CaseNoCell & i).Value)
If Len(cValue) > 0 Then
cString = Split(cValue, WordSeparator)
If IsNumeric(Application.Match( _
CStr(CaseNumber), cString, 0)) Then
' CopyAndPasteCode CaseNumber
Debug.Print "Case " & CaseNumber & ": " & "Row " & i
End If
End If
Next i
Next CaseNumber
End Sub

InStr will not find dots in some cases

I have strings that consist of leading dots followed by a number (for example "..2" or "....4". I want to delete all leading dots and convert the string into a long variable.
So I have written a function that finds leading dots in strings and deletes them. For some reason, the function works for a string like "..2" but will not work for "...3". The InStr function will not find "." in "...3".
The strings are read out from a column in a worksheet. They are not formatted in any weird way, I have tried just typing them in manually in a new worksheet without any changes to the default formatting settings, same results.
So I have tried several things. I beleive there must be some error involving character encodings, I cannot figure out how to solve this problem though.
I have tried using a recursive function using InStr to delete the dots and then tried the split function with "." as the delimiter to test my assumption. Split has the same problem, works for "..2" but will not work for "...3".
When I debug print the strings that I read out, "...3" seems to be formatted differently than "..2" or ".1". I do not know why.
here you can see the difference in the formatting
Sub Gruppieren()
'read out strings first
'then try to delete the dots
Dim strArr() As String
Dim lngArr() As Long
Dim lLastRow As Long
Dim i As Long
lLastRow = getFirstEmptyRow("A", Tabelle1.Index)
ReDim strArr(1 To lLastRow)
ReDim lngArr(1 To lLastRow)
For i = 1 To UBound(strArr)
strArr(i) = Worksheets(1).Cells(i, 1).Value
Debug.Print strArr(i)
strArr(i) = clearLeadingDots(strArr(i))
'strArr(i) = splitMeIfYouCan(strArr(i))
If IsNumeric(strArr(i)) = True Then
lngArr(i) = CLng((strArr(i)))
Debug.Print lngArr(i)
End If
Next i
End Sub
'The functions:
Function clearLeadingDots(myText As String) As String
Dim i As Long
i = InStr(myText, ".")
If i <> 0 Then
myText = Right(CStr(myText), Len(myText) - i)
clearLeadingDots = clearLeadingDots(CStr(myText))
Else
clearLeadingDots = CStr(myText)
Exit Function
End If
End Function
Function splitMeIfYouCan(myText As String) As String
Dim myArr() As String
Dim i As Long
myArr = Split(myText, ".")
splitMeIfYouCan = myArr(UBound(myArr))
End Function
Edit: The answer was, that three dots were converted into an ellipsis automatically, searching for and eliminating Chr(133) did the job.

Pass Multiple Values and Get Selected Results In A Cell

I am new with Excel VBA and trying to use it for a scenario in an Excel work book. I am trying to do a multiple value search in a cell and that should be highlighted. Say I've these ids - 1001, 1002, so in the specific cell these values should be highlighted or searched. I am not sure how can I use List<> in VBA but in C#, I can do the following:
List<string> aLst = new List<string>();
aLst.Add("1001");
aLst.Add("1002");
So with the list, I can iterate the ids and get the results matched. I was looking into the following VBA code and it gets one value as parameter. Then checks the matched one:
Function SingleCellExtract(Lookupvalue As String, LookupRange As Range, ColumnNumber As Integer)
Dim i As Long
Dim Result As String
For i = 1 To LookupRange.Columns(1).Cells.Count
If LookupRange.Cells(i, 1) = Lookupvalue Then
Result = Result & " " & LookupRange.Cells(i, ColumnNumber) & ","
End If
Next i
SingleCellExtract = Left(Result, Len(Result) – 1)
End Function
Here is the link that I am following: Excel VBA
So any way that I can use List<> and search the required values highlighted in the excel sheet?
Sample:
Id - Name
1001 John
1002 Jack
So copy this function into a new module.
Then you can access either via another function or procedure or through an excel formula in a cell like =MultiCellExtract(A2:A3;A2:B3;2)
' LookupValuesRange is an Excel Range of cells
Public Function MultiCellExtract(LookupValuesRange As Range, LookupRange As Range, ColumnNumber As Integer) As String
Dim cellValue As Range
Dim i As Long
Dim Result As String
For Each cellValue In LookupValuesRange
For i = 1 To LookupRange.Columns(1).Cells.Count
If LookupRange.Cells(i, 1) = cellValue.Value Then
Result = Result & " " & LookupRange.Cells(i, ColumnNumber) & ","
End If
Next i
Next cellValue
MultiCellExtract = Left(Result, Len(Result) - 1)
End Function
Let me know if it helps or we can adjust it.

Get value between multiple parenthesis with excel/airtable formula

I'm trying to get all the content between multiple parenthesis and comma delimiting them. So for example
A1 contains
thisfile.jpg (/path/to/file.jpg), thisfile2.jpg (/path/to/file2.jpg)
and B1 should look like
/path/to/file.jpg, /path/to/file2.jpg
If it's just one entry I can get what I need with this:
MID(A1,FIND("(",A1)+1,FIND(")",A1)-FIND("(",A1)-1)
But that only returns the first one, I need to be for each parenthesis. The amount of parenthesis in each row will vary.
I am sure there are better solutions out there with formulas only. Yet, I cannot help you there. But the following UDF is surely also a feasible solution. Just copy this code into an empty module:
Option Explicit
Public Function GetPaths(strTMP As String)
Dim i As Long
Dim varArray As Variant
varArray = Split(strTMP, "(")
For i = LBound(varArray) To UBound(varArray)
If InStr(1, varArray(i), ")") > 0 Then
GetPaths = GetPaths & ", " & Mid(varArray(i), 1, InStr(1, varArray(i), ")") - 1)
End If
Next i
GetPaths = Mid(GetPaths, 3)
End Function
Afterwards, you can use this formula in column B as follows: =GetPaths(A1).

How to extract text within a string of text

I have a simple problem that I'm hoping to resolve without using VBA but if that's the only way it can be solved, so be it.
I have a file with multiple rows (all one column). Each row has data that looks something like this:
1 7.82E-13 >gi|297848936|ref|XP_00| 4-hydroxide gi|297338191|gb|23343|randomrandom
2 5.09E-09 >gi|168010496|ref|xp_00| 2-pyruvate
etc...
What I want is some way to extract the string of numbers that begin with "gi|" and end with a "|". For some rows this might mean as many as 5 gi numbers, for others it'll just be one.
What I would hope the output would look like would be something like:
297848936,297338191
168010496
etc...
Here is a very flexible VBA answer using the regex object. What the function does is extract every single sub-group match it finds (stuff inside the parenthesis), separated by whatever string you want (default is ", "). You can find info on regular expressions here: http://www.regular-expressions.info/
You would call it like this, assuming that first string is in A1:
=RegexExtract(A1,"gi[|](\d+)[|]")
Since this looks for all occurance of "gi|" followed by a series of numbers and then another "|", for the first line in your question, this would give you this result:
297848936, 297338191
Just run this down the column and you're all done!
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String, _
Optional separator As String = ", ") As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long, j As Long
Dim result As String
RE.pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.count - 1
For j = 0 To allMatches.Item(i).submatches.count - 1
result = result & (separator & allMatches.Item(i).submatches.Item(j))
Next
Next
If Len(result) <> 0 Then
result = Right$(result, Len(result) - Len(separator))
End If
RegexExtract = result
End Function
Here it is (assuming data is in column A)
=VALUE(LEFT(RIGHT(A1,LEN(A1) - FIND("gi|",A1) - 2),
FIND("|",RIGHT(A1,LEN(A1) - FIND("gi|",A1) - 2)) -1 ))
Not the nicest formula, but it will work to extract the number.
I just noticed since you have two values per row with output separated by commas. You will need to check if there is a second match, third match etc. to make it work for multiple numbers per cell.
In reference to your exact sample (assuming 2 values maximum per cell) the following code will work:
=IF(ISNUMBER(FIND("gi|",$A1,FIND("gi|", $A1)+1)),CONCATENATE(LEFT(RIGHT($A1,LEN($A1)
- FIND("gi|",$A1) - 2),FIND("|",RIGHT($A1,LEN($A1) - FIND("gi|",$A1) - 2)) -1 ),
", ",LEFT(RIGHT($A1,LEN($A1) - FIND("gi|",$A1,FIND("gi|", $A1)+1)
- 2),FIND("|",RIGHT($A1,LEN($A1) - FIND("gi|",$A1,FIND("gi|", $A1)+1) - 2))
-1 )),LEFT(RIGHT($A1,LEN($A1) - FIND("gi|",$A1) - 2),
FIND("|",RIGHT($A1,LEN($A1) - FIND("gi|",$A1) - 2)) -1 ))
How's that for ugly? A VBA solution may be better for you, but I'll leave this here for you.
To go up to 5 numbers, well, study the pattern and recurse manually in the formula. IT will get long!
I'd probably split the data first on the | delimiter using the convert text to columns wizard.
In Excel 2007 that is on the Data tab, Data Tools group and then choose Text to Columns. Specify Other: and | as the delimiter.
From the sample data you posted it looks like after you do this the numbers will all be in the same columns so you could then just delete the columns you don't want.
As the other guys presented the solution without VBA... I'll present the one that does use. Now, is your call to use it or no.
Just saw that #Issun presented the solution with regex, very nice! Either way, will present a 'modest' solution for the question, using only 'plain' VBA.
Option Explicit
Option Base 0
Sub findGi()
Dim oCell As Excel.Range
Set oCell = Sheets(1).Range("A1")
'Loops through every row until empty cell
While Not oCell.Value = ""
oCell.Offset(0, 1).Value2 = GetGi(oCell.Value)
Set oCell = oCell.Offset(1, 0)
Wend
End Sub
Private Function GetGi(ByVal sValue As String) As String
Dim sResult As String
Dim vArray As Variant
Dim vItem As Variant
Dim iCount As Integer
vArray = Split(sValue, "|")
iCount = 0
'Loops through the array...
For Each vItem In vArray
'Searches for the 'Gi' factor...
If vItem Like "*gi" And UBound(vArray) > iCount + 1 Then
'Concatenates the results...
sResult = sResult & vArray(iCount + 1) & ","
End If
iCount = iCount + 1
Next vItem
'And removes trail comma
If Len(sResult) > 0 Then
sResult = Left(sResult, Len(sResult) - 1)
End If
GetGi = sResult
End Function
open your excel in Google Sheets and use the regular expression with REGEXEXTRACT
Sample Usage
=REGEXEXTRACT("My favorite number is 241, but my friend's is 17", "\d+")
Tip: REGEXEXTRACT will return 241 in this example because it returns the first matching case.
In your case
=REGEXEXTRACT(A1,"gi[|](\d+)[|]")

Resources