How to split a content from a cell in excel - excel

I am wondering how to take out the date part from the content and split both the code and date to separate columns. I will show you guys an example
Column A
Orient / 21/Dec / 30-12-2020
TechSol/8 / 1-1-2021
Orient / 12/Jan / 1-10-2021
AE-003 / 13-1-2021
I want to get the results like this:
B column
C column
Orient / 21/Dec
30-12-2020
TechSol/8
1-1-2021
Orient / 12/OCT
1-10-2021
AE-003
13-1-2021
the format of the combined cell is always like Code / Date, that is code is always separated from a date with <space> dash <space>. I am unable to figure out a way to separate them. When I use text to the column with character as / such dash are also present in the code. But I use fixed-width option it still doesn't work for me, as these are all different widths. using the formula =right is not working for me because the date format is not always in a fixed format, for example, 10 October will be in dd-mm-yyyy but single-digit month or day will be in the format d-m-yyyy so the character length is not also fixed.
I hope you all understood my issue. I need a formula to split these into different columns.

Please, try the next function:
Function SplitTEXT(x As String) As Variant
Dim arr, sec As String
arr = Split(x, "/ "): sec = arr(UBound(arr)) 'split and memorize he last array element (date)
arr(UBound(arr)) = "###$" & arr(UBound(arr)) 'add a unusual string to the last array element
'in order to easily and faster replace it in the next line
'Create an array from joined array elements after replacing the last one and the last (memorized) element (date):
SplitTEXT = Array(Join(Filter(arr, arr(UBound(arr)), False), "/ "), sec)
End Function
It can be tested for all your example strings in the next way:
Sub testSplitTEXT()
Dim x As String, arr
x = "Orient / 21/Dec / 30-12-2020"
'x = "TechSol/8 / 1-1-2021"
'x = "Orient / 12/Jan / 1-10-2021"
'x = "AE-003 / 13-1-2021"
arr = SplitTEXT(x)
Debug.Print arr(0), arr(1)
Range("B1:C1").value = arr
End Sub
You must only uncomment the x = ... lines...
Or, use the next way to iterate between each A:A column values and split as you requested (on B:C columns):
Sub testSplitTIteration()
Dim i As Long, sh As Worksheet, lastR As Long
Set sh = ActiveSheet
lastR = sh.Range("A" & sh.rows.count).End(xlUp).row
For i = 2 To lastR
sh.Range("B" & i & ":C" & i).value = SplitTEXT(sh.Range("A" & i).value)
Next
End Sub

Given the examples you show:
Col B: Return up to the last / in the string
Col C: Return all after the last <space> in the string
B1: =LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))))-1)
C1: =TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99))

Split by the Last Occurrence
Option Explicit
Sub splitByLastOccurrence()
Const sName As String = "Sheet1"
Const sFirst As String = "A1"
Const dName As String = "Sheet1"
Const dFirst As String = "B1"
Const Delimiter As String = " / "
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
' Attempt to define (one-column) Source Range.
Dim rg As Range
Dim isRangeDefined As Boolean
With wb.Worksheets(sName).Range(sFirst)
Set rg = .Resize(.Worksheet.Rows.Count - .Row + 1) _
.Find("*", , xlFormulas, , , xlPrevious)
If Not rg Is Nothing Then
Set rg = .Resize(rg.Row - .Row + 1)
isRangeDefined = True
End If
End With
If isRangeDefined Then
' Write (one-column) Source Range to (one-column) Data Array.
Dim rCount As Long: rCount = rg.Rows.Count
Dim Data As Variant
If rCount = 1 Then
ReDim Data(1 To 1, 1 To 1): Data(1, 1) = rg.Value
Else
Data = rg.Value
End If
ReDim Preserve Data(1 To rCount, 1 To 2) ' increase by one column
Dim i As Long
Dim Pos As Long
Dim cString As String
' Write result to (two-column) Data Array.
For i = 1 To rCount
If Not IsError(Data(i, 1)) Then
cString = Data(i, 1)
Pos = InStrRev(cString, Delimiter)
If Pos > 0 Then
Data(i, 1) = Left(cString, Pos - 1)
Data(i, 2) = Right(cString, _
Len(cString) - Pos - Len(Delimiter) + 1)
End If
End If
Next i
' Write values from (two-column) Data Array
' to (two-column) Destination Range.
With wb.Worksheets(dName).Range(dFirst).Resize(, 2)
.Resize(rCount).Value = Data
.Resize(.Worksheet.Rows.Count - .Row - rCount + 1) _
.Offset(i - 1).ClearContents
End With
'Else
' No range.
End If
End Sub

Tiny variant using ReDim
For the sake of the art, I demonstrate a tiny variant to #FaneDuru 's valid answer (and can be called the same way).
This approach needs the following steps:
split the string passed as argument thus resulting in an array a with up to three elements,
remember the last element (identified via the Ubound() function) and assign it to b,
redimension array a via ReDim Preserve thus removing the last element (instead of a negative filtering),
return a function result as array comprising the joined elements of array a as well as the remembered element b.
Function SplitText(s As String) As Variant
'[0]split string
Dim a, b, ubnd As Long
a = Split(s, "/ "): ubnd = UBound(a)
b = a(ubnd)
'[1]redimension array a
ReDim Preserve a(IIf(ubnd = 1, 0, 1))
'[2]return results
SplitText = Array(Join(a, "/"), b)
End Function

I have found the answer to my problem. All I wanted to do what a reverse search to find the last / to extract the date which was variable and substitute the date to the first cell to delete that.
=IF(ISERROR(FIND(" / ",A1)),A1,RIGHT(A1,LEN(A1)-FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))))

Related

Get Outer Bounding Range of Union with Multiple Areas

Looked high and low, and I haven't found anyone who has talked about this:
I have 2 or more ranges that have been "Unioned" in VBA (so rngUnion.Areas.Count >= 2) and the area ranges are partially contiguous (e.g. rngUnion.Areas(1).address = "A1:Y75", rngUnion.Areas(2).address = "A76:U123", etc.).
What is the simple/efficient way to get the outer bounding range object of the combine areas within rngUnion? I have code below that does this but it seems super kludgy and dumb - I am sure that there is a better way.
Note: I am assuming that there could be other used cells around these areas that are not with the union so I am extremely hesitant to use .CurrentRegion, .UsedRange, or .End(xlUp).Row type methods that are all being suggested for working with ranges.
Sub SomeObfuscatedMethodForGettingAUnionOfPartiallyContiguousAreas()
Dim rng1 As Range: Set rng1 = Range("A1:Y75")
Dim rng2 As Range: Set rng2 = Range("A76:U123")
Dim rngUnion As Range, rngComplete As Range
Set rngUnion = Union(rng1, rng2)
Set rngComplete = GetOuterBoundingRange(rngUnion)
Debug.Print rngComplete.Address 'prints "A1:Y123"
End Sub
Function GetOuterBoundingRange(rngUnion As Range) As Range
Dim minRow As Long: minRow = 2147483647
Dim minCol As Long: minCol = 2147483647
Dim maxRow As Long: maxRow = 0
Dim maxCol As Long: maxRow = 0
Dim minRowTemp As Long
Dim minColTemp As Long
Dim maxRowTemp As Long
Dim maxColTemp As Long
Dim area As Range
For Each area In rngUnion.Areas
minRowTemp = area.Row
maxRowTemp = minRowTemp + area.Rows.Count - 1
minColTemp = area.Column
maxColTemp = minColTemp + area.Columns.Count - 1
If minRowTemp < minRow Then minRow = minRowTemp
If minColTemp < minCol Then minCol = minColTemp
If maxRowTemp > maxRow Then maxRow = maxRowTemp
If maxColTemp > maxCol Then maxCol = maxColTemp
Next area
With rngUnion.parent
Set GetOuterBoundingRange = .Range(.Cells(minRow, minCol), .Cells(maxRow, maxCol))
End With
End Function
As far as I know, there is no build-in function to do so. I don't think your function is that clumsy, in all cases you will need to loop over all areas and find the min and max row and column.
My attempt is a little bit shorter by collecting the numbers into arrays and uses the Min and Max-function, but basically it's doing the same.
Function getR(r As Range) As Range
ReDim minRow(1 To r.Areas.Count) As Long
ReDim maxRow(1 To r.Areas.Count) As Long
ReDim minCol(1 To r.Areas.Count) As Long
ReDim maxCol(1 To r.Areas.Count) As Long
Dim i As Long
For i = 1 To r.Areas.Count
minRow(i) = r.Areas(i).Row
maxRow(i) = r.Areas(i).Row + r.Areas(i).Rows.Count
minCol(i) = r.Areas(i).Column
maxCol(i) = r.Areas(i).Column + r.Areas(i).Columns.Count
Next
With r.Parent
Set getR = .Range(.Cells(WorksheetFunction.Min(minRow), WorksheetFunction.Min(minCol)), _
.Cells(WorksheetFunction.Max(maxRow) - 1, WorksheetFunction.Max(maxCol) - 1))
End With
End Function
This function uses the Application.Range property (Excel) to create the Range Around the Union Range.
Function UnionRange_ƒRangeAround_Set(rUnion As Range) As Range
Dim rOutput As Range, b As Byte
With rUnion
Set rOutput = .Areas(1)
For b = 2 To .Areas.Count
Set rOutput = Range(rOutput, .Areas(b))
Next
End With
Set UnionRange_ƒRangeAround_Set = rOutput
End Function
Since I brought it up, here is a solution which uses a regular expressions. Note for it to work you would need to set a reference to "Microsoft VBScript Regular Expressions 5.5". I pulled all the numbers out of the R1C1 address and used the fact that row numbers and column numbers would alternate, so it would fail if the range in question involved row only or column only references (eg, R3:R4 would break it).
Function getOuterBoundingRange(rngUnion As Range) As Range
Dim regEx As New RegExp
Dim m As Match, oMat As MatchCollection
Dim rowsArr() As Variant
Dim colsArr() As Variant
With regEx
.Global = True
.Pattern = "\d+"
End With
Set oMat = regEx.Execute(rngUnion.Address(, , xlR1C1))
ReDim rowsArr(0 To oMat.Count / 2 - 1)
ReDim colsArr(0 To oMat.Count / 2 - 1)
i = 0
For Each m In oMat
If (i / 2) = Int(i / 2) Then
rowsArr(i / 2) = CLng(m.Value)
Else
colsArr(Int(i / 2)) = CLng(m.Value)
End If
i = i + 1
Next m
With rngUnion.Parent
Set getOuterBoundingRange = .Range(.Cells(WorksheetFunction.Min(rowsArr), WorksheetFunction.Min(colsArr)), _
.Cells(WorksheetFunction.Max(rowsArr), WorksheetFunction.Max(colsArr)))
End With
End Function
Alternative via tricky FilterXML() - //Late Edit as of 2021-11-14
Instead of looping through all areas cell by cell or applying regEx,
I demonstrate how to resolve OP's question alternatively via FilterXML().
I extended #Professor Pantsless'es clever idea to use a R1C1 address of a range Union,
but parsed the address into two parts: the first with entire row indices, and the second with entire column indices.
This allows a minimum/maximum filtering without loops, executed by XPath expressions via FilterXML() (~> see help function getBoundaries).
Function getR(r As Range) As Range
'a) get Boundaries
Dim rc() As Long: rc = getBoundaries(r)
'b) get entire range
With r.Parent
Set getR = .Range(.Cells(rc(1), rc(2)), _
.Cells(rc(3), rc(4)))
End With
End Function
Help function getBoundaries()
Includes the main logic using FilterXML() in three steps:
a) define XPath expressions to find minimal/maximal row/column indices.
b) build a wellformed xml content string by tokenizing the Union range address (where R1C1 mode allows to get numeric values) - uses a further help function getContent().
c) apply FilterXML() based on a wellformed xml content and XPath expressions returning results as a 4-elements array with outer range boundaries.
Function getBoundaries(r As Range) As Long()
'Purp.: return boundaries of range union
'Site: https://stackoverflow.com/questions/69572123/get-outer-bounding-range-of-union-with-multiple-areas
'Date: 2021-10-15
'Auth: [T.M](https://stackoverflow.com/users/6460297/t-m)
'a) define XPath patterns
Const min As String = "//i[not(../i < .)][1]"
Const max As String = "//i[not(../i > .)][1]"
'b)get wellformed xml content (rows|columns)
Dim content As String
'c1)get Row boundaries
content = getContent(r, True) ' help function getContent()
Dim tmp(1 To 4) As Long
tmp(1) = Application.FilterXML(content, min)
tmp(3) = Application.FilterXML(content, max)
'c2)get Column boundaries
content = getContent(r, False) ' << corrected misspelling 2021-11-14 to getContent (inst/of wellformed()
tmp(2) = Application.FilterXML(content, min)
tmp(4) = Application.FilterXML(content, max)
'd) return boundaries array
getBoundaries = tmp
End Function
Help function getContent() (called by above function in section b))
Function getContent(r As Range, ExtractRows As Boolean) As String
'Purp.: get wellformed XML content string
'Meth.: tokenize R1C1-range address into html-like tags
Dim tmp As String
If ExtractRows Then ' extract row numbers
tmp = r.EntireRow.Address(ReferenceStyle:=xlR1C1)
getContent= "<rc><i>" & Replace(Replace(Replace(tmp, "R", ""), ",", ":"), ":", "</i><i>") & "</i></rc>"
Else ' extract column numbers
tmp = r.EntireColumn.Address(ReferenceStyle:=xlR1C1)
getContent= "<rc><i>" & Replace(Replace(Replace(tmp, "C", ""), ",", ":"), ":", "</i><i>") & "</i></rc>"
End If
End Function
Further links
I recommend reading #JvdV 's excellent & nearly encyclopaedic post Extract substrings from string using FilterXML().

VBA: Copy values from 2 or more columns into one column with a corresponding row

I am quite new to VBA and would like to do the following but not sure how:
First, I delimit data separated by commas. (this one is fine as I can do it with the record macro option as well)
The problem is that I would also like to transpose the information from each row to a column, with the rows being one under the other. However, I would also like to add the corresponding row to the data before it was delimited. Here is an example:
Example
Unpivot 'Comma' Separated
Sample Data
Country;Fruits
France;Apple,Oranges
Germany;Oranges,Bananas
UK;Grapes,Lemons
Sweden;Mandarines,Strawberries,Bananas, Apples
Create (OP)
=IF(A1="","",A1&";"&B1)
Copy (CTRL+C) the range and paste (CTRL+V) it into NotePad.
Select All and Copy (CTRL+A, CTRL+C)
Paste here (CTRL+V).
Select the whole text and either click the code sample icon ({}) or use CTRL+K.
Use (User)
Select the text and copy (CTRL+C),
Right-click the first cell (A1) and Paste (Match Destination Formatting),
Data > TextToColumns,
Next,
Delimiter: check Semicolon
Finish
Program
Usage/Features
Caution: If you use the same worksheets and the same first cell addresses, you will be overwriting.
Note that there is no Undo.
Adjust the values of the four constants.
Open the Immediate window (CTRL+G) to see the range addresses at the various stages.
Application.Trim will cover any redundant spaces like the one in Swedish apples.
The Code
Option Explicit
Sub unpivotCommaSeparated()
Const sName As String = "Sheet1"
Const sFirst As String = "A1"
Const dName As String = "Sheet1"
Const dFirst As String = "D1"
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
Dim sData As Variant
Dim rg As Range
Dim isDataInArray As Boolean
With wb.Worksheets(sName).Range(sFirst)
Debug.Print "Source First Cell: " & .Address(0, 0)
Set rg = .Resize(.Worksheet.Rows.Count - .Row + 1) _
.Find("*", , xlFormulas, , , xlPrevious)
If Not rg Is Nothing Then
Debug.Print "Source Last Cell in First Column: " & rg.Address(0, 0)
Set rg = .Resize(rg.Row - .Row + 1, 2)
Debug.Print "Source Range: " & rg.Address(0, 0)
sData = rg.Value
isDataInArray = True
End If
End With
If isDataInArray Then
Dim srCount As Long: srCount = UBound(sData, 1)
Dim cCount As Long: cCount = UBound(sData, 2)
ReDim Preserve sData(1 To srCount, 1 To cCount + 1)
Dim drCount As Long: drCount = 1
Dim i As Long
For i = 2 To srCount
sData(i, 2) = Split(sData(i, 2), ",")
sData(i, 3) = UBound(sData(i, 2))
drCount = drCount + sData(i, 3) + 1
Next i
Dim dData As Variant: ReDim dData(1 To drCount, 1 To cCount)
Dim j As Long
For j = 1 To cCount
dData(1, j) = sData(1, j)
Next j
Dim k As Long: k = 1
For i = 2 To srCount
For j = 0 To sData(i, 3)
k = k + 1
dData(k, 1) = sData(i, 1)
dData(k, 2) = Application.Trim(sData(i, 2)(j))
Next j
Next i
With wb.Worksheets(dName).Range(dFirst).Resize(, cCount)
Debug.Print "Destination First Row Range: " & .Address(0, 0)
Set rg = .Resize(k)
Debug.Print "Destination Range: " & rg.Address(0, 0)
rg.Value = dData
Set rg = .Resize(.Worksheet.Rows.Count - .Row - k + 1).Offset(k)
Debug.Print "Clear Range: " & rg.Address(0, 0)
rg.ClearContents
End With
End If
End Sub
SOLVED!! (2 STEPS)
(A) QUICK-START GUIDE
Google-sheets (screenshots below) here (perma-link, self-same content / functions included below in any case!). Key for labels 1 & 2:
Label 1 (e.g. 'country' in original Q - font colour = blue in screenshots below)
Label 2 (e.g. 'fruit' in original Q - font colour = green in screenshots below)
(B) STEPS 1-2
(preface: may as well be entitled 'two-function' soln...)
STEP 1: FILTERXML as dynamic array applied to Label 1 in first instance (ref: 8 ways to split text by delimiter - J./ MacDougall Note: this is one of 2 methods that shall be considered for Step 1.
FilterXML screenshot
FilterXML function in the context of above screenshot:
=FILTERXML("<t><s>"&SUBSTITUTE(ARRAYTOTEXT(D6:D10),",","</s><s>")&"</s></t>","//s")
STEP 2. INDEX-ARRAY (map Label 1 values to corresponding Label 2 values after applying FILTERXML step)
Utilises new Excel 'array' feature which only requires populating first cell - array produced this way will then 'fill down' into adjacent cells as req.
Note: 'old' array functionality can still be used / adopted by typing function into first cell, then pressing 'ctrl'+'alt'+'enter'.
Doing so may restrict ability to use hash references for arrays as I've done here
Index can often be seen with match component / function. In this case, match lookup value utilises '*' wildcards (this feature makes match functions exceptionally versatile)
Index-Array screenshot
Index-Array function in the context of above screenshot:
=INDEX($C$6:$C$10,MATCH("*"&D16#&"*",$D$6:$D$10,0))[5]
(C) OTHER VIABLE TECHNIQUES
Viable in context of other priorities, budget, etc.
i) VB Code
Effectively based upon self-same formulation already provided - hence the 'sub' name'... :)
Modify/adjust as you deem fit/as req.
Sub Boring_Split_Code():
ActiveCell.Formula2R1C1 = _
"=FILTERXML(""<t><s>""&SUBSTITUTE(ARRAYTOTEXT(R[-12]C:R[-8]C),"","",""</s><s>"")&""</s></t>"",""//s"")"
Range("C22").Select
'Application.CutCopyMode = False
ActiveCell.Formula2R1C1 = _
"=INDEX(R10C3:R14C3,MATCH(""*""&RC[1]#&""*"",R10C4:R14C4,0))"
End Sub
ii) VB 'mechanical-unpivot' method
Ref: O. Cronquist
Imagine this is what you were after when Q first posted
However, only features after my 'boring' sub above given I strongly recommend using a more elegant solution (e.g. FilterXML / Index-Array)
Devil's advocate: this approach may still prove preferable depending on use-case / objective
Caveats abound - may require 'tweaking' or post-execution manipulation to address 'blank' cells
Recommend using 'unique() Excel formula to address if/as req.- see here for further detail re: 'unique' function
Sub Unpivot()
'
' Unpivot Macro
' Creates pivot flat file source format from table with rows and columns
Dim rng As Range
Dim Ws As Worksheet
On Error Resume Next
Set rng = Application.InputBox(Prompt:="Select a range to normalize data" _
, Title:="Select a range", Default:=ActiveCell.address, Type:=8)
On Error GoTo 0
If rng Is Nothing Then
Else
Application.ScreenUpdating = False
Set Ws = Sheets.Add
i = 0
For r = 1 To rng.Rows.Count - 1
For c = 1 To rng.Columns.Count - 1
Ws.Range("A1").Offset(i, 0) = rng.Offset(0, c).Value
Ws.Range("A1").Offset(i, 1) = rng.Offset(r, 0).Value
Ws.Range("A1").Offset(i, 2) = rng.Offset(r, c).Value
i = i + 1
Next c
Next r
Ws.Range("A:C").EntireColumn.AutoFit
Application.ScreenUpdating = True
End If
End Sub
iii) Popular Mid / Match variants...
Less elegant alternate to 'funky' FilterXML method
(index-array method/equivalent still required)
Mid-Match (Step 1, Method 2) screenshot
Mid-Match functions
In the context of recent above screenshot:
= SUBSTITUTE(ARRAYTOTEXT(D6:D10)," ","")
Primes/prepares raw data for mid/substitute/search applications...
= MID(E23,1,SEARCH(",",E23)-1)
Initiates recursive substitute method, defined as follows (data rows 2+):
= MID( SUBSTITUTE(E$23, CONCAT(E$24:E24 & "," ), "" ), 1, IFERROR( SEARCH( ",", SUBSTITUTE(E$23, CONCAT( E$24:E24 & "," ),"" ) ) - 1, LEN( $E$23 ) ) )

VBA Code to Concatenate strings from column if first integers, or first and third integers, in another column match

Alright, this is a very specific question. I have an excel macro written that takes a web URL, delimits it, transposes it, and then adds adjacent columns that describe the information in the originally transposed columns. Now, I need to add something to my macro that will loop through and check if the first character of one cell matches one of the first 4 characters of another cell. If it does, I need to concatenate strings from the descriptive columns to new cells. I'll illustrate this below:
3,435,201,0.5,%22type%25202%2520diabetes%22,0 Node type 2 diabetes
4,165,97,0.5,%22diet%22,0 Node diet
5,149,248,0.5,%22lack%2520of%2520exercise%22,2 Node lack of exercise
6,289,329,0.5,%22genetics%22,3 Node genetics
7,300,71,0.5,%22blood%2520pressure%2520%22,5 Node blood pressure
7,3,-7,1,0 Arrow +
4,3,-21,1,0 Arrow +
5,3,-22,1,0 Arrow +
6,3,-34,1,0 Arrow +
,7%5D Tail
I added color to make the concept of the problem more easily visualized. In row one of the first column, we see a red 3 that corresponds to 'type 2 diabetes'. In the fifth row of the first column, we see a blue 7 that corresponds to 'blood pressure'. These are both node objects, as the adjacent column signifies. In the sixth cell of the first column we see a blue 7 and a red 3. This indicates that an arrow (also signified by adjacent column) is connecting blood pressure to diabetes. In the next column over, we see an orange plus sign, which indicates this is a positive relationship.
The goal is to populate the next column over with "blood pressure + type diabetes", as I demonstrated in the image. So, I need some code to check the first characters in each node cell, and then compare them to the first 4 characters of each arrow cell. When an arrow that matches two of the nodes is found, I need the code to populate the row next to the + signs with a concatenated string comprised of the names of the nodes pertaining to that arrow, as well as the + sign between them (it's possible that it could also be a minus sign, but one isn't present in this example). Any pointers? I can't wrap my head around this. Edited to add Data
Here is the code of my current macro:
Sub Delimit_Transpose()
Cells.Replace What:="],[", Replacement:="#", LookAt:=xlPart, SearchOrder _
:=xlByRows, MatchCase:=False, SearchFormat:=False, ReplaceFormat:=False
ActiveCell.FormulaR1C1 = "=RIGHT(R[-1]C,LEN(R[-1]C)-36)"
Dim i As Long, strTxt As String
Dim startP As Range
Dim xRg As Range, yRg As Range
On Error Resume Next
Set xRg = Application.InputBox _
(Prompt:="Range Selection...", _
Title:="Delimit Transpose", Type:=8)
i = 1
Application.ScreenUpdating = False
For Each yRg In xRg
If i = 1 Then
strTxt = yRg.Text
i = 2
Else
strTxt = strTxt & "," & yRg.Text
End If
Next
Application.ScreenUpdating = True
Set startP = Application.InputBox _
(Prompt:="Paste Range...", _
Title:="Delimit Transpose", Type:=8)
ary = Split(strTxt, "#")
i = 1
Application.ScreenUpdating = False
For Each a In ary
startP(i, 1).Value = Replace(Replace(a, "[", ""), "]", "")
i = i + 1
Next a
i = 1
For Each a In ary
If Len(a) > 13 Then
startP.Offset(i - 1, 1).Value = "Node"
ElseIf Len(a) < 13 And Len(a) > 6 Then
startP.Offset(i - 1, 1).Value = "Arrow"
Else
startP.Offset(i - 1, 1).Value = "Tail"
End If
i = i + 1
Next a
Dim openPos As Integer
Dim closePos As Integer
Dim midBit As String
i = 1
n = 5
For Each a In ary
openPos = InStr(a, ",%22")
On Error Resume Next
closePos = InStr(a, "%22,")
On Error Resume Next
midBit = Mid(a, openPos + 1, closePos - openPos - 1)
On Error Resume Next
If openPos <> 0 And Len(midBit) > 0 Then
startP.Offset(i - 1, 2).Value = Replace(Replace(midBit, "%22", ""), "%2520", " ")
ElseIf Len(a) < 13 And InStr(a, "-") = 4 Then
startP.Offset(i - 1, 2).Value = "'-"
ElseIf Len(a) < 7 Then
startP.Offset(i - 1, 2).Value = " "
Else
startP.Offset(i - 1, 2).Value = "+"
End If
i = i + 1
n = n + 1
Next a
Application.ScreenUpdating = True
End Sub
This is my approach.
There's room for a lot of improvements, but is a rough code that should get you started.
Read the code's comments and adapt it to fit your needs.
EDIT: I updated the code to match the sample worksheet you uploaded, build the first column range dinamically, validate if commas appear in the first column cell so no error is raised.
As I said in the comments, it's better easier to debug if you call one procedure from the other, instead of merging them.
Code:
Option Explicit
Public Sub StoreConcatenate()
' Basic error handling
On Error GoTo CleanFail
' Define general parameters
Dim targetSheetName As String
targetSheetName = "Test space" ' Sheet holding the data
Dim firstColumnLetter As String
firstColumnLetter = "C" ' First column holding the numbers
Dim firstColumnStartRow As Long
firstColumnStartRow = 7
' With these three parameters we'll build the range address holding the first column dynamically
' Set reference to worksheet
Dim targetSheet As Worksheet
Set targetSheet = ThisWorkbook.Worksheets(targetSheetName)
' Find last row in column (Modify on what column)
Dim firstColumnlastRow As Long
firstColumnlastRow = targetSheet.Cells(targetSheet.Rows.Count, firstColumnLetter).End(xlUp).Row
' Build range of first column dinamically
Dim firstColumnRange As Range
Set firstColumnRange = targetSheet.Range(firstColumnLetter & firstColumnStartRow & ":" & firstColumnLetter & firstColumnlastRow)
' Loop through first column range cells
Dim valueCell As Range
For Each valueCell In firstColumnRange
' Check if cell contains "," in the second position in string
If InStr(valueCell.Value, ",") = 2 Then
' Store first digit of cell before ","
Dim firstDigit As Integer
firstDigit = Split(valueCell.Value, ",")(0)
' Check if cell contains "," in the fourth position in string
If InStr(3, valueCell.Value, ",") = 4 Then
' Store second digit of cell after ","
Dim secondDigit As Integer
secondDigit = Split(valueCell.Value, ",")(1)
End If
' Store second colum type
Dim secondColumnType As String
secondColumnType = valueCell.Offset(, 1).Value
' Store third column value
Dim thirdColumnValue As String
thirdColumnValue = valueCell.Offset(, 2).Value
' Store nodes values (first digit and second column type)
Select Case secondColumnType
Case "Node"
Dim nodeValues() As Variant
Dim nodeCounter As Long
ReDim Preserve nodeValues(nodeCounter)
nodeValues(nodeCounter) = Array(firstDigit, thirdColumnValue)
nodeCounter = nodeCounter + 1
Case "Arrow"
Dim matchedNodeFirstValue As String
Dim matchedNodeSecondValue As String
matchedNodeFirstValue = IsInArrayReturnItem(firstDigit, nodeValues)(1)
matchedNodeSecondValue = IsInArrayReturnItem(secondDigit, nodeValues)(1)
If matchedNodeFirstValue <> vbNullString And matchedNodeSecondValue <> vbNullString Then
valueCell.Offset(, 3).Value = matchedNodeFirstValue & Space(1) & thirdColumnValue & Space(1) & matchedNodeSecondValue
End If
End Select
End If
Next valueCell
CleanExit:
Exit Sub
CleanFail:
Debug.Print "Something went wrong: " & Err.Description
Resume CleanExit
End Sub
' Credits: https://stackoverflow.com/a/38268261/1521579
Public Function IsInArrayReturnItem(stringToBeFound As Integer, arr As Variant) As Variant
Dim i
For i = LBound(arr) To UBound(arr)
If arr(i)(0) = stringToBeFound Then
IsInArrayReturnItem = arr(i)
Exit Function
End If
Next i
IsInArrayReturnItem = Array(vbNullString, vbNullString)
End Function
Let me know if it works
It appears that you are concatenating the lookups based on the
first and second integers,
where the second column = "Arrow"
If that is the case, I suggest:
Read the data table into a VBA array for faster processing
I am assuming your data is ordered as you show it, with all the Node entries at the start.
if that is not the case, then loop twice -- once to find the Nodes, and second time to concatenate the Arrow data.
Read the diagnoses into a dictionary for fact lookup.
if column2 = "Arrow" then concatenate the lookups of the first and second integers
Write back the data
Note: As written, this will overwrite the original table destroying any formulas that might be there. If needed, you could easily modify it to only overwrite the necessary area.
Note2 Be sure to set a reference (under Tools/References) to Microsoft Scripting Runtime, or change the Dictionary declaration to late-binding.
Regular Module
'set reference to Microsoft Scripting Runtime
Option Explicit
Sub Dx()
Dim WS As Worksheet
Dim rngData As Range, c As Range, vData As Variant
Dim dDx As Dictionary
Dim I As Long, sKey As String, dxKeys As Variant
'Get the data range
Set WS = ThisWorkbook.Worksheets("sheet1")
With WS
'assume table starts in A1 and is three columns wide
Set rngData = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=3)
'read into variant array for faster processing
vData = rngData
End With
'create dictionsry for dx lookups
Set dDx = New Dictionary
For I = 2 To UBound(vData, 1)
Select Case vData(I, 2)
Case "Node"
sKey = Split(vData(I, 1), ",")(0) 'first comma-separated number
If dDx.Exists(sKey) Then
MsgBox "duplicate diagnostic key. Please correct the data"
Exit Sub
End If
dDx.Add Key:=sKey, Item:=vData(I, 3)
Case "Arrow"
dxKeys = Split(vData(I, 1), ",")
vData(I, 3) = dDx(dxKeys(0)) & " + " & dDx(dxKeys(1))
End Select
Next I
'reWrite the table
Application.ScreenUpdating = False
rngData = vData
End Sub

I need this matching method to skip over blank cells and not include them as a matched value

This code works almost perfectly. The problem is it includes blank cells in its "matched" results. What do I need to change to make this code ignore blank cells? Below I will include an example of what is going on.
Sub MarkMatches()
Const TopLeftCell As String = "A2" ' change to match where your data are
Dim Rng As Range ' data range
Dim FirstRow As Long, FirstClm As Long
Dim Data As Variant ' original data (2-D)
Dim Arr As Variant ' data rearranged (1-D)
Dim Tmp As Variant ' working variable
Dim R As Long, R1 As Long ' row counters
Dim C As Long ' column counter
Dim Count() As String ' match counter
With Range(TopLeftCell)
FirstRow = .Row
FirstClm = .Column
End With
C = Cells(FirstRow, Columns.Count).End(xlToLeft).Column
Set Rng = Range(Cells(FirstRow, FirstClm), _
Cells(Rows.Count, FirstClm).End(xlUp).Offset(0, C - FirstClm))
Data = Rng.Value
ReDim Arr(1 To UBound(Data))
For R = 1 To UBound(Data)
ReDim Tmp(1 To UBound(Data, 2))
For C = 1 To UBound(Data, 2)
Tmp(C) = Data(R, C)
Next C
Arr(R) = Tmp
Next R
ReDim Count(1 To UBound(Arr))
For R = 1 To UBound(Arr) - 1
For R1 = R + 1 To UBound(Arr)
Tmp = 0
For C = 1 To UBound(Arr(R))
If Not IsError(Application.Match(Arr(R)(C), Arr(R1), 0)) Then
Tmp = Tmp + 1
End If
Next C
If Tmp > 0 Then ' change to suit
Tmp = Format(Tmp, "(0)") & ", "
Count(R) = Count(R) & CStr(R1 + FirstRow - 1) & Tmp
Count(R1) = Count(R1) & CStr(R + FirstRow - 1) & Tmp
End If
Next R1
Next R
For R = 1 To UBound(Count)
If Len(Count(R)) Then Count(R) = Left(Count(R), Len(Count(R)) - 2)
Next R
' set the output column here (2 columns right of the last data column)
' to avoid including this column in the evaluation
' it must be blank before a re-run
Set Rng = Rng.Resize(, 1).Offset(0, UBound(Data, 2) + 1)
Rng.Value = Application.Transpose(Count)
End Sub
Thank you #Variatus for the code and help so far!
I tried to work with your original code, but honestly I became very confused. My example below will illustrate some practices that could help (and those who may review your code later, including yourself!). So here's a list of comments:
Always use Option Explicit. Your code may already have this, but I'm listing it here for completeness sake.
Create variable names that describe what data it holds. Your code does a little of this, but some of the variable names are difficult to fit into the logic flow. My idea in coding is always to try and write self-documenting code. That way, it's nearly always clear what the code is trying to accomplish. Then I'll use comment for code blocks where it might be a bit less clear. (Don't fall into the trap of prefixing variable names with a "type" or something; it's ultimately not worth it.)
A clear description of the problem always helps. This is true not only to get help on SO, but also for yourself. My final comment to your post above, asking about the problem description really simplified everything. This includes describing what you want your output to show.
As per the problem description, you need to identify each unique item and keep track of which row you find that item so you can create a report later. A Dictionary is a perfect tool for this. Read up about how to use a Dictionary, but you should be able to follow what this block of code is doing here (even without all the previous declarations):
For Each cell In dataArea.Cells
If Not IsEmpty(cell) Then
If items.Exists(cell.Value) Then
'--- add this row to the list
rowList = items(cell.Value) & "," & cell.Row
items(cell.Value) = rowList
Else
'--- first time adding this value
items.Add cell.Value, cell.Row
End If
End If
Next cell
It's easy to see how the logic of this code follows the description of the problem. After that, it's just a matter of running through each row in the data area and checking each value on that row to see if duplicates exist on any other row. The full example solution is below for you to study and adjust to fit your situation.
Option Explicit
Sub IdentifyMatches()
Dim ws As Worksheet
Set ws = Sheet1
Dim dataArea As Range
Set dataArea = ws.Range("A1:F6")
Dim items As Dictionary
Set items = New Dictionary
'--- build the data set of all unique items, and make a note
' of which row the item appears.
' KEY = cell value
' VALUE = CSV list of row numbers
Dim rowList As String
Dim cell As Range
For Each cell In dataArea.Cells
If Not IsEmpty(cell) Then
If items.Exists(cell.Value) Then
'--- add this row to the list
rowList = items(cell.Value) & "," & cell.Row
items(cell.Value) = rowList
Else
'--- first time adding this value
items.Add cell.Value, cell.Row
End If
End If
Next cell
'--- now work through the data, row by row and make the report
Dim report As String
Dim duplicateCount As Variant
ReDim duplicateCount(1 To dataArea.Rows.Count)
Dim dataRow As Range
For Each dataRow In dataArea.Rows
Erase duplicateCount
ReDim duplicateCount(1 To dataArea.Rows.Count)
Dim rowNumber As Variant
For Each cell In dataRow.Cells
If items.Exists(cell.Value) Then
rowList = items(cell.Value)
Dim rowNumbers As Variant
rowNumbers = Split(rowList, ",")
For Each rowNumber In rowNumbers
If rowNumber <> cell.Row Then
duplicateCount(rowNumber) = duplicateCount(rowNumber) + 1
End If
Next rowNumber
End If
Next cell
report = vbNullString
For rowNumber = 1 To UBound(duplicateCount)
If duplicateCount(rowNumber) > 0 Then
report = report & rowNumber & "(" & duplicateCount(rowNumber) & ")" & ", "
End If
Next rowNumber
'--- display the report in the next column at the end of the data area
If Len(report) > 0 Then
report = Left$(report, Len(report) - 2) 'removes the trailing comma and space
dataRow.Cells(1, dataRow.Columns.Count + 1).Value = report
End If
Next dataRow
End Sub

Excel VBA - Find a set of characters and set as string

I have a set of descriptions that contain ID numbers arranged into a column. For example:
Column A
This is a description with the ID number in it ID12345.
This is a description ID66666 with the ID number in it.
This is ID99999 a description with the ID number in it.
The Id numbers are always in the format "IDXXXXX" I'd like to somehow trim away all the text in each of these cells and leave just that ID number.
My thoughts:
Can this be somehow done by finding a string like "ID?????" and setting that to a variable, then replacing the contents of the cell with that variable? Or by erasing all characters in a cell -except- for "ID?????"
Any help would be appreciated, thanks.
This code I wrote for you will iterate through all items in Column A. It will split all the words in each cell into an array. If the word is 7 or 8 characters long then it could potentially be an IDxxxxx. It will perform a few checks to see if it really matches an IDxxxxx syntax. In case it does it will replace the contents of the cell with just the ID dropping all the remaining text.
Sub ReplaceContentWithIDs()
Dim ws As Worksheet
Set ws = Sheets("Sheet1") ' or your sheet name
Dim rng As Range
Dim i&, lr&, j&
Dim arr
Dim str$
lr = ws.Range("A" & Rows.Count).End(xlUp).Row
' starting from 1 - if you have headers change to 2
For i = 1 To lr
Set rng = ws.Range("A" & i)
arr = Split(CStr(rng.Value), " ")
For j = LBound(arr) To UBound(arr)
str = arr(j)
If (Len(str) = 7) Or (Len(str) = 8) Then
If (StrComp(Left(str, 2), "ID", vbTextCompare) = 0) And _
IsNumeric(Right(Left(str, 7), 5)) Then
' found it
If Len(str) = 8 Then
rng.Value = Left(str, 7)
ElseIf Len(str) = 7 Then
rng.Value = str
End If
End If
End If
Next j
Set rng = Nothing
Next i
End Sub
I took this as a challenge to my intellect, and given that it is the end of the day, after seeing the formulas by Aladdin and pgc01 on Mr Excel forums I did a little work and came up with this CSE (Array formula):
=IF(ISNUMBER(LOOKUP(9.99999999999999E+307,SEARCH({"ID0","ID2","ID3","ID4","ID5","ID6","ID7","ID8","ID9"},A1))),MID(A1,LOOKUP(9.99999999999999E+307,SEARCH({"ID0","ID2","ID3","ID4","ID5","ID6","ID7","ID8","ID9"},A1)),7),"")
I also had some luck with this CSE Array formula:
=IF(ISNUMBER(SEARCH("ID"&{1,2,3,4,5,6,7,8,9},$A$1)),MID(A$1,SEARCH("ID"&{1,2,3,4,5,6,7,8,9},$A$1),7))

Resources