line splitting using patterns - excel

My situation is this:
I have a workbook in which i take the addresses from column L and paste them into another workbook in column Y
The addresses are like this:
Analipseos, Katerinis, 60500
Ermioni, Ermioni, 21051, Alkistis, Alkistis, 21052
Agia, Agia, 40003, skiathos, skiathos 37002
I want to split each line every time there's a comma ,
Is there a way to split the first three for example taking from Agia, Agia, 40003, skiathos, skiathos 37002 to split Agia, Agia, 40003 and the rest skiathos, skiathos 37002 to be pasted elsewhere for example in column B.
Note: I don't want the commas to be pasted anywhere.
Could anyone help?
I tried this code in my loop: Sub tst() Application.DisplayAlerts = False Range("A2", Range("A" & Rows.Count).End(xlUp)).TextToColumns Range("B2"), xlDelimited, , , , , True Application.DisplayAlerts = True
End Sub
In a separate sheet it works just fine, but in my code loop it doesn' work properly

Since you want to split on the 3rd comma, you can go about it in a slightly different way:
Simply replace the 3rd comma by a character you know won't appear in the text otherwise, for example, a | character.
Excel makes this very easy to do using the SUBSTITUTE function.
So, for example, suppose you have the text Agia, Agia, 40003, skiathos, skiathos 37002 in cell L1, you can replace the 3rd comma to a | character using the following function:
=SUBSTITUTE(L1,",","|",3)
Making the new (substituted) text become: Agia, Agia, 40003| skiathos, skiathos 37002
Now, you can:
Replace all the other commas if you no longer want them
Use Data > Text to columns and split based upon the new | character which will only appear where you want it.
I hope that does the trick!

Related

Extract substrings from irregular text in Excel cell

I am trying to solve this problem -
If suppose I have text like this in a single column on Excel
#22-atr$$1 AM**01-May-2015&&
$21-atr#10-Jan-2007*6 PM&
&&56-atr#11 PM$$8-Jan-2016*
**4 PM#68-atr#21-Mar-2022&&
and I want to write functions to have separate columns as follows
Can someone help me do that please?
I am trying to solve this problem and the only thing that I was able to arrive to is extracting Month by using =MID(A1,FIND("-",A1)+1,3)
One option for formulae would be using new functions, currently available in the BETA-channel for insiders:
Formula in B1:
=LET(A,TEXTSPLIT(A1,{"#","$","&","*","#"},,1),B,SORTBY(A,IFERROR(MATCH(RIGHT(A),{"r","M"},0),3)),C,HSTACK(TAKE(B,,2),TEXTSPLIT(TEXT(--INDEX(B,3),"YYYY-Mmm-D"),"-")),IFERROR(--C,C))
The idea is to:
Use LET() throughout to store variables;
TEXTSPLIT() the value in column A using all available delimiters into columns and skip empty values in the resulting array;
Then SORTBY() the rightmost character of the resulting three elements using MATCH(). The IFERROR() will catch the data-string;
We can than HSTACK() the 1st and 2nd column with the result of splitting the 3rd element after we formatted to YYYY-MMM-D first;
Finally, the resulting array can be multiplied by a double unary. If not, we replace it with original content from the previous variable.
Notes:
I formatted column C to hold time-value in AM/PM.
I changed the text to hold dutch month-names to have Excel recognize the dates for demonstration purposes. Should work the same with English names.
For fun an UDF using regular expressions:
Public Function GetPart(inp As String, prt As Long) As Variant
Dim Pat As String
Select Case prt
Case 0
Pat = "(\d+-atr)"
Case 1
Pat = "(\d+\s*[AP]M)"
Case 2
Pat = "-(\d{4})"
Case 3
Pat = "-(\w+)-"
Case 4
Pat = "(\d+)-\w+-"
Case Else
Pat = ""
End Select
With CreateObject("vbscript.regexp")
.Pattern = ".*" & Pat & ".*"
GetPart = .Replace(inp, "$1")
End With
End Function
Invoke through =GetPart(0,A1). Choices ares 0-4 and in order of your column-headers.
You can achieve what you wish by applying a few simple transformations.
Replace the #,$,* and & with a common character that is guaranteed not to appear in the data sections (e.g. #)
Replace all occurrences of 2 or more runs of the # character with a single #
Trim the # from the start and end of the string
Split the string into an array using # as the split character (vba.split)
use For Each to loop over the array
In the loop have a set of three tests
Test 1 tests the string for the ocurrence of "-atr"
Test 2 tests the string for the occurence of "-XXX-" where XXX is a three letter month - You then split the date at the - to give an array with Day/Month/Year
Test 3 Tests if the string has ' AM' or ' PM'

Excel - reverse text to columns

I am struggling to find a good way of doing this.
I have a file where some lines are hundreds of words long (comma separated) and some are only a few words long.
So performing a text to columns produces hundreds of columns, most of which are blank.
I have done my edits on the second column and now need to join everything back so that each line can be read in a text file and all words are comma separated once more.
Is there a formula that will know how many columns are in each line that are not blank and bring them all back into the first cell with comma separation?
I should add, I only have Excel 2010
I'd be happy to try a good powershell script solution if possible
Many thanks,
K
eg.
54325,354354,786756,6543,73644,23323,544,7233,64537,654,56,3456,754,876666,78,788
122,433
655,766
1233,7374,65436,65444,6577,85488,56767,8585876,6755,544445,67,67783,2233,466636
I use text to columns so I can work on contents of column 2, but then need it back in this format once done.
If I simply save as csv and open as text file, there are commas for every blank cell in each column
Given a text file like the one you describe, you could simply split the lines manually:
Get-Content .\input.txt |ForEach-Object {
# split line into individual values
$values = $_ -split ','
# modify as needed
if($values.Length -gt 2){
$values[1] = "Make changes to column 2 here"
}
# stitch line back together
$values -join ','
} |Set-Content .\output.txt
Not sure why you need to strip off the trailing commas from the short rows, as they don't produce blank columns, only empty cells. The columns have content from another row. But here is VBA code to do what you request, in case that is a possibility for your usage:
Assumption: Starting point is your worksheet with each element of the CSV data you show above, in a separate cell.
None of your words includes a comma, linefeed, or other character that requires enclosing the word in quotes.
This VBA code:
Option Explicit
Sub createCSV()
Dim v, x, S As String, I As Long, RE As Object
v = ThisWorkbook.Worksheets("Sheet2").Cells(1, 1).CurrentRegion
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = ",*$"
.MultiLine = True
.Global = True
End With
ReDim x(1 To UBound(v, 1), 1 To 1)
With WorksheetFunction
For I = 1 To UBound(v, 1)
'Replace trailing commas with nothing
x(I, 1) = RE.Replace(Join(.Index(v, I, 0), ","), "")
Next I
End With
Open "D:\Users\Ron\Desktop\NewFile.csv" For Output As #1
For I = 1 To UBound(x)
Print #1, x(I, 1)
Next I
Close #1
End Sub
I used a Regular Expression to remove the trailing commas.
results (in Notepad++):
54325,354354,786756,6543,73644,23323,544,7233,64537,654,56,3456,754,876666,78,788
122,433
655,766
1233,7374,65436,65444,6577,85488,56767,8585876,6755,544445,67,67783,2233,466636

How to split by multiple delimiters in vba excel

I need to split some data with multi delimiters (//,/,-) , and I used one cell (A3) as data entry cell and I need multi delimiters to provide multi option to the user.
and I also need to know the availability to re-arrange the splitting results like if the results involved words content (*.com or *.net) transfer to certain column
I try to use a code to split but it is working with one delimiter
Make them the same.
Say we have a string that we want to parse by both / and $. Here an example:
Sub multiparse()
Dim s As String, s2 As String
s = "poiuy/tyuiop$7654$lkiop/"
s2 = Replace(s, "/", "$")
arr = Split(s2, "$")
End Sub

How to remove duplicates in a string

I have a file contains 38,000 records each row contains 2 or more ';' at the end. is there any formula to remove the end repeated ';' in Excel or any other tool for example
To remove repeated characters (semi-colons in this case)
Hit CTRL+H
Find What: ;; (two semicolons)
Replace with: ; (one semicolon)
Click Replace All.
When it finishes, repeat Step 4 until there are no more matches found.
Now the document will have no more than one semicolon in a row.
Remove repeated characters using a VBA function:
The following function does the same thing using VBA, and for any character you choose:
Function removeDoubleChars(txt As String, doubleChar As String) As String
'removes all multiple-consecutive [doubleChar] within [txt]
Do
txt = Replace(txt, doubleChar & doubleChar, doubleChar)
Loop While InStr(txt, doubleChar & doubleChar) > 0
removeDoubleChars = txt
End Function
You would use this like Range("A1") = removeDoubleChars ( Range("A1"), ";") to remove consecutive semicolons from cell A1.

Remove next substring from charter on last position in Excel

I have Excel sheet which contains data similar to
Addresses
xyz,abc,olk
opn,opk,prt
we-ylj,tyf,uyfas
oiui,ytfy,tydry - We also work in bla,bla,bla
ytfyt,tyfyt,ghfyt
i-hgsd,gsdf-hgd,sdgh,- We also work in xxx,yy,zzz
ytsfgh,gfasdg,tydsfyt
I want to remove all substring which is next to the character "-" only if it's in the last position.
Result should be like
xyz,abc,olk
opn,opk,prt
we-ylj,tyf,uyfas
oiui,ytfy,tydry
ytfyt,tyfyt,ghfyt i-hgsd,gsdf-hgd,sdgh
ytsfgh,gfasdg,tydsfyt
I tried with =Substitute function but unable to replace data because of the last substring separated from "-" is not similar.
Going by your specifications, I would use two columns just so it's not a very long formula:
In B1:
=IFERROR(FIND(CHAR(1),SUBSTITUTE(A1,"-",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"-",""))))-1,LEN(A1))
This gets the position of the last - or the full text length.
Then in C1:
=LEFT(A1,IF(FIND(",",A1)<B1,B1,LEN(A1)))
This checks if there's a , before the last -. If there is no ,, then the full text is taken.
EDIT: I only now noticed your edited comment. If it's just everything after - We, then I would use this:
=TRIM(LEFT(A1,IFERROR(FIND("- We",A1)-2,LEN(A1))))

Resources