Excel replace characters in string before and after 'x'

Excel replace characters in string before and after 'x' - string

Hello I have a column with strings (names of products) in it.
Now these are formatted as Name LenghtxWidth, example Green box 20x30. Now I need to change the 20 with the 30 in this example so I get Green box 30x20, any ideas how I can achieve this?
Thanks

Here is both a formula solution, as well as a VBA solution using Regular Expressions:
Formula
=LEFT(A1,FIND(TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)),A1)-1)&
MID(TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)),SEARCH("x",TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)))+1,99)&
"x"&
LEFT(TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)),SEARCH("x",TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)))-1)
UDF
Option Explicit
Function RevWL(S As String)
Dim RE As Object
Const sPat As String = "(\d+.?\d*)x(\d+.?\d*)"
'If L or W might start with a decimal point, and not a digit,
'Then change sPat to: (\d*.?\d+)x(\d*.?\d+)
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.ignorecase = True
.Pattern = sPat
RevWL = .Replace(S, "$2x$1")
End With
End Function
Here is an example of the kinds of data I tested with:
The Formula works by looking at the last space-separated substring which would be LxW, then reversing the portion after and before the x, then concatenating everything back together.
The regex pattern captures the two numbers (could be integers or decimals, so long as the start with an integer -- although that could be changed if needed), and reversing them.
Here is a more detailed explanation of the regex (and the replacement string) with links to a tutorial:
(\d+.?\d*)x(\d+.?\d*)
(\d+.?\d*)x(\d+.?\d*)
Options: Case insensitive; ^$ don’t match at line breaks
Match the regex below and capture its match into backreference number 1 (\d+.?\d*)
Match a single character that is a “digit” \d+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match any single character that is NOT a line break character .?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match a single character that is a “digit” \d*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “x” literally x
Match the regex below and capture its match into backreference number 2 (\d+.?\d*)
Match a single character that is a “digit” \d+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match any single character that is NOT a line break character .?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match a single character that is a “digit” \d*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
$2x$1
Insert the text that was last matched by capturing group number 2 $2
Insert the character “x” literally x
Insert the text that was last matched by capturing group number 1 $1
Created with RegexBuddy

Here is a VBA solution that will work for you:
Option Explicit
Function Switch(r As Range) As String
Dim measurement As String
Dim firstPart As String
Dim secondPart As String
measurement = Right(r, Len(r) - InStrRev(r, " "))
secondPart = Right(measurement, Len(measurement) - InStr(1, measurement, "x"))
firstPart = Left(measurement, InStr(1, measurement, "x") - 1)
Switch = Left(r, InStrRev(r, " ") - 1) & " " & secondPart & "x" & firstPart
End Function
You can paste this in a regular module in the VBE (Visual Basic Editor) and use it as a regular function/formula. If your value is in cell A1 then type =Switch(A1) in cell B1. Hope it helps!

Ok, so it is really easier to use VBA, but if you want only some formulas you can use some columns to split your text and then concatenate your cells.
Here is a little example:
Of course B1-4 are optional. It is here only to have something more readable, but you can do use only one formula
=CONCATENATE(LEFT(A1, SEARCH(" ",A1,1)-1)," ",RIGHT(RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)),LEN(RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)))-SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)),1)),"x",LEFT(RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)), SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)),1)-1))
If you have several spaces in your names, you can use this formula that will search the last space in the text
=CONCATENATE(LEFT(A1, SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))-1)," ",RIGHT(RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))),LEN(RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))))-SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))),1)),"x",LEFT(RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))), SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))),1)-1))

Related

Splitting very large string separated with comma and i need to split 50 items only per row

im having very big string on 1st row.so 1st row contains lots of items with comma like below
12345,54322,44444,222222222,444444,121,333,44444,........
I just need to split this till 50 items in every row. lets assume there are 700 items separated with comma and I want to keep till 50 items only in 1st row and then next 50 in 2nd row and so on.
I tried with the below code which splits till 50 for sure but im not sure if this will works going forward. so need help on this
OutData = Split(InpData, ",")(50)
MsgBox OutData

You can do this in many more ways, but one would be to replace every nth comma. For example through Regular Expressions:
Sub Test()
Dim s As String: s = "1,2,3,4,5,6,7,8,9,10,11"
Dim n As Long: n = 2
Dim arr() As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "([^,]*(?:,[^,]*){" & n - 1 & "}),"
arr = Split(.Replace(s, "$1|"), "|")
End With
End Sub
The pattern used means:
( - Open 1st capture group;
[^,]* - Match 0+ (Greedy) characters other than comma;
(?: - Open a nested non-capture group;
,[^,]* - Match a comma and again 0+ characters other than comma;
){1} - Close the non-capture group and match n-1 times (1 time in the given example);
), - Close the capture group and match a literal comma.
Replace every match with the content of the 1st capture group and a character you know is not in the full string so we can split on that character. See an online demo
I suppose you can do whatever you like with the resulting array. You probably want to transpose it into the worksheet.

Trying to extract a string of text pattern from the beginning and the end of a cell in Excel

I have the following data and what I would like to see on the column result:
Data
Result
PN 65011:2020text text text PN 65011:2020
PN 65011:2020, PN 65011:2020
PN 45014-1:2017text text text text PN 65014-1:2017 PN 8726-1:2017/P11:2020
PN 45014-1:2017, PN 65014-1:2017, PN 8726-1:2017/P11:2020
PN 6534:2020text text text text
PN 6534:2020
PN 65014-1:2017text text text text PN 65014-1:2017/PC1:2013
PN 65014-1:2017,PN 65014-1:2017/PC1:2013
PN ESO 67345:2019text text text PN 65018-1:2019/PC2:2020
PN ESO 67345:2019, PN 65018-1:2019/PC2:2020
PN ESO/EOC 5320:2013text text text PN ESO 27380:2019 PN 65015-1:2020/PC:2021
PN ESO/EOC 5320:2013, PN ESO 27380:2019, PN 65015-1:2020/PC:2021
I have used ="PN "&TEXTJOIN(", PN ",1,IF(ISNUMBER(SEARCH("/",TRIM(MID(SUBSTITUTE(A2,"PN ",REPT(" ",LEN(A2))),(ROW(INDIRECT("1:"&LEN(A2))))*LEN(A2)-(LEN(A2)-1),LEN(A2))))),TRIM(MID(SUBSTITUTE(A2,"PN ",REPT(" ",LEN(A2))),(ROW(INDIRECT("1:"&LEN(A2))))*LEN(A2)-(LEN(A2)-1),LEN(A2))),LEFT(TRIM(MID(SUBSTITUTE(A2,"PN ",REPT(" ",LEN(A2))),(ROW(INDIRECT("1:"&LEN(A2))))*LEN(A2)-(LEN(A2)-1),LEN(A2))),MIN(IFERROR(FIND({" "},LOWER(TRIM(MID(SUBSTITUTE(A2,"PN ",REPT(" ",LEN(A2))),(ROW(INDIRECT("1:"&LEN(A2))))*LEN(A2)-(LEN(A2)-1),LEN(A2))))),""))-1)))
And I almost get what I would like to see, except for the last row (PN ESO 5320:2013), I don't get the numbers. It stops at PN ESO. Like this:
Data
Result
PN ESO/EOC 5320:2013text text PN ESO 27380:2019 text PN 65015-1:2020/PC:2021
PN ESO/EOC, PN ESO
Any ideas on how I can get the entire reference?
Thank you very much in advance.

Here is an example on how you could approach this using Excel O365
Formula in B2:
=TEXTJOIN(", ",,LET(X,FILTERXML("<t><s>"&SUBSTITUTE(A2,"PN ","</s><s>PN ")&"</s></t>","//s[position() > 1]"),Y,LEFT(X,FIND("|",SUBSTITUTE(X,":","|",LEN(X)-LEN(SUBSTITUTE(X,":",""))))+4),Y))
The idea here is to first SUBSTITUTE() all instances of "PN " to a valid xpath construct. Then we using FILTERXML() to return all values as an array, obviously still with the concatenated "text text text". Therefor I used LET() to load the array as a variable and use some string manipulation on all elements.
First I substituted the last occurence of the colon in all strings into a pipe-symbol which we then FIND() and return its position. Now we have the positions we can extract the the proper substrings using LEFT(). Used TEXTJOIN() to join the resulting array back together.

If you can accept a VBA solution, regular expressions are well suited for this kind of problem. If your examples are all as you show:
We use the regex which will look for substrings that
start with PN
pick up the following characters until we end with a colon followed by multiple digits.
if there is a / following, then look for the next set up to colon-multiple digit pattern.
To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like =extrPN(cell_Ref) in some cell.
Option Explicit
Function extrPN(S As String) As String
Dim RE As Object, MC As Object, M As Object
Const sPat As String = "PN[^:]+:\d+(?:/[^:]+:\d+)?"
Dim sTemp As String
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = sPat
.ignorecase = False
If .Test(S) = True Then
Set MC = .Execute(S)
For Each M In MC
sTemp = sTemp & ", " & M
Next M
extrPN = Mid(sTemp, 3)
Else: extrPN = "no match"
End If
End With
End Function
Explanation of Regex
extract PN
PN.*?:\d+(?:/[^:]+:\d+)?
Options: Case insensitive; ^$ match at line breaks
Match the character string “PN” literally PN
Match any single character that is NOT a line break character .*?
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) *?
Match the colon character :
Match a single character that is a “digit” \d+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the regular expression below (?:/[^:]+:\d+)?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match the character “/” literally /
Match any character that is NOT the colon character [^:]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the colon character :
Match a single character that is a “digit” \d+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Created with RegexBuddy

Get numbers between chars in VB.NET

it's my first question here.
I have a string for example like this:
some text [low=123 medium=456 high=789]
And I want to read all the numbers and type it in a label or something other like this:
label1. text = low
label2. text = medium
label3. text = high

You could use Regex for this:
Dim RegexObj As New Regex("low=(?<low>\d+)\s+medium=(?<medium>\d+)\s+high=(?<high>\d+)")
label1.Text = RegexObj.Match(theString).Groups("low").Value
label2.Text = RegexObj.Match(theString).Groups("medium").Value
label3.Text = RegexObj.Match(theString).Groups("high").Value
Regex details
"low=" ' Match the characters “low=” literally
"(?<low>" ' Match the regular expression below and capture its match into backreference with name “low”
"\d" ' Match a single digit 0..9
"+" ' Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")"
"\s" ' Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
"+" ' Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"medium=" ' Match the characters “medium=” literally
"(?<medium>" ' Match the regular expression below and capture its match into backreference with name “medium”
"\d" ' Match a single digit 0..9
"+" ' Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")"
"\s" ' Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
"+" ' Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"high=" ' Match the characters “high=” literally
"(?<high>" ' Match the regular expression below and capture its match into backreference with name “high”
"\d" ' Match a single digit 0..9
"+" ' Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")"

Cut the first and last character of you text
Split the result with space as seperator abd then split the resulting array with "=" as seperator.
Then you have yor result.
If the order is not fixt, you have to check if the first array element is medium and so on.

How would I remove all leading alphabetical characters?

I am interested in removing leading alphabetical (alpha) characters from cells which appear in a column. I only wish to remove the leading alpha characters (including UPPER and LOWER case): if alpha characters appear after a number they should be kept. Some cells in the column might not have leading alpha characters.
Here is an example of what I have:
36173
PIL51014
4UNV22001
ZEB54010
BICMPAFG11BK
BICMPF11
Notice how there are not always the same number of leading alpha characters. I cannot simply use a Left or Right function in Excel, because the number of characters I wish to keep and remove varies.
A correct output for what I am looking for would look like:
36173
51014
4UNV22001
54010
11BK
11
Notice how the second to last row preserved the characters "BK", and the 3rd row preserved "UNV". I cannot simply remove all alpha characters.
I am a beginner with visual basic and was not able to figure out how to use excel functions to address my issue. How would I do this?

Here is an Excel formula that will "strip off the leading alpha characters" Actually, it looks for the first numeric character, and returns everything after that:
=MID(A1,MIN(FIND({0;1;2;3;4;5;6;7;8;9},A1&"0123456789")),99)
The 99 at the end needs to be some value longer than the longest string you might be processing. 99 usually works.

Here's a formula based solution complete with test results:
=MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789"),255),100)
Change the 100 at the end if any string may be longer than 100 characters. Also the 255 is not needed, but it won't hurt.

This short UDF should strip off leading alphabetic characters.
Function noLeadAlpha(str As String)
If Not IsNumeric(str) Then
Do While Asc(str) < 48 Or Asc(str) > 57
str = Mid(str, 2)
If Not CBool(Len(str)) Then Exit Do
Loop
End If
noLeadAlpha = str
End Function

Koodos Jeeped, you beat me to it.
But here is an alternative anyway:
Function RemoveAlpha(aString As String) As String
For i = 1 To Len(aString)
Select Case Mid(aString, i, 1)
Case "0" To "9"
RemoveAlpha = Right(aString, Len(aString) - i + 1): Exit For
End Select
Next i
End Function

Deleting variable number of leading characters from a variable-length string

If I am having G4ED7883666 and I want the output to be 7883666
and I have to apply this on a range of cells and they are not the same length and the only common thing is that I have to delete anything before the number that lies before the alphabet?

This formula finds the last number in a string, that is, all digits to the right of the last alpha character in the string.
=RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1)
Note that this is an array formula and must be entered with the Control-Shift-Enter keyboard combination.
How the formula works
Let's assume that the target string is fairly simple: "G4E78"
Working outward from the middle of the formula, the first thing to do is create an array with the elements 1 through 25. (Although this might seem to limit the formula to strings with no more than 25 characters, it actually places a limit of 25 digits on the size of the number that may be extracted by the formula.
ROW($1:$25) = {1;2;3;4;5;6;7; etc.}
Subtracting from this array the value of (1 + the length of the target string) produces a new array, the elements of which count down from the length of string. The first five elements will correspond to the position of the characters of the string - in reverse order!
LEN(A1)+1-ROW($1:$25) = {5;4;3;2;1;0;-1;-2;-3;-4; etc.}
The MID function then creates a new array that reverses the order of the characters of the string.
For example, the first element of the new array is the result of MID(A1, 5, 1), the second of MID(A1, 4, 1) and so on. The #VALUE! errors reflect the fact that MID cannot evaluate 0 or negative values as the position of a string, e.g., MID(A1,0,1) = #VALUE!.
MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";"E";"4";"G";#VALUE!;#VALUE!; etc.}
Multiplying the elements of the array by 1 turns the character elements of that array to #VALUE! errors as well.
=1*MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";#VALUE!;"4";#VALUE!;#VALUE!;#VALUE!; etc.}
And the IFERROR function turns the #VALUES into 99, which is just an arbitrary number greater than the value of a single digit.
IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99) = {8;7;99;4;99;99;99; etc.}
Matching on the 99 gives the position of the first non-digit character counting from the right end of the string. In this case, "E" is the first non-digit in the reversed string "87E4G", at position 3. This is equivalent to saying that the number we are looking for at the end of the string, plus the "E", is 3 characters long.
MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0) = 3
So, for the final step, we take 3 - 1 (for the "E) characters from the right of string.
RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1) = "78"

One more submission for you to consider. This VBA function will get the right most digits before the first non-numeric character
Public Function GetRightNumbers(str As String)
Dim i As Integer
For i = Len(str) To 0 Step -1
If Not IsNumeric(Mid(str, i, 1)) Then
Exit For
End If
Next i
GetRightNumbers = Mid(str, i + 1)
End Function

You can write some VBA to format the data (just starting at the end and working back until you hit a non-number.)
Or you could (if you're happy to get an addin like Excelicious) then you can use regular expressions to format the text via a formula. An expression like [0-9]+$ would return all the numbers at the end of a string IIRC.

NOTE: This uses the regex pattern in James Snell's answer, so please upvote his answer if you find this useful.
Your best bet is to use a regular expression. You need to set a reference to VBScript Regular Expressions for this to work. Tools --> References...
Now you can use regex in your VBA.
This will find the numbers at the end of each cell. I am placing the result next to the original so that you can verify it is working the way you want. You can modify it to replace the cell as soon as you feel comfortable with it. The code works regardless of the length of the string you are evaluating, and will skip the cell if it doesn't find a match.
Sub GetTrailingNumbers()
Dim ws As Worksheet
Dim rng As Range
Dim cell As Range
Dim result As Object, results As Object
Dim regEx As New VBScript_RegExp_55.RegExp
Set ws = ThisWorkbook.Sheets("Sheet1")
' range is hard-coded here, but you can define
' it programatically based on the shape of your data
Set rng = ws.Range("A1:A3")
' pattern from James Snell's answer
regEx.Pattern = "[0-9]+$"
For Each cell In rng
If regEx.Test(cell.Value) Then
Set results = regEx.Execute(cell.Value)
For Each result In results
cell.Offset(, 1).Value = result.Value
Next result
End If
Next cell
End Sub

Takes the first 4 digits from the right of num:
num1=Right(num,4)
Takes the first 5 digits from the left of num:
num1=Left(num,5)
First takes the first ten digits from the left then takes the first four digits from the right:
num1=Right(Left(num, 10),4)
In your case:
num=G4ED7883666
num1=Right(num,7)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Excel replace characters in string before and after 'x' - string

Hello I have a column with strings (names of products) in it. Now these are formatted as Name LenghtxWidth, example Green box 20x30. Now I need to change the 20 with the 30 in this example so I get Green box 30x20, any ideas how I can achieve this? Thanks

Related

Splitting very large string separated with comma and i need to split 50 items only per row

Trying to extract a string of text pattern from the beginning and the end of a cell in Excel

Get numbers between chars in VB.NET

How would I remove all leading alphabetical characters?

Deleting variable number of leading characters from a variable-length string

Categories

Resources