Excel find and replace function correct formula - excel

I wish to use the find and replace function in excel to remove example sentences from cells similar to this:
text <br>〔「text」text,「text」text〕<br>(1)text「sentence―sentence/sentence」<br>(2)text「sentence―sentence」
Sentences are in between 「」brackets and will include a ― and / character somewhere inside the brackets.
I have tried 「*―*/*」 however this will delete everything from the right of the〔
Is there any way to target and delete these specific sentence brackets, with the find and replace tool?
Desired outcome:
text <br>〔「text」text,「text」text〕<br>(1)text<br>(2)text「sentence―sentence」

Quite a long formula but in Excel O365 you could use:
=SUBSTITUTE(CONCAT(FILTERXML("<t><s>"&SUBSTITUTE(CONCAT(IF(MID(A1,SEQUENCE(LEN(A1)),1)="「","</s><s>「",IF(MID(A1,SEQUENCE(LEN(A1)),1)="」","」</s><s>",MID(A1,SEQUENCE(LEN(A1)),1)))),"<br>","|$|")&"</s></t>","//s[not(contains(., '「') and contains(., '―') and contains(., '/') and contains(., '」'))][node()]")),"|$|","<br>")
As long as you have access to CONCAT you could also do this in Excel 2019 but you'll have to swap SEQUENCE(LEN(A1)) for ROW(A$1:INDEX(A:A,LEN(A1)))

This formula won't work in many cases, but if the string has matching rules as in your example, then try this:
=SUBSTITUTE(C5,"「" & INDEX(TRIM(MID(SUBSTITUTE(","&SUBSTITUTE(C5,"」","「"),"「",REPT(" ",99)),(ROW(A1:INDEX(A1:A100,LEN(C5)-LEN(SUBSTITUTE(C5,"」",""))))*2-1)*99,99)),MATCH("*―*/*",TRIM(MID(SUBSTITUTE(","&SUBSTITUTE(C5,"」","「"),"「",REPT(" ",99)),(ROW(A1:INDEX(A1:A100,LEN(C5)-LEN(SUBSTITUTE(C5,"」",""))))*2-1)*99,99)),0)) & "」","")
explain how it works:
split the string between the characters "「 "and "」" into an array
use match("*―*/*",,0) to find the string position (note that it will only return one value if it exists, if you have multiple strings, you can replace match("*―*/*",) with search ("*―*/*",..) and use it as an extra column to get matches string)
Use the index(array,match("*―*/*",..)) to get the string needs replacing (result)
Replace the original string with the results found =substitute(txt,result,"")

Or,
In B1 enter formula :
=SUBSTITUTE(A1,"「"&TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND("」",A1,FIND("/",A1))),"「",REPT(" ",99)),99)),"")

You did not tag [VBA], but if you are not averse, you could write a User Defined Function that would do what you want using Regular Expressions.
To enter this User Defined Function (UDF), alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like =replStr(A1) in some cell.
Option Explicit
Function replStr(str As String) As String
Dim RE As Object
Const sPat As String = "\u300C(?:(?=[^\u300D]*\u002F)(?=[^\u300D]*\u2015)[^\u300D]*)\u300D"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = sPat
replStr = .Replace(str, "")
End With
End Function

Related

Is it possible to find and delete anything between two specified characters in an excel csv cell?

I have a csv file, where images links are added in one cell for one product. I want to remove text form ? to ,
I write this code:
=MID(A2,1,FIND("?",A2)-1)&MID(A2,FIND(",",A2),LEN(A2))
But its applied only on the first image link.
This is what I have:
/images/image1.jpg?1200x800=new, /images/image2.jpg?1200x800=new,/images/image3.jpg?1200x800=new, /images/image5.jpg?1200x800=new
Result I need:
/images/image1.jpg,/images/image2.jpg,/images/image3.jpg,/images/image5.jpg
If your data is in A1:
=TEXTJOIN(",",,LET(x,TEXTSPLIT(A1,,","), y, LEFT(x,FIND("?",x)-1),y))
If you have Excel 2016 or earlier, which lack both the TEXTJOIN function as well as dynamic arrays, I suggest using a VBA routine to produce your desired output.
I used a regex match method to extract each segment, then joined them together. You could use a regex replace method, but since your original data has zero or one spaces after each comma, that would be the case in your result string also, so not as much under your control.
To enter this User Defined Function (UDF), alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like =Images(cell_ref) in some cell.
Option Explicit
Function Images(S As String) As String
Dim RE As Object, MC As Object, M As Object
Dim AL As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "([^\?, ]*)\?"
.MultiLine = True
.Global = True
If .test(S) Then
Set MC = .Execute(S)
Set AL = CreateObject("System.Collections.ArrayList")
For Each M In MC
AL.Add M.submatches(0)
Next M
End If
End With
Images = Join(AL.toarray, ", ")
End Function

Replace non-printable characters with " (Inch sign) VBA Excel

I need to replace non-printable characters with " (Inch sign).
I tried to use excel clean function and other UDF functions, but it just remove and not replace.
Note: non-printable characters are highlighted in blue on the above photo and it's position is random on the cells.
this is a sample string file Link`
The expected correct output should be 12"x14" LPG . OUTLET OCT-SEP# process
In advance grateful for useful comments and answer.
As per my comment, you can try:
=SUBSTITUTE(A1,CHAR(25)&CHAR(25),CHAR(34))
Or the VBA pseudo-code:
[A1] = [A1].Replace(Chr(25) & Chr(25), Chr(34))
Where [A1] is the obvious placeholder for the range-object you would want to use with proper and absolute referencing.
With ms365 newest functions, we could also use:
=TEXTJOIN(CHAR(34),,TEXTSPLIT(A1,CHAR(25)))
You can use Regular Expressions within a UDF to create a flexible method to replace "bad" characters, when you don't know exactly what they are.
In the UDF below, I show two pattern options, but others are possible.
One is to replace all characters with a character code >127
the second is to replace all characters with a charcter code >255
Option Explicit
Function ReplaceBadChars(str As String, replWith As String) As String
Dim RE As Object
Set RE = CreateObject("Vbscript.Regexp")
With RE
.Pattern = "[\u0080-\uFFFF]" 'to replace all characters with code >127 or
'.Pattern = "[\u0100-\uFFFF]" 'to replace all characters with code >255
.Global = True
ReplaceBadChars = .Replace(str, replWith)
End With
End Function
On the worksheet you can use, for example:
=ReplaceBadChars(A1,"""")
Or you could use it in a macro if you wanted to process a column of data without adding an extra column.
Note: I am uncertain as to whether there might be an efficiency difference using a smaller negated character class (eg: [^\x00-\x79] instead of the character class I showed in the code. But if, as written, execution seems slow, I'd try this change)
You can try this :
Cells.Replace What:="[The caracter to replace]", Replacement:=""""

How do I extract a series of numbers along with a single letter followed by another series of numbers?

The problem that I'm facing is that I have an entire column that has text separated by _ that contains pixel size that I want to be able to extract but currently can't. For example:
A
Example_Number_320x50_fifty_five
Example_Number_One_300x250_hundred
Example_Number_two_fifty_728x49
I have tried using Substitute function to grab the numbers which works but only grabs the numbers when I need something like: 320x50 instead I'm getting 0, as I'm not sure how to exactly extract something like this. If it was consistent I could easily do LEFT or RIGHT formula's to grab it but as you can see the data varies.
The result that I'm looking for is something along the lines of:
A | B
Example_Number_320x50_fifty_five | 320x50
Example_Number_One_300x250_hundred | 300x200
Example_Number_two_fifty_728x49 | 728x49
Any help would be much appreciated! If any further clarification is needed please let me know and I'll try to explain as best as I can!
-Maykid
I would probably use a Regular Expressions UDF to accomplish this.
First, open up the VBE by pressing Alt + F11.
Right-Click on VBAProject > Insert > Module
Then you can paste the following code in your module:
Option Explicit
Public Function getPixelDim(RawTextValue As String) As String
With CreateObject("VBScript.RegExp")
.Pattern = "\d+x\d+"
If .Test(RawTextValue) Then
getPixelDim = .Execute(RawTextValue)(0)
End If
End With
End Function
Back to your worksheet, you would use the following formula:
=getPixelDim(A1)
Looking at the pattern \d+x\d+, an escaped d (\d) refers to any digit, a + means one or more of \d, and the x is just a literal letter x. This is the pattern you want to capture as your function's return value.
Gosh, K Davis was just so fast! Here's an alternate method with similar concept.
Create a module and create a user defined function like so.
Public Function GetPixels(mycell As Range) As String
Dim Splitter As Variant
Dim ReturnValue As String
Splitter = Split(mycell.Text, "_")
For i = 0 To UBound(Splitter)
If IsNumeric(Mid(Splitter(i), 1, 1)) Then
ReturnValue = Splitter(i)
Exit For
End If
Next
GetPixels = ReturnValue
End Function
In your excel sheet, type in B1 the formula =GetPixels(A1) and you will get 320x50.
How do you create a user defined function?
Developer tab
Use this URL to add Developer tab if you don't have it: https://www.addintools.com/documents/excel/how-to-add-developer-tab.html
Click on the highlighted areas to get to Visual Basic for Applications (VBA) window.
Create module
Click Insert > Module and then type in the code.
Use the user defined function
Note how the user defined function is called.

Remove text appearing between two characters - multiple instances - Excel

In Microsoft Excel file, I have a text in rows that appears like this:
1. Rc8 {[%emt 0:00:05]} Rxc8 {[%emt 0:00:01]} 2. Rxc8 {[%emt 0:00:01]} Qxc8 {} 3. Qe7# 1-0
I need to remove any text appearing within the flower brackets { and }, including the brackets themselves.
In the above example, there are three instances of such flower brackets. But some rows might have more than that.
I tried =MID(LEFT(A2,FIND("}",A2)-1),FIND("{",A2)+1,LEN(A2))
This outputs to: {[%emt 0:00:05]}. As you see this is the very first instance of text between those flower brackets.
And if we use this to within SUBSTITUTE like this: =SUBSTITUTE(A2,MID(LEFT(A2,FIND("}",A2)),FIND("{",A2),LEN(A2)),"")
I get an output like this:
1. Rc8 Rxc8 {[%emt 0:00:01]} 2. Rxc8 {[%emt 0:00:01]} Qxc8 {} 3. Qe7# 1-0
If you have noticed, only one instance is removed. How do I make it work for all instances? thanks.
Highlight everything
Go to replace
enter {*} in text to replace
leave replace with blank
This should replace all flower brackets and anything in between them
It is not that easy without VBA, but there is still a way.
Either (as suggested by yu_ominae) just use a formula like this and auto-fill it:
=IFERROR(SUBSTITUTE(A2,MID(LEFT(A2,FIND("}",A2)),FIND("{",A2),LEN(A2)),""),A2)
Another way would be iterative calculations (go to options -> formulas -> check the "enable iterative calculations" button)
To do it now in one cell, you need 1 helper-cell (for my example we will use C1) and the use a formula like this in B2 and auto-fill down:
=IF($C$1,A2,IFERROR(SUBSTITUTE(B2,MID(LEFT(B2,FIND("}",B2)),FIND("{",B2),LEN(B2)),""),B2))
Put "1" in C1 and all formulas in B:B will show the values of A:A. Now go to C1 and hit the del-key several times (you will see the "{}"-parts disappearing) till all looks like you want it.
EDIT: To do it via VBA but without regex you can simply put this into a module:
Public Function DELBRC(ByVal str As String) As String
While InStr(str, "{") > 0 And InStr(str, "}") > InStr(str, "{")
str = Left(str, InStr(str, "{") - 1) & Mid(str, InStr(str, "}") + 1)
Wend
DELBRC = Trim(str)
End Function
and then in the worksheet directly use:
=DELBRC(A2)
If you still have any questions, just ask ;)
Try a user defined function. In VBA create a reference to "Microsoft VBScript Regular Expressions 5.5. Then add this code in a module.
Function RemoveTags(ByVal Value As String) As String
Dim rx As New RegExp
rx.Global = True
rx.Pattern = " ?{.*?}"
RemoveTags = Trim(rx.Replace(Value, ""))
End Function
On the worksheet in the cell enter: =RemoveTags(A1) or whatever the address is where you want to remove text.
If you want to test it in VBA:
Sub test()
Dim a As String
a = "Rc8 {[%emt 0:00:05]} Rxc8 {[%emt 0:00:01]}"
Debug.Print RemoveTags(a)
End Sub
Outputs "Rc8 Rxc8"

Check if cell is only a-z excel

I would like to be sure that all my cell contain only characters (A-Z/a-z). I want to be sure there isn't any symbol, number or anything else. Any tips?
For example I have this "Š".
As a VBA function, the following should work:
Option Compare Binary
Function LettersOnly(S As String) As Boolean
LettersOnly = Not S Like "*[!A-Za-z]*" And S <> ""
End Function
In using the function, S can be either an actual string, or a reference to the cell of concern.
EDIT: Also, you want to be certain you have not set Option Compare Text in your code. The default is Option Compare Binary which is what you want for this type of comparison. I have added that to the code for completeness.
Open the VBA editor (Alt+F11) and create a new module.
Add a reference to "Microsoft VBScript Regular Expressions 5.5" (Tools -> References).
In your new module, create a new function like this:
Function IsAToZOnly(inputStr As String) As Boolean
Dim pattern As String: pattern = "^[A-Za-z]*$"
Dim regEx As New RegExp
regEx.pattern = pattern
IsAToZOnly = regEx.Test(inputStr)
End Function
Use the new function in your worksheet:
=IsAToZOnly(A1)

Resources