Script or macro that deletes certain characters in a column - excel

I need to write (or find) a script that when run, deletes everything in a column except what is between two quotation marks. It should look something like this:
Before: DEFAULT "443562765560"
After: 443562765560
So basically it deleted everything after and before the quotation marks, just leaving what was inside.
Can someone point me in the right direction?

You could use the Split function:
MyArray = Split(TextToSplit, Delim)
This returns a zero based array: MyArray(0) is everything to the left of the first Delim, MyArray(1) is everything between the first and second Delim's, etc. It's a little tricky when Delim is double quotes --- I like to use Delim=Chr(34).
To extract everything between the first two double quotes you could do something like:
c.Formula = Split(c.Value, Chr(34))(1)
where c is a cell. The (1) on the end extracts element 1 from the 0-based array returned by Split.
Hope that helps.

Related

Excel Formula To Replicate Text To Column Functionality

I would like a formula in excel that does what Text To Columns does.
For example the following string in A1
" text with a comma, stays in one column",," keep starting blank text",1,2,3,"123"
Would be split into multiple cells like this...
The following LET Function allows you to split the text into columns based on the splitter character (in this instance a comma).
It ignores commas that are between quotes (the Delim argument - which has double quotes in it).
It does this by ensuring there is an even number of quotes before the splitter character.
=LET(
NOTES,"Splits a string but also checks to see if the splitter is inside a delimiter. So will ignore a comma inside quotes.",
RawString,$A1,
Splitter,",",Note2,"This is the character to split the string by",
Delim,"""",Note4,"This is the text delimiter it looks odd but it's just a double quote - change to "" if you don't want text delimitation",
IgnoreBlanks,FALSE,
CleanTextDelims,TRUE,
TrimBlanks,FALSE,
SplitString,Splitter&RawString&Splitter,Note3,"Add the splitter to the start and the end to help create the array of split positions",
StringLength,LEN(SplitString),
Seq,SEQUENCE(1,StringLength),Note5,"Get a sequence from 1 to the length of the split string",
Note6,"The below does the bulk of the work. It works out if we are at an odd or even point in terms of count of text delimiters up to the point in the sequence we are processing.",
Note7,"if we are at an even point and we have a delimiter then make a note of the sequence otherwise put a blank.",
PosArray,IF(Seq=StringLength,Seq,IF(MOD(LEN(LEFT(SplitString,Seq))-LEN(SUBSTITUTE(LEFT(SplitString,Seq),Delim,"")),2)=0,IF(MID(SplitString,Seq,1)=Splitter,Seq,""),"")),
PosArrayClean,FILTER(PosArray,PosArray<>""),Note8,"Clean blanks",
StartArray,FILTER(PosArrayClean,PosArrayClean<>StringLength),
EndArray,FILTER(PosArrayClean,PosArrayClean<>1),
StringArray,MID(SplitString,StartArray+1,EndArray-StartArray-1),
StringArrayB,IF(IgnoreBlanks,FILTER(StringArray,StringArray<>""),StringArray),
StringArrayC,IF(CleanTextDelims,IF(LEFT(StringArrayB,1)=Delim,MID(StringArrayB,2,IF(RIGHT(StringArrayB,1)=Delim,LEN(StringArrayB)-2,LEN(StringArrayB))),StringArrayB),StringArrayB),
IFERROR(IF(TrimBlanks,TRIM(StringArrayC),StringArrayC),"")
)
Breaking down each step in the LET formula:
Supply the raw string (from cell A1 in this case)
Set the splitter character - in this case a comma
Set the text delimiter - in this case double quotes (looks odd because it has to be as double double quotes - Delim,"""" )
IgnoreBlanks is an option to exclude blank cells in the output
CleanTextDelims will clean the TextDelimiter (Double quotes) from the start and end of the resultant string
Create a SplitString variable with the split character at the front and back.
Get the length of the string for ease of use
Get a sequence from 1 to the length of the string.
Get an array of the position of characters that are splitters with an even number of Text Delimiters to the left of that position in the string the posArray (splitter position array).
Clean the blanks to get the posArrayClean
Create a start and end array (start array ignores the last and end array ignores the first item in the PosArrayClean)
Get the array of strings/cells to output.
If the IgnoreBlanks is used then igore blank cells
If the CleanTextDelims option is set then strip off the Text Delim (double quotes) from the start and end of the resultant string.
If the TrimBlanks option is set then trim blank spaces off the start and end of the resulting strings.
Hopefully the notes explain clearly how this works and make it easy to modify.
If you want create a named Lambda to use you can use the following code to paste into the formula of a named range called SplitStringDelim (you can name it what you like of course). NB You can't have the line separators in this and I stripped the notes out of it.
=LAMBDA(StringRaw,SplitChar,DelimChar,IgnoreBlank,CleanTextDelim,TrimBlank, LET( RawString,StringRaw, Splitter,SplitChar, Delim,DelimChar, IgnoreBlanks,IgnoreBlank, CleanTextDelims,CleanTextDelim, TrimBlanks,TrimBlank, SplitString,Splitter&RawString&Splitter, StringLength,LEN(SplitString), Seq,SEQUENCE(1,StringLength), PosArray,IF(Seq=StringLength,Seq,IF(MOD(LEN(LEFT(SplitString,Seq))-LEN(SUBSTITUTE(LEFT(SplitString,Seq),Delim,"")),2)=0,IF(MID(SplitString,Seq,1)=Splitter,Seq,""),"")), PosArrayClean,FILTER(PosArray,PosArray<>""),Note8,"Clean blanks", StartArray,FILTER(PosArrayClean,PosArrayClean<>StringLength), EndArray,FILTER(PosArrayClean,PosArrayClean<>1), StringArray,MID(SplitString,StartArray+1,EndArray-StartArray-1), StringArrayB,IF(IgnoreBlanks,FILTER(StringArray,StringArray<>""),StringArray), StringArrayC,IF(CleanTextDelims,IF(LEFT(StringArrayB,1)=Delim,MID(StringArrayB,2,IF(RIGHT(StringArrayB,1)=Delim,LEN(StringArrayB)-2,LEN(StringArrayB))),StringArrayB),StringArrayB), IFERROR(IF(TrimBlanks,TRIM(StringArrayC),StringArrayC),"")))

Remove text from excel cell before first occurance of special character [duplicate]

Is there an efficient way to identify the last character/string match in a string using base functions? I.e. not the last character/string of the string, but the position of a character/string's last occurrence in a string. Search and find both work left-to-right so I can't think how to apply without lengthy recursive algorithm. And this solution now seems obsolete.
I think I get what you mean. Let's say for example you want the right-most \ in the following string (which is stored in cell A1):
Drive:\Folder\SubFolder\Filename.ext
To get the position of the last \, you would use this formula:
=FIND("#",SUBSTITUTE(A1,"\","#",(LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))/LEN("\")))
That tells us the right-most \ is at character 24. It does this by looking for "#" and substituting the very last "\" with an "#". It determines the last one by using
(len(string)-len(substitute(string, substring, "")))\len(substring)
In this scenario, the substring is simply "\" which has a length of 1, so you could leave off the division at the end and just use:
=FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))
Now we can use that to get the folder path:
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
Here's the folder path without the trailing \
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))-1)
And to get just the filename:
=MID(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))+1,LEN(A1))
However, here is an alternate version of getting everything to the right of the last instance of a specific character. So using our same example, this would also return the file name:
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
How about creating a custom function and using that in your formula? VBA has a built-in function, InStrRev, that does exactly what you're looking for.
Put this in a new module:
Function RSearch(str As String, find As String)
RSearch = InStrRev(str, find)
End Function
And your function will look like this (assuming the original string is in B1):
=LEFT(B1,RSearch(B1,"\"))
New Answer | 31-3-2022:
With even newer functions come even shorter answers. At time of writing in BETA, but probably widely available in the near future, we can use TEXTBEFORE():
=LEN(TEXTBEFORE(A2,B2,-1))+1
The trick here is that the 3rd parameter tells the function to retrieve the last occurence of the substring we give in the 2nd parameter. At time of writing this function is still case-sensitive by default which could be handeld by the optional 4th parameter.
Original Answer | 17-6-2020:
With newer versions of excel come new functions and thus new methods. Though it's replicable in older versions (yet I have not seen it before), when one has Excel O365 one can use:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),1)="Y"))
This can also be used to retrieve the last position of (overlapping) substrings:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),2)="YY"))
| Value | Pattern | Formula | Position |
|--------|---------|------------------------------------------------|----------|
| XYYZ | Y | =MATCH(2,1/(MID(A2,SEQUENCE(LEN(A2)),1)="Y")) | 3 |
| XYYYZ | YY | =MATCH(2,1/(MID(A3,SEQUENCE(LEN(A3)),2)="YY")) | 3 |
| XYYYYZ | YY | =MATCH(2,1/(MID(A4,SEQUENCE(LEN(A4)),2)="YY")) | 4 |
Whilst this both allows us to no longer use an arbitrary replacement character and it allows overlapping patterns, the "downside" is the useage of an array.
Note: You can force the same behaviour in older Excel versions through either
=MATCH(2,1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"))
Entered through CtrlShiftEnter, or using an inline INDEX to get rid of implicit intersection:
=MATCH(2,INDEX(1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"),))
tigeravatar and Jean-François Corbett suggested to use this formula to generate the string right of the last occurrence of the "\" character
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
If the character used as separator is space, " ", then the formula has to be changed to:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1," ",REPT("{",LEN(A1))),LEN(A1)),"{","")
No need to mention, the "{" character can be replaced with any character that would not "normally" occur in the text to process.
Just came up with this solution, no VBA needed;
Find the last occurance of "_" in my example;
=IFERROR(FIND(CHAR(1);SUBSTITUTE(A1;"_";CHAR(1);LEN(A1)-LEN(SUBSTITUTE(A1;"_";"")));0)
Explained inside out;
SUBSTITUTE(A1;"_";"") => replace "_" by spaces
LEN( *above* ) => count the chars
LEN(A1)- *above* => indicates amount of chars replaced (= occurrences of "_")
SUBSTITUTE(A1;"_";CHAR(1); *above* ) => replace the Nth occurence of "_" by CHAR(1) (Nth = amount of chars replaced = the last one)
FIND(CHAR(1); *above* ) => Find the CHAR(1), being the last (replaced) occurance of "_" in our case
IFERROR( *above* ;"0") => in case no chars were found, return "0"
Hope this was helpful.
You could use this function I created to find the last instance of a string within a string.
Sure the accepted Excel formula works, but it's much too difficult to read and use. At some point you have to break out into smaller chunks so it's maintainable. My function below is readable, but that's irrelevant because you call it in a formula using named parameters. This makes using it simple.
Public Function FindLastCharOccurence(fromText As String, searchChar As String) As Integer
Dim lastOccur As Integer
lastOccur = -1
Dim i As Integer
i = 0
For i = Len(fromText) To 1 Step -1
If Mid(fromText, i, 1) = searchChar Then
lastOccur = i
Exit For
End If
Next i
FindLastCharOccurence = lastOccur
End Function
I use it like this:
=RIGHT(A2, LEN(A2) - FindLastCharOccurence(A2, "\"))
Considering a part of a Comment made by #SSilk my end goal has really been to get everything to the right of that last occurence an alternative approach with a very simple formula is to copy a column (say A) of strings and on the copy (say ColumnB) apply Find and Replace. For instance taking the example: Drive:\Folder\SubFolder\Filename.ext
This returns what remains (here Filename.ext) after the last instance of whatever character is chosen (here \) which is sometimes the objective anyway and facilitates finding the position of the last such character with a short formula such as:
=FIND(B1,A1)-1
I'm a little late to the party, but maybe this could help. The link in the question had a similar formula, but mine uses the IF() statement to get rid of errors.
If you're not afraid of Ctrl+Shift+Enter, you can do pretty well with an array formula.
String (in cell A1):
"one.two.three.four"
Formula:
{=MAX(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)))} use Ctrl+Shift+Enter
Result: 14
First,
ROW($1:$99)
returns an array of integers from 1 to 99: {1,2,3,4,...,98,99}.
Next,
MID(A1,ROW($1:$99),1)
returns an array of 1-length strings found in the target string, then returns blank strings after the length of the target string is reached: {"o","n","e",".",..."u","r","","",""...}
Next,
IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99))
compares each item in the array to the string "." and returns either the index of the character in the string or FALSE: {FALSE,FALSE,FALSE,4,FALSE,FALSE,FALSE,8,FALSE,FALSE,FALSE,FALSE,FALSE,14,FALSE,FALSE.....}
Last,
=MAX(IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99)))
returns the maximum value of the array: 14
Advantages of this formula is that it is short, relatively easy to understand, and doesn't require any unique characters.
Disadvantages are the required use of Ctrl+Shift+Enter and the limitation on string length. This can be worked around with a variation shown below, but that variation uses the OFFSET() function which is a volatile (read: slow) function.
Not sure what the speed of this formula is vs. others.
Variations:
=MAX((MID(A1,ROW(OFFSET($A$1,,,LEN(A1))),1)=".")*ROW(OFFSET($A$1,,,LEN(A1)))) works the same way, but you don't have to worry about the length of the string
=SMALL(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd occurrence of the match
=LARGE(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd-to-last occurrence of the match
=MAX(IF(MID(I16,ROW($1:$99),2)=".t",ROW($1:$99))) matches a 2-character string **Make sure you change the last argument of the MID() function to the number of characters in the string you wish to match!
In newer versions of Excel (2013 and up) flash fill might be a simple and quick solution see: Using Flash Fill in Excel
.
For a string in A1 and substring in B1, use:
=XMATCH(B1,MID(A1,SEQUENCE(LEN(A1)),LEN(B1)),,-1)
Working from inside out, MID(A1,SEQUENCE(LEN(A1)),LEN(B1)) splits string A1 into a dynamic array of substrings, each the length of B1. To find the position of the last occurrence of substring B1, we use XMATCH with its Search_mode argument set to -1.
A simple way to do that in VBA is:
YourText = "c:\excel\text.txt"
xString = Mid(YourText, 2 + Len(YourText) - InStr(StrReverse(YourText), "\" ))
Very late to the party, but A simple solution is using VBA to create a custom function.
Add the function to VBA in the WorkBook, Worksheet, or a VBA Module
Function LastSegment(S, C)
LastSegment = Right(S, Len(S) - InStrRev(S, C))
End Function
Then the cell formula
=lastsegment(B1,"/")
in a cell and the string to be searched in cell B1 will populate the cell with the text trailing the last "/" from cell B1.
No length limit, no obscure formulas.
Only downside I can think is the need for a macro-enabled workbook.
Any user VBA Function can be called this way to return a value to a cell formula, including as a parameter to a builtin Excel function.
If you are going to use the function heavily you'll want to check for the case when the character is not in the string, then string is blank, etc.
If you're only looking for the position of the last instance of character "~" then
=len(substitute(String,"~",""))+1
I'm sure there is version that will work with the last instance of a string but I have to get back to work.
Cell A1 = find/the/position/of/the last slash
simple way to do it is reverse the text and then find the first slash as normal. Now you can get the length of the full text minus this number.
Like so:
=LEN(A1)-FIND("/",REVERSETEXT(A1),1)+1
This returns 21, the position of the last /

How would I remove all leading alphabetical characters?

I am interested in removing leading alphabetical (alpha) characters from cells which appear in a column. I only wish to remove the leading alpha characters (including UPPER and LOWER case): if alpha characters appear after a number they should be kept. Some cells in the column might not have leading alpha characters.
Here is an example of what I have:
36173
PIL51014
4UNV22001
ZEB54010
BICMPAFG11BK
BICMPF11
Notice how there are not always the same number of leading alpha characters. I cannot simply use a Left or Right function in Excel, because the number of characters I wish to keep and remove varies.
A correct output for what I am looking for would look like:
36173
51014
4UNV22001
54010
11BK
11
Notice how the second to last row preserved the characters "BK", and the 3rd row preserved "UNV". I cannot simply remove all alpha characters.
I am a beginner with visual basic and was not able to figure out how to use excel functions to address my issue. How would I do this?
Here is an Excel formula that will "strip off the leading alpha characters" Actually, it looks for the first numeric character, and returns everything after that:
=MID(A1,MIN(FIND({0;1;2;3;4;5;6;7;8;9},A1&"0123456789")),99)
The 99 at the end needs to be some value longer than the longest string you might be processing. 99 usually works.
Here's a formula based solution complete with test results:
=MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789"),255),100)
Change the 100 at the end if any string may be longer than 100 characters. Also the 255 is not needed, but it won't hurt.
This short UDF should strip off leading alphabetic characters.
Function noLeadAlpha(str As String)
If Not IsNumeric(str) Then
Do While Asc(str) < 48 Or Asc(str) > 57
str = Mid(str, 2)
If Not CBool(Len(str)) Then Exit Do
Loop
End If
noLeadAlpha = str
End Function
        
Koodos Jeeped, you beat me to it.
But here is an alternative anyway:
Function RemoveAlpha(aString As String) As String
For i = 1 To Len(aString)
Select Case Mid(aString, i, 1)
Case "0" To "9"
RemoveAlpha = Right(aString, Len(aString) - i + 1): Exit For
End Select
Next i
End Function

Excel copying part of a string between characters

Is there a way of returning part of a string between certain characters in excel? For example my string looks like this:
`switchrefid` = {switchrefid: }
I need to cut the part of the string between the ' (apostrophes) so it just returns switchrefid
I'm sure there must be a formula for this i just cant think of the one to use.
Thanks in advance.
As long as the ``` characters occur exactly twice in your data, you can do:
=LEFT(RIGHT(A1, LEN(A1)-FIND("`", A1)), FIND("`",RIGHT(A1, LEN(A1)-FIND("`", A1)))-1)
Although it is pretty horrible!
(Edit: this assumes your data is in A1, of course.)
Just two more options, if the word always starts at the second Character and ends just before the last you could simply use :
=MID(A1,2,LEN(A1)-2) ' Minus 2 for the 2 ticks
And the second option would be to substitute the tick with nothing like so:
=SUBSTITUTE(A1,"`","")
With the substitute is also supports a number of substitutes. So if you had `switchrefid`` for some reason and only want to get rid of 2 of the three ticks you could use:
=SUBSTITUTE(A1,"`","",2)
and this would return switchrefid`
although it would not work for ''switchrefid' as it would STILL return switchrefid' because it only removes the 1st 2 instances of the text to remove

Excel formula contains error

I have an error in this excel formula and I can't just figure it out:
=LEFT(B3,FIND(",",B3&",")-1)&","&RIGHT(B3,LEN(B3)-FIND("&",B3&"&")),RIGHT(B3,LEN(B3)-SEARCH("#",SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))&", "&SUBSTITUTE(RIGHT(B3,LEN(B3)-FIND("&",B3&"&")-1),RIGHT(B3,LEN(B3)-SEARCH("#",SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))),""))
It may seem like a big formula, but all it's intended to do is if no ampersand is in a cell, return an empty cell, if no comma but ampersand exists, then return this, for example:
KNUD J & MARIA L HOSTRUP
into this:
HOSTRUP,MARIA L
Otherwise, there is no ampersand but there is a comma so we just return: LEFT(A1,FIND("&",A1,1)-1).
Seems basic, but formula has been giving me error message and doesn't point to the problem.
Your error is here:
=LEFT(B3,FIND(",",B3&",")-1)&","&RIGHT(B3,LEN(B3)-FIND("&",B3&"&")),
At this point, the comma doesn't apply to anthing, because the right operator has matching parens
As far as what you want? Let's break that up into what you actually asked for:
if no ampersand in a cell, return empty cell,
B4=Find("&", B3&"&")
B5=IF(B4>LEN(B3),"",B6)
if no comma but ampersand exists
B6=IF(FIND(",", B3&",")>LEN(B3),B8,B7)
then turn this, for example:
KNUD J & MARIA L HOSTRUP
into this:
HOSTRUP,MARIA L
I'm presuming you mean to put the last whole word? Let's mark the last whole word:
B9=SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ","")))
B10=RIGHT(B7,LEN(B9)-FIND("#",B9))
And the stuff between the ampersand and the last word
B11=TRIM(MID(B9,B4 + 1, LEN(B9)-FIND("#",B9)-1))
Then calculating it is easy
B7=B10&","&B11
Otherwise, there is no ampersand but there is a comma so we just return:
LEFT(A1,FIND("&",A1,1)-1).
Well, if you want that, let's just put that in B8
B8=LEFT(A1,FIND("&",A1,1)-1)
(But I think you actually mean B3 instead of A1)
B8=LEFT(B3,FIND("&",B3,1)-1)
And there you have it (B5 contains the information you're looking for) It took a few cells, but it's easier to debug this way. If you want to collapse it, you can (but doing so is more code, because we can reduce duplication by referencing a previously calculated cell on more than one occasion).
Summary:
B3=<Some Name with & or ,>
B4=FIND("&", B3&"&")
B5=IF(B4>LEN(B3),"",B6)
B6=IF(FIND(",", B3&",")>LEN(B3),B7,B8)
B7=B10&","&B11
B8=LEFT(B3,FIND("&",B3,1)-1)
B9=SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ","")))
B10=RIGHT(B9,LEN(B9)-FIND("#",B9))
B11=TRIM(MID(B9,B4 + 1, LEN(B9)-FIND("#",B9)-1))
When I put in "KNUD J & MARIA L HOSTRUP", I get "HOSTRUP,MARIA" in B5.

Resources