Excel Formula To Replicate Text To Column Functionality - string

I would like a formula in excel that does what Text To Columns does.
For example the following string in A1
" text with a comma, stays in one column",," keep starting blank text",1,2,3,"123"
Would be split into multiple cells like this...

The following LET Function allows you to split the text into columns based on the splitter character (in this instance a comma).
It ignores commas that are between quotes (the Delim argument - which has double quotes in it).
It does this by ensuring there is an even number of quotes before the splitter character.
=LET(
NOTES,"Splits a string but also checks to see if the splitter is inside a delimiter. So will ignore a comma inside quotes.",
RawString,$A1,
Splitter,",",Note2,"This is the character to split the string by",
Delim,"""",Note4,"This is the text delimiter it looks odd but it's just a double quote - change to "" if you don't want text delimitation",
IgnoreBlanks,FALSE,
CleanTextDelims,TRUE,
TrimBlanks,FALSE,
SplitString,Splitter&RawString&Splitter,Note3,"Add the splitter to the start and the end to help create the array of split positions",
StringLength,LEN(SplitString),
Seq,SEQUENCE(1,StringLength),Note5,"Get a sequence from 1 to the length of the split string",
Note6,"The below does the bulk of the work. It works out if we are at an odd or even point in terms of count of text delimiters up to the point in the sequence we are processing.",
Note7,"if we are at an even point and we have a delimiter then make a note of the sequence otherwise put a blank.",
PosArray,IF(Seq=StringLength,Seq,IF(MOD(LEN(LEFT(SplitString,Seq))-LEN(SUBSTITUTE(LEFT(SplitString,Seq),Delim,"")),2)=0,IF(MID(SplitString,Seq,1)=Splitter,Seq,""),"")),
PosArrayClean,FILTER(PosArray,PosArray<>""),Note8,"Clean blanks",
StartArray,FILTER(PosArrayClean,PosArrayClean<>StringLength),
EndArray,FILTER(PosArrayClean,PosArrayClean<>1),
StringArray,MID(SplitString,StartArray+1,EndArray-StartArray-1),
StringArrayB,IF(IgnoreBlanks,FILTER(StringArray,StringArray<>""),StringArray),
StringArrayC,IF(CleanTextDelims,IF(LEFT(StringArrayB,1)=Delim,MID(StringArrayB,2,IF(RIGHT(StringArrayB,1)=Delim,LEN(StringArrayB)-2,LEN(StringArrayB))),StringArrayB),StringArrayB),
IFERROR(IF(TrimBlanks,TRIM(StringArrayC),StringArrayC),"")
)
Breaking down each step in the LET formula:
Supply the raw string (from cell A1 in this case)
Set the splitter character - in this case a comma
Set the text delimiter - in this case double quotes (looks odd because it has to be as double double quotes - Delim,"""" )
IgnoreBlanks is an option to exclude blank cells in the output
CleanTextDelims will clean the TextDelimiter (Double quotes) from the start and end of the resultant string
Create a SplitString variable with the split character at the front and back.
Get the length of the string for ease of use
Get a sequence from 1 to the length of the string.
Get an array of the position of characters that are splitters with an even number of Text Delimiters to the left of that position in the string the posArray (splitter position array).
Clean the blanks to get the posArrayClean
Create a start and end array (start array ignores the last and end array ignores the first item in the PosArrayClean)
Get the array of strings/cells to output.
If the IgnoreBlanks is used then igore blank cells
If the CleanTextDelims option is set then strip off the Text Delim (double quotes) from the start and end of the resultant string.
If the TrimBlanks option is set then trim blank spaces off the start and end of the resulting strings.
Hopefully the notes explain clearly how this works and make it easy to modify.
If you want create a named Lambda to use you can use the following code to paste into the formula of a named range called SplitStringDelim (you can name it what you like of course). NB You can't have the line separators in this and I stripped the notes out of it.
=LAMBDA(StringRaw,SplitChar,DelimChar,IgnoreBlank,CleanTextDelim,TrimBlank, LET( RawString,StringRaw, Splitter,SplitChar, Delim,DelimChar, IgnoreBlanks,IgnoreBlank, CleanTextDelims,CleanTextDelim, TrimBlanks,TrimBlank, SplitString,Splitter&RawString&Splitter, StringLength,LEN(SplitString), Seq,SEQUENCE(1,StringLength), PosArray,IF(Seq=StringLength,Seq,IF(MOD(LEN(LEFT(SplitString,Seq))-LEN(SUBSTITUTE(LEFT(SplitString,Seq),Delim,"")),2)=0,IF(MID(SplitString,Seq,1)=Splitter,Seq,""),"")), PosArrayClean,FILTER(PosArray,PosArray<>""),Note8,"Clean blanks", StartArray,FILTER(PosArrayClean,PosArrayClean<>StringLength), EndArray,FILTER(PosArrayClean,PosArrayClean<>1), StringArray,MID(SplitString,StartArray+1,EndArray-StartArray-1), StringArrayB,IF(IgnoreBlanks,FILTER(StringArray,StringArray<>""),StringArray), StringArrayC,IF(CleanTextDelims,IF(LEFT(StringArrayB,1)=Delim,MID(StringArrayB,2,IF(RIGHT(StringArrayB,1)=Delim,LEN(StringArrayB)-2,LEN(StringArrayB))),StringArrayB),StringArrayB), IFERROR(IF(TrimBlanks,TRIM(StringArrayC),StringArrayC),"")))

Related

MS Excel Forumla assistance

I have a cell I need to split into 2 cells.
Data Sample: Note: All Cells are formatted as TEXT
"3851v61_18.005_ Have the anchors for all suspended scaffolding system suspension lines and separate vertical lifelines been verified? "
Data Sample 2: Parent_ID
Steps:
Need to check to see if the cell value starts with number.
Also, If it contains a special character ("_") if may have more than 1.
Display cell #1 = just the ID number containing the underscore(s).
Display cell #2 - Just the text right of the underscore. However, if the original cell only starts with Alpha characters then display the actual value. ie. Parent_Id
Strip off any erroneous underscores left hanging.
Expected results:
Cell #:
"3851v61_18.005" (ID Number portion of the Text)
"Have the anchors for all suspended scaffolding system suspension lines and separate vertical lifelines been verified?
This is what I have so far: (If it does not start with a number, then return the value of the cell, else continue with the equation)
`=`IF(NUMBERVALUE(LEFT(C321,1))>=1,IFERROR(LEFT(C321, FIND("_",C321)-1), C321),FALSE)`
=IFERROR(RIGHT(C321,LEN(C321)-FIND("_",C321)), C321)`
If the Underscore count is more than one need to include it in the entire number and strip off the text after the last underscore in Cell 1. At the same for the right of the Underscore to display the text after underscore in Cell 2.
Thank you for any assistances offered.
I think I understand but am not 100% sure.
Try something like the below to get the full string (if it starts with something that isn't a number) or the string up to the last underscore (if it does start with a number):
=IF(NOT(ISNUMBER(NUMBERVALUE(LEFT($D1,1)))), $D1,
LEFT($D1, FIND("!!!", SUBSTITUTE($D1, "_", "!!!",
LEN($D1)-LEN(SUBSTITUTE($D1, "_", ""))))-1))
Then in a similar fashion try something like the below to get the full string (if it starts with something that isn't a number) or the string to right of the last underscore (if it does start with a number):
=IF(NOT(ISNUMBER(NUMBERVALUE(LEFT($D1,1)))), $D1,
RIGHT($D1, LEN($D1)-FIND("!!!", SUBSTITUTE($D1, "_", "!!!",
LEN($D1)-LEN(SUBSTITUTE($D1, "_", ""))))))
For example:

Format in Python

I have a list of values as follows:
no column
1. 111-222-11
2. 112-333-12
3. 113-444-13
I want to format the value from 111-222-11 to 111-222-011 and format the other values similarly. Here is my code snippet in Python 3, which I am trying to use for that:
‘{:03}-{:06}-{:03}.format(column)
I hope that you can help.
Assuming that column is a variable that can be assigned string values 111-222-11, 112-333-12, 113-444-13 and so on, which you want to change to 111-222-011, 112-333-012, 113-444-013 and so on, it appears that you tried to use a combination of slice notation and format method to achieve this.
Slice notation
Slice notation, when applied to a string, treats it as a list-like object consisting of characters. The positional index of a character from the beginning of the string starts from zero. The positional index of a character from the end of the string starts with -1. The first colon : separates the beginning and the end of a slice. The end of the slice is not included into it, unlike its beginning. You indicate slices as you would indicate indexes of items in a list by using square brackets:
'111-222-11'[0:8]
would return
'111-222-'
Usually, the indexes of the first and the last characters of the string are skipped and implied by the colon.
Knowing the exact position where you need to add a leading zero before the last two digits of a string assigned to column, you could do it just with slice notation:
column[:8] + '0' + column[-2:]
format method
The format method is a string formatting method. So, you want to use single quotes or double quotes around your strings to indicate them when applying that method to them:
'your output string here'.format('your input string here')
The numbers in the curly brackets are not slices. They are placeholders, where the strings, which are passed to the format method, are inserted. So, combining slices and format method, you could add a leading zero before the last two digits of a column string like this:
'{0}0{1}'.format(column[:8], column[-2:])
Making more slices is not necessary because there is only one place where you want to insert a character.
split method
An alternative to slicing would be using split method to split the string by a delimiter. The split method returns a list of strings. You need to prefix it with * operator to unpack the arguments from the list before passing them to the format method. Otherwise, the whole list will be passed to the first placeholder.
'{0}-{1}-0{2}'.format(*column.split('-'))
It splits the string into a list treating - as the separator and puts each item into a new string, which adds 0 character before the last one.

Remove next substring from charter on last position in Excel

I have Excel sheet which contains data similar to
Addresses
xyz,abc,olk
opn,opk,prt
we-ylj,tyf,uyfas
oiui,ytfy,tydry - We also work in bla,bla,bla
ytfyt,tyfyt,ghfyt
i-hgsd,gsdf-hgd,sdgh,- We also work in xxx,yy,zzz
ytsfgh,gfasdg,tydsfyt
I want to remove all substring which is next to the character "-" only if it's in the last position.
Result should be like
xyz,abc,olk
opn,opk,prt
we-ylj,tyf,uyfas
oiui,ytfy,tydry
ytfyt,tyfyt,ghfyt i-hgsd,gsdf-hgd,sdgh
ytsfgh,gfasdg,tydsfyt
I tried with =Substitute function but unable to replace data because of the last substring separated from "-" is not similar.
Going by your specifications, I would use two columns just so it's not a very long formula:
In B1:
=IFERROR(FIND(CHAR(1),SUBSTITUTE(A1,"-",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"-",""))))-1,LEN(A1))
This gets the position of the last - or the full text length.
Then in C1:
=LEFT(A1,IF(FIND(",",A1)<B1,B1,LEN(A1)))
This checks if there's a , before the last -. If there is no ,, then the full text is taken.
EDIT: I only now noticed your edited comment. If it's just everything after - We, then I would use this:
=TRIM(LEFT(A1,IFERROR(FIND("- We",A1)-2,LEN(A1))))

Remove all text and characters except some

I have here some text strings
"16cg-301 -request","16cg-3368 - for review","16cg-3684 - for process"
what i would like to do is to remove all the text and characters except the number and the letters "cg" and - which is within the reference code.
If the string you want to extract is always before the first space in the full string then you can use SEARCH and LEFT to extract your reference code:
=LEFT(A1,SEARCH(" ",A1)-1)
This formula would take 16cg-3368 from 16cg-3368 - for review.
I suggest using something like suggested here
How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
With a replace regex similar to this
[^\dcg]*
or a match regex like this
^([0-9cg- ]+).*
else you could also work with a strange formule similar to this
=CONCATENATE(IF(NOT(ISERROR(SEARCH(MID(A2;1;1);"01234567890cg-")>0));MID(A2;1;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;2;1);"01234567890cg-")>0));MID(A2;2;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;3;1);"01234567890cg-")>0));MID(A2;3;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;4;1);"01234567890cg-")>0));MID(A2;4;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;5;1);"01234567890cg-")>0));MID(A2;5;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;6;1);"01234567890cg-")>0));MID(A2;6;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;7;1);"01234567890cg-")>0));MID(A2;7;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;8;1);"01234567890cg-")>0));MID(A2;8;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;9;1);"01234567890cg-")>0));MID(A2;9;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;10;1);"01234567890cg-")>0));MID(A2;10;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;11;1);"01234567890cg-")>0));MID(A2;11;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;12;1);"01234567890cg-")>0));MID(A2;12;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;13;1);"01234567890cg-")>0));MID(A2;13;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;14;1);"01234567890cg-")>0));MID(A2;14;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;15;1);"01234567890cg-")>0));MID(A2;15;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;16;1);"01234567890cg-")>0));MID(A2;16;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;17;1);"01234567890cg-")>0));MID(A2;17;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;18;1);"01234567890cg-")>0));MID(A2;18;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;19;1);"01234567890cg-")>0));MID(A2;19;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;20;1);"01234567890cg-")>0));MID(A2;20;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;21;1);"01234567890cg-")>0));MID(A2;21;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;22;1);"01234567890cg-")>0));MID(A2;22;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;23;1);"01234567890cg-")>0));MID(A2;23;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;24;1);"01234567890cg-")>0));MID(A2;24;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;25;1);"01234567890cg-")>0));MID(A2;25;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;26;1);"01234567890cg-")>0));MID(A2;26;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;27;1);"01234567890cg-")>0));MID(A2;27;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;28;1);"01234567890cg-")>0));MID(A2;28;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;29;1);"01234567890cg-")>0));MID(A2;29;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;30;1);"01234567890cg-")>0));MID(A2;30;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;31;1);"01234567890cg-")>0));MID(A2;31;1);"");IF(NOT(ISERROR(SEARCH(MID(A2;32;1);"01234567890cg-")>0));MID(A2;32;1);""))
only works by now for less than 33 signs.
problem here will be that you will get unexpected behavior like this:
123cg-123 - Process => 123cg-123-c
after rereading , I think you should try an other approach than described in the question ;-)
If you want to return everything up to and including the last digit, then try:
=LEFT(A1,LOOKUP(2,1/ISNUMBER(-MID(A1,seq,1)),seq))
seq is a named formula: Formula ► Define Name
Name: seq
Refers to: =ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,255,1))
seq returns an array of sequential numbers from 1 to 255.
mid(a1,seq,1)
returns an array consisting of the individual characters in the string in A1. The leading minus sign converts the digits from strings to numbers.
The lookup function will then return the position of the last digit

Script or macro that deletes certain characters in a column

I need to write (or find) a script that when run, deletes everything in a column except what is between two quotation marks. It should look something like this:
Before: DEFAULT "443562765560"
After: 443562765560
So basically it deleted everything after and before the quotation marks, just leaving what was inside.
Can someone point me in the right direction?
You could use the Split function:
MyArray = Split(TextToSplit, Delim)
This returns a zero based array: MyArray(0) is everything to the left of the first Delim, MyArray(1) is everything between the first and second Delim's, etc. It's a little tricky when Delim is double quotes --- I like to use Delim=Chr(34).
To extract everything between the first two double quotes you could do something like:
c.Formula = Split(c.Value, Chr(34))(1)
where c is a cell. The (1) on the end extracts element 1 from the 0-based array returned by Split.
Hope that helps.

Resources