Separate words with commas in Excel 2010 - excel

I'm trying to use a formula in Excel to separate a bunch of words in a cell with a comma. If there are more than 5 words in the cell, I just want to get the first 5 words. To get the first five words in a cell and separate them by a comma I use this:
=SUBSTITUTE(LEFT(A1,FIND("^",SUBSTITUTE(A1," ","^",5))-1), " ", ", ")
This works fine. But the problem with this, because of the number 5 here, if I a cell contains less than 5 words, I get an error. I tried to substitute the 5 with this:
LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1
So my function becomes this:
=SUBSTITUTE(LEFT(A1,FIND("^",SUBSTITUTE(A1," ","^",LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1))-1), " ", ", ")
But this doesn't work, it gives me an error. Any idea how I can do this please?
Also I would like to ignore the first word if its first character is "-" (without the quotes) and just start from the second word. So in other words, I want something like this:
I love my life very much should return I, love, my, life, very
- I love my life very much should return I, love, my, life, very (the "-" is ignored")
I love my should return I, love, my
Thanks in advance for any help

Here's a somewhat different approach. Aside from the "less than 5" issue, it also deals with the "5 words with no space at the end" issue:
=LEFT(A1,FIND("^",SUBSTITUTE(A1 & "^"," ","^",5))-1)
EDIT 1: I just noticed the part about the leading "- ". My addition isn't very elegant, but it deals with it, and also TRIMS any trailing spaces:
=TRIM(LEFT(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1),FIND("^",SUBSTITUTE(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1) & "^"," ","^",5))-1))
EDIT 2: Oh yeah, commas:
=SUBSTITUTE(TRIM(LEFT(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1),FIND("^",SUBSTITUTE(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1) & "^"," ","^",5))-1))," ",",")

Try this:
=TRIM(LEFT(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"-"," "))," ",","),",",REPT(" ",99),5),99))

This will work even if there is not a space after the dash or if there are extra spaces in the text. Often I find that input is not very clean.
=SUBSTITUTE(LEFT(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"-","",1)),
" ","*",5),IFERROR(FIND("*",SUBSTITUTE(TRIM(SUBSTITUTE(A1,"-","",1)),
" ","*",5))-1,999))," ",",")
Edit: After commenting on István's, I made mine flawless too.
=SUBSTITUTE(LEFT(SUBSTITUTE(TRIM(SUBSTITUTE(LEFT(TRIM(A1),1),"-"," ",1)
&MID(TRIM(A1),2,999))," ","*",5),IFERROR(FIND("*",SUBSTITUTE(
TRIM(SUBSTITUTE(LEFT(TRIM(A1),1),"-","",1)&MID(TRIM(A1),2,999))," ","*",5))-1,999))," ",",")
But I think his is more elegant.

Try this:
=SUBSTITUTE(LEFT(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "),", ","|",MIN(LEN(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "))-LEN(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", ")," ","")),5)),FIND("|",SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "),", ","|",MIN(LEN(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "))-LEN(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", ")," ","")),5)))-1),",,",",")
The formula works by taking the following steps:
Remove any leading dash-space
Trim any leading or trailing spaces
Insert comma-spaces in place of spaces and add a trailing comma-space
Calculate the lesser of 5 and the number of words in the string
Put in "|" in place of either the fifth comma-space or the trailing comma-space if the string is less than five words
Determine the position of the "|"
Strip off the "|" and all characters to the right of it
Remove any doubled commas due to any single embedded commas in the initial string
If you are willing to consider a VBA solution, this complex expression can be replaced by a user-defined function:
Function words5(InputString As String) As String
Dim wordArray As Variant
wordArray = Split(Trim(Replace(InputString, _ 'remove "-", put words into array
"-", "", , 1)), " ")
ReDim Preserve wordArray(LBound(wordArray) To _ 'drop all but the first 5 words
WorksheetFunction.Min(UBound(wordArray), 5 - 1))
words5 = Replace(Join(wordArray, ", "), ",,", ",") 'rejoin the words with ", "
End Function 'separator
On the plus side of using this code is its maintainability compared to the worksheet formula, which impossible to understand or safely alter without access to the original building blocks that were combined into the single expression.
The code would have to be installed in the workbook in which it is used or in either the standard Personal.xlsb workbook or an addin workbook.
To use the function, copy and paste it into a standard module, which can be inserted into a workbook via the VBA editor. You can open the editor with the Visual Basic button on the `Developer tab of the ribbon.

Figured I'd throw my hat in the ring also. I think this formula should cover the bases:
=SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(TRIM(SUBSTITUTE(A1&" ","- ",""))," ",REPT(" ",99)),99*5))," ",",")

Related

Changing multiple ROW values with in a single column, Followed but a filtering process

So my goal here is to filter for 2 service code(there will be hundreds in a single column) but in this case I need "4" and "CO4" that is the letter o capitalized not the number zero. FYI
Issues/Goals:
4 and CO4 have a space in them like CO4(space) this varies as in some may not have the space. Humans..am I right? lol
Filtering in an addition column called 'Is Void' for False values with the above two service codes.* this is where I believe my issue is
2a) this is because I lose a lot of data about 1700 rows with that code I will show in a bit.
Sample Data base:
My code: This has everything imported and data base in open too.
dfRTR = dfRTR[["Zone Name", "End Time", "Is Void", "Ticket Notes", "Service Code", "Unit Count", "Disposal Monitor Name", "Addr No","Addr St", "Ticket Number"]] #Columns I will use
dfRTR.replace("CO4 ","CO4") #Takes out (space) in CO4
dfRTR.replace("4 ", "4") #Takes out (space) in 4
filt = dfRTR[(dfRTR['Is Void'] == False) & (dfRTR["Service Code"].isin(["CO4 ", "4"]))] #my problem child.
If I take this code out I have all my data, but with it only about 700-800 Rows which is supposed to be around 1500-2000k Rows in the column "Is Void".
I have only been coding for about two months, not knowing how to replace two values at once in the same column, is another topic. I am trying to automate my whole audit which can take 4hrs to 2-3days depending on the project. Any advice is greatly appreciated.
EDIT:
So, if i manual make my exccel all text then run this code:
dfRTR['Service Code'] = dfRTR['Service Code'].str.replace(r'\W', "")
filt = dfRTR[(dfRTR['Is Void'] != True) & dfRTR["Service
Code"].isin(["CO4","4"])]
filt.to_excel('VecTest.xlsx')
And I can return all my data I need filtered. Only down side is that my date will have text formatting. I will try to automate one column in this case 'Service Code' to be text then run it again.
Edit part 2: Making the above file in a CVS mas the filtering process way easier. Problem is converting it back to an excel. Excel formulas have an issue with simple formula like =A1=B1 if both cells A1 and B1 have a value of 1 they will not match. CVS pulls away all the extra "Formatting" but in the Excel VBA code I use to format it make the cell give off this warning, even though the data being pulled is from CVS format.
VBA code makes the values appear with the excel warning:
"The number in this cell is formatted as a text or preceded with an apostrophe"
Conclusion:
I would need to make all my check using python before using CVS formatting.
I figured it out. My solution is a function but it does not need to be:
def getRTRVecCols(dfRTR):
# this is majority of special chars that may be in columns cells
spec_chars = ["!", '"', "#", "%", "&", "'", "(", ")",
"*", "+", ",", "-", ".", "/", ":", ";", "<",
"=", ">", "?", "#", "[", "\\", "]", "^", "_",
"`", "{", "|", "}", "~", "–"]
# this forloop will remove all those special chars for a specific column
for char in spec_chars:
dfRTR["Service Code"] = dfRTR["Service Code"].str.replace(char, ' ')
#with the above we may get white spaces. This is how to remove those
dfRTR["Service Code"] = dfRTR["Service Code"].str.split().str.join(" ")
# take in multiple values for in one column to filter for (specific for me)
codes = ["CO4", "4"]
#filtering our data set with multiple conditions. 2 conditions is in the same single column and another condition in a differnt column
dfRTR = dfRTR[(dfRTR["Is Void"] != True) & (dfRTR["Service Code"].isin(codes))]
#saves to a excel file
dfRTR.to_excel('RTR Vec Cols.xlsx')
#my function call
getRTRVecCols(dfRTR)
Now I Do get this warning, which I am still to noob in python to understand yet. But is next on my list to fix. But it works perfect for now*
FutureWarning: The default value of regex will change from True to False in a
future version. In addition, single character regular expressions will*not* be treated as literal strings when regex=True.
dfRTR["Service Code"] = dfRTR["Service Code"].str.replace(char, ' ')

Custom number (price) format independent of localization

I am wondering is it possible to have custom number format using Excel formula that will not be dependent on localization of Excel application (EU/US)?
For example I have value 1291660.
Then using formula =TEXT(A1;"# ##0,00"). I get as an output 1 291 660,00. The target is to have in any case 1.291.660,00 as an output. Any Excel professional to give an advice?
I have tried =TEXT(A1;"#.##0,00") - This didn't work
I think VBA is the only solution to this. I have found my old question about the same topic, but it seems that solution provided is not working for some reason?
Ultimate 1000 separator using VBA
Function CustomFormat(InputValue As Double) As String
Dim sThousandsSep As String
Dim sDecimalSep As String
Dim sFormat As String
sThousandsSep = Application.International(xlThousandsSeparator)
sDecimalSep = Application.International(xlDecimalSeparator)
' Up to 6 decimal places
sFormat = "#" & sThousandsSep & "###" & sDecimalSep & "######"
CustomFormat = Format(InputValue, sFormat)
If (Right$(CustomFormat, 1) = sDecimalSep) Then
CustomFormat = Left$(CustomFormat, Len(CustomFormat) - 1)
End If
' Replace the thousands separator with a space
' or any other character
CustomFormat = Replace(CustomFormat, sThousandsSep, " ")
End Function
By replacing CustomFormat = Replace(CustomFormat, sThousandsSep, " ") with CustomFormat = Replace(CustomFormat, sThousandsSep, ".") output is .1 291 660
You may use:
=SUBSTITUTE(SUBSTITUTE(FIXED(A1,2,0),",","."),".",",",INT(LEN(A1)/3)+1)
The way it works is that on an EU-system FIXED() will return: 1.291.660,00 but on an US-system it should return 1,291,660.00. To create the same output-string, we can SUBSTITUTE() all comma's to dots. A 2nd SUBSTITUTE() will then replace only the last dot back to a comma. To find the right index I used INT(LEN(A1)/3)+1 which works well on itegers like 1291660. If you happen to have decimal values, you can change this to:
=SUBSTITUTE(SUBSTITUTE(FIXED(A1,2,0),",","."),".",",",INT(LEN(INT(A1))/3)+1)
EDIT:
The above should always return the desired format, but it's a string. To return the numeric value in any further calculations, you can use NUMBERVALUE():
=NUMBERVALUE(C1,",",".")
Go to excel file tab, click options and then the following options as desired
Uncheck use system separators and define your own
You don't need VBA for this. You can use SUBSTITUTE to replace the default separator characters, and you can detect what these are by cutting them out from the formatted string of a known number. I use ASCII 1 (SOH) character to avoid replacing twice (e.g. replacing thousands separator from " " to ".", than replacing decimal separators from "." to "," would cause that thousands separators appear as ","):
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TEXT(1234567.89,"# ##0.000"),MID(TEXT("# ##0",1000),2,1),CHAR(1)&" "),MID(TEXT("0.0",0.1),2,1),CHAR(1)&","),CHAR(1)&" ","."),CHAR(1)&",",",")
This will output "1.234.567,890".
This output will appear as a string (you cannot add numbers to it, and it is left adjusted by default), and you cannot change this behavior if you don't use Excels local settings for separators.
BTW, using " " for thousands separator and either "." or "," for decimals is the clearest way of displaying numbers.

Excel Escape 2 ""

& CHAR(10) & REPT(" ", 20)& "5. Change object BusinessCardRequest(FirstName=$CreatedFor/FirstName;LastName=$CreatedFor/Surname;EmailAddress=$CreatedFor/Email;MobileNumber=$CreatedFor/Mobile;PositionTitle=if $PositionFirst != empty then $PositionFirst/Title else "" "";Brand=if $PositionFirst != empty then getCaption($PositionFirst/Brand) else "" "")"
I have above code in excel and I want to escape so that I show 2 empty string as value for If condition. But it gives error I tried using 3 like """ """ it is also not working. But when I remove one it works.
How to correctly escape 2 " " in excel
The correct way to escape a " in Excel is to add one more " before it.
By escaping a " you are asking Excel to treat it as literal text.
For example:
"The ""fox"" jumped over the lazy dog"
would evaluate as
The "fox" jumped over the lazy dog
In answer to your example below:
But it gives error I tried using 3 like """ """ it is also not working. But when I remove one it works.
You don't have to encase the problematic character in escape characters, you only need to immediately precede the problematic character with the escape character.

Issues with Range.FormulaLocal and textstring in formula

When I finally had figured out that I can use my own language with Range.FormulaLocal in stead of Range.Formula I was very excited!
The possibilities are endless. But, I am encountering a problem with textstrings in formulas.
This code works just fine:
Range("I5").FormulaLocal = "=ALS(A5=1;H5;0)"
But these codelines are not working:
Range("I5").FormulaLocal = "=ALS(A5="x";H5;0)"
Range("I6").FormulaLocal = "=ALS.FOUT(VERT.ZOEKEN(A2;'betaaltermijnen.xlsx'!tabel;3;ONWAAR);"")
Could somebody help me?
You're accidentally ending your strings early...
First line:
If you have a variable x which you want to include in the string, then use &
Range("I5").FormulaLocal = "=ALS(A5=" & x & ";H5;0)"
If instead you're trying to have the string "x" then you must use an additional quotation mark before each in-string quotation mark. This is called an escape character.
Range("I5").FormulaLocal = "=ALS(A5=""x"";H5;0)"
This way, when VBA sees "", it treats it as the start or end of a quote within a string
By the same reasoning, your second line becomes
Range("I6").FormulaLocal = _
"=ALS.FOUT(VERT.ZOEKEN(A2;'betaaltermijnen.xlsx'!tabel;3;ONWAAR);"" "") "
Where I've used the _ underscore to continue the line without it getting too long, because the last 6 characters are the important bit!

Split, escaping certain splits

I have a cell that contains multiple questions and answers and is organised like a CSV. So to get all these questions and answers separated a simple split using the comma as the delimiter should separate this easily.
Unfortunately, there are some values that use the comma as the decimal separator. Is there a way to escape the split for those occurrences?
Fortunately, my data can be split using ", " as separator, but if this wouldn't be the case, would there still be a solution besides manually replacing the decimal delimiter from a comma to a dot?
Example:
"Price: 0,09,Quantity: 12,Sold: Yes"
Using Split("Price: 0,09,Quantity: 12,Sold: Yes",",") would yield:
Price: 0
09
Quantity: 12
Sold: Yes
One possibility, given this test data, is to loop through the array after splitting, and whenever there's no : in the string, add this entry to the previous one.
The function that does this might look like this:
Public Function CleanUpSeparator(celldata As String) As String()
Dim ret() As String
Dim tmp() As String
Dim i As Integer, j As Integer
tmp = Split(celldata, ",")
For i = 0 To UBound(tmp)
If InStr(1, tmp(i), ":") < 1 Then
' Put this value on the previous line, and restore the comma
tmp(i - 1) = tmp(i - 1) & "," & tmp(i)
tmp(i) = ""
End If
Next i
j = 0
ReDim ret(j)
For i = 0 To UBound(tmp)
If tmp(i) <> "" Then
ret(j) = tmp(i)
j = j + 1
ReDim Preserve ret(j)
End If
Next i
ReDim Preserve ret(j - 1)
CleanUpSeparator = ret
End Function
Note that there's room for improvement by making the separator caharacters : and , into parameters, for instance.
I spent the last 24 hours or so puzzling over what I THINK is a completely analogous problem, so I'll share my solution here. Forgive me if I'm wrong about the applicability of my solution to this question. :-)
My Problem: I have a SharePoint list in which teachers (I'm an elementary school technology specialist) enter end-of-year award certificates for me to print. Teachers can enter multiple students' names for a given award, separating each name using a comma. I have a VBA macro in Access that turns each name into a separate record for mail merging. Okay, I lied. That was more of a story. HERE'S the problem: How can teachers add a student name like Hank Williams, Jr. (note the comma) without having the comma cause "Jr." to be interpreted as a separate student in my macro?
The full contents of the (SharePoint exported to Excel) field "Students" are stored within the macro in a variable called strStudentsBeforeSplit, and this string is eventually split with this statement:
strStudents = Split(strStudentsBeforeSplit, ",", -1, vbTextCompare)
So there's the problem, really. The Split function is using a comma as a separator, but poor student Hank Williams, Jr. has a comma in his name. What to do?
I spent a long time trying to figure out how to escape the comma. If this is possible, I never figured it out.
Lots of forum posts suggested using a different character as the separator. That's okay, I guess, but here's the solution I came up with:
Replace only the special commas preceding "Jr" with a different, uncommon character BEFORE the Split function runs.
Swap back to the commas after Split runs.
That's really the end of my post, but here are the lines from my macro that accomplish step 1. This may or may not be of interest because it really just deals with the minutiae of making the swap. Note that the code handles several different (mostly wrong) ways my teachers might type the "Jr" part of the name.
'Dealing with the comma before Jr. This will handle ", Jr." and ", Jr" and " Jr." and " Jr".
'Replaces the comma with ~ because commas are used to separate fields in Split function below.
'Will swap ~ back to comma later in UpdateQ_Comma_for_Jr query.
strStudentsBeforeSplit = Replace(strStudentsBeforeSplit, "Jr", "~ Jr.") 'Every Jr gets this treatment regardless of what else is around it.
'Note that because of previous Replace functions a few lines prior, the space between the comma and Jr will have been removed. This adds it back.
strStudentsBeforeSplit = Replace(strStudentsBeforeSplit, ",~ Jr", "~ Jr") 'If teacher had added a comma, strip it.
strStudentsBeforeSplit = Replace(strStudentsBeforeSplit, " ~ Jr", "~ Jr") 'In cases when teacher added Jr but no comma, remove the (now extra)...
'...space that was before Jr.

Resources