Turn Excel line break into <br> - excel

A client sent me a huge list of product name and descriptions. The Description cells have text wrap and many line breaks. I need to import this into a MySQL which I do through Navicat Premium.
The problem is that the description cell is used as the HTML description of each product page.
Is there a way to replace Excel's line break with the <br> either in the same Excel file or by a php function?

A little bit of ASCII coding will go a long way.
Set up the find/replace dialogue (Ctrl-H). In the Find field, hold down the Alt key and type 010 from the numeric key pad. (This lets you find a linefeed character.) In the replace field, put your <br>.

or use a VBA function to replace the carriage returns in a string
Insert a MODULE and paste this
Function LineFeedReplace(ByVal str As String)
dim strReplace as String
strReplace = "<br>"
LineFeedReplace = Replace(Replace(Replace(Replace(Replace(Replace(str, Chr(10), strReplace), Chr(13), strReplace), vbCr , strReplace), vbCrLf, strReplace), vbLf, strReplace), vbNewLine, strReplace)
End Function
If cell A1 contains a string with a linefeed then =LineFeedReplace(A1) will return the string with all linefeeds set to <br>

First make sure you account for both CR and LF which tend to come together. The codes for these are 0013 and 0010 and so you will need a formula that allows you to clean both. I used this formula successfully =SUBSTITUTE(A3,CHAR(13),"<br>") to convert a cell of long text in excel replacing invisible breaks with the 'br' tag. Since you can't tell exactly what kind of line break you have you can also try it with 0010 or =SUBSTITUTE(A3,CHAR(10),"<br>")

Related

Replace non-printable characters with " (Inch sign) VBA Excel

I need to replace non-printable characters with " (Inch sign).
I tried to use excel clean function and other UDF functions, but it just remove and not replace.
Note: non-printable characters are highlighted in blue on the above photo and it's position is random on the cells.
this is a sample string file Link`
The expected correct output should be 12"x14" LPG . OUTLET OCT-SEP# process
In advance grateful for useful comments and answer.
As per my comment, you can try:
=SUBSTITUTE(A1,CHAR(25)&CHAR(25),CHAR(34))
Or the VBA pseudo-code:
[A1] = [A1].Replace(Chr(25) & Chr(25), Chr(34))
Where [A1] is the obvious placeholder for the range-object you would want to use with proper and absolute referencing.
With ms365 newest functions, we could also use:
=TEXTJOIN(CHAR(34),,TEXTSPLIT(A1,CHAR(25)))
You can use Regular Expressions within a UDF to create a flexible method to replace "bad" characters, when you don't know exactly what they are.
In the UDF below, I show two pattern options, but others are possible.
One is to replace all characters with a character code >127
the second is to replace all characters with a charcter code >255
Option Explicit
Function ReplaceBadChars(str As String, replWith As String) As String
Dim RE As Object
Set RE = CreateObject("Vbscript.Regexp")
With RE
.Pattern = "[\u0080-\uFFFF]" 'to replace all characters with code >127 or
'.Pattern = "[\u0100-\uFFFF]" 'to replace all characters with code >255
.Global = True
ReplaceBadChars = .Replace(str, replWith)
End With
End Function
On the worksheet you can use, for example:
=ReplaceBadChars(A1,"""")
Or you could use it in a macro if you wanted to process a column of data without adding an extra column.
Note: I am uncertain as to whether there might be an efficiency difference using a smaller negated character class (eg: [^\x00-\x79] instead of the character class I showed in the code. But if, as written, execution seems slow, I'd try this change)
You can try this :
Cells.Replace What:="[The caracter to replace]", Replacement:=""""

I have text in separate lines inside same box, How to make it all in one line

Some of my text are in different lines inside same cell. I want them in single line. How do I bring them in single line ?
Example:
first cell contains:
Hi Ram, I want to go to movie today.
Are you willing to join?
If yes, let me know early.
Example:
Expected output:
Hi Ram, I want to go to movie today.Are you willing to join?If yes, let me know early.
New line in a cell A1 caused by alt+Enter for example, may be removed using a formula such as:
=SUBSTITUTE(A1,CHAR(10)," ")
Where A1 is the cell containing the text to be changed. You can enter the formula above in a different cell of course.
The parameter " " indicates 1 space to replace the line break. You could use any other character.
Another type of line break is CHAR(13). You can remove CHAR(13) using the same function again:
=SUBSTITUTE(SUBSTITUTE(A1, CHAR(13)," "), CHAR(10), " ")
In case you had some spaces already before the new-line character, you need to wrap the above formula in a TRIM function like so:
=TRIM(SUBSTITUTE(A1,CHAR(10)," "))
OR
=TRIM(SUBSTITUTE(SUBSTITUTE(A1,CHAR(13)," "),CHAR(10)," "))
Always make a copy of your file before you apply formulas that could change the data.
Note-1:
char(13) is officially called "carriage return" and char(10) is called "line feed".
CHAR(10) returns a line break on Windows, and CHAR(13) returns a line break on the Mac. This answer is for Windows. You can't visually see it but you can see its effect.
Note-2:
As #kojow7 answered, a text wrap can cause the text to appear on more than 1 line depending on the cell width and the text length. This answer does not resolve this case.
Related discussion can be found here: Remove line breaks from cell.
Two things you may need to fix here: 1) Line breaks and 2) Text Wrapping
To fix line breaks:
Select the cells that need to be changed
Press CTRL+H to open Search and Replace
In the Find box type CTRL+J to insert the line break character (it may look like nothing was inserted in the field, but it does insert a line break)
Decide whether to replace the line breaks with a space or with nothing
Press Replace All
To turn off text wrapping:
Select the cells that need to be changed
Go to the Home Tab
In the Alignment Group check to see if the Wrap Text button is clicked.
If it is, click on it again to deselect it.
Depending on your situation, you may need to fix either one or both of these.
Depending on your document it might contain linefeeds or carriage returns or BOTH.
Alexander Frolov (https://www.ablebits.com/office-addins-blog/2013/12/03/remove-carriage-returns-excel/) has written a very good blog post about different technics of finding and removing linebreaks in an Excel file. We will use the “macro way” of doing that – as it is the one that works either on Windows AND Mac. The search replace method offered here too will not work on Mac but on windows.
Add the below Macro to your document (slighlty modified from the original)
Change the value of “ReplaceWith” from ” ” (space) to anything you like a linebreak to be replaced with.
E.g. ReplaceWith = “-” will result in “Line1-Line2-Line3”
Run the Macro (Extras > Macro) while all cells are selected.
Sub RemoveCarriageReturns()
ReplaceWith = " "
LinefeedChar = Chr(10)
Dim recordRange As Range
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
For Each recordRange In ActiveSheet.UsedRange
If 0 < InStr(recordRange, LinefeedChar) Then
recordRange = Replace(recordRange, LinefeedChar, ReplaceWith)
End If
Next
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
If your separate lines are not gone by now please change "LinefeedChar" from "Chr(10)" to "Chr(13)" and run it again

VBA Special characters U+2264 and U+2265

I have a frustrating problem. I have a string containg other characters that are not in this list (check link). My string represents a SQL Query.
This is an example of what my string can contain: INSERT INTO test (description) VALUES ('≤ ≥ >= <=')
When I check the database, the row is inserted successfully, but the characters "≤" and "≥" are replaced with "=" character.
In the database the string in description column looks like "= = >= <=".
For the most characters I can get a character code. I googled a character code for those two symbols, but I didn't find one. My goal is to check if my string contains this two characters , and afterwards replace them with ">=" and "<="
===Later Edit===
I have tried to check every character in a for loop;
tmp = Mid$(str, i, 1)
tmp will have the value "=" when my for loop reaches the "≤" character, so Excel cannot read this "≤" character in a VB string, then when I'm checking for character code I get the code for "=" (Chr(61))
Are you able to figure out what the character codes for both "≤" and "≥" in your database character set are? if so then maybe try replacing both characters in your query string with chrw(character_code).
I have just tested something along the lines of what you are trying to do using Excel as my database - and it looks to work fine.
Edit: assuming you are still stuck and looking for assistance here - could you confirm what database you are working with, and any type information setting for the "description" field you are looking to insert your string into?
Edit2: I am not familiar with SQL server, but isn't your "description" field set up to be of a certain data type? if so what is it and does it support unicode characters? ncharvar, nchar seem to be examples of sql server data types that support Unicode.
It sounds like you may also want to try and add an "N" prefix to the value in your query string - see
Do I have use the prefix N in the "insert into" statement for unicode? &
how to insert unicode text to SQL Server from query window
Edit3: varchar won't qualify for proper rendering of Unicode - see here What is the difference between varchar and nvarchar?. Can you switch to nvarchar? as mentionned above, you may also want to prefix the values in your query string with 'N' for full effect
Edit4: I can't speak much more about sqlserver, but what you are looking at here is how VBA displays the character, not at how it actually stores it in memory - which is the bottom line. VBA won't display "≤" properly since it doesn't support the Unicode character set. However, it may - and it does - store the binary representation correctly.
For any evidence of this, just try and paste back the character to another cell in Excel from VBA, and you will retrieve the original character - or look at the binary representation in VBA:
Sub test()
Dim s As String
Dim B() As Byte
'8804 is "≤" character in Excel character set
s = ChrW(8804)
'Assign memory representation of s to byte array B
B = s
'This loop prints "100" and "34", respectively the low and high bytes of s coding in memory
'representing binary value 0010 0010 0110 0100 ie 8804
For i = LBound(B) To UBound(B)
Debug.Print B(i)
Next i
'This prints "=" because VBA can not render character code 8804 properly
Debug.Print s
End Sub
If I copy your text INSERT INTO test (description) VALUES ('≤ ≥ >= <=') and paste it into the VBA editor, it becomes INSERT INTO test (description) VALUES ('= = >= <=').
If I paste that text into a Excel cell or an Access table's text field, it pastes "correctly".
This seems to be a matter of character code supported, and I suggest you have a look at this SO question.
But where in you program does that string come from, since it cannot be typed in VBA ??
Edit: I jus gave it a try with the below code, and it works like a charm for transferring your exotic characters from the worksheet to a table !
Sub test1()
Dim db As Object, rs As Object, cn As Object
Set cn = CreateObject("DAO.DBEngine.120")
Set db = cn.OpenDatabase("P:\Database1.accdb")
Set rs = db.OpenRecordset("table1")
With rs
.addnew
.Fields(0) = Range("d5").Value
.Update
End With
End Sub

Separate words with commas in Excel 2010

I'm trying to use a formula in Excel to separate a bunch of words in a cell with a comma. If there are more than 5 words in the cell, I just want to get the first 5 words. To get the first five words in a cell and separate them by a comma I use this:
=SUBSTITUTE(LEFT(A1,FIND("^",SUBSTITUTE(A1," ","^",5))-1), " ", ", ")
This works fine. But the problem with this, because of the number 5 here, if I a cell contains less than 5 words, I get an error. I tried to substitute the 5 with this:
LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1
So my function becomes this:
=SUBSTITUTE(LEFT(A1,FIND("^",SUBSTITUTE(A1," ","^",LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1))-1), " ", ", ")
But this doesn't work, it gives me an error. Any idea how I can do this please?
Also I would like to ignore the first word if its first character is "-" (without the quotes) and just start from the second word. So in other words, I want something like this:
I love my life very much should return I, love, my, life, very
- I love my life very much should return I, love, my, life, very (the "-" is ignored")
I love my should return I, love, my
Thanks in advance for any help
Here's a somewhat different approach. Aside from the "less than 5" issue, it also deals with the "5 words with no space at the end" issue:
=LEFT(A1,FIND("^",SUBSTITUTE(A1 & "^"," ","^",5))-1)
EDIT 1: I just noticed the part about the leading "- ". My addition isn't very elegant, but it deals with it, and also TRIMS any trailing spaces:
=TRIM(LEFT(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1),FIND("^",SUBSTITUTE(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1) & "^"," ","^",5))-1))
EDIT 2: Oh yeah, commas:
=SUBSTITUTE(TRIM(LEFT(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1),FIND("^",SUBSTITUTE(IF(LEFT(A1,2)="- ",MID(A1,3,999),A1) & "^"," ","^",5))-1))," ",",")
Try this:
=TRIM(LEFT(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"-"," "))," ",","),",",REPT(" ",99),5),99))
This will work even if there is not a space after the dash or if there are extra spaces in the text. Often I find that input is not very clean.
=SUBSTITUTE(LEFT(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"-","",1)),
" ","*",5),IFERROR(FIND("*",SUBSTITUTE(TRIM(SUBSTITUTE(A1,"-","",1)),
" ","*",5))-1,999))," ",",")
Edit: After commenting on István's, I made mine flawless too.
=SUBSTITUTE(LEFT(SUBSTITUTE(TRIM(SUBSTITUTE(LEFT(TRIM(A1),1),"-"," ",1)
&MID(TRIM(A1),2,999))," ","*",5),IFERROR(FIND("*",SUBSTITUTE(
TRIM(SUBSTITUTE(LEFT(TRIM(A1),1),"-","",1)&MID(TRIM(A1),2,999))," ","*",5))-1,999))," ",",")
But I think his is more elegant.
Try this:
=SUBSTITUTE(LEFT(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "),", ","|",MIN(LEN(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "))-LEN(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", ")," ","")),5)),FIND("|",SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "),", ","|",MIN(LEN(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", "))-LEN(SUBSTITUTE(SUBSTITUTE(TRIM(SUBSTITUTE(A1,"- ","",1))&" "," ",", ")," ","")),5)))-1),",,",",")
The formula works by taking the following steps:
Remove any leading dash-space
Trim any leading or trailing spaces
Insert comma-spaces in place of spaces and add a trailing comma-space
Calculate the lesser of 5 and the number of words in the string
Put in "|" in place of either the fifth comma-space or the trailing comma-space if the string is less than five words
Determine the position of the "|"
Strip off the "|" and all characters to the right of it
Remove any doubled commas due to any single embedded commas in the initial string
If you are willing to consider a VBA solution, this complex expression can be replaced by a user-defined function:
Function words5(InputString As String) As String
Dim wordArray As Variant
wordArray = Split(Trim(Replace(InputString, _ 'remove "-", put words into array
"-", "", , 1)), " ")
ReDim Preserve wordArray(LBound(wordArray) To _ 'drop all but the first 5 words
WorksheetFunction.Min(UBound(wordArray), 5 - 1))
words5 = Replace(Join(wordArray, ", "), ",,", ",") 'rejoin the words with ", "
End Function 'separator
On the plus side of using this code is its maintainability compared to the worksheet formula, which impossible to understand or safely alter without access to the original building blocks that were combined into the single expression.
The code would have to be installed in the workbook in which it is used or in either the standard Personal.xlsb workbook or an addin workbook.
To use the function, copy and paste it into a standard module, which can be inserted into a workbook via the VBA editor. You can open the editor with the Visual Basic button on the `Developer tab of the ribbon.
Figured I'd throw my hat in the ring also. I think this formula should cover the bases:
=SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(TRIM(SUBSTITUTE(A1&" ","- ",""))," ",REPT(" ",99)),99*5))," ",",")

Converting html special character in excel

Could anyone please suggest a function/formula used in worksheet to convert html special character to HTML entity, thanks
E.g.
™ to ™
® to ®
The answer to this question is a two part.
Do you only need to convert a certain set of these special chars?
Do you need to convert All supported?
Answer 1:
Public Function ConvertHTMLTag(data As String) As String
data = Replace(data, "™", "™")
data = Replace(data, "®", "®")
ConvertHTMLTag = data
End Function
Answer 2:
Repeat for all chars in http://www.webmonkey.com/2010/02/special_characters/
To make this a bit easier, I would try to put this list in to an Excel sheet with in two columns. One for the special tag and the other with it's evaluated char.
Write a formula in a 3rd column to create your code for you...
="data = Replace(data, "&Char(34)&A1&Char(34)&", "&Char(34)&A2&Char(34)&")"
Once you have your VBA code you've created in Excel, a simple copy and paste in to the function above will do the trick.
Use this function to encode from html special character to string
Function HTMLToCharCodes(ByVal s As String) As String
With New MSXML2.DOMDocument60
.LoadXML "<p>" & s & "</p>"
HTMLToCharCodes = .SelectSingleNode("p").nodeTypedValue
End With
End Function
Input: &, return: &

Resources