This question already has answers here:
How do I display a messagebox with unicode characters in VBA?
(2 answers)
Unicode string literals in VBA
(3 answers)
Closed 2 years ago.
Has anyone a workaround to show Arabic in a message box. I even changed the font to Arial (Arabic) in VBA setting and I cant write Arabic in the code but when it runs its shows garbage words in the message box.
Below is a sample code even though it's showing as CaCaCa below but in my VBA it's a proper Arabic word.
Private Sub cmdDelete_Click()
Dim iRow As Long
If Selected_List = 0 Then
Font.Name = "Arial Unicode MS"
MsgBox "ÇáÇãÇä No row is selected. ", vbOKOnly + vbInformation, "Delete"
Exit Sub
End If
An approach based on #Tarik 's link to MsgBox with Unicode characters:
VBA7 Declaration of API function MessageBoxW()
Option Explicit ' Declaration head of code module
Private Declare PtrSafe Function MessageBoxW Lib "User32" ( _
ByVal hWnd As LongPtr, _
ByVal lpText As LongPtr, _
ByVal lpCaption As LongPtr, _
ByVal uType As Long) _
As Long
Help function MsgBoxW()
'Site: https://stackoverflow.com/questions/55210315/how-do-i-display-a-messagebox-with-unicode-characters-in-vba
Function MsgBoxW( _
Prompt As String, _
Optional Buttons As VbMsgBoxStyle = vbOKOnly + vbInformation, _
Optional Title As String = " Delete") _
As VbMsgBoxResult
Title = WorksheetFunction.Unichar(&H1F4BC) & Title
MsgBoxW = MessageBoxW(Application.hWnd, StrPtr(Prompt), StrPtr(Title), Buttons)
End Function
Note that Access would need a Application.hWndAccessApp argument to get the corresponding window handle.
Example call
Sub ExampleCall()
Dim s As String
s = WorksheetFunction.Unichar(&H2776) & " " & getArabic() & vbNewLine & vbNewLine & _
WorksheetFunction.Unichar(&H2777) & " No Rows selected. "
MsgBoxW s
End Sub
Hardcoding test function getArabic()
As I don't know the Arabic language, the following function only tries to simulate a correct phrase I got via a translation site by joining single unicode values of a hardcoded array to a string like e.g. ; so I beg your pardon for any mistranslation :-)
There are numerous sites where you can get the hexadecimal or decimal code values immediately.
It would be possible as well to insert your original string into an Excel sheet cell and analyze the corresponding character values one by one (e.g. via formula =UNICODE(MID($A2,1,1)) etc.)
Function getArabic()
'Note: uses decimal values here (e.g. decimal 1604 equals hexadecimal &H644)
Dim arr: arr = Array(1604, 1605, 32, 1610, 1578, 1605, 32, 1578, 1581, 1583, 1610, 1583, 32, 1589, 1601, 46)
Dim a, s As String
For Each a In arr
s = s & WorksheetFunction.Unichar(a)
Next a
getArabic = s
End Function
The computer you are using should have the default code page to Windows 1256. That way, it interprets any extended ASCII character (above 127) as Arabic. The alternative (preferred option) is to ensure you use UTF-8.
See How do I display a messagebox with unicode characters in VBA?
Related
Using Excel VBA: Is it possibile to get the text contained in the tooltip which shows the argument list of a sub or function?
The Application.MacroOptions method knows the argument "ArgumentDescriptions" but it is possibly only set. Is there any way to read this info?
"Get the tooltip text contained showing the argument list of a sub or function ... The Application.MacroOptions Method has (the) argument ArgumentDescriptions but it Is possibile(!) only set. Is there any way to read this info?"
► Afaik there is no built-in way.
Possible workaround
As you "need this info in VBA code for a function/sub created in other module or class.",
you might want to analyze your code modules by referencing the
"Microsoft Visual Basic for Applications Extensibility 5.3" library in the VB Editor's menu.
Caveats:
Security: Requires to trust access to the VBA project object model.
Rights: If not only for your personal use, consider that other corporate users may
not have enough rights to turn that feature on.
Self reflection: Mirrors the currently compiled/saved code only, so it might not reflect the latest code when the searched procedure body line has been changed.
Line breaks: The following approach assumes that the entire procedure info is coded in one line -
not regarding closing line breaks via "_";
it should be easy to extend the .Lines result in these cases by your own (e.g. benefitting from the count argument or by additional loops through the next lines).
The following code doesn't intend neither to cover or to optimize all possibilities,
but to direct you to a solution keeping it short & simple.
Function GetSyntax()
Function GetSyntax(wb As Workbook, Optional ByVal srchProcName As String = "GetCookie") As String
'Purp: Show name & arguments of a given procedure
'1) escape a locked project
If wb.VBProject.Protection = vbext_pp_locked Then Exit Function ' escape locked projects
'2) loop through all modules
Dim component As VBIDE.VBComponent
For Each component In wb.VBProject.VBComponents
' Debug.Print "***"; component.Name, component.Type
Dim found As Boolean
'3) loop through procedures (as well as Let/Set/Get properties)
Dim pk As Long ' proc kind enumeration
For pk = vbext_pk_Proc To vbext_pk_Get
'a) get the essential body line of the search procedure
Dim lin As String
lin = getLine(component.CodeModule, srchProcName, pk)
'b) found non-empty code line?
found = Len(lin) <> 0
If found And pk = 0 Then GetArgs = lin: Exit For
'c) get proc info(s) - in case of Let/Set/Get properties
Dim Delim As String
GetSyntax = GetSyntax & IIf(found, Delim & lin, "")
Delim = vbNewLine ' don't change line order
Next pk
'If found Then Exit For ' if unique proc names only
Next component
End Function
Help function getLine()
Function getLine(module As VBIDE.CodeModule, ByVal srchProcName As String, ByVal pk As Long) As String
'a) define procedure kind
Dim ProcKind As VBIDE.vbext_ProcKind
ProcKind = pk
'b) get effective row number of proc/prop body line
On Error Resume Next
Dim effectiveRow As Long
effectiveRow = module.ProcBodyLine(srchProcName, ProcKind) ' find effective row of search procedure
'c) provide for non-findings or return function result (Case 0)
Select Case Err.Number
Case 0 ' Found
Dim lin As String
'Syntax: obj.Lines (startline, count) As String
lin = Trim(module.Lines(effectiveRow, 1))
getLine = lin
Case 35 ' Not found
Err.Clear: On Error GoTo 0
Case Else
Debug.Print "** " & " Error " & Err.Number & " " & Err.Description: Err.Clear: On Error GoTo 0
End Select
End Function
Possible Test call
Dim procList, proc
procList = Split("getCookie,foo,myNewFunction", ",")
For Each proc In procList
MsgBox GetSyntax(ThisWorkbook, proc), vbInformation, proc
Next
I have created a custom page size for excel under "windows print server properties". following the steps given here https://www.win2pdf.com/doc/pdf-custom-paper-windows-10.html
Now I have all the custom sizes including default sizes shown in the excel page size list. next, I have pulled all sizes to excel Combobox dropdown with the following codes.
so my requirement is to change excel sheet size with Combobox selection change.
Please Note: we cannot change size with case method or predefine it unless sizes are fixed, as sizes will be dynamic in Combobox, once the user creates a new size it ll get added to Combobox.
Thanks
'worksheet code
ActiveSheet.ComboBox1.List = GetPaperSizes
'module code
Option Explicit
Private Const DC_PAPERNAMES = &H10
Private Declare Function DeviceCapabilities _
Lib "winspool.drv" _
Alias "DeviceCapabilitiesA" _
(ByVal lpDeviceName As String, _
ByVal lpPort As String, _
ByVal iIndex As Long, _
ByRef lpOutput As Any, _
ByRef lpDevMode As Any) _
As Long
Private Declare Function StrLen _
Lib "kernel32.dll" _
Alias "lstrlenA" _
(ByVal lpString As String) _
As Long
Function GetPaperSizes() As Variant
Dim AllNames As String
Dim I As Long
Dim Msg As String
Dim PD As Variant
Dim Ret As Long
Dim papersizes() As Byte
Dim PaperSize As String
'Retrieve the number of available paper names
PD = Split(Application.ActivePrinter, " on ")'<<<== change "on" with its local Language translation from english
Ret = DeviceCapabilities(PD(0), PD(1), DC_PAPERNAMES, ByVal 0&, ByVal 0&)
'resize the array
ReDim papersizes(0 To Ret * 64) As Byte
'retrieve all the available paper names
Call DeviceCapabilities(PD(0), PD(1), DC_PAPERNAMES, papersizes(0), ByVal 0&)
'convert the retrieved byte array to an ANSI string
AllNames = StrConv(papersizes, vbUnicode)
'loop through the string and search for the names of the papers
For I = 1 To Len(AllNames) Step 64
PaperSize = Mid(AllNames, I, 64)
PaperSize = Left(PaperSize, StrLen(PaperSize))
If PaperSize <> vbNullString Then Msg = Msg & PaperSize & vbCrLf
Next I
GetPaperSizes = Split(Left(Msg, Len(Msg) - 2), vbCrLf)
End Function
can I expect any help on this? I am not sure this bump is right or not.
New in Stack Overflow, I'm building a macro in Excel with VBA. Basically I have a file with multiple tabs with information in tables, these tables have text in it and some words from that text are bold and repeat in each tab (lets say owner and process). I do display of this information in a textbox located in a form according to the line of the table they pick before on a listbox, the text displays correctly but it ignore the text formatting(bold and italic). Is there a way to display the text formatting in the textbox as it is in the table?
Hope I've have made myself clear enough.
Typical example for a Shape textbox (not ActiveX):
Sub BoxMaker()
ActiveSheet.Shapes.AddTextbox(msoTextOrientationHorizontal, 217.5, 51#, _
482.25, 278.25).Select
Selection.Name = "SPLASH"
Selection.Characters.Text = "Please Wait for Macro"
With Selection.Characters(Start:=1, Length:=21).Font
.Name = "Arial"
.FontStyle = "Regular"
.Size = 36
.Strikethrough = False
.Superscript = False
.Subscript = False
.OutlineFont = False
.Shadow = False
.Underline = xlUnderlineStyleNone
.ColorIndex = xlAutomatic
End With
Selection.Characters(Start:=8, Length:=4).Font.Bold = True
Selection.HorizontalAlignment = xlCenter
Selection.VerticalAlignment = xlCenter
End Sub
You can format text in a textbox similar to text in a cell.
Possible work around in UserForm using a Webbrowser control
Due to your comment: "...actually I'm using MSForms.TextBox"
A possible work around would be to create a simple HTML file using the b tag (e.g. blabla within normal textbold text ...) for bold text and load it into a Webbrowser control (needs reference to Microsoft Internet Controls) e.g. via WebBrowser1.Navigate ThisWorkbook.Path & "\topic.htm".
As HTML commonly uses utf-8 encoding I demonstrate an approach using the system function WideCharToMultiByte (API call).
Example call using helper functions (API calls)
' declare and assign simple html string
Dim htmlstring as String
htmlstring = "<html><body><div>Normal text <b>bold text</b> further text</div></body></html>"
' write html file via helper procedure
writeUtf8 htmlstring, ThisWorkbook.Path & "\topic.htm"
' load html file into WebBrowswer control
Me.WebBrowser1.Navigate ThisWorkbook.Path & "\topic.htm"
Helper procedures
I suggest to write a separate code module for these helper functions:
Option Explicit ' declaration head of separate code module
' Declare API Function
Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpDefaultChar As Long, _
ByVal lpUsedDefaultChar As Long) As Long
Public Sub writeUtf8(ByVal s As String, ByVal datei As String)
' Purpose: save HTML String in utf-8 mode
' Note: called by writeUtf8 htmlstring, thisworkbook.path & "\topic.htm"
Dim file As Integer
Dim B() As Byte
file = FreeFile
Open datei For Binary Access Write Lock Read Write As #file
getUtf8 s, B
Put #file, , B
Close #file
End Sub
Private Sub getUtf8(ByRef s As String, ByRef B() As Byte)
' Note: called by above helper function; uses API call (see declaration head)
Const CP_UTF8 As Long = 65001
Dim len_s As Long
Dim ptr_s As Long
Dim size As Long
Erase B
len_s = Len(s)
If len_s = 0 Then _
err.Raise 30030, , "Len(WideChars) = 0"
ptr_s = StrPtr(s)
size = WideCharToMultiByte(CP_UTF8, 0, ptr_s, len_s, 0, 0, 0, 0)
If size = 0 Then _
err.Raise 30030, , "WideCharToMultiByte() = 0"
ReDim B(0 To size - 1)
If WideCharToMultiByte(CP_UTF8, 0, ptr_s, len_s, VarPtr(B(0)), size, 0, 0) = 0 Then _
err.Raise 30030, , "WideCharToMultiByte(" & Format$(size) & ") = 0"
End Sub
I am developing an Excel add-in with a CHM help file. The CHM has topics that I am trying to reach from Excel's "Insert Function" dialog. This is confirmed when I invoke HH.EXE as follows:
HH.EXE -mapid 1234 "mk:#MSITSTORE:<path-to-my-chm-file>"
I registered all of my UDFs with the Application.MacroOptions( ) function, passing the appropriate parameters (Macro, Category, HelpContextID and HelpFile).
When I click the "Help on this function" link, HH.EXE is invoked with the correct path and file name of my CHM. However, there is no -mapid parameter used for the invocation of HH.EXE. Consequently, when my CHM file is loaded, HH does not go to the desired topic.
Does anyone know or have a guess as to why Excel may be omitting this parameter?
Thank you!
As I understand you want something like shown in the snapshot. Context-sensitive help is complex.
I add sample code for buttons and code for calling the HTMLHelp (CHM) API.
Office 2007 is installed on my machine only.
This is done by a HTMLHelp API call (code snippet (1)):
'******************************************************************************
'----- Modul - definition for HTMLHelp
'----- (c) Ulrich Kulle, http://www.help-info.de/en/Visual_Basic_Applications/vba.htm
'----- 2002-08-26 Version 0.2.xxx
'******************************************************************************
'----- Portions of this code courtesy of David Liske.
Declare Function IsWindow Lib "user32.dll" (ByVal hwnd As Long) As Long
Private Declare Function SendMessage Lib "user32" _
Alias "SendMessageA" (ByVal hwnd As Long, ByVal wMsg As Long, _
ByVal wParam As Long, lParam As Any) As Long
Private Declare Function HtmlHelp Lib "hhctrl.ocx" Alias "HtmlHelpA" _
(ByVal hwndCaller As Long, ByVal pszFile As String, _
ByVal uCommand As Long, ByVal dwData As Long) As Long
Declare Function HTMLHelpTopic Lib "hhctrl.ocx" Alias "HtmlHelpA" _
(ByVal hwndCaller As Long, ByVal pszFile As String, _
ByVal uCommand As Long, ByVal dwData As String) As Long
Private Declare Function HtmlHelpSearch Lib "hhctrl.ocx" Alias "HtmlHelpA" _
(ByVal hwndCaller As Long, ByVal pszFile As String, _
ByVal uCommand As Long, dwData As HH_FTS_QUERY) As Long
'--- to keep the handle of the HH windows when calling help by API --------
Public HHwinHwnd As Long
'--- some constants used by the API ---------------------------------------
Public Const HH_DISPLAY_TOPIC = &H0 ' select last opened tab, [display a specified topic]
Public Const HH_DISPLAY_TOC = &H1 ' select contents tab, [display a specified topic]
Public Const HH_DISPLAY_INDEX = &H2 ' select index tab and searches for a keyword
Public Const HH_DISPLAY_SEARCH = &H3 ' select search tab and perform a search
Public Const HH_HELP_CONTEXT = &HF ' display mapped numeric value in dwData
Public Const HH_CLOSE_ALL = &H12
Public Type HH_FTS_QUERY ' UDT for accessing the Search tab
cbStruct As Long ' Sizeof structure in bytes.
fUniCodeStrings As Long ' TRUE if all strings are unicode.
pszSearchQuery As String ' String containing the search query.
iProximity As Long ' Word proximity.
fStemmedSearch As Long ' TRUE for StemmedSearch only.
fTitleOnly As Long ' TRUE for Title search only.
fExecute As Long ' TRUE to initiate the search.
pszWindow As String ' Window to display in
End Type
Public Function HFile(ByVal i_HFile As Integer) As String
'----- Set the string variable to include the application path of helpfile
Select Case i_HFile
Case 1
HFile = ThisWorkbook.Path & "\CHM-example.chm"
Case 2
'----- Place other Help file paths in successive case statements
HFile = ThisWorkbook.Path & "\CHM-example.chm"
End Select
End Function
Public Sub ShowContents(ByVal intHelpFile As Integer)
HHwinHwnd = HtmlHelp(hwnd, HFile(intHelpFile), HH_DISPLAY_TOC, 0)
End Sub
Public Sub ShowIndex(ByVal intHelpFile As Integer)
HHwinHwnd = HtmlHelp(hwnd, HFile(intHelpFile), HH_DISPLAY_INDEX, 0)
End Sub
Public Sub ShowTopic(ByVal intHelpFile As Integer, strTopic As String)
HHwinHwnd = HTMLHelpTopic(hwnd, HFile(intHelpFile), HH_DISPLAY_TOPIC, strTopic)
End Sub
Public Sub ShowTopicID(ByVal intHelpFile As Integer, IdTopic As Long)
HHwinHwnd = HtmlHelp(hwnd, HFile(intHelpFile), HH_HELP_CONTEXT, IdTopic)
End Sub
Public Sub CloseHelp(ByVal hwnd As Long)
Const WM_CLOSE = &H10
If IsWindow(hwnd) Then
SendMessage hwnd, WM_CLOSE, 0, 0
End If
End Sub
'------------------------------------------------------------------------------
'----- display the search tab
'----- bug: start searching with a string dosn't work
'------------------------------------------------------------------------------
Public Sub ShowSearch(ByVal intHelpFile As Integer)
Dim searchIt As HH_FTS_QUERY
With searchIt
.cbStruct = Len(searchIt)
.fUniCodeStrings = 1&
.pszSearchQuery = "foobar"
.iProximity = 0&
.fStemmedSearch = 0&
.fTitleOnly = 1&
.fExecute = 1&
.pszWindow = ""
End With
Call HtmlHelpSearch(0&, HFile(intHelpFile), HH_DISPLAY_SEARCH, searchIt)
End Sub
And some sample code (2) added:
Sub AddUDFToCategory()
'------------------------------------------------------------------------------
' insert after Description line [optional]: Category:=2, _ => Date & Time
'------------------------------------------------------------------------------
' If the UDF's are in an Addin (.xla) it's better to qualify the function name
' like this:
' Macro:=ThisWorkbook.Name & "!" & "DayName"
'------------------------------------------------------------------------------
' see also Excel help for Application.MacroOptions
'------------------------------------------------------------------------------
application.MacroOptions _
Macro:="TestMacro", _
Description:="This function gives back the 'Hello world' message!", _
Category:=2, _
HelpFile:=ThisWorkbook.Path & "\CHM-example.chm", _
HelpContextID:=10000
application.MacroOptions _
Macro:="DayName", _
Description:="A Function That Gives the Name of the Day", _
Category:=2, _
HelpFile:=ThisWorkbook.Path & "\CHM-example.chm", _
HelpContextID:=20000
End Sub
Function TestMacro()
'----------------------------------------------------------------
' Display a message box with a help button linked to a help topic
'----------------------------------------------------------------
MsgBox "The 'Hello World' message for testing this function!.", _
Buttons:=vbOKOnly + vbMsgBoxHelpButton, _
HelpFile:=ThisWorkbook.Path & "\CHM-example.chm", _
Context:=20010
End Function
Function DayName(InputDate As Date)
'---------------------------------------------
'--- A Function That Gives the Name of the Day
'--- http://www.fontstuff.com/vba/vbatut01.htm
'---------------------------------------------
Dim DayNumber As Integer
DayNumber = Weekday(InputDate, vbSunday)
Select Case DayNumber
Case 1
DayName = "Sunday"
Case 2
DayName = "Monday"
Case 3
DayName = "Tuesday"
Case 4
DayName = "Wednesday"
Case 5
DayName = "Thursday"
Case 6
DayName = "Friday"
Case 7
DayName = "Saturday"
End Select
End Function
For a working example go to Online Help and Visual Basic for Applications
please. Search for VBA - Download and the download link Download Visual Basic for Applications example project (EXCEL).
download the ZIP file from the link above to a temp directory, right click the saved ZIP file first, click Properties and click Unblock
unzip to a temp directory.
as a test open CHM file first by double click
double click CHM_VBA_example.xls
first have a look to security warnings ( Excel) and set to Activate
follow the instructions from the Excel example worksheet.
For further information read Using the VBA Excel Example File too.
I have approx. 12000 cells in excel containing RTF (including formatting tags). I need to parse them to get to the unformatted text.
This is the example of one of the cells with text:
{\rtf1\ansi\deflang1060\ftnbj\uc1
{\fonttbl{\f0 \froman \fcharset0 Times New Roman;}{\f1 \fswiss \fcharset238
Arial;}}
{\colortbl ;\red255\green255\blue255 ;\red0\green0\blue0 ;}
{\stylesheet{\fs24\cf2\cb1 Normal;}{\cs1\cf2\cb1 Default Paragraph Font;}}
\paperw11908\paperh16833\margl1800\margr1800\margt1440\margb1440\headery720\footery720
\deftab720\formshade\aendnotes\aftnnrlc\pgbrdrhead\pgbrdrfoot
\sectd\pgwsxn11908\pghsxn16833\marglsxn1800\margrsxn1800\margtsxn1440\margbsxn1440
\headery720\footery720\sbkpage\pgncont\pgndec
\plain\plain\f1\fs24\pard TPR 0160 000\par IPR 0160 000\par OB-R-02-28\par}
And all I really need is this:
TPR 0160 000
IPR 0160 000
OB-R-02-28
The problem with simple looping over the cells and removing unnecessary formatting is, that not everything in those 12000 cells is as straightforward as this is. So I would need to manually inspect many different versions and write several variations; and still at the end there would be a lot of manual work to do.
But if I copy the contents of one cell to empty text document and save it as RTF, then open it with MS Word, it instantly parses the text and I get exactly what I want. Unfortunately it's extremely inconvenient to do so for a 12000 cells.
So I was thinking about VBA macro, to move cell contents to Word, force parsing and then copy the result back to the originating cell. Unfortunately I'm not really sure how to do it.
Does anybody has any idea? Or a different approach? I will be really grateful for a solution or a push in the right direction.
TNX!
If you did want to go down the route of using Word to parse the text, this function should help you out. As the comments suggest, you'll need a reference to the MS Word Object Library.
Function ParseRTF(strRTF As String) As String
Dim wdDoc As Word.Document 'Ref: Microsoft Word 11.0 Object Library'
Dim f As Integer 'Variable to store the file I/O number'
'File path for a temporary .rtf file'
Const strFileTemp = "C:\TempFile_ParseRTF.rtf"
'Obtain the next valid file I/O number'
f = FreeFile
'Open the temp file and save the RTF string in it'
Open strFileTemp For Output As #f
Print #f, strRTF
Close #f
'Open the .rtf file as a Word.Document'
Set wdDoc = GetObject(strFileTemp)
'Read the now parsed text from the Word.Document'
ParseRTF = wdDoc.Range.Text
'Delete the temporary .rtf file'
Kill strFileTemp
'Close the Word connection'
wdDoc.Close False
Set wdDoc = Nothing
End Function
You could call it for each of your 12,000 cells using something similar to this:
Sub ParseAllRange()
Dim rngCell As Range
Dim strRTF As String
For Each rngCell In Range("A1:A12000")
'Parse the cell contents'
strRTF = ParseRTF(CStr(rngCell))
'Output to the cell one column over'
rngCell.Offset(0, 1) = strRTF
Next
End Sub
The ParseRTF function takes about a second to run (on my machine at least), so for 12,000 cells this will work out at about three and a half hours.
Having thought about this problem over the weekend, I was sure there was a better (quicker) solution for this.
I remembered the RTF capabilities of the clipboard, and realised that a class could be created that would copy RTF data to the clipboard, paste to a word doc, and output the resulting plain text. The benefit of this solution is that the word doc object would not have to be opened and closed for each rtf string; it could be opened before the loop and closed after.
Below is the code to achieve this. It is a Class module named clsRTFParser.
Private Declare Function GlobalAlloc Lib "kernel32" _
(ByVal wFlags&, ByVal dwBytes As Long) As Long
Private Declare Function GlobalLock Lib "kernel32" _
(ByVal hMem As Long) As Long
Private Declare Function GlobalUnlock Lib "kernel32" _
(ByVal hMem As Long) As Long
Private Declare Function lstrcpy Lib "kernel32" _
(ByVal lpString1 As Any, ByVal lpString2 As Any) As Long
Private Declare Function OpenClipboard Lib "user32" _
(ByVal Hwnd As Long) As Long
Private Declare Function EmptyClipboard Lib "user32" () As Long
Private Declare Function RegisterClipboardFormat Lib "user32" Alias _
"RegisterClipboardFormatA" (ByVal lpString As String) As Long
Private Declare Function SetClipboardData Lib "user32" _
(ByVal wFormat As Long, ByVal hMem As Long) As Long
Private Declare Function CloseClipboard Lib "user32" () As Long
'---'
Dim wdDoc As Word.Document 'Ref: Microsoft Word 11.0 Object Library'
Private Sub Class_Initialize()
Set wdDoc = New Word.Document
End Sub
Private Sub Class_Terminate()
wdDoc.Close False
Set wdDoc = Nothing
End Sub
'---'
Private Function CopyRTF(strCopyString As String) As Boolean
Dim hGlobalMemory As Long
Dim lpGlobalMemory As Long
Dim hClipMemory As Long
Dim lngFormatRTF As Long
'Allocate and copy string to memory'
hGlobalMemory = GlobalAlloc(&H42, Len(strCopyString) + 1)
lpGlobalMemory = GlobalLock(hGlobalMemory)
lpGlobalMemory = lstrcpy(lpGlobalMemory, strCopyString)
'Unlock the memory and then copy to the clipboard'
If GlobalUnlock(hGlobalMemory) = 0 Then
If OpenClipboard(0&) <> 0 Then
Call EmptyClipboard
'Save the data as Rich Text Format'
lngFormatRTF = RegisterClipboardFormat("Rich Text Format")
hClipMemory = SetClipboardData(lngFormatRTF, hGlobalMemory)
CopyRTF = CBool(CloseClipboard)
End If
End If
End Function
'---'
Private Function PasteRTF() As String
Dim strOutput As String
'Paste the clipboard data to the wdDoc and read the plain text result'
wdDoc.Range.Paste
strOutput = wdDoc.Range.Text
'Get rid of the new lines at the beginning and end of the document'
strOutput = Left(strOutput, Len(strOutput) - 2)
strOutput = Right(strOutput, Len(strOutput) - 2)
PasteRTF = strOutput
End Function
'---'
Public Function ParseRTF(strRTF As String) As String
If CopyRTF(strRTF) Then
ParseRTF = PasteRTF
Else
ParseRTF = "Error in copying to clipboard"
End If
End Function
You could call it for each of your 12,000 cells using something similar to this:
Sub CopyParseAllRange()
Dim rngCell As Range
Dim strRTF As String
'Create new instance of clsRTFParser'
Dim RTFParser As clsRTFParser
Set RTFParser = New clsRTFParser
For Each rngCell In Range("A1:A12000")
'Parse the cell contents'
strRTF = RTFParser.ParseRTF(CStr(rngCell))
'Output to the cell one column over'
rngCell.Offset(0, 1) = strRTF
Next
End Sub
I have simulated this using example RTF strings on my machine. For 12,000 cells it took two and a half minutes, a much more reasonable time frame!
You can try to parse every cell with regular expression and leave only the content you need.
Every RTF control code start with "\" and ends with space, without any additional space between. "{}" are use for grouping. If your text won't contain any, you can just remove them (the same for ";"). So now you stay with your text and some unnecessary words as "Arial", "Normal" etc. You can build the dictionary to remove them also. After some tweaking, you will stay with only the text you need.
Look at http://www.regular-expressions.info/ for more information and great tool to write RegExp's (RegexBuddy - unfortunately it isn't free, but it's worth the money. AFAIR there is also trial).
UPDATE: Of course, I don't encourage you to do it manually for every cell. Just iterate through active range:
Refer this thread:
SO: About iterating through cells in VBA
Personally, I'll give a try to this idea:
Sub Iterate()
For Each Cell in ActiveSheet.UsedRange.Cells
'Do something
Next
End Sub
And how to use RegExp's in VBA (Excel)?
Refer:
Regex functions in Excel
and
Regex in VBA
Basically you've to use VBScript.RegExp object through COM.
Some of the solutions here require a reference to the MS Word Object Library. Playing with the cards I am dealt, I found a solution that does not rely on it. It strips RTF tags, and other fluff like font tables and stylesheets, all in VBA. It might be helpful to you. I ran it across your data, and other than the whitespace, I get the same output as what you expected.
Here is the code.
First, something to check if a string is alphanumeric or not. Give it a string that's one character long. This function is used to work out delimitation here and there.
Public Function Alphanumeric(Character As String) As Boolean
If InStr("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-", Character) Then
Alphanumeric = True
Else
Alphanumeric = False
End If
End Function
Next up is to remove an entire group. I use this to remove font tables and other rubbish.
Public Function RemoveGroup(RTFString As String, GroupName As String) As String
Dim I As Integer
Dim J As Integer
Dim Count As Integer
I = InStr(RTFString, "{\" & GroupName)
' If the group was not found in the RTF string, then just return that string unchanged.
If I = 0 Then
RemoveGroup = RTFString
Exit Function
End If
' Otherwise, we will need to scan along, from the start of the group, until we find the end of the group.
' The group is delimited by { and }. Groups may be nested, so we need to count up if we encounter { and
' down if we encounter }. When that count reaches zero, then the end of the group has been found.
J = I
Do
If Mid(RTFString, J, 1) = "{" Then Count = Count + 1
If Mid(RTFString, J, 1) = "}" Then Count = Count - 1
J = J + 1
Loop While Count > 0
RemoveGroup = Replace(RTFString, Mid(RTFString, I, J - I), "")
End Function
Okay, and this function removes any tags.
Public Function RemoveTags(RTFString As String) As String
Dim L As Long
Dim R As Long
L = 1
' Search to the end of the string.
While L < Len(RTFString)
' Append anything that's not a tag to the return value.
While Mid(RTFString, L, 1) <> "\" And L < Len(RTFString)
RemoveTags = RemoveTags & Mid(RTFString, L, 1)
L = L + 1
Wend
'Search to the end of the tag.
R = L + 1
While Alphanumeric(Mid(RTFString, R, 1)) And R < Len(RTFString)
R = R + 1
Wend
L = R
Wend
End Function
We can remove curly braces in the obvious way:
Public Function RemoveBraces(RTFString As String) As String
RemoveBraces = Replace(RTFString, "{", "")
RemoveBraces = Replace(RemoveBraces, "}", "")
End Function
Once you have the functions above copy-pasted into your module, you can create a function that uses them to strip away any stuff you don't need or want. The following works perfectly in my case.
Public Function RemoveTheFluff(RTFString As String) As String
RemoveTheFluff = Replace(RTFString, vbCrLf, "")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "fonttbl")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "colortbl")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "stylesheet")
RemoveTheFluff = RemoveTags(RemoveBraces(RemoveTheFluff))
End Function
I hope this helps. I wouldn't use it in a word processor or anything, but it might do for scraping data if that's what you're doing.
Your post made it sound as if each RTF document was stored in a single Excell cell. If so, then
Solution using .Net Framework RichTextBox control
will convert the RTF in each cell to plain text in 2 lines of code (after a little system configuration to get the right .tlb file to allow reference to the .Net Framework). Put the cell value in rtfsample and
Set miracle = New System_Windows_Forms.RichTextBox
With miracle
.RTF = rtfText
PlainText = .TEXT
End With