INTRO
With this code it is possible to display mathematical equation within a WORD document:
Sub genEQ()
Dim objRange As Range
Dim objEq As OMath
Dim AC As OMathAutoCorrectEntry
Application.OMathAutoCorrect.UseOutsideOMath = True
Set objRange = Selection.Range
objRange.Text = "Celsius = \sqrt(x+y) + sin(5/9 \times(Fahrenheit – 23 (\delta)^2))"
For Each AC In Application.OMathAutoCorrect.Entries
With objRange
If InStr(.Text, AC.Name) > 0 Then
.Text = Replace(.Text, AC.Name, AC.Value)
End If
End With
Next AC
Set objRange = Selection.OMaths.Add(objRange)
Set objEq = objRange.OMaths(1)
objEq.BuildUp
End Sub
An with this code I am able to display UNICODE characters within EXCEL message boxes without displaying "?" or "random characters":
Private Declare PtrSafe Function MessageBoxW Lib "User32" (ByVal hWnd As LongPtr, ByVal lpText As LongPtr, ByVal lpCaption As LongPtr, ByVal uType As Long) As Long
Public Function MsgBoxW(Prompt As String, Optional Buttons As VbMsgBoxStyle = vbOKOnly, Optional Title As String = "Microsoft Excel") As VbMsgBoxResult
MsgBoxW = MessageBoxW(Application.hWnd, StrPtr(Prompt), StrPtr(Title), Buttons)
End Function
QUESTION
Is there now a way to combine those and display a complete equation within a message box in EXCEL?
Further how do I refer to MS WORD with the above code snippet to be run within EXCEL?
Is there a way to display a fraction like in the formula of the first code without making a string with a " / "-symbol?
As stated by #GSerg, you need to go through an intermediary picture and use a userform rather than a message box.
The following code converts the text to a formula and goes via Publisher to save the picture then loads it into a pre-existing UserForm UserForm1 with image placeholder Image1. I have increased font size to get better resolution for the picture but this can be set to other values.
Updated to work with autocorrect formulae
Sub DisplayFormulae()
' Requires reference: Microsoft Word x.x Object Library
' Requires reference: Microsoft Publisher x.x Object Library
Dim sFormula As String: sFormula = "Celsius = \sqrt(x+y) + sin(5/9 \times(Fahrenheit – 23 (\delta)^2))"
Dim SaveName As String: SaveName = Environ("TEMP") & "\formula.jpg"
Dim AC As Word.OMathAutoCorrectEntry
Dim WordDoc As New Word.Document
With WordDoc
.Range.Text = sFormula
.Range.Font.Size = 18
For Each AC In .Parent.OMathAutoCorrect.Entries
With .Range
If InStr(.Text, AC.Name) > 0 Then
.Text = Replace(.Text, AC.Name, AC.Value)
End If
End With
Next AC
.OMaths.Add(.Range).OMaths(1).BuildUp
.OMaths(1).Range.Copy
.Close SaveChanges:=wdDoNotSaveChanges
End With
Dim PubDoc As New Publisher.Document
PubDoc.Pages(1).Shapes.Paste
PubDoc.Pages(1).Shapes(1).SaveAsPicture _
PbResolution:=pbPictureResolutionCommercialPrint_300dpi, _
Filename:=SaveName
PubDoc.Close
UserForm1.Controls("Image1").Picture = LoadPicture(SaveName)
UserForm1.Show
Kill SaveName
End Sub
Related
I have a SAP GUI script running every day in VBA. In the script I am exporting some data from SAP to several different Excel files, and these are saved to a network drive. In the first macro, I export data. In the second I copy the data to the same workbook as the script is in.
Some days I get a runtime error
Subscript out of range
on Set ws2 = Workbooks("FEBA_EXPORT_" & today2 & ".XLSX").Worksheets("Sheet1").
It looks like the Excel file is not recognized as open. I manually close the file, and reopen it and then the script will run.
I tried to insert the below code in front of the Set ws2 line that is giving an error, and this code is always giving the massage that the file is open.
Dim Ret
Ret = IsWorkBookOpen(filepath & "FEBA_EXPORT_" & today2 & ".XLSX")
If Ret = True Then
MsgBox "File is open"
Else
MsgBox "File is Closed"
End If
This is the function:
Function IsWorkBookOpen(FileName As String)
Dim ff As Long, ErrNo As Long
On Error Resume Next
ff = FreeFile()
Open FileName For Input Lock Read As #ff
Close ff
ErrNo = Err
On Error GoTo 0
Select Case ErrNo
Case 0: IsWorkBookOpen = False
Case 70: IsWorkBookOpen = True
Case Else: Error ErrNo
End Select
End Function
This is the relevant part of the code:
Sub CopyExportedFEBA_ExtractFEBRE()
Dim SapGuiAuto As Object
Dim SAPApp As Object
Dim SAPCon As Object
Dim session As Object
Set SapGuiAuto = GetObject("SAPGUI")
Set SAPApp = SapGuiAuto.GetScriptingEngine
Set SAPCon = SAPApp.Children(0)
Set session = SAPCon.Children.ElementAt(0) ' <--- Assumes you are using the first session open. '
Dim ws0, ws1, ws2, ws6, ws7 As Worksheet
Set ws0 = Workbooks("FEB_BSPROC.xlsm").Worksheets("INPUT")
Set ws1 = Workbooks("FEB_BSPROC.xlsm").Worksheets("FEB_BSPROC")
Set ws6 = Workbooks("FEB_BSPROC.xlsm").Worksheets("FBL3N_1989")
Dim today2, filepath As String
today2 = ws0.Range("E2")
filepath = ws0.Range("A7")
' Check if if FEBA_EXPORT wb is open
' This is giving the message that the file is open
Dim Ret
Ret = IsWorkBookOpen(filepath & "FEBA_EXPORT_" & today2 & ".XLSX")
If Ret = True Then
MsgBox "File is open"
Else
MsgBox "File is Closed"
End If
' This is giving runtime error 9 Subscript out of range
' If manually close the Excel and the reopen, then it will always work after this
Set ws2 = Workbooks("FEBA_EXPORT_" & today2 & ".XLSX").Worksheets("Sheet1")
'This is never giving any errors
Set ws7 = Workbooks("1989_" & today2 & ".XLSX").Worksheets("Sheet1")
The filepath varaiable is the full filepath to the network drive. So this is not the issue. Also I have another Excel file that is opened in the same way, and that one is never giving any errors.
The today2 variable is also correct.
I thought that it would work if I could close the ws2 workbook with VBA and then reopen it. So I tried to close it without setting it to a variable, but then I got the same error.
With SAP GUI scripting when you export anything to an Excel file, the file will open automatically after it has been saved. I am wondering if this could be the issue? I only have problems with this one Excel file, and not with any of several others that are saved and opened in the same way.
As I said in my above comment, the workbook may be open in a new session, different from the one where the code runs. Please, use the next function to identify if it is a matter of different Excel session:
Function sameExSession(wbFullName As String, Optional boolClose As Boolean) As Boolean
Dim sessEx As Object, wb As Object
Set sessEx = GetObject(wbFullName).Application
If sessEx.hwnd = Application.hwnd Then
sameExSession = True
Else
sameExSession = False
If boolClose Then
sessEx.Workbooks(Right(wbFullName, Len(wbFullName) - InStrRev(wbFullName, "\"))).Close False
sessEx.Quit: Set sessEx = Nothing
End If
End If
End Function
It identify the session where the workbook is open, then compare its handle with the active session one and if not the same, close the workbook (if calling the function with second parameter as True), quit the session and returns False. If only checking, call the function with the second parameter being False (the workbook will not be closed, and session will still remain).
It can be used in the next way:
Sub testSameExSession()
Dim wbFullName As String, wbSAP As Workbook
wbFullName = filepath & "FEBA_EXPORT_" & today2 & ".XLSX"
If sameExSession(wbFullName, True) Then
Debug.Print "The same session"
Set wbSAP = Workbooks("FEBA_EXPORT_" & today2 & ".XLSX")
Else
Debug.Print "Different session..."
Set wbSAP = Workbooks.Open(wbFullName)
End If
Debug.Print wbSAP.Name
'use the set workbook to do what you need...
End Sub
When you have the described problem, please use the above way to test if it is a matter of different sessions.
If so, is easy to input this part in your existing code, I think. If the workbook will be open in a different session, no need to manually close it (and reopen), the above function does it...
In case someone is still facing this issue, I found a way to wait for the excel files downloaded from SAP and its app instance to open, then close them and let you work with the files without troubles. You can set a timeout too.
If files are downloaded and opened in an already open instance of excel, it will just close the files and not the whole instance.
You can use it as follow:
Sub Test()
Call Close_SAP_Excel("Test.xlsx", "Test2.xlsx")
End Sub
xCloseExcelFromSAP
#If VBA7 Then
Private Declare PtrSafe Function AccessibleObjectFromWindow Lib "oleacc" ( _
ByVal hwnd As LongPtr, ByVal dwId As Long, riid As Any, ppvObject As Object) As Long
Private Declare PtrSafe Function FindWindowExA Lib "user32" ( _
ByVal hwndParent As LongPtr, ByVal hwndChildAfter As LongPtr, _
ByVal lpszClass As String, ByVal lpszWindow As String) As LongPtr
#Else
Private Declare Function AccessibleObjectFromWindow Lib "oleacc" ( _
ByVal hwnd As Long, ByVal dwId As Long, riid As Any, ppvObject As Object) As Long
Private Declare Function FindWindowExA Lib "user32" ( _
ByVal hwndParent As Long, ByVal hwndChildAfter As Long, _
ByVal lpszClass As String, ByVal lpszWindow As String) As Long
#End If
Sub Close_SAP_Excel(ParamArray FileNames())
'Procedure to close files downloaded from SAP and at the same time close the Excel application instance that will be open with them.
Dim ExcelAppSAP As Variant
Dim ExcelFile As Variant
Dim FinishedLoop As Boolean, TimeoutReached As Boolean, FileClosed As Boolean
Dim ReTry As Long
Dim i As Long, x As Long
Set ExcelAppSAP = Nothing
ReTry = 100000 'Used as Timeout 100000 = ~10 seconds
i = 1
'The following loop is executed until excel file is closed.
'Inside of this, there is a For Loop for each Excel Instance and inside of that is another loop
'for each excel inside the instance. If name matches, it is closed.
Do While Not FinishedLoop
If i > ReTry Then
TimeoutReached = True
Exit Do
End If
For Each ExcelFile In GetExcelInstances() 'Function to Get Excel Open Instances
For Each xls In ExcelFile.Workbooks
For x = LBound(FileNames()) To UBound(FileNames())
If xls.Name = FileNames(x) Then
Set ExcelAppSAP = ExcelFile 'Set Instance opened by SAP to variable
'Here add actions if needed. Reference to workbook as xls e.g.: xls.Sheets(1).Range("A1").Copy
xls.Close SaveChanges:=False
FileClosed = True
End If
Next x
Next
Next
If FileClosed Then
FinishedLoop = True
End If
i = i + 1
Loop
ThisWorkbook.Activate
If Not TimeoutReached Then
If FileClosed Then
On Error Resume Next
If ExcelAppSAP.Workbooks.Count = 0 Then
ExcelAppSAP.Quit
End If
Else
MsgBox "Excel application instance from SAP was not closed correctly. Please close it manually or try again.", , "Error"
End If
Else
MsgBox "Max timeout reached", , "Error"
End If
End Sub
Public Function GetExcelInstances() As Collection
Dim guid&(0 To 3), acc As Object, hwnd, hwnd2, hwnd3
guid(0) = &H20400
guid(1) = &H0
guid(2) = &HC0
guid(3) = &H46000000
Set GetExcelInstances = New Collection
Do
hwnd = FindWindowExA(0, hwnd, "XLMAIN", vbNullString)
If hwnd = 0 Then Exit Do
hwnd2 = FindWindowExA(hwnd, 0, "XLDESK", vbNullString)
hwnd3 = FindWindowExA(hwnd2, 0, "EXCEL7", vbNullString)
If AccessibleObjectFromWindow(hwnd3, &HFFFFFFF0, guid(0), acc) = 0 Then
GetExcelInstances.Add acc.Application
End If
Loop
End Function
I have faced a strange issue(Maybe a bug) at Data Validation
When i use below code for data validation, Excel is creating a dropdown but add string as one line. But when i write the same string from Excel data validation menu, Excel is seperating the dropdown with multiple items which i want.
For example, lets say string is "A;B;C"
When i do it by VBA, dropdown shows "A;B;C" as 1 line but when i click data validation menu and write manually "A;B;C" , Excel is creating 3 lines of dropdown with "A" , "B" , "C"
It is totally strange behavior. You may see the code as below. I add video link to explain better.
https://streamable.com/a75kud
Public arrAddress As String
Sub DynamicDataVal()
Dim rng As Range
Dim cll As Range
Dim dValicationCount As Long
Dim un As String
Dim DValidationList As Range
Dim DValidationListString As String
Dim seper As String
Dim col As New Collection, a
Dim colIt As Variant
Dim arr() As Variant
un = "Sayin " & Environ("UserName")
On Error Resume Next
Set rng = Application.InputBox("Lutfen Veri Alanini Seciniz", un, ActiveCell.Address, , , , , 8)
If rng Is Nothing Then Exit Sub
Set cll = ActiveCell
dValicationCount = cll.SpecialCells(xlCellTypeSameValidation).Count
If dValicationCount = 0 Then
arr = rng.Offset(1, 0).Resize(rng.Resize(, 1).Cells.Count - 1, 1).Value
arrAddress = rng.Address(External:=True)
For Each a In arr
col.Add a, a
Next a
seper = ListSeperatorMod.GetListSeparator
For Each colIt In col
DValidationListString = DValidationListString & seper & colIt
Next colIt
DValidationListString = Right(DValidationListString, Len(DValidationListString) - 1)
On Error GoTo 0
With cll.Validation
.Delete
.Add Type:=xlValidateList, AlertStyle:=xlValidAlertStop, _
Formula1:=DValidationListString
End With
Else
If rng.Validation.Type <> 3 Then
Exit Sub
Else
'Will be done
End If
End If
On Error GoTo 0
End Sub
Private Declare Function GetLocaleInfo Lib "kernel32" Alias "GetLocaleInfoA" _
(ByVal Locale As Long, ByVal LCType As Long, ByVal lpLCData As String, ByVal cchData As Long) As Long
Private Declare Function GetUserDefaultLCID% Lib "kernel32" ()
Private Const LOCALE_SLIST = &HC
Public Function GetListSeparator() As String
Dim ListSeparator As String
Dim iRetVal1 As Long
Dim iRetVal2 As Long
Dim lpLCDataVar As String
Dim Position As Integer
Dim Locale As Long
Locale = GetUserDefaultLCID()
iRetVal1 = GetLocaleInfo(Locale, LOCALE_SLIST, lpLCDataVar, 0)
ListSeparator = String$(iRetVal1, 0)
iRetVal2 = GetLocaleInfo(Locale, LOCALE_SLIST, ListSeparator, iRetVal1)
Position = InStr(ListSeparator, Chr$(0))
If Position > 0 Then
ListSeparator = Left$(ListSeparator, Position - 1)
End If
GetListSeparator = ListSeparator
End Function
It's by design and it's the the same behavior you'd see if using VBA to enter a formula in a cell: you always use the "US" list separator , in VBA, and not (eg) the locale-specific separator such as ;.
This differs from entering a formula via the user interface, where you're always using the locale-specific separator.
So in VBA you might use:
Range("A1").Formula = "=MAX(A1, B2)"
and if your locale separator is ; then the formula shows up on the sheet as:
=MAX(A1; B2)
This alows the same VBA to function across different locales without modifications
I have several instances of Excel running, maybe up to 4 instances. One of these (let's call in instance A) usually does not have an open workbook. It is used by one of the other instances (let's call it instance B) to open a workbook in instance A, add, change data, then save and close that workbook, thus the instance where these changes are being made (instance A) returns to a state where there are no open workbooks. I do it this way because it is much faster than having the workbook with the code (instance B) open a workbook, perform these tasks and then close the workbook.
My problem is this: From time to time, for debugging purposes mostly, it is desirable to make instance A visible, but what I'm finding is that an instance without an open workbook cannot be made visible, or at least this is what I'm concluding. I am using Excel 2016, 64 bit. My code to do this is:
Private Sub cmdMakeSelectionVisible_Click()
Dim strng As String
Dim lCol As Long, lRow As Long
Dim oXLApp As Excel.Application
Dim bFoundInstance As Boolean
Dim wb_Actress As Workbook
With Me.lstXL '<--| refer to your listbox: change "ListBox1" with your actual listbox name
For lRow = 0 To .ListCount - 1 '<--| loop through listbox rows
If .Selected(lRow) Then '<--| if current row selected
For lCol = 0 To .ColumnCount - 1 '<--| loop through listbox columns
strng = strng & .List(lRow, lCol) & " | " '<--| build your output string
If lCol = 1 Then
MsgBox .List(lRow, lCol)
bFoundInstance = GetReferenceToXLApp(.List(lRow, lCol), oXLApp)
MsgBox oXLApp.Caption
Set wb_Actress = oXLApp.Workbooks.Open("T:\-1996\Dummy Performer's Book.xlsm")
oXLApp.Visible = True
wb_Actress.Close
End If
Next lCol
MsgBox "you selected" & vbCrLf & Left(strng, (Len(strng) - 1)) '<--| show output string (after removing its last character ("|"))
Exit For '<-_| exit loop
End If
Next lRow
End With
End Sub
Inside the For Loops there is an If statement and in that If statement if the instance in question opens a workbook, then the code works. If the instance does not open a workbook and it contains no children, it does not work. If at that point I test to see if oXApp.Visible is true, it is true, but the instance remains hidden.
The question is, is there any way around this without opening a workbook, make the instance visible, then closing the workbook? I still consider myself a novice when it comes to VBA. There may be a totally different approach other than using the handle to the desired instance.
Thanks for looking and helping.
Edit:
The code for GetReferenceToXLApp is:
Public Function GetReferenceToXLApp(hWndXL As Long, oXLApp As Object) As Boolean
Dim hWinDesk As Long
Dim hWin7 As Long
Dim obj As Object
Dim iID As GUID
'// Rather than explaining, go read
'// http://msdn.microsoft.com/en-us/library/windows/desktop/ms687262(v=vs.85).aspx
Call IIDFromString(StrPtr(IID_IDispatch), iID)
'// We have the XL App (Class name XLMAIN)
'// This window has a child called 'XLDESK' (which I presume to mean 'XL desktop')
'// XLDesk is the container for all XL child windows....
hWinDesk = FindWindowEx(hWndXL, 0&, "XLDESK", vbNullString)
'// EXCEL7 is the class name for a Workbook window (and probably others, as well)
'// This is used to check there is actually a workbook open in this instance.
hWin7 = FindWindowEx(hWinDesk, 0&, "EXCEL7", vbNullString)
'// Deep API... read up on it if interested.
'// http://msdn.microsoft.com/en-us/library/windows/desktop/dd317978(v=vs.85).aspx
If AccessibleObjectFromWindow(hWin7, OBJID_NATIVEOM, iID, obj) = RETURN_OK Then
Set oXLApp = obj.Application
GetReferenceToXLApp = True
End If
End Function
And
Private Declare PtrSafe Function IIDFromString Lib "ole32" _
(ByVal lpsz As LongPtr, ByRef lpiid As GUID) As Long
And
Private Declare PtrSafe Function FindWindowEx Lib "User32" Alias "FindWindowExA" _
(ByVal hWnd1 As LongPtr, ByVal hWnd2 As LongPtr, ByVal lpsz1 As String, _
ByVal lpsz2 As String) As Long
Seems to work...
Dim oApp As Excel.Application
Sub TT()
Set oApp = New Excel.Application
'oApp.Workbooks.Add
oApp.Visible = True
Debug.Print oApp.HWnd, Application.HWnd
Debug.Print oApp.Caption, Application.Caption
Application.Wait Now + TimeSerial(0, 0, 2)
AppActivate Application.Caption
Application.Wait Now + TimeSerial(0, 0, 2)
AppActivate oApp.Caption
End Sub
New in Stack Overflow, I'm building a macro in Excel with VBA. Basically I have a file with multiple tabs with information in tables, these tables have text in it and some words from that text are bold and repeat in each tab (lets say owner and process). I do display of this information in a textbox located in a form according to the line of the table they pick before on a listbox, the text displays correctly but it ignore the text formatting(bold and italic). Is there a way to display the text formatting in the textbox as it is in the table?
Hope I've have made myself clear enough.
Typical example for a Shape textbox (not ActiveX):
Sub BoxMaker()
ActiveSheet.Shapes.AddTextbox(msoTextOrientationHorizontal, 217.5, 51#, _
482.25, 278.25).Select
Selection.Name = "SPLASH"
Selection.Characters.Text = "Please Wait for Macro"
With Selection.Characters(Start:=1, Length:=21).Font
.Name = "Arial"
.FontStyle = "Regular"
.Size = 36
.Strikethrough = False
.Superscript = False
.Subscript = False
.OutlineFont = False
.Shadow = False
.Underline = xlUnderlineStyleNone
.ColorIndex = xlAutomatic
End With
Selection.Characters(Start:=8, Length:=4).Font.Bold = True
Selection.HorizontalAlignment = xlCenter
Selection.VerticalAlignment = xlCenter
End Sub
You can format text in a textbox similar to text in a cell.
Possible work around in UserForm using a Webbrowser control
Due to your comment: "...actually I'm using MSForms.TextBox"
A possible work around would be to create a simple HTML file using the b tag (e.g. blabla within normal textbold text ...) for bold text and load it into a Webbrowser control (needs reference to Microsoft Internet Controls) e.g. via WebBrowser1.Navigate ThisWorkbook.Path & "\topic.htm".
As HTML commonly uses utf-8 encoding I demonstrate an approach using the system function WideCharToMultiByte (API call).
Example call using helper functions (API calls)
' declare and assign simple html string
Dim htmlstring as String
htmlstring = "<html><body><div>Normal text <b>bold text</b> further text</div></body></html>"
' write html file via helper procedure
writeUtf8 htmlstring, ThisWorkbook.Path & "\topic.htm"
' load html file into WebBrowswer control
Me.WebBrowser1.Navigate ThisWorkbook.Path & "\topic.htm"
Helper procedures
I suggest to write a separate code module for these helper functions:
Option Explicit ' declaration head of separate code module
' Declare API Function
Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpDefaultChar As Long, _
ByVal lpUsedDefaultChar As Long) As Long
Public Sub writeUtf8(ByVal s As String, ByVal datei As String)
' Purpose: save HTML String in utf-8 mode
' Note: called by writeUtf8 htmlstring, thisworkbook.path & "\topic.htm"
Dim file As Integer
Dim B() As Byte
file = FreeFile
Open datei For Binary Access Write Lock Read Write As #file
getUtf8 s, B
Put #file, , B
Close #file
End Sub
Private Sub getUtf8(ByRef s As String, ByRef B() As Byte)
' Note: called by above helper function; uses API call (see declaration head)
Const CP_UTF8 As Long = 65001
Dim len_s As Long
Dim ptr_s As Long
Dim size As Long
Erase B
len_s = Len(s)
If len_s = 0 Then _
err.Raise 30030, , "Len(WideChars) = 0"
ptr_s = StrPtr(s)
size = WideCharToMultiByte(CP_UTF8, 0, ptr_s, len_s, 0, 0, 0, 0)
If size = 0 Then _
err.Raise 30030, , "WideCharToMultiByte() = 0"
ReDim B(0 To size - 1)
If WideCharToMultiByte(CP_UTF8, 0, ptr_s, len_s, VarPtr(B(0)), size, 0, 0) = 0 Then _
err.Raise 30030, , "WideCharToMultiByte(" & Format$(size) & ") = 0"
End Sub
I have approx. 12000 cells in excel containing RTF (including formatting tags). I need to parse them to get to the unformatted text.
This is the example of one of the cells with text:
{\rtf1\ansi\deflang1060\ftnbj\uc1
{\fonttbl{\f0 \froman \fcharset0 Times New Roman;}{\f1 \fswiss \fcharset238
Arial;}}
{\colortbl ;\red255\green255\blue255 ;\red0\green0\blue0 ;}
{\stylesheet{\fs24\cf2\cb1 Normal;}{\cs1\cf2\cb1 Default Paragraph Font;}}
\paperw11908\paperh16833\margl1800\margr1800\margt1440\margb1440\headery720\footery720
\deftab720\formshade\aendnotes\aftnnrlc\pgbrdrhead\pgbrdrfoot
\sectd\pgwsxn11908\pghsxn16833\marglsxn1800\margrsxn1800\margtsxn1440\margbsxn1440
\headery720\footery720\sbkpage\pgncont\pgndec
\plain\plain\f1\fs24\pard TPR 0160 000\par IPR 0160 000\par OB-R-02-28\par}
And all I really need is this:
TPR 0160 000
IPR 0160 000
OB-R-02-28
The problem with simple looping over the cells and removing unnecessary formatting is, that not everything in those 12000 cells is as straightforward as this is. So I would need to manually inspect many different versions and write several variations; and still at the end there would be a lot of manual work to do.
But if I copy the contents of one cell to empty text document and save it as RTF, then open it with MS Word, it instantly parses the text and I get exactly what I want. Unfortunately it's extremely inconvenient to do so for a 12000 cells.
So I was thinking about VBA macro, to move cell contents to Word, force parsing and then copy the result back to the originating cell. Unfortunately I'm not really sure how to do it.
Does anybody has any idea? Or a different approach? I will be really grateful for a solution or a push in the right direction.
TNX!
If you did want to go down the route of using Word to parse the text, this function should help you out. As the comments suggest, you'll need a reference to the MS Word Object Library.
Function ParseRTF(strRTF As String) As String
Dim wdDoc As Word.Document 'Ref: Microsoft Word 11.0 Object Library'
Dim f As Integer 'Variable to store the file I/O number'
'File path for a temporary .rtf file'
Const strFileTemp = "C:\TempFile_ParseRTF.rtf"
'Obtain the next valid file I/O number'
f = FreeFile
'Open the temp file and save the RTF string in it'
Open strFileTemp For Output As #f
Print #f, strRTF
Close #f
'Open the .rtf file as a Word.Document'
Set wdDoc = GetObject(strFileTemp)
'Read the now parsed text from the Word.Document'
ParseRTF = wdDoc.Range.Text
'Delete the temporary .rtf file'
Kill strFileTemp
'Close the Word connection'
wdDoc.Close False
Set wdDoc = Nothing
End Function
You could call it for each of your 12,000 cells using something similar to this:
Sub ParseAllRange()
Dim rngCell As Range
Dim strRTF As String
For Each rngCell In Range("A1:A12000")
'Parse the cell contents'
strRTF = ParseRTF(CStr(rngCell))
'Output to the cell one column over'
rngCell.Offset(0, 1) = strRTF
Next
End Sub
The ParseRTF function takes about a second to run (on my machine at least), so for 12,000 cells this will work out at about three and a half hours.
Having thought about this problem over the weekend, I was sure there was a better (quicker) solution for this.
I remembered the RTF capabilities of the clipboard, and realised that a class could be created that would copy RTF data to the clipboard, paste to a word doc, and output the resulting plain text. The benefit of this solution is that the word doc object would not have to be opened and closed for each rtf string; it could be opened before the loop and closed after.
Below is the code to achieve this. It is a Class module named clsRTFParser.
Private Declare Function GlobalAlloc Lib "kernel32" _
(ByVal wFlags&, ByVal dwBytes As Long) As Long
Private Declare Function GlobalLock Lib "kernel32" _
(ByVal hMem As Long) As Long
Private Declare Function GlobalUnlock Lib "kernel32" _
(ByVal hMem As Long) As Long
Private Declare Function lstrcpy Lib "kernel32" _
(ByVal lpString1 As Any, ByVal lpString2 As Any) As Long
Private Declare Function OpenClipboard Lib "user32" _
(ByVal Hwnd As Long) As Long
Private Declare Function EmptyClipboard Lib "user32" () As Long
Private Declare Function RegisterClipboardFormat Lib "user32" Alias _
"RegisterClipboardFormatA" (ByVal lpString As String) As Long
Private Declare Function SetClipboardData Lib "user32" _
(ByVal wFormat As Long, ByVal hMem As Long) As Long
Private Declare Function CloseClipboard Lib "user32" () As Long
'---'
Dim wdDoc As Word.Document 'Ref: Microsoft Word 11.0 Object Library'
Private Sub Class_Initialize()
Set wdDoc = New Word.Document
End Sub
Private Sub Class_Terminate()
wdDoc.Close False
Set wdDoc = Nothing
End Sub
'---'
Private Function CopyRTF(strCopyString As String) As Boolean
Dim hGlobalMemory As Long
Dim lpGlobalMemory As Long
Dim hClipMemory As Long
Dim lngFormatRTF As Long
'Allocate and copy string to memory'
hGlobalMemory = GlobalAlloc(&H42, Len(strCopyString) + 1)
lpGlobalMemory = GlobalLock(hGlobalMemory)
lpGlobalMemory = lstrcpy(lpGlobalMemory, strCopyString)
'Unlock the memory and then copy to the clipboard'
If GlobalUnlock(hGlobalMemory) = 0 Then
If OpenClipboard(0&) <> 0 Then
Call EmptyClipboard
'Save the data as Rich Text Format'
lngFormatRTF = RegisterClipboardFormat("Rich Text Format")
hClipMemory = SetClipboardData(lngFormatRTF, hGlobalMemory)
CopyRTF = CBool(CloseClipboard)
End If
End If
End Function
'---'
Private Function PasteRTF() As String
Dim strOutput As String
'Paste the clipboard data to the wdDoc and read the plain text result'
wdDoc.Range.Paste
strOutput = wdDoc.Range.Text
'Get rid of the new lines at the beginning and end of the document'
strOutput = Left(strOutput, Len(strOutput) - 2)
strOutput = Right(strOutput, Len(strOutput) - 2)
PasteRTF = strOutput
End Function
'---'
Public Function ParseRTF(strRTF As String) As String
If CopyRTF(strRTF) Then
ParseRTF = PasteRTF
Else
ParseRTF = "Error in copying to clipboard"
End If
End Function
You could call it for each of your 12,000 cells using something similar to this:
Sub CopyParseAllRange()
Dim rngCell As Range
Dim strRTF As String
'Create new instance of clsRTFParser'
Dim RTFParser As clsRTFParser
Set RTFParser = New clsRTFParser
For Each rngCell In Range("A1:A12000")
'Parse the cell contents'
strRTF = RTFParser.ParseRTF(CStr(rngCell))
'Output to the cell one column over'
rngCell.Offset(0, 1) = strRTF
Next
End Sub
I have simulated this using example RTF strings on my machine. For 12,000 cells it took two and a half minutes, a much more reasonable time frame!
You can try to parse every cell with regular expression and leave only the content you need.
Every RTF control code start with "\" and ends with space, without any additional space between. "{}" are use for grouping. If your text won't contain any, you can just remove them (the same for ";"). So now you stay with your text and some unnecessary words as "Arial", "Normal" etc. You can build the dictionary to remove them also. After some tweaking, you will stay with only the text you need.
Look at http://www.regular-expressions.info/ for more information and great tool to write RegExp's (RegexBuddy - unfortunately it isn't free, but it's worth the money. AFAIR there is also trial).
UPDATE: Of course, I don't encourage you to do it manually for every cell. Just iterate through active range:
Refer this thread:
SO: About iterating through cells in VBA
Personally, I'll give a try to this idea:
Sub Iterate()
For Each Cell in ActiveSheet.UsedRange.Cells
'Do something
Next
End Sub
And how to use RegExp's in VBA (Excel)?
Refer:
Regex functions in Excel
and
Regex in VBA
Basically you've to use VBScript.RegExp object through COM.
Some of the solutions here require a reference to the MS Word Object Library. Playing with the cards I am dealt, I found a solution that does not rely on it. It strips RTF tags, and other fluff like font tables and stylesheets, all in VBA. It might be helpful to you. I ran it across your data, and other than the whitespace, I get the same output as what you expected.
Here is the code.
First, something to check if a string is alphanumeric or not. Give it a string that's one character long. This function is used to work out delimitation here and there.
Public Function Alphanumeric(Character As String) As Boolean
If InStr("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-", Character) Then
Alphanumeric = True
Else
Alphanumeric = False
End If
End Function
Next up is to remove an entire group. I use this to remove font tables and other rubbish.
Public Function RemoveGroup(RTFString As String, GroupName As String) As String
Dim I As Integer
Dim J As Integer
Dim Count As Integer
I = InStr(RTFString, "{\" & GroupName)
' If the group was not found in the RTF string, then just return that string unchanged.
If I = 0 Then
RemoveGroup = RTFString
Exit Function
End If
' Otherwise, we will need to scan along, from the start of the group, until we find the end of the group.
' The group is delimited by { and }. Groups may be nested, so we need to count up if we encounter { and
' down if we encounter }. When that count reaches zero, then the end of the group has been found.
J = I
Do
If Mid(RTFString, J, 1) = "{" Then Count = Count + 1
If Mid(RTFString, J, 1) = "}" Then Count = Count - 1
J = J + 1
Loop While Count > 0
RemoveGroup = Replace(RTFString, Mid(RTFString, I, J - I), "")
End Function
Okay, and this function removes any tags.
Public Function RemoveTags(RTFString As String) As String
Dim L As Long
Dim R As Long
L = 1
' Search to the end of the string.
While L < Len(RTFString)
' Append anything that's not a tag to the return value.
While Mid(RTFString, L, 1) <> "\" And L < Len(RTFString)
RemoveTags = RemoveTags & Mid(RTFString, L, 1)
L = L + 1
Wend
'Search to the end of the tag.
R = L + 1
While Alphanumeric(Mid(RTFString, R, 1)) And R < Len(RTFString)
R = R + 1
Wend
L = R
Wend
End Function
We can remove curly braces in the obvious way:
Public Function RemoveBraces(RTFString As String) As String
RemoveBraces = Replace(RTFString, "{", "")
RemoveBraces = Replace(RemoveBraces, "}", "")
End Function
Once you have the functions above copy-pasted into your module, you can create a function that uses them to strip away any stuff you don't need or want. The following works perfectly in my case.
Public Function RemoveTheFluff(RTFString As String) As String
RemoveTheFluff = Replace(RTFString, vbCrLf, "")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "fonttbl")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "colortbl")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "stylesheet")
RemoveTheFluff = RemoveTags(RemoveBraces(RemoveTheFluff))
End Function
I hope this helps. I wouldn't use it in a word processor or anything, but it might do for scraping data if that's what you're doing.
Your post made it sound as if each RTF document was stored in a single Excell cell. If so, then
Solution using .Net Framework RichTextBox control
will convert the RTF in each cell to plain text in 2 lines of code (after a little system configuration to get the right .tlb file to allow reference to the .Net Framework). Put the cell value in rtfsample and
Set miracle = New System_Windows_Forms.RichTextBox
With miracle
.RTF = rtfText
PlainText = .TEXT
End With