I understand "Environ" can identify who opens the file, but I do not know how to write the code for it.
I found one answer that emails via Outlook when a file is opened, but ideally it would be logged with the person's name and time stamped in a hidden tab in the worksheet or some other file. Since the user will not be making edits to the file and/or saving it I don't know if that is an option.
Here's some code you can use. Open VBE (Alt+F11) double click on the "ThisWorkbook" over in the Project window for your spreadsheet and then paste this in.
Public Declare Function GetUserName Lib "advapi32.dll" _
Alias "GetUserNameA" (ByVal lpBuffer As String, nSize As Long) As Long
Private Sub Workbook_Open()
'When the worksheet opens, this will write the computer username
' and the date and time to a worksheet of your choice
' just change that "YourHiddenSheetNameHere" to the name of your
' hidden tab
Dim lastRow As Integer
Dim hiddenSheet As Worksheet
Set hiddenSheet = Sheets("YourHiddenSheetNAmeHere")
lastRow = hiddenSheet.Range("A999999").End(xlUp).Row
hiddenSheet.Cells(lastRow, 1).Value = ReturnUserName
hiddenSheet.Cells(lastRow, 1).Value = Now()
End Function
Function ReturnUserName() As String
' returns the NT Domain User Name
Dim rString As String * 255, sLen As Long, tString As String
tString = ""
On Error Resume Next
sLen = GetUserName(rString, 255)
sLen = InStr(1, rString, Chr(0))
If sLen > 0 Then
tString = Left(rString, sLen - 1)
Else
tString = rString
End If
On Error GoTo 0
ReturnUserName = UCase(Trim(tString))
End Function
This will fire every time someone opens the workbook saving the username they used to log into the computer as well as the date and time. Saving to whichever tab you stick in there. You'll need to save the workbook with .xlsm instead of .xlsx since it will be a macro-enabled book.
Related
I have a SAP GUI script running every day in VBA. In the script I am exporting some data from SAP to several different Excel files, and these are saved to a network drive. In the first macro, I export data. In the second I copy the data to the same workbook as the script is in.
Some days I get a runtime error
Subscript out of range
on Set ws2 = Workbooks("FEBA_EXPORT_" & today2 & ".XLSX").Worksheets("Sheet1").
It looks like the Excel file is not recognized as open. I manually close the file, and reopen it and then the script will run.
I tried to insert the below code in front of the Set ws2 line that is giving an error, and this code is always giving the massage that the file is open.
Dim Ret
Ret = IsWorkBookOpen(filepath & "FEBA_EXPORT_" & today2 & ".XLSX")
If Ret = True Then
MsgBox "File is open"
Else
MsgBox "File is Closed"
End If
This is the function:
Function IsWorkBookOpen(FileName As String)
Dim ff As Long, ErrNo As Long
On Error Resume Next
ff = FreeFile()
Open FileName For Input Lock Read As #ff
Close ff
ErrNo = Err
On Error GoTo 0
Select Case ErrNo
Case 0: IsWorkBookOpen = False
Case 70: IsWorkBookOpen = True
Case Else: Error ErrNo
End Select
End Function
This is the relevant part of the code:
Sub CopyExportedFEBA_ExtractFEBRE()
Dim SapGuiAuto As Object
Dim SAPApp As Object
Dim SAPCon As Object
Dim session As Object
Set SapGuiAuto = GetObject("SAPGUI")
Set SAPApp = SapGuiAuto.GetScriptingEngine
Set SAPCon = SAPApp.Children(0)
Set session = SAPCon.Children.ElementAt(0) ' <--- Assumes you are using the first session open. '
Dim ws0, ws1, ws2, ws6, ws7 As Worksheet
Set ws0 = Workbooks("FEB_BSPROC.xlsm").Worksheets("INPUT")
Set ws1 = Workbooks("FEB_BSPROC.xlsm").Worksheets("FEB_BSPROC")
Set ws6 = Workbooks("FEB_BSPROC.xlsm").Worksheets("FBL3N_1989")
Dim today2, filepath As String
today2 = ws0.Range("E2")
filepath = ws0.Range("A7")
' Check if if FEBA_EXPORT wb is open
' This is giving the message that the file is open
Dim Ret
Ret = IsWorkBookOpen(filepath & "FEBA_EXPORT_" & today2 & ".XLSX")
If Ret = True Then
MsgBox "File is open"
Else
MsgBox "File is Closed"
End If
' This is giving runtime error 9 Subscript out of range
' If manually close the Excel and the reopen, then it will always work after this
Set ws2 = Workbooks("FEBA_EXPORT_" & today2 & ".XLSX").Worksheets("Sheet1")
'This is never giving any errors
Set ws7 = Workbooks("1989_" & today2 & ".XLSX").Worksheets("Sheet1")
The filepath varaiable is the full filepath to the network drive. So this is not the issue. Also I have another Excel file that is opened in the same way, and that one is never giving any errors.
The today2 variable is also correct.
I thought that it would work if I could close the ws2 workbook with VBA and then reopen it. So I tried to close it without setting it to a variable, but then I got the same error.
With SAP GUI scripting when you export anything to an Excel file, the file will open automatically after it has been saved. I am wondering if this could be the issue? I only have problems with this one Excel file, and not with any of several others that are saved and opened in the same way.
As I said in my above comment, the workbook may be open in a new session, different from the one where the code runs. Please, use the next function to identify if it is a matter of different Excel session:
Function sameExSession(wbFullName As String, Optional boolClose As Boolean) As Boolean
Dim sessEx As Object, wb As Object
Set sessEx = GetObject(wbFullName).Application
If sessEx.hwnd = Application.hwnd Then
sameExSession = True
Else
sameExSession = False
If boolClose Then
sessEx.Workbooks(Right(wbFullName, Len(wbFullName) - InStrRev(wbFullName, "\"))).Close False
sessEx.Quit: Set sessEx = Nothing
End If
End If
End Function
It identify the session where the workbook is open, then compare its handle with the active session one and if not the same, close the workbook (if calling the function with second parameter as True), quit the session and returns False. If only checking, call the function with the second parameter being False (the workbook will not be closed, and session will still remain).
It can be used in the next way:
Sub testSameExSession()
Dim wbFullName As String, wbSAP As Workbook
wbFullName = filepath & "FEBA_EXPORT_" & today2 & ".XLSX"
If sameExSession(wbFullName, True) Then
Debug.Print "The same session"
Set wbSAP = Workbooks("FEBA_EXPORT_" & today2 & ".XLSX")
Else
Debug.Print "Different session..."
Set wbSAP = Workbooks.Open(wbFullName)
End If
Debug.Print wbSAP.Name
'use the set workbook to do what you need...
End Sub
When you have the described problem, please use the above way to test if it is a matter of different sessions.
If so, is easy to input this part in your existing code, I think. If the workbook will be open in a different session, no need to manually close it (and reopen), the above function does it...
In case someone is still facing this issue, I found a way to wait for the excel files downloaded from SAP and its app instance to open, then close them and let you work with the files without troubles. You can set a timeout too.
If files are downloaded and opened in an already open instance of excel, it will just close the files and not the whole instance.
You can use it as follow:
Sub Test()
Call Close_SAP_Excel("Test.xlsx", "Test2.xlsx")
End Sub
xCloseExcelFromSAP
#If VBA7 Then
Private Declare PtrSafe Function AccessibleObjectFromWindow Lib "oleacc" ( _
ByVal hwnd As LongPtr, ByVal dwId As Long, riid As Any, ppvObject As Object) As Long
Private Declare PtrSafe Function FindWindowExA Lib "user32" ( _
ByVal hwndParent As LongPtr, ByVal hwndChildAfter As LongPtr, _
ByVal lpszClass As String, ByVal lpszWindow As String) As LongPtr
#Else
Private Declare Function AccessibleObjectFromWindow Lib "oleacc" ( _
ByVal hwnd As Long, ByVal dwId As Long, riid As Any, ppvObject As Object) As Long
Private Declare Function FindWindowExA Lib "user32" ( _
ByVal hwndParent As Long, ByVal hwndChildAfter As Long, _
ByVal lpszClass As String, ByVal lpszWindow As String) As Long
#End If
Sub Close_SAP_Excel(ParamArray FileNames())
'Procedure to close files downloaded from SAP and at the same time close the Excel application instance that will be open with them.
Dim ExcelAppSAP As Variant
Dim ExcelFile As Variant
Dim FinishedLoop As Boolean, TimeoutReached As Boolean, FileClosed As Boolean
Dim ReTry As Long
Dim i As Long, x As Long
Set ExcelAppSAP = Nothing
ReTry = 100000 'Used as Timeout 100000 = ~10 seconds
i = 1
'The following loop is executed until excel file is closed.
'Inside of this, there is a For Loop for each Excel Instance and inside of that is another loop
'for each excel inside the instance. If name matches, it is closed.
Do While Not FinishedLoop
If i > ReTry Then
TimeoutReached = True
Exit Do
End If
For Each ExcelFile In GetExcelInstances() 'Function to Get Excel Open Instances
For Each xls In ExcelFile.Workbooks
For x = LBound(FileNames()) To UBound(FileNames())
If xls.Name = FileNames(x) Then
Set ExcelAppSAP = ExcelFile 'Set Instance opened by SAP to variable
'Here add actions if needed. Reference to workbook as xls e.g.: xls.Sheets(1).Range("A1").Copy
xls.Close SaveChanges:=False
FileClosed = True
End If
Next x
Next
Next
If FileClosed Then
FinishedLoop = True
End If
i = i + 1
Loop
ThisWorkbook.Activate
If Not TimeoutReached Then
If FileClosed Then
On Error Resume Next
If ExcelAppSAP.Workbooks.Count = 0 Then
ExcelAppSAP.Quit
End If
Else
MsgBox "Excel application instance from SAP was not closed correctly. Please close it manually or try again.", , "Error"
End If
Else
MsgBox "Max timeout reached", , "Error"
End If
End Sub
Public Function GetExcelInstances() As Collection
Dim guid&(0 To 3), acc As Object, hwnd, hwnd2, hwnd3
guid(0) = &H20400
guid(1) = &H0
guid(2) = &HC0
guid(3) = &H46000000
Set GetExcelInstances = New Collection
Do
hwnd = FindWindowExA(0, hwnd, "XLMAIN", vbNullString)
If hwnd = 0 Then Exit Do
hwnd2 = FindWindowExA(hwnd, 0, "XLDESK", vbNullString)
hwnd3 = FindWindowExA(hwnd2, 0, "EXCEL7", vbNullString)
If AccessibleObjectFromWindow(hwnd3, &HFFFFFFF0, guid(0), acc) = 0 Then
GetExcelInstances.Add acc.Application
End If
Loop
End Function
Trying to get an older VB.NET application working again. One feature builds a text string composed of text delimited by Tab/Return characters, then creates (via interop) an Excel Workbook, adds a Worksheet, and (desired) paste the text string into the worksheet.
Here is the code:
Private Function AddNewWorksheetToWorkbook(
ByVal theWorkbook As Workbook,
ByVal worksheetName As String,
ByVal textToPaste As String
) As Microsoft.Office.Interop.Excel.Worksheet
Dim newWorksheet As Microsoft.Office.Interop.Excel.Worksheet
newWorksheet = theWorkbook.Worksheets.Add()
newWorksheet.Name = worksheetName
theWorkbook.Save()
newWorksheet.Activate() 'All works fine, file saved, worksheet named and Active as desired
Dim app As Microsoft.Office.Interop.Excel.Application
app = newWorksheet.Application
If app.ActiveSheet.Name = newWorksheet.Name Then 'Just a test to make sure ActiveSheet is the one desired -- it is
Clipboard.SetText(textToPaste) 'Clipboard has text delimited by vbTab and vbReturn (a "plain" text table)
newWorksheet.Range("A1").Select() 'Cell "A1" is properly selected
newWorksheet.Paste() 'BOOM! Get System.Runtime.InteropServices.COMException: 'Microsoft Excel cannot paste the data.'
End If
theWorkbook.Save()
Return newWorksheet
End Function
As noted in the comments, all goes well until the Worksheet.Paste() method call.
I have tried variations on Paste() as well as PasteSpecial(), etc. No joy.
Keep getting System.Runtime.InteropServices.COMException: 'Microsoft Excel cannot paste the data.'
I am able to (manually, not through interop) click "Paste" in Excel and it works just fine.
I would be grateful for any insights from the stackoverflow community!
So, here is what I ended up doing to solve (actually avoid and solve) the problem I was facing. Here is how I altered the existing function.
Private Function AddNewWorksheetToWorkbook(
ByVal theWorkbook As Workbook,
ByVal worksheetName As String,
ByVal textToPaste As String
) As Microsoft.Office.Interop.Excel.Worksheet
Dim newWorksheet As Microsoft.Office.Interop.Excel.Worksheet
newWorksheet = theWorkbook.Worksheets.Add()
newWorksheet.Name = worksheetName
theWorkbook.Save()
newWorksheet.Activate() 'All works fine, file saved, worksheet named and Active as desired
Dim app As Microsoft.Office.Interop.Excel.Application
app = newWorksheet.Application
If app.ActiveSheet.Name = newWorksheet.Name Then
Dim rowCount As Integer = 0
Dim colCount As Integer = 0
Dim values(,) As String = ExtractTwoDimDataSet(pasteText, rowCount, colCount)
Dim oRange As Range
oRange = newWorksheet.Range(newWorksheet.Cells(1, 1), newWorksheet.Cells(rowCount, colCount))
oRange.Value = values
End If
theWorkbook.Save()
Return newWorksheet
End Function
The change, of course, is to not use the Clipboard at all (which users might appreciate) and assign the "two-dimensional" text array to a Cell range on the Worksheet. The function (yes, I know, ugly with return values and ByRef parameters) is as follows:
Private Shared Function ExtractTwoDimDataSet(tabAndCrLfDelimitedText As String, ByRef rowCount As Integer, ByRef colCount As Integer) As String(,)
rowCount = 0
colCount = 0
Dim rows() As String
Dim columns() As String
rows = Split(tabAndCrLfDelimitedText, vbCrLf)
rowCount = rows.Length
For Each line As String In rows
columns = Split(line, vbTab)
If columns.Length > colCount Then
colCount = columns.Length
End If
Next
Dim values(rowCount, colCount) As String
rows = Split(tabAndCrLfDelimitedText, vbCrLf)
Dim r As Integer = 0
For Each line As String In rows
columns = Split(line, vbTab)
Dim c As Integer = 0
For Each cell As String In columns
values(r, c) = cell
c = c + 1
Next
r = r + 1
Next
Return values
End Function
The end result does what it needs to do and the function above is fairly reusable but I marked it Private as it is not general-purpose, and depends on the vbCrLf and vbTab delimiters.
This is clearly in the spirit of advice from #Mary ...
Thanks for the views and suggestions from stackoverflow folks!
I have several instances of Excel running, maybe up to 4 instances. One of these (let's call in instance A) usually does not have an open workbook. It is used by one of the other instances (let's call it instance B) to open a workbook in instance A, add, change data, then save and close that workbook, thus the instance where these changes are being made (instance A) returns to a state where there are no open workbooks. I do it this way because it is much faster than having the workbook with the code (instance B) open a workbook, perform these tasks and then close the workbook.
My problem is this: From time to time, for debugging purposes mostly, it is desirable to make instance A visible, but what I'm finding is that an instance without an open workbook cannot be made visible, or at least this is what I'm concluding. I am using Excel 2016, 64 bit. My code to do this is:
Private Sub cmdMakeSelectionVisible_Click()
Dim strng As String
Dim lCol As Long, lRow As Long
Dim oXLApp As Excel.Application
Dim bFoundInstance As Boolean
Dim wb_Actress As Workbook
With Me.lstXL '<--| refer to your listbox: change "ListBox1" with your actual listbox name
For lRow = 0 To .ListCount - 1 '<--| loop through listbox rows
If .Selected(lRow) Then '<--| if current row selected
For lCol = 0 To .ColumnCount - 1 '<--| loop through listbox columns
strng = strng & .List(lRow, lCol) & " | " '<--| build your output string
If lCol = 1 Then
MsgBox .List(lRow, lCol)
bFoundInstance = GetReferenceToXLApp(.List(lRow, lCol), oXLApp)
MsgBox oXLApp.Caption
Set wb_Actress = oXLApp.Workbooks.Open("T:\-1996\Dummy Performer's Book.xlsm")
oXLApp.Visible = True
wb_Actress.Close
End If
Next lCol
MsgBox "you selected" & vbCrLf & Left(strng, (Len(strng) - 1)) '<--| show output string (after removing its last character ("|"))
Exit For '<-_| exit loop
End If
Next lRow
End With
End Sub
Inside the For Loops there is an If statement and in that If statement if the instance in question opens a workbook, then the code works. If the instance does not open a workbook and it contains no children, it does not work. If at that point I test to see if oXApp.Visible is true, it is true, but the instance remains hidden.
The question is, is there any way around this without opening a workbook, make the instance visible, then closing the workbook? I still consider myself a novice when it comes to VBA. There may be a totally different approach other than using the handle to the desired instance.
Thanks for looking and helping.
Edit:
The code for GetReferenceToXLApp is:
Public Function GetReferenceToXLApp(hWndXL As Long, oXLApp As Object) As Boolean
Dim hWinDesk As Long
Dim hWin7 As Long
Dim obj As Object
Dim iID As GUID
'// Rather than explaining, go read
'// http://msdn.microsoft.com/en-us/library/windows/desktop/ms687262(v=vs.85).aspx
Call IIDFromString(StrPtr(IID_IDispatch), iID)
'// We have the XL App (Class name XLMAIN)
'// This window has a child called 'XLDESK' (which I presume to mean 'XL desktop')
'// XLDesk is the container for all XL child windows....
hWinDesk = FindWindowEx(hWndXL, 0&, "XLDESK", vbNullString)
'// EXCEL7 is the class name for a Workbook window (and probably others, as well)
'// This is used to check there is actually a workbook open in this instance.
hWin7 = FindWindowEx(hWinDesk, 0&, "EXCEL7", vbNullString)
'// Deep API... read up on it if interested.
'// http://msdn.microsoft.com/en-us/library/windows/desktop/dd317978(v=vs.85).aspx
If AccessibleObjectFromWindow(hWin7, OBJID_NATIVEOM, iID, obj) = RETURN_OK Then
Set oXLApp = obj.Application
GetReferenceToXLApp = True
End If
End Function
And
Private Declare PtrSafe Function IIDFromString Lib "ole32" _
(ByVal lpsz As LongPtr, ByRef lpiid As GUID) As Long
And
Private Declare PtrSafe Function FindWindowEx Lib "User32" Alias "FindWindowExA" _
(ByVal hWnd1 As LongPtr, ByVal hWnd2 As LongPtr, ByVal lpsz1 As String, _
ByVal lpsz2 As String) As Long
Seems to work...
Dim oApp As Excel.Application
Sub TT()
Set oApp = New Excel.Application
'oApp.Workbooks.Add
oApp.Visible = True
Debug.Print oApp.HWnd, Application.HWnd
Debug.Print oApp.Caption, Application.Caption
Application.Wait Now + TimeSerial(0, 0, 2)
AppActivate Application.Caption
Application.Wait Now + TimeSerial(0, 0, 2)
AppActivate oApp.Caption
End Sub
First up, I admit I know almost nothing about writing macro's in excel. I managed to create the macro below from piecing together other macros from various other posts.
The macro below works, but I need it to do one more thing and I cannot figure it out.
Basically, what the macro does is allow the user to select a folder location and then go through a column and pick up the link contained within each row of that column and then save the file on the other end of that link to the folder path selected with a defined naming format.
The only bit that I cannot figure out, is when I apply a filter to the excel form, it will still grab all the files regardless of whether they are visible or not.
The current macro:
*Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias _
"URLDownloadToFileA" (ByVal pCaller As Long, ByVal szURL As String, ByVal _
szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long
Sub Button1_Click()
Dim intranetLink As String
Dim mainBook As Workbook
Dim Counter As Integer
Dim saveDialog As FileDialog
Dim savePath As String
Dim filename As String
Counter = 4
Set saveDialog = Application.FileDialog(msoFileDialogFolderPicker)
With saveDialog
.Title = "Select a Folder" 'sticks a title on the dialog so the user kind of knows what they're supposed to be doing
.AllowMultiSelect = False 'prevents the user from selecting more than one item out the dialog.
.InitialFileName = strPath '
If .Show <> -1 Then GoTo FolderBombed 'if the user does something funky or cancels, abort the rest of the macro.
savePath = .SelectedItems(1) 'get the file path to the selected folder
End With
For Each vCell In Range("J4:J" & Cells(Rows.Count, "J").End(xlUp).Row)
intranetLink = vCell.Text
filename = Cells(Counter, 6)
filename = "c:\Path\" + filename
URLDownloadToFile 0, intranetLink, filename, 0, 0
Counter = Counter + 1
Next vCell
FolderBombed:
MsgBox ("Completed")
End Sub*
The line I need to modify is the following one:
For Each vCell In Range("J4:J" & Cells(Rows.Count, "J").End(xlUp).Row)
I have tried changing it to something like this:
For Each vCell In Range("J4:J" & Cells(Rows.Count, "J").CurrentRegion.SpecialCells(xlVisible).End(xlUp).Row).
but all this does is select a file that is filtered out of the range.
Any help in getting this right would be greatly appreciated.
Chris.
Use this - loop through the cells and check each one:
For Each vCell In Range("J4:J" & Cells(Rows.Count, "J").End(xlUp).Row)
If not vCell.EntireRow.Hidden Then
[...]
I have approx. 12000 cells in excel containing RTF (including formatting tags). I need to parse them to get to the unformatted text.
This is the example of one of the cells with text:
{\rtf1\ansi\deflang1060\ftnbj\uc1
{\fonttbl{\f0 \froman \fcharset0 Times New Roman;}{\f1 \fswiss \fcharset238
Arial;}}
{\colortbl ;\red255\green255\blue255 ;\red0\green0\blue0 ;}
{\stylesheet{\fs24\cf2\cb1 Normal;}{\cs1\cf2\cb1 Default Paragraph Font;}}
\paperw11908\paperh16833\margl1800\margr1800\margt1440\margb1440\headery720\footery720
\deftab720\formshade\aendnotes\aftnnrlc\pgbrdrhead\pgbrdrfoot
\sectd\pgwsxn11908\pghsxn16833\marglsxn1800\margrsxn1800\margtsxn1440\margbsxn1440
\headery720\footery720\sbkpage\pgncont\pgndec
\plain\plain\f1\fs24\pard TPR 0160 000\par IPR 0160 000\par OB-R-02-28\par}
And all I really need is this:
TPR 0160 000
IPR 0160 000
OB-R-02-28
The problem with simple looping over the cells and removing unnecessary formatting is, that not everything in those 12000 cells is as straightforward as this is. So I would need to manually inspect many different versions and write several variations; and still at the end there would be a lot of manual work to do.
But if I copy the contents of one cell to empty text document and save it as RTF, then open it with MS Word, it instantly parses the text and I get exactly what I want. Unfortunately it's extremely inconvenient to do so for a 12000 cells.
So I was thinking about VBA macro, to move cell contents to Word, force parsing and then copy the result back to the originating cell. Unfortunately I'm not really sure how to do it.
Does anybody has any idea? Or a different approach? I will be really grateful for a solution or a push in the right direction.
TNX!
If you did want to go down the route of using Word to parse the text, this function should help you out. As the comments suggest, you'll need a reference to the MS Word Object Library.
Function ParseRTF(strRTF As String) As String
Dim wdDoc As Word.Document 'Ref: Microsoft Word 11.0 Object Library'
Dim f As Integer 'Variable to store the file I/O number'
'File path for a temporary .rtf file'
Const strFileTemp = "C:\TempFile_ParseRTF.rtf"
'Obtain the next valid file I/O number'
f = FreeFile
'Open the temp file and save the RTF string in it'
Open strFileTemp For Output As #f
Print #f, strRTF
Close #f
'Open the .rtf file as a Word.Document'
Set wdDoc = GetObject(strFileTemp)
'Read the now parsed text from the Word.Document'
ParseRTF = wdDoc.Range.Text
'Delete the temporary .rtf file'
Kill strFileTemp
'Close the Word connection'
wdDoc.Close False
Set wdDoc = Nothing
End Function
You could call it for each of your 12,000 cells using something similar to this:
Sub ParseAllRange()
Dim rngCell As Range
Dim strRTF As String
For Each rngCell In Range("A1:A12000")
'Parse the cell contents'
strRTF = ParseRTF(CStr(rngCell))
'Output to the cell one column over'
rngCell.Offset(0, 1) = strRTF
Next
End Sub
The ParseRTF function takes about a second to run (on my machine at least), so for 12,000 cells this will work out at about three and a half hours.
Having thought about this problem over the weekend, I was sure there was a better (quicker) solution for this.
I remembered the RTF capabilities of the clipboard, and realised that a class could be created that would copy RTF data to the clipboard, paste to a word doc, and output the resulting plain text. The benefit of this solution is that the word doc object would not have to be opened and closed for each rtf string; it could be opened before the loop and closed after.
Below is the code to achieve this. It is a Class module named clsRTFParser.
Private Declare Function GlobalAlloc Lib "kernel32" _
(ByVal wFlags&, ByVal dwBytes As Long) As Long
Private Declare Function GlobalLock Lib "kernel32" _
(ByVal hMem As Long) As Long
Private Declare Function GlobalUnlock Lib "kernel32" _
(ByVal hMem As Long) As Long
Private Declare Function lstrcpy Lib "kernel32" _
(ByVal lpString1 As Any, ByVal lpString2 As Any) As Long
Private Declare Function OpenClipboard Lib "user32" _
(ByVal Hwnd As Long) As Long
Private Declare Function EmptyClipboard Lib "user32" () As Long
Private Declare Function RegisterClipboardFormat Lib "user32" Alias _
"RegisterClipboardFormatA" (ByVal lpString As String) As Long
Private Declare Function SetClipboardData Lib "user32" _
(ByVal wFormat As Long, ByVal hMem As Long) As Long
Private Declare Function CloseClipboard Lib "user32" () As Long
'---'
Dim wdDoc As Word.Document 'Ref: Microsoft Word 11.0 Object Library'
Private Sub Class_Initialize()
Set wdDoc = New Word.Document
End Sub
Private Sub Class_Terminate()
wdDoc.Close False
Set wdDoc = Nothing
End Sub
'---'
Private Function CopyRTF(strCopyString As String) As Boolean
Dim hGlobalMemory As Long
Dim lpGlobalMemory As Long
Dim hClipMemory As Long
Dim lngFormatRTF As Long
'Allocate and copy string to memory'
hGlobalMemory = GlobalAlloc(&H42, Len(strCopyString) + 1)
lpGlobalMemory = GlobalLock(hGlobalMemory)
lpGlobalMemory = lstrcpy(lpGlobalMemory, strCopyString)
'Unlock the memory and then copy to the clipboard'
If GlobalUnlock(hGlobalMemory) = 0 Then
If OpenClipboard(0&) <> 0 Then
Call EmptyClipboard
'Save the data as Rich Text Format'
lngFormatRTF = RegisterClipboardFormat("Rich Text Format")
hClipMemory = SetClipboardData(lngFormatRTF, hGlobalMemory)
CopyRTF = CBool(CloseClipboard)
End If
End If
End Function
'---'
Private Function PasteRTF() As String
Dim strOutput As String
'Paste the clipboard data to the wdDoc and read the plain text result'
wdDoc.Range.Paste
strOutput = wdDoc.Range.Text
'Get rid of the new lines at the beginning and end of the document'
strOutput = Left(strOutput, Len(strOutput) - 2)
strOutput = Right(strOutput, Len(strOutput) - 2)
PasteRTF = strOutput
End Function
'---'
Public Function ParseRTF(strRTF As String) As String
If CopyRTF(strRTF) Then
ParseRTF = PasteRTF
Else
ParseRTF = "Error in copying to clipboard"
End If
End Function
You could call it for each of your 12,000 cells using something similar to this:
Sub CopyParseAllRange()
Dim rngCell As Range
Dim strRTF As String
'Create new instance of clsRTFParser'
Dim RTFParser As clsRTFParser
Set RTFParser = New clsRTFParser
For Each rngCell In Range("A1:A12000")
'Parse the cell contents'
strRTF = RTFParser.ParseRTF(CStr(rngCell))
'Output to the cell one column over'
rngCell.Offset(0, 1) = strRTF
Next
End Sub
I have simulated this using example RTF strings on my machine. For 12,000 cells it took two and a half minutes, a much more reasonable time frame!
You can try to parse every cell with regular expression and leave only the content you need.
Every RTF control code start with "\" and ends with space, without any additional space between. "{}" are use for grouping. If your text won't contain any, you can just remove them (the same for ";"). So now you stay with your text and some unnecessary words as "Arial", "Normal" etc. You can build the dictionary to remove them also. After some tweaking, you will stay with only the text you need.
Look at http://www.regular-expressions.info/ for more information and great tool to write RegExp's (RegexBuddy - unfortunately it isn't free, but it's worth the money. AFAIR there is also trial).
UPDATE: Of course, I don't encourage you to do it manually for every cell. Just iterate through active range:
Refer this thread:
SO: About iterating through cells in VBA
Personally, I'll give a try to this idea:
Sub Iterate()
For Each Cell in ActiveSheet.UsedRange.Cells
'Do something
Next
End Sub
And how to use RegExp's in VBA (Excel)?
Refer:
Regex functions in Excel
and
Regex in VBA
Basically you've to use VBScript.RegExp object through COM.
Some of the solutions here require a reference to the MS Word Object Library. Playing with the cards I am dealt, I found a solution that does not rely on it. It strips RTF tags, and other fluff like font tables and stylesheets, all in VBA. It might be helpful to you. I ran it across your data, and other than the whitespace, I get the same output as what you expected.
Here is the code.
First, something to check if a string is alphanumeric or not. Give it a string that's one character long. This function is used to work out delimitation here and there.
Public Function Alphanumeric(Character As String) As Boolean
If InStr("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-", Character) Then
Alphanumeric = True
Else
Alphanumeric = False
End If
End Function
Next up is to remove an entire group. I use this to remove font tables and other rubbish.
Public Function RemoveGroup(RTFString As String, GroupName As String) As String
Dim I As Integer
Dim J As Integer
Dim Count As Integer
I = InStr(RTFString, "{\" & GroupName)
' If the group was not found in the RTF string, then just return that string unchanged.
If I = 0 Then
RemoveGroup = RTFString
Exit Function
End If
' Otherwise, we will need to scan along, from the start of the group, until we find the end of the group.
' The group is delimited by { and }. Groups may be nested, so we need to count up if we encounter { and
' down if we encounter }. When that count reaches zero, then the end of the group has been found.
J = I
Do
If Mid(RTFString, J, 1) = "{" Then Count = Count + 1
If Mid(RTFString, J, 1) = "}" Then Count = Count - 1
J = J + 1
Loop While Count > 0
RemoveGroup = Replace(RTFString, Mid(RTFString, I, J - I), "")
End Function
Okay, and this function removes any tags.
Public Function RemoveTags(RTFString As String) As String
Dim L As Long
Dim R As Long
L = 1
' Search to the end of the string.
While L < Len(RTFString)
' Append anything that's not a tag to the return value.
While Mid(RTFString, L, 1) <> "\" And L < Len(RTFString)
RemoveTags = RemoveTags & Mid(RTFString, L, 1)
L = L + 1
Wend
'Search to the end of the tag.
R = L + 1
While Alphanumeric(Mid(RTFString, R, 1)) And R < Len(RTFString)
R = R + 1
Wend
L = R
Wend
End Function
We can remove curly braces in the obvious way:
Public Function RemoveBraces(RTFString As String) As String
RemoveBraces = Replace(RTFString, "{", "")
RemoveBraces = Replace(RemoveBraces, "}", "")
End Function
Once you have the functions above copy-pasted into your module, you can create a function that uses them to strip away any stuff you don't need or want. The following works perfectly in my case.
Public Function RemoveTheFluff(RTFString As String) As String
RemoveTheFluff = Replace(RTFString, vbCrLf, "")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "fonttbl")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "colortbl")
RemoveTheFluff = RemoveGroup(RemoveTheFluff, "stylesheet")
RemoveTheFluff = RemoveTags(RemoveBraces(RemoveTheFluff))
End Function
I hope this helps. I wouldn't use it in a word processor or anything, but it might do for scraping data if that's what you're doing.
Your post made it sound as if each RTF document was stored in a single Excell cell. If so, then
Solution using .Net Framework RichTextBox control
will convert the RTF in each cell to plain text in 2 lines of code (after a little system configuration to get the right .tlb file to allow reference to the .Net Framework). Put the cell value in rtfsample and
Set miracle = New System_Windows_Forms.RichTextBox
With miracle
.RTF = rtfText
PlainText = .TEXT
End With