Excel VBA - Most Efficient Way to Traverse Outlook Folders - excel

What is the most efficient way to traverse Outlook.Folders in Excel VBA? I'm somewhat new to Excel VBA and am presently using recursion to traverse folders.
The below code works but is there anything better than what I coded?
Private Function RecursiveEmailItems( _
Optional SubjectContains As String = Empty, _
Optional FolderType As OlDefaultFolders = olFolderInbox, _
Optional StartFolderNm As String = "", _
Optional Folder As Outlook.Folder = Nothing, _
Optional FolderDepth As Long = 1)
Dim Filter As String ' Stores outlook filter
Dim Emails As Outlook.Items ' Stores list of outlook emails
Dim Email As Outlook.MailItem ' Stores an email item
Dim oTest As Object ' Used to test email/folder item
If SubjectContains <> "" Then Filter = "#SQL=urn:schemas:httpmail:subject ci_phrasematch '" _
& SubjectContains & "'"
'-- Check to see if need to initialize folder - First Run --
If Folder Is Nothing Then
If StartFolderNm = "" Then Set Folder = _
Outlook.Application.GetNamespace("MAPI").GetDefaultFolder(FolderType) _
Else Set Folder = _
Outlook.Application.GetNamespace("MAPI").GetDefaultFolder(FolderType).Folders(StartFolderNm)
End If
'-- Get emails from the folder, use filter if not empty
If Filter = "" Then Set Emails = Folder.Items Else Set Emails = Folder.Items.Restrict(Filter)
'-- Process all emails found --
If Emails.Count > 0 Then Debug.Print Indent(FolderDepth, " ") & "-" & Folder.Name
For Each oTest In Emails
If TypeName(oTest) = "MailItem" Then
Set Email = oTest
Debug.Print Indent(FolderDepth, " ") & " |" & Email.Subject
End If
Next oTest
'-- Process all subfolders --
For Each oTest In Folder.Folders
If TypeName(oTest) = "MAPIFolder" Then
Call RecursiveEmailItems(SubjectContains, FolderType, StartFolderNm, oTest, FolderDepth + 1)
End If
Next oTest
End Function
Private Function GetEmailStatus()
'Dim Filter As String: Initialize: Filter = "Timesheet " & Format(EndDt, "mm/dd/yy")
Dim Filter As String: Initialize: Filter = "Timesheet 06"
Call RecursiveEmailItems("Timesheet 06/", olFolderInbox, "Timesheet")
End Function
Private Function Indent(Count As Long, Char As String) As String
Dim idx As Long
For idx = 1 To Count
Indent = Indent + Char
Next
End Function
Thanks in advance for any help!

Running the Restrict method for each folder is not really a good idea. Instead, you may consider using the AdvancedSearch method of the Application class. The key benefits of using the AdvancedSearch method in Outlook are:
The search is performed in another thread. You don’t need to run another thread manually since the AdvancedSearch method runs it automatically in the background.
Possibility to search for any item types: mail, appointment, calendar, notes etc. in any location, i.e. beyond the scope of a certain folder. The Restrict and Find/FindNext methods can be applied to a particular Items collection (see the Items property of the Folder class in Outlook).
Full support for DASL queries (custom properties can be used for searching too). You can read more about this in the Filtering article in MSDN. To improve the search performance, Instant Search keywords can be used if Instant Search is enabled for the store (see the IsInstantSearchEnabled property of the Store class).
You can stop the search process at any moment using the Stop method of the Search class.
Read more about this method in the Advanced search in Outlook programmatically: C#, VB.NET article.
Public m_SearchComplete As Boolean
Private Sub Application_AdvancedSearchComplete(ByVal SearchObject As Search)
If SearchObject.Tag = "MySearch" Then
m_SearchComplete = True
End If
End Sub
Sub TestSearchForMultipleFolders()
Dim Scope As String
Dim Filter As String
Dim MySearch As Outlook.Search
Dim MyTable As Outlook.Table
Dim nextRow As Outlook.Row
m_SearchComplete = False
'Establish scope for multiple folders
Scope = "'" & Application.Session.GetDefaultFolder( _
olFolderInbox).FolderPath _
& "','" & Application.Session.GetDefaultFolder( _
olFolderSentMail).FolderPath & "'"
'Establish filter
If Application.Session.DefaultStore.IsInstantSearchEnabled Then
Filter = Chr(34) & "urn:schemas:httpmail:subject" _
& Chr(34) & " ci_phrasematch 'Office'"
Else
Filter = Chr(34) & "urn:schemas:httpmail:subject" _
& Chr(34) & " like '%Office%'"
End If
Set MySearch = Application.AdvancedSearch( _
Scope, Filter, True, "MySearch")
While m_SearchComplete <> True
DoEvents
Wend
Set MyTable = MySearch.GetTable
Do Until MyTable.EndOfTable
Set nextRow = MyTable.GetNextRow()
Debug.Print nextRow("Subject")
Loop
End Sub
To specify multiple folders for the Scope parameter, use a comma character between each folder path and enclose each folder path in single quotes. For default folders such as Inbox or Sent Items, you can use the simple folder name instead of the full folder path.

That looks fine to me. Is there a particular area you want to improve?
The only line I would remove is the If TypeName(oTest) = "MAPIFolder" Then line. All folders in OOM are MAPIFolder, so the check is superfluous, but it won't do much performance wise either way.

Related

Create a VBA version of dictionaries with 2 values per key

I am trying to make my excel macro dynamic. The excel macro essentially looks at only 2 columns, one which contains the name and the other contains the numeric part. I have my macro working perfectly, the only problem is that it is hard coded when I created the program. In my code, I hard coded the name in column 2 and the numeric part in column 3. However, that is not the case in real life. The name and numeric data could appear in column 1 and 5, for example. I've been manually rearranging the data in the columns so that it fits into what hard coded. However, I want to make this process dynamic and less manual work for the user.
There are 5 different versions of spreadsheets this macro will be used on and in each spreadsheet, the name and number columns are different. I am looking to make a user form box of some sort, where the user selects "Vendor XYZ" and since Vendor XYZ always sends their data sheets the same way I know that Vendor XYZ's name column is 2 and number is 4. So I was thinking that the dictionary would be something in the form of {Vendor XYZ: 2,4} (where the first number is the name column and the second number is the numeric columnnumber...I know the syntax is wrong)
I think my work around this would be to hard code the different vendors and then use if statements ( I haven't tried it yet)
I will have a user input/dropdown box of 5 different vendors. Then something like
If userinput="A"
then namecol=2 and numcol=1
If userinput="B"
then namecol="3" and numcol="4"
I don't know if that would even work. The problem with that is that the number of vendors is small now, but will be scaling up and I can't do that if we have 100 or 1000 vendors.
Any ideas?
Depending on how your initial dataset is retrieved, you can use something like this:
Public Function GetHeaderIndices(ByVal InputData As Variant) As Scripting.Dictionary
If IsEmpty(InputData) Then Exit Function
Dim HeaderIndices As Scripting.Dictionary
Set HeaderIndices = New Scripting.Dictionary
HeaderIndices.CompareMode = TextCompare
Dim i As Long
For i = LBound(InputData, 2) To UBound(InputData, 2)
If Not HeaderIndices.Exists(Trim(InputData(LBound(InputData, 1), i))) Then _
HeaderIndices.Add Trim(InputData(LBound(InputData, 1), i)), i
Next
Set GetHeaderIndices = HeaderIndices
End Function
This Function takes an array as an input and gives the user a dictionary with the indices of the headers from the input.
If you are smart (and I say this because too many users just don't use tables) you will have your data in a table, and you will have named that table. If you did, you could do something like this:
Sub DoSomething()
Dim MyData as Variant
MyData = ThisWorkbook.Worksheets("MyDataSheet").ListObjects("MyTableName").Range.Value
End Sub
So, if you data looked like this:
Foo Baz Bar
1 Car Apple
3 Van Orange
2 Truck Banana
The function would give you a dictionary like:
Keys Items
Foo 1
Baz 2
Bar 3
Then your subroutines could do something like this:
Sub DoEverything()
Dim MyData as Variant
MyData = ThisWorkbook.Worksheets("MyDataSheet").ListObjects("MyTableName").Range.Value
DoSomething(MyData)
End Sub
Sub DoSomething(ByRef MyData as Variant)
Dim HeaderIndices as Scripting.Dictionary
Set HeaderIndices = GetHeaderIndices(MyData)
Dim i as Long
' Loop through all the rows after the header row.
For i = LBound(MyData, 1) + 1 to Ubound(MyData, 1)
If MyData(i, HeaderIndices("Baz")) = "Truck" Then
?MyData(i, HeaderIndices("Foo"))
?MyData(i, HeaderIndices("Baz"))
?MyData(i, HeaderIndices("Bar"))
End If
Next
End Sub
This does require a reference to Scripting.Runtime so if you don't want to add a reference you will need to change any reference to As Scripting.Dictionary to As Object and any New Scripting.Dictionary to CreateObject("Scripting.Dictionary").
Alternatively, I use the following code module to take care of adding references programmatically for all my users:
Public Sub PrepareReferences()
If CheckForAccess Then
RemoveBrokenReferences
AddReferencebyGUID "{420B2830-E718-11CF-893D-00A0C9054228}"
End If
End Sub
Public Sub AddReferencebyGUID(ByVal ReferenceGUID As String)
Dim Reference As Variant
Dim i As Long
' Set to continue in case of error
On Error Resume Next
' Add the reference
ThisWorkbook.VBProject.References.AddFromGuid _
GUID:=ReferenceGUID, Major:=1, Minor:=0
' If an error was encountered, inform the user
Select Case Err.Number
Case 32813
' Reference already in use. No action necessary
Case vbNullString
' Reference added without issue
Case Else
' An unknown error was encountered, so alert the user
MsgBox "A problem was encountered trying to" & vbNewLine _
& "add or remove a reference in this file" & vbNewLine & "Please check the " _
& "references in your VBA project!", vbCritical + vbOKOnly, "Error!"
End Select
On Error GoTo 0
End Sub
Private Sub RemoveBrokenReferences()
' Reference is a Variant here since it requires an external reference.
' It isnt possible to ensure that the external reference is checked when this process runs.
Dim Reference As Variant
Dim i As Long
For i = ThisWorkbook.VBProject.References.Count To 1 Step -1
Set Reference = ThisWorkbook.VBProject.References.Item(i)
If Reference.IsBroken Then
ThisWorkbook.VBProject.References.Remove Reference
End If
Next i
End Sub
Public Function CheckForAccess() As Boolean
' Checks to ensure access to the Object Model is set
Dim VBP As Variant
If Val(Application.Version) >= 10 Then
On Error Resume Next
Set VBP = ThisWorkbook.VBProject
If Err.Number <> 0 Then
MsgBox "Please pay attention to this message." _
& vbCrLf & vbCrLf & "Your security settings do not allow this procedure to run." _
& vbCrLf & vbCrLf & "To change your security setting:" _
& vbCrLf & vbCrLf & " 1. Select File - Options - Trust Center - Trust Center Settings - Macro Settings." & vbCrLf _
& " 2. Place a checkmark next to 'Trust access to the VBA project object model.'" _
& vbCrLf & "Once you have completed this process, please save and reopen the workbook." _
& vbCrLf & "Please reach out for assistance with this process.", _
vbCritical
CheckForAccess = False
Err.Clear
Exit Function
End If
End If
CheckForAccess = True
End Function
And I have the following command in each Workbook_Open event (less than ideal, but only good solution I have so far)
Private Sub Workbook_Open()
PrepareReferences
End Sub

Need dictionary/loop assistance

I have some work to complete where I have 9 tabs of data (some of which contain thousands of lines of data). Each tab contains (amongst others) a policy number, a credit and/or a debit number.
Every policy number will have a match somewhere in the tabs containing an equal credit or debit, e.g.
tab 1 will have Policy number 123 and a credit of £100 and
tab 5 will also have policy number 123 with a debit of £100.
What I'm looking to do is, look through each policy number on every tab and find where the opposite amount is located adding the location address to each policy number.
I'm certainly not looking for anyone to create the coding for me, but what I am looking for is advice. I've looked at using loops but feel this may take a very long time to process. I've also looked at Dictionaries but am relatively new to these so am not very confident.
Is what I'm looking for even possible? And if so any ideas where to start or pointers? Any advice is greatly appreciated. Thanks!
Usage Example
#Matt555, You can test the created XML file with the following code to get the sheet names of policy "123" and debit of 100. I tested the code assuming your titles in row A:A contain "policy" and "debit"
#Peh, You are right, xml dom methods aren't used too often within vba. The advantage of using XML in this connex is a great flexibility in searching via XPath as well as performance over huge files. I prefer it even to arrays or dictionaries when filtering unique values. It is possible to return the found item number in node lists without looping through the whole data set ...
Option Explicit
Sub testPolicy()
Dim policy
Dim debit As Double
policy = "123"
debit = "100"
MsgBox "Policy " & policy & " found in " & vbNewLine & _
findSheetName(policy, debit), vbInformation, "Policy " & policy & " in Tabs"
' You can easily split this to an array and analyze the results
End Sub
Function findSheetName(ByVal policy, Optional ByVal debit) As String
' Purpose: Finds Sheet Names where policy AND/OR debit is found
' Note: Assuming your titles in row A:A contain "policy" and "debit"
' You can declare xDoc also after Option Explicit to make it public
Dim xDoc As Object
Dim xNd As Object ' MSXML.IXMDOMNode
Dim xNdList As Object ' MSXML.IXMLDOMNodeList
Dim s As String
' XPath expression
Dim xPth As String
If IsMissing(debit) Then
xPth = "//row[policy=""" & policy & """]"
Else
xPth = "//row[policy=""" & policy & """][debit=""" & debit & """]"
End If
' XML to memory
Set xDoc = CreateObject("MSXML2.Domdocument.6.0")
' allow XPath
xDoc.setProperty "SelectionLanguage", "XPath"
xDoc.validateOnParse = False
' ========
' LOAD XML
' ========
xDoc.Load ThisWorkbook.Path & "\" & "output.xml"
' Loop thru NodeList
Set xNdList = xDoc.DocumentElement.SelectNodes(xPth)
Debug.Print xPth, xNdList.Length
For Each xNd In xNdList
s = s & xNd.ParentNode.NodeName & "|"
Next xNd
Set xDoc = Nothing
findSheetName = s
End Function
You could
a) create an XML file looping through all sheets,
b) open it via load method and
c) perform a simple XPath search (I can give some examples later)
I modified a recent answer (cf. excel-vba-xml-parsing-performance)
to do step "a)" using late binding thus
a) avoiding a reference to the latest MS XML Version Version 6 (msxml6.dll) and
b) getting data over all xheets. XML allows you structured search via XPath over nodes in a logical structure comparable to HTML. The root node in this example is called data, the following nodes are named with the sheets' names and the subsequent nodes get the names in row A:A of each sheet.
A XML file is a simple text file, which you can open by a text editor. Above all you can use VBA XMLDOM methods to analyze or search through the items (nodes). I will give you examples to relating to your question, but give me some time. => see answer "Usage Example", where I explain some Advantages of XML, too (#Peh).
Please pay Attention to the added notes, too.
Option Explicit
Sub xmlExportSheets()
' Zweck: XML Export over all sheets in workbook
' cf. Site: [excel-vba-xml-parsing-performance][1][https://stackoverflow.com/questions/40986395/excel-vba-xml-parsing-performance/40987237#40987237][1]
' Note: pretty printed raw output with line breaks and indentation using an embedded XSLT stylesheet
On Error GoTo ErrHandle
' A. Declarations
' 1 DECLARE XML DOC OBJECT '
' a) Early Binding: VBA REFERENCE MSXML, v6.0 necessary'
' Dim doc As New MSXML2.DOMDocument60, xslDoc As New MSXML2.DOMDocument60, newDoc As New MSXML2.DOMDocument60
' Dim root As IXMLDOMElement, dataNode As IXMLDOMElement, datesNode As IXMLDOMElement, namesNode As IXMLDOMElement
' b) Late Binding XML Files:
Dim doc As Object
Dim xslDoc As Object
Dim newDoc As Object
' c) Late Binding XML Nodes:
Dim root As Object
Dim sh As Object ' xml node containing Sheet Name
Dim dataNode As Object
Dim datesNode As Object
Dim namesnode As Object
' 2 DECLARE other variables
Dim i As Long
Dim j As Long
Dim tmpValue As Variant
Dim tit As String
Dim ws As Worksheet
' B. XML Docs to Memory
Set doc = CreateObject("MSXML2.Domdocument.6.0")
Set xslDoc = CreateObject("MSXML2.Domdocument.6.0")
Set newDoc = CreateObject("MSXML2.Domdocument.6.0")
' C. Set DocumentElement (= root node)'
Set root = doc.createElement("data")
' D. Create Root Node
doc.appendChild root
' ===========================
' ITERATE THROUGH Sheets
' ===========================
For Each ws In ThisWorkbook.Sheets
Set sh = doc.createElement(ws.Name) '
root.appendChild sh
' ===========================
' ITERATE THROUGH ROWS ' A2:NNn
' ===========================
For i = 2 To ws.UsedRange.Rows.Count ' Sheets(1)
' DATA ROW NODE '
Set dataNode = doc.createElement("row") '
sh.appendChild dataNode
' TABLES NODE (orig.: DATES NODE) '
Set datesNode = doc.createElement(ws.Cells(1, 1)) ' Dates
datesNode.Text = ws.Range("A" & i)
dataNode.appendChild datesNode
' NAMES NODE '
For j = 1 To ws.UsedRange.Columns.Count - 1 ' = 12
tit = ws.Cells(1, j + 1)
tmpValue = ws.Cells(i, j + 1)
Set namesnode = doc.createElement(tit)
namesnode.Text = tmpValue
dataNode.appendChild namesnode
Next j
Next i
Next ws
' =============================
' PRETTY PRINT RAW OUTPUT (XSL)
' =============================
xslDoc.LoadXML "<?xml version=" & Chr(34) & "1.0" & Chr(34) & "?>" _
& "<xsl:stylesheet version=" & Chr(34) & "1.0" & Chr(34) _
& " xmlns:xsl=" & Chr(34) & "http://www.w3.org/1999/XSL/Transform" & Chr(34) & ">" _
& "<xsl:strip-space elements=" & Chr(34) & "*" & Chr(34) & " />" _
& "<xsl:output method=" & Chr(34) & "xml" & Chr(34) & " indent=" & Chr(34) & "yes" & Chr(34) & "" _
& " encoding=" & Chr(34) & "UTF-8" & Chr(34) & "/>" _
& " <xsl:template match=" & Chr(34) & "node() | #*" & Chr(34) & ">" _
& " <xsl:copy>" _
& " <xsl:apply-templates select=" & Chr(34) & "node() | #*" & Chr(34) & " />" _
& " </xsl:copy>" _
& " </xsl:template>" _
& "</xsl:stylesheet>"
' XSLT (Transformation)
xslDoc.async = False
doc.transformNodeToObject xslDoc, newDoc
' =================
' Save the XML File
' =================
newDoc.Save ThisWorkbook.Path & "\Output.xml"
MsgBox "Successfully exported Excel data to " & ThisWorkbook.Path & "\Output.XML!", vbInformation
' Regular End of procedure
Exit Sub
ErrHandle:
MsgBox Err.Number & " - " & Err.Description, vbCritical
Exit Sub
End Sub
Note
Sheet names have to be without spaces
Added Note (important hint):
XML Nodes use titles in first row of every sheet. As the modified procedure gets title names via UsedRange it's important not to have any empty cells in row A:A for this example.
Additional remark
I don't know the reason why my prompt answer (marked as "a") was downgraded by someone. I would find it helpful to argue this :-)

ALM Defects list failing with 'User defined type not defined'

I wanted to extract list of defects using a filter criteria. I tried the VBA code from OTA here, but compile fails on the following declarations with User defined type not defined:
Dim BugFact As BugFactory
Dim BugFilter As TDFilter
Dim bugList As List
Dim theBug As Bug
Note: I do not have administrative privileges on ALM.
The full VBA code:
Sub BugFilter()
Dim BugFact As BugFactory
Dim BugFilter As TDFilter
Dim bugList As List
Dim theBug As Bug
Dim i%, msg$
' Get the bug factory filter.
'tdc is the global TDConnection object.
Set BugFact = tdc.BugFactory
Set BugFilter = BugFact.Filter
' Set the filter values.
BugFilter.Filter("BG_STATUS") = "Closed"
BugFilter.order("BG_PRIORITY") = 1
MsgBox BugFilter.Text
'Create a list of defects from the filter
' and show a few of them.
Set bugList = BugFilter.NewList
msg = "Number of defects = " & bugList.Count & Chr(13)
For Each theBug In bugList
msg = msg & theBug.ID & ", " & theBug.Summary & ", " _
& theBug.Status & ", " & theBug.Priority & Chr(13)
i = i + 1
If i > 10 Then Exit For
Next
MsgBox msg
End Sub
You need to add a reference to the OTA COM Type library (see here); otherwise your program will not know about the OTA types such as BugFactory and TDFilter.

Vba to import a sub-portion of a hugh csv file into excel 2010

I have a csv file that has approx 600 fields and approx 100k of rows, i would like to import only select fields and only certian rows where a select set of fields match a certain set of criteria into an existing excel worksheet tab
I attempted to use ms query within excel but it stops at 255 columns, i can import the whole file in excel 2010 (250m) but it is a memory hog and by the time i remove the unneeded fields and rows it locks up my computer.
I would like to kick the import process off with an excel vba macro. I have all the front end code of file selection, etc.... But need some assistance in the text read query convert to excel area of vba
Any assitance would be greatly appreciated
Thanks
Tom
For that many records you would be better off importing the .csv into Microsoft Access, indexing some fields, writing a query that contains only what you want, and then exporting to Excel from the query.
If you really need an Excel-only solution, do the following:
Open up the VBA editor. Navigate to Tools -> References. Select the most recent ActiveX Data Objects Library. (ADO for short). On my XP machine running Excel 2003, it's version 2.8.
Create a module if you don't have one already. Or create one anyway to contain the code at the bottom of this post.
In any blank worksheet paste the following values starting at cell A1:
SELECT Field1, Field2
FROM C:\Path\To\file.csv
WHERE Field1 = 'foo'
ORDER BY Field2
(Formatting issues here. select from, etc should each be in their own row in col A for reference. The other stuff are the important bits and should go in column B.)
Amend the input fields as appropriate for your filename and query requirements, then run thegetCsv() subroutine. It will put the results in a QueryTable object starting at cell C6.
I personally hate QueryTables but the .CopyFromRecordset method I prefer to use with ADO doesn't give you field names. I left the code for that method in, commented out, so you can investigate that way. If you use it, you can get rid of the call to deleteQueryTables() because it's a really ugly hack, it deletes whole columns which you may not like, etc.
Happy coding.
Option Explicit
Function ExtractFileName(filespec) As String
' Returns a filename from a filespec
Dim x As Variant
x = Split(filespec, Application.PathSeparator)
ExtractFileName = x(UBound(x))
End Function
Function ExtractPathName(filespec) As String
' Returns the path from a filespec
Dim x As Variant
x = Split(filespec, Application.PathSeparator)
ReDim Preserve x(0 To UBound(x) - 1)
ExtractPathName = Join(x, Application.PathSeparator) & Application.PathSeparator
End Function
Sub getCsv()
Dim cnCsv As New ADODB.Connection
Dim rsCsv As New ADODB.Recordset
Dim strFileName As String
Dim strSelect As String
Dim strWhere As String
Dim strOrderBy As String
Dim strSql As String
Dim qtData As QueryTable
strSelect = ActiveSheet.Range("B1").Value
strFileName = ActiveSheet.Range("B2").Value
strWhere = ActiveSheet.Range("B3").Value
strOrderBy = ActiveSheet.Range("B4").Value
strSql = "SELECT " & strSelect
strSql = strSql & vbCrLf & "FROM " & ExtractFileName(strFileName)
If strWhere <> "" Then strSql = strSql & vbCrLf & "WHERE " & strWhere
If strOrderBy <> "" Then strSql = strSql & vbCrLf & "ORDER BY " & strOrderBy
With cnCsv
.Provider = "Microsoft.Jet.OLEDB.4.0"
.ConnectionString = "Data Source=" & ExtractPathName(strFileName) & ";" & _
"Extended Properties=""text;HDR=yes;FMT=Delimited(,)"";Persist Security Info=False"
.Open
End With
rsCsv.Open strSql, cnCsv, adOpenForwardOnly, adLockReadOnly, adCmdText
'ActiveSheet.Range("C6").CopyFromRecordset rsCsv
Call deleteQueryTables
Set qtData = ActiveSheet.QueryTables.Add(rsCsv, ActiveSheet.Range("C6"))
qtData.Refresh
rsCsv.Close
Set rsCsv = Nothing
cnCsv.Close
Set cnCsv = Nothing
End Sub
Sub deleteQueryTables()
On Error Resume Next
With Application
.ScreenUpdating = False
.Calculation = xlCalculationManual
End With
Dim qt As QueryTable
Dim qtName As String
Dim nName As Name
For Each qt In ActiveSheet.QueryTables
qtName = qt.Name
qt.Delete
For Each nName In Names
If InStr(1, nName.Name, qtName) > 0 Then
Range(nName.Name).EntireColumn.Delete
nName.Delete
End If
Next nName
Next qt
With Application
.ScreenUpdating = True
.Calculation = xlCalculationAutomatic
End With
End Sub
You can parse your input file extracting the lines that conform to your criteria. The following code uses the split function on each line of the CSV file to separate the fields and then checks to see if it matches the required criteria. If all the criteria match then selected fields are saved in a new CSV file then you can just open the smaller file. You will need to set the microsoft scripting runtime reference in the VBA editor for this to work.
This method should use little memory as it processes 1 line at a time, I tested it on data of 600 fields and 100000 lines and it took about 45 seconds to process the file with no noticable increase in RAM usage in windows task manager. It is CPU intensive and the time taken would increase as the complexity data, conditions and the number of fields copied increases.
If you prefer to write directly to an existing sheet this can be easily acheived, but you would have to rememove any old data there first.
Sub Extract()
Dim fileHandleInput As Scripting.TextStream
Dim fileHandleExtract As Scripting.TextStream
Dim fsoObject As Scripting.FileSystemObject
Dim sPath As String
Dim sFilenameExtract As String
Dim sFilenameInput As String
Dim myVariant As Variant
Dim bParse As Boolean 'To check if the line should be written
sFilenameExtract = "Exctract1.CSV"
sFilenameInput = "Input.CSV"
Set fsoObject = New FileSystemObject
sPath = ThisWorkbook.Path & "\"
'Check if this works ie overwrites existing file
If fsoObject.FileExists(sPath & sFilenameExtract) Then
Set fileHandleExtract = fsoObject.OpenTextFile(sPath & sFilenameExtract, ForWriting)
Else
Set fileHandleExtract = fsoObject.CreateTextFile((sPath & sFilenameExtract), True)
End If
Set fileHandleInput = fsoObject.OpenTextFile(sPath & sFilenameInput, ForReading)
'extracting headers for selected fields in this case the 1st, 2nd and 124th fields
myVariant = Split(fileHandleInput.ReadLine, ",")
fileHandleExtract.WriteLine (myVariant(0) & "," & _
myVariant(1) & "," & _
myVariant(123))
'Parse each line (row) of the inputfile
Do While Not fileHandleInput.AtEndOfStream
myVariant = Split(fileHandleInput.ReadLine, ",")
'Set bParse initially to true
bParse = True
'Check if the first element is greater than 123
If Not myVariant(0) > 123 Then bParse = False
'Check if second element is one of allowed values
'Trim used to remove pesky leading or lagging values when checking
Select Case Trim(myVariant(1))
Case "Red", "Yellow", "Green", "Blue", "Black"
'Do nothing as value found
Case Else
bParse = False 'As wasn't a value in the condition
End Select
'If the conditions were met by the line then write specific fields to extract file
If bParse Then
fileHandleExtract.WriteLine (myVariant(0) & "," & _
myVariant(1) & "," & _
myVariant(123))
End If
Loop
'close files and cleanup
fileHandleExtract.Close
fileHandleInput.Close
Set fileHandleExtract = Nothing
Set fileHandleInput = Nothing
Set fsoObject = Nothing
End Sub

Generate Word Documents (in Excel VBA) from a series of Document Templates

Hey all. I'll try to make this brief and simple. :)
I have
40 or so boilerplate word documents with a series of fields (Name, address, etc) that need to be filled in. This is historically done manually, but it's repetitive and cumbersome.
A workbook where a user has filled a huge set of information about an individual.
I need
A way to programatically (from Excel VBA) open up these boilerplate documents, edit in the value of fields from various named ranges in the workbook, and save the filled in templates to a local folder.
If I were using VBA to programatically edit particular values in a set of spreadsheets, I would edit all those spreadsheets to contain a set of named ranges which could be used during the auto-fill process, but I'm not aware of any 'named field' feature in a Word document.
How could I edit the documents, and create a VBA routine, so that I can open each document, look for a set of fields which might need to be filled in, and substitute a value?
For instance, something that works like:
for each document in set_of_templates
if document.FieldExists("Name") then document.Field("Name").value = strName
if document.FieldExists("Address") then document.Field("Name").value = strAddress
...
document.saveAs( thisWorkbook.Path & "\GeneratedDocs\ " & document.Name )
next document
Things I've considered:
Mail merge - but this is insufficient because it requires opening each document manually and structuring the workbook as a data source, I kind of want the opposite. The templates are the data source and the workbook is iterating through them. Also, mail merge is for creating many identical documents using a table of different data. I have many documents all using the same data.
Using placeholder text such as "#NAME#" and opening each document for a search and replace. This is the solution I would resort to if nothing more elegant is proposed.
It's been a long time since I asked this question, and my solution has undergone more and more refinement. I've had to deal with all sorts of special cases, such as values that come directly from the workbook, sections that need to be specially generated based on lists, and the need to do replacements in headers and footers.
As it turns out, it did not suffice to use bookmarks, as it was possible for users to later edit documents to change, add, and remove placeholder values from the documents. The solution was in fact to use keywords such as this:
This is just a page from a sample document which uses some of the possible values that can get automatically inserted into a document. Over 50 documents exist with completely different structures and layouts, and using different parameters. The only common knowledge shared by the word documents and the excel spreadsheet is a knowledge of what these placeholder values are meant to represent. In excel, this is stored in a list of document generation keywords, which contain the keyword, followed by a reference to the range that actually contains this value:
These were the key two ingredients required. Now with some clever code, all I had to do was iterate over each document to be generated, and then iterate over the range of all known keywords, and do a search and replace for each keyword in each document.
First, I have the wrapper method, which takes care of maintaining an instance of microsoft word iterating over all documents selected for generation, numbering the documents, and doing the user interface stuff (like handling errors, displaying the folder to the user, etc.)
' Purpose: Iterates over and generates all documents in the list of forms to generate
' Improves speed by creating a persistant Word application used for all generated documents
Public Sub GeneratePolicy()
Dim oWrd As New Word.Application
Dim srcPath As String
Dim cel As Range
If ERROR_HANDLING Then On Error GoTo errmsg
If Forms.Cells(2, FormsToGenerateCol) = vbNullString Then _
Err.Raise 1, , "There are no forms selected for document generation."
'Get the path of the document repository where the forms will be found.
srcPath = FindConstant("Document Repository")
'Each form generated will be numbered sequentially by calling a static counter function. This resets it.
GetNextEndorsementNumber reset:=True
'Iterate over each form, calling a function to replace the keywords and save a copy to the output folder
For Each cel In Forms.Range(Forms.Cells(2, FormsToGenerateCol), Forms.Cells(1, FormsToGenerateCol).End(xlDown))
RunReplacements cel.value, CreateDocGenPath(cel.Offset(0, 1).value), oWrd
Next cel
oWrd.Quit
On Error Resume Next
'Display the folder containing the generated documents
Call Shell("explorer.exe " & CreateDocGenPath, vbNormalFocus)
oWrd.Quit False
Application.StatusBar = False
If MsgBox("Policy generation complete. The reserving information will now be recorded.", vbOKCancel, _
"Policy Generated. OK to store reserving info?") = vbOK Then Push_Reserving_Requirements
Exit Sub
errmsg:
MsgBox Err.Description, , "Error generating Policy Documents"
End Sub
That routine calls RunReplacements which takes care of opening the document, prepping the environment for a fast replacement, updating links once done, handling errors, etc:
' Purpose: Opens up a document and replaces all instances of special keywords with their respective values.
' Creates an instance of Word if an existing one is not passed as a parameter.
' Saves a document to the target path once the template has been filled in.
'
' Replacements are done using two helper functions, one for doing simple keyword replacements,
' and one for the more complex replacements like conditional statements and schedules.
Private Sub RunReplacements(ByVal DocumentPath As String, ByVal SaveAsPath As String, _
Optional ByRef oWrd As Word.Application = Nothing)
Dim oDoc As Word.Document
Dim oWrdGiven As Boolean
If oWrd Is Nothing Then Set oWrd = New Word.Application Else oWrdGiven = True
If ERROR_HANDLING Then On Error GoTo docGenError
oWrd.Visible = False
oWrd.DisplayAlerts = wdAlertsNone
Application.StatusBar = "Opening " & Mid(DocumentPath, InStrRev(DocumentPath, "\") + 1)
Set oDoc = oWrd.Documents.Open(Filename:=DocumentPath, Visible:=False)
RunAdvancedReplacements oDoc
RunSimpleReplacements oDoc
UpdateLinks oDoc 'Routine which will update calculated statements in Word (like current date)
Application.StatusBar = "Saving " & Mid(DocumentPath, InStrRev(DocumentPath, "\") + 1)
oDoc.SaveAs SaveAsPath
GoTo Finally
docGenError:
MsgBox "Un unknown error occurred while generating document: " & DocumentPath & vbNewLine _
& vbNewLine & Err.Description, vbCritical, "Document Generation"
Finally:
If Not oDoc Is Nothing Then oDoc.Close False: Set oDoc = Nothing
If Not oWrdGiven Then oWrd.Quit False
End Sub
That routine then invokes RunSimpleReplacements. and RunAdvancedReplacements. In the former, we iterate over the set of Document Generation Keywords and call WordDocReplace if the document contains our keyword. Note that it's much faster to try and Find a bunch of words to figure out that they don't exist, then to call replace indiscriminately, so we always check if a keyword exists before attempting to replace it.
' Purpose: While short, this short module does most of the work with the help of the generation keywords
' range on the lists sheet. It loops through every simple keyword that might appear in a document
' and calls a function to have it replaced with the corresponding data from pricing.
Private Sub RunSimpleReplacements(ByRef oDoc As Word.Document)
Dim DocGenKeys As Range, valueSrc As Range
Dim value As String
Dim i As Integer
Set DocGenKeys = Lists.Range("DocumentGenerationKeywords")
For i = 1 To DocGenKeys.Rows.Count
If WordDocContains(oDoc, "#" & DocGenKeys.Cells(i, 1).Text & "#") Then
'Find the text that we will be replacing the placeholder keyword with
Set valueSrc = Range(Mid(DocGenKeys.Cells(i, 2).Formula, 2))
If valueSrc.MergeCells Then value = valueSrc.MergeArea.Cells(1, 1).Text Else value = valueSrc.Text
'Perform the replacement
WordDocReplace oDoc, "#" & DocGenKeys.Cells(i, 1).Text & "#", value
End If
Next i
End Sub
This is the function used to detect whether a keyword exists in the document:
' Purpose: Function called for each replacement to first determine as quickly as possible whether
' the document contains the keyword, and thus whether replacement actions must be taken.
Public Function WordDocContains(ByRef oDoc As Word.Document, ByVal searchFor As String) As Boolean
Application.StatusBar = "Checking for keyword: " & searchFor
WordDocContains = False
Dim storyRange As Word.Range
For Each storyRange In oDoc.StoryRanges
With storyRange.Find
.Text = searchFor
WordDocContains = WordDocContains Or .Execute
End With
If WordDocContains Then Exit For
Next
End Function
And this is where the rubber meets the road - the code that executes the replacement. This routine got more complicated as I encountered difficulties. Here are the lessons you will only learn from experience:
You can set the replacement text directly, or you can use the clipboard. I found out the hard way that if you are doing a VBA replace in word using a string longer than 255 characters, the text will get truncated if you try to place it in the Find.Replacement.Text, but you can use "^c" as your replacement text, and it will get it directly from the clipboard. This was the workaround I got to use.
Simply calling replace will miss keywords in some text areas like headers and footers. Because of this, you actually need to iterate over the document.StoryRanges and run the search and replace on each one to ensure that you catch all instances of the word you want to replace.
If you're setting the Replacement.Text directly, you need to convert Excel line breaks (vbNewLine and Chr(10)) with a simple vbCr for them to appear properly in word. Otherwise, anywhere your replacement text has line breaks coming from an excel cell will end up inserting strange symbols into word. If you use the clipboard method however, you do not need to do this, as the line breaks get converted automatically when put in the clipboard.
That explains everything. Comments should be pretty clear too. Here's the golden routine that executes the magic:
' Purpose: This function actually performs replacements using the Microsoft Word API
Public Sub WordDocReplace(ByRef oDoc As Word.Document, ByVal replaceMe As String, ByVal replaceWith As String)
Dim clipBoard As New MSForms.DataObject
Dim storyRange As Word.Range
Dim tooLong As Boolean
Application.StatusBar = "Replacing instances of keyword: " & replaceMe
'We want to use regular search and replace if we can. It's faster and preserves the formatting that
'the keyword being replaced held (like bold). If the string is longer than 255 chars though, the
'standard replace method doesn't work, and so we must use the clipboard method (^c special character),
'which does not preserve formatting. This is alright for schedules though, which are always plain text.
If Len(replaceWith) > 255 Then tooLong = True
If tooLong Then
clipBoard.SetText IIf(replaceWith = vbNullString, "", replaceWith)
clipBoard.PutInClipboard
Else
'Convert excel in-cell line breaks to word line breaks. (Not necessary if using clipboard)
replaceWith = Replace(replaceWith, vbNewLine, vbCr)
replaceWith = Replace(replaceWith, Chr(10), vbCr)
End If
'Replacement must be done on multiple 'StoryRanges'. Unfortunately, simply calling replace will miss
'keywords in some text areas like headers and footers.
For Each storyRange In oDoc.StoryRanges
Do
With storyRange.Find
.MatchWildcards = True
.Text = replaceMe
.Replacement.Text = IIf(tooLong, "^c", replaceWith)
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
On Error Resume Next
Set storyRange = storyRange.NextStoryRange
On Error GoTo 0
Loop While Not storyRange Is Nothing
Next
If tooLong Then clipBoard.SetText ""
If tooLong Then clipBoard.PutInClipboard
End Sub
When the dust settles, we're left with a beautiful version of the initial document with production values in place of those hash marked keywords. I'd love to show an example, but of course every filled in document contain all-proprietary information.
The only think left to mention I guess would be that RunAdvancedReplacements section. It does something extremely similar - it ends up calling the same WordDocReplace function, but what's special about the keywords used here is that they don't link to a single cell in the original workbook, they get generated in the code-behind from lists in the workbook. So for instance, one of the advanced replacements would look like this:
'Generate the schedule of vessels
If WordDocContains(oDoc, "#VESSELSCHEDULE#") Then _
WordDocReplace oDoc, "#VESSELSCHEDULE#", GenerateVesselSchedule()
And then there will be a corresponding routine which puts together a string containing all the vessel information as configured by the user:
' Purpose: Generates the list of vessels from the "Vessels" sheet based on the user's configuration
' in the booking tab. The user has the option to generate one or both of Owned Vessels
' and Chartered Vessels, as well as what fields to display. Uses a helper function.
Public Function GenerateVesselSchedule() As String
Dim value As String
Application.StatusBar = "Generating Schedule of Vessels."
If Booking.Range("ListVessels").value = "Yes" Then
Dim VesselCount As Long
If Booking.Range("ListVessels").Offset(1).value = "Yes" Then _
value = value & GenerateVesselScheduleHelper("Vessels", VesselCount)
If Booking.Range("ListVessels").Offset(1).value = "Yes" And _
Booking.Range("ListVessels").Offset(2).value = "Yes" Then _
value = value & "(Chartered Vessels)" & vbNewLine
If Booking.Range("ListVessels").Offset(2).value = "Yes" Then _
value = value & GenerateVesselScheduleHelper("CharteredVessels", VesselCount)
If Len(value) > 2 Then value = Left(value, Len(value) - 2) 'Remove the trailing line break
Else
GenerateVesselSchedule = Booking.Range("VesselSchedAlternateText").Text
End If
GenerateVesselSchedule = value
End Function
' Purpose: Helper function for the Vessel Schedule generation routine. Generates either the Owned or
' Chartered vessels based on the schedule parameter passed. The list is numbered and contains
' the information selected by the user on the Booking sheet.
' SENSITIVE: Note that this routine is sensitive to the layout of the Vessel Schedule tab and the
' parameters on the Configure Quotes tab. If either changes, it should be revisited.
Public Function GenerateVesselScheduleHelper(ByVal schedule As String, ByRef VesselCount As Long) As String
Dim value As String, nextline As String
Dim numInfo As Long, iRow As Long, iCol As Long
Dim Inclusions() As Boolean, Columns() As Long
'Gather info about vessel info to display in the schedule
With Booking.Range("VesselInfoToInclude")
numInfo = Booking.Range(.Cells(1, 1), .End(xlToRight)).Columns.Count - 1
ReDim Inclusions(1 To numInfo)
ReDim Columns(1 To numInfo)
On Error Resume Next 'Some columns won't be identified
For iCol = 1 To numInfo
Inclusions(iCol) = .Offset(0, iCol) = "Yes"
Columns(iCol) = sumSchedVessels.Range(schedule).Cells(1).EntireRow.Find(.Offset(-1, iCol)).Column
Next iCol
On Error GoTo 0
End With
'Build the schedule
With sumSchedVessels.Range(schedule)
For iRow = .row + 1 To .row + .Rows.Count - 1
If Len(sumSchedVessels.Cells(iRow, Columns(1)).value) > 0 Then
VesselCount = VesselCount + 1
value = value & VesselCount & "." & vbTab
nextline = vbNullString
'Add each property that was included to the description string
If Inclusions(1) Then nextline = nextline & sumSchedVessels.Cells(iRow, Columns(1)) & vbTab
If Inclusions(2) Then nextline = nextline & "Built: " & sumSchedVessels.Cells(iRow, Columns(2)) & vbTab
If Inclusions(3) Then nextline = nextline & "Length: " & _
Format(sumSchedVessels.Cells(iRow, Columns(3)), "#'") & vbTab
If Inclusions(4) Then nextline = nextline & "" & sumSchedVessels.Cells(iRow, Columns(4)) & vbTab
If Inclusions(5) Then nextline = nextline & "Hull Value: " & _
Format(sumSchedVessels.Cells(iRow, Columns(5)), "$#,##0") & vbTab
If Inclusions(6) Then nextline = nextline & "IV: " & _
Format(sumSchedVessels.Cells(iRow, Columns(6)), "$#,##0") & vbTab
If Inclusions(7) Then nextline = nextline & "TIV: " & _
Format(sumSchedVessels.Cells(iRow, Columns(7)), "$#,##0") & vbTab
If Inclusions(8) And schedule = "CharteredVessels" Then _
nextline = nextline & "Deductible: " & Format(bmCharterers.Range(schedule).Cells( _
iRow - .row, 9), "$#,##0") & vbTab
nextline = Left(nextline, Len(nextline) - 1) 'Remove the trailing tab
'If more than 4 properties were included insert a new line after the 4th one
Dim tabloc As Long: tabloc = 0
Dim counter As Long: counter = 0
Do
tabloc = tabloc + 1
tabloc = InStr(tabloc, nextline, vbTab)
If tabloc > 0 Then counter = counter + 1
Loop While tabloc > 0 And counter < 4
If counter = 4 Then nextline = Left(nextline, tabloc - 1) & vbNewLine & Mid(nextline, tabloc)
value = value & nextline & vbNewLine
End If
Next iRow
End With
GenerateVesselScheduleHelper = value
End Function
the resulting string can be used just like the contents of any excel cell, and passed to the replacement function, which will appropriately use the clipboard method if it exceeds 255 characters.
So this template:
Plus this spreadsheet data:
Becomes this document:
I sincerely hope that this helps someone out some day. It was definitely a huge undertaking and a complex wheel to have to re-invent. The application is huge, with over 50,000 lines of VBA code, so if I've referenced a crucial method in my code somewhere that someone needs, please leave a comment and I'll add it in here.
http://www.computorcompanion.com/LPMArticle.asp?ID=224 Describes the use of Word bookmarks
A section of text in a document can be bookmarked, and given a variable name. Using VBA, this variable can be accessed and the content in the document can be replaced with alternate content. This is a solution to having placeholders such as Name and Address in the document.
Furthermore, using bookmarks, documents can be modified to reference bookmarked text. If a name appears several times throughout a document, the first instance can be bookmarked, and additional instances can reference the bookmark. Now when the first instance is programatically changed, all other instances of the variable throughout the document are also automatically changed.
Now all that's needed is to update all the documents by bookmarking the placeholder text and using a consistent naming convention throughout the documents, then iterate through each documents replacing the bookmark if it exists:
document.Bookmarks("myBookmark").Range.Text = "Inserted Text"
I can probably solve the problem of variables that don't appear in a given document using the on error resume next clause before attempting each replacement.
Thanks to Doug Glancy for mentioning the existance of bookmarks in his comment. I had no knowledge of their existence beforehand. I will keep this topic posted on whether this solution suffices.
You might consider an XML based approach.
Word has a feature called Custom XML data-binding, or data-bound content controls. A content control is essentially a point in the document which can contain content. A "data-bound" content control gets its content from an XML document you include in the docx zip file. An XPath expression is used to say which bit of XML. So all you need to do is include your XML file, and Word will do the rest.
Excel has ways to get data out of it as XML, so the whole solution should work nicely.
There is plenty of information on content control data-binding on MSDN (some of which has been referenced in earlier SO questions) so I won't bother including them here.
But you do need a way of setting up the bindings. You can either use the Content Control Toolkit, or if you want to do it from within Word, my OpenDoPE add-in.
Having done a similar task I found that inserting values into tables was much quicker than searching for named tags - the data can then be inserted like this:
With oDoc.Tables(5)
For i = 0 To Data.InvoiceDictionary.Count - 1
If i > 0 Then
oDoc.Tables(5).rows.Add
End If
Set invoice = Data.InvoiceDictionary.Items(i)
.Cell(i + 2, 1).Range.Text = invoice.InvoiceCCNumber
.Cell(i + 2, 2).Range.Text = invoice.InvoiceDate
.Cell(i + 2, 3).Range.Text = invoice.TransactionType
.Cell(i + 2, 4).Range.Text = invoice.Description
.Cell(i + 2, 5).Range.Text = invoice.SumOfValue
Next i
.Cell(i + 1, 4).Range.Text = "Total:"
End With
in this case row 1 of the table was the headers; row 2 was empty and there were no further rows - thus the rows.add applies once more than one row was attached. The tables can be very detailed documents and by hiding the borders and cell borders can be made to look like ordinary text. Tables are numbered sequentially following the document flow. (i.e. Doc.Tables(1) is the first table...

Resources