How to extract headings from Word files to Excel? - excel

I have hundreds of Word files (docx) which each have various headings, defined as Heading 1, Heading 2, Heading 3, etc. Each of these files has a table of contents which correspond to the headings.
I want to extract each heading from each of these files into an Excel workbook to build a database.
My first attempt was to extract the headings from a single Word document into an Excel workbook. I found code online to extract headings from Word to Outlook, and also separate code to extract headings from Word to a new Word file.
I haven't been able to adapt either of these.
How do I extract headings from a single Word file to Excel? I will then try to work out further steps.
Word to Outlook
Sub CopyHeadingsIntoOutlookMail()
Dim objOutlookApp, objMail As Object
Dim objMailDocument As Word.Document
Dim objMailRange As Word.Range
Dim varHeadings As Variant
Dim i As Long
Dim strText As String
Dim nLongDiff As Integer
'Create a new Outlook email
Set objOutlookApp = CreateObject("Outlook.Application")
Set objMail = objOutlookApp.CreateItem(olMailItem)
objMail.Display
Set objMailDocument = objMail.GetInspector.WordEditor
Set objMailRange = objMailDocument.Range(0, 0)
'Get the headings of the current Word document
varHeadings = ActiveDocument.GetCrossReferenceItems(wdRefTypeHeading)
For i = LBound(varHeadings) To UBound(varHeadings)
strText = Trim(varHeadings(i))
'Get the heading level
nLongDiff = Len(RTrim$(CStr(varHeadings(i)))) - Len(Trim(CStr(varHeadings(i))))
nHeadingLevel = (nLongDiff / 2) + 1
'Insert the heading into the Outlook mail
With objMailRange
.InsertAfter strText & vbNewLine
.Style = "Heading " & nHeadingLevel
.Collapse wdCollapseEnd
End With
Next i
End Sub
Word to Word
Public Sub CreateOutline()
Dim docOutline As Word.Document
Dim docSource As Word.Document
Dim rng As Word.Range
Dim astrHeadings As Variant
Dim strText As String
Dim intLevel As Integer
Dim intItem As Integer
Set docSource = ActiveDocument
Set docOutline = Documents.Add
' Content returns only the
' main body of the document, not
' the headers and footer.
Set rng = docOutline.Content
astrHeadings = _
docSource.GetCrossReferenceItems(wdRefTypeHeading)
For intItem = LBound(astrHeadings) To UBound(astrHeadings)
' Get the text and the level.
strText = Trim$(astrHeadings(intItem))
intLevel = GetLevel(CStr(astrHeadings(intItem)))
' Add the text to the document.
rng.InsertAfter strText & vbNewLine
' Set the style of the selected range and
' then collapse the range for the next entry.
rng.Style = "Heading " & intLevel
rng.Collapse wdCollapseEnd
Next intItem
End Sub
Private Function GetLevel(strItem As String) As Integer
' Return the heading level of a header from the
' array returned by Word.
' The number of leading spaces indicates the
' outline level (2 spaces per level: H1 has
' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
Dim strTemp As String
Dim strOriginal As String
Dim intDiff As Integer
' Get rid of all trailing spaces.
strOriginal = RTrim$(strItem)
' Trim leading spaces, and then compare with
' the original.
strTemp = LTrim$(strOriginal)
' Subtract to find the number of
' leading spaces in the original string.
intDiff = Len(strOriginal) - Len(strTemp)
GetLevel = (intDiff / 2) + 1
End Function

Try the following Excel macro. When you run it, simply select the folder to process.
Sub GetTOCHeadings()
'Note: this code requires a reference to the Word object model.
'See under the VBE's Tools|References.
Application.ScreenUpdating = False
Dim wdDoc As Word.Document, wdRng As Word.Range, wdPara As Word.Paragraph
Dim strFolder As String, strFile As String
Dim WkSht As Worksheet, i As Long, j As Long
strFolder = GetFolder
If strFolder = "" Then Exit Sub
Dim wdApp As New Word.Application
Set WkSht = ActiveSheet
i = WkSht.Cells(WkSht.Rows.Count, 1).End(xlUp).Row
wdApp.WordBasic.DisableAutoMacros
wdApp.DisplayAlerts = wdAlertsNone
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
i = i + 1
Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
With wdDoc
j = 1: WkSht.Cells(i, j) = strFile
If .TablesOfContents.Count > 0 Then
With .TablesOfContents(1)
.IncludePageNumbers = False
.Update
Set wdRng = .Range
End With
With wdRng
.Fields(1).Unlink
For Each wdPara In .Paragraphs
j = j + 1
WkSht.Cells(i, j).Value = Replace(wdPara.Range.Text, vbTab, " ")
Next
End With
End If
.Close SaveChanges:=False
End With
strFile = Dir()
Wend
wdApp.DisplayAlerts = wdAlertsAll
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing: Set WkSht = Nothing
Application.ScreenUpdating = True
End Sub
Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function

Related

Open Multiple WORD FILES based on a list, perform tasks , save and close

I'd like to open a bunch of word files, from a list of file names in my excel workbook, activate the opened word files, perform a text replacement, and save the changes.
I can't make the liaison between Excel VBA and Word files.
Dim wdApp As Object, wddoc As Object
'Declare a string variable for Word document
Dim strDocName As String
On Error Resume Next
Set wdApp = CreateObject("Word.Application")
wdApp.Visible = True
strDocName = xDPathStr \ "FILENAME.docx" '(xDPathStr is the path I have defined earlier with all the word files)'
'Open Word File
wdApp.Activate
Set wddoc = wdApp.Documents(strDocName)
If wddoc Is Nothing Then Set wddoc = wdApp.Documents.Open(strDocName)
wddoc.Activate
wrdApp.Visible = True
Modify Word Files From a List in Excel
It is assumed that the file names are in column A. It will open each file and replace all occurrences of one string with another.
The focus here is on how to reference (open) Word, open files, modify them (not so much), close them with saving changes, and finally close Word only if it was initially closed.
Option Explicit
Sub VisitWord()
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Late Binding (not recommended at this stage)
' When you get familiar with how it works, switch to Late Binding:
' Dim wdApp As Object
' Dim WordWasClosed As Boolean
' On Error Resume Next ' see if Word is open
' Set wdApp = GetObject(, "Word.Application") ' attempt to create a reference to it
' On Error GoTo 0
' If wdApp Is Nothing Then ' Word is not open
' WordWasClosed = True
' Set wdApp = CreateObject("Word.Application") ' open and create a reference to it
' End If
' wdApp.Visible = True ' default is false; outcomment when done testing
' Dim wdDoc As Object
' ' etc.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Early Binding
' For this to work, in Excel, you need to create a reference to
' Tools > References > Microsoft Word 16.0 Object Library
' Use this to have the Word intellisense work for you.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Const WordFolderPath As String = "C:\Test\"
Const FINDSTRING As String = "Old String"
Const REPLACESTRING As String = "New String"
Dim wb As Workbook: Set wb = ThisWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets("Sheet1")
Dim rg As Range
Set rg = ws.Range("A2", ws.Cells(ws.Rows.Count, "A").End(xlUp))
Dim wdApp As Word.Application
Dim WordWasClosed As Boolean
On Error Resume Next ' see if Word is open
Set wdApp = Word.Application ' attempt to create a reference to it
On Error GoTo 0
If wdApp Is Nothing Then ' Word is not open
WordWasClosed = True
Set wdApp = New Word.Application ' open and create a reference to it
End If
wdApp.Visible = True ' default is false; outcomment when done testing
Dim cell As Range
Dim wdDoc As Word.Document
Dim WordFileName As String
Dim WordFilePath As String
For Each cell In rg.Cells
WordFileName = CStr(cell.Value)
If Len(WordFileName) > 0 Then
WordFilePath = WordFolderPath & WordFileName
If Len(Dir(WordFilePath)) > 0 Then ' file exists
Set wdDoc = wdApp.Documents.Open(WordFilePath)
' Here you do the damage...
wdDoc.Content.Find.Execute _
FindText:=FINDSTRING, _
ReplaceWith:=REPLACESTRING, _
Format:=True, _
Replace:=wdReplaceAll
wdDoc.Close SaveChanges:=True
End If
End If
Next cell
If WordWasClosed Then wdApp.Quit
End Sub
So this is the code i've come up with so far:
Dim wdApp As Object, wddoc As Object
'Declare a string variable for Word document
Dim strDocName As String
On Error Resume Next
Set wdApp = CreateObject("Word.Application")
wdApp.Visible = True
strDocName = xDPathStr \ "FILENAME.docx" '(xDPathStr is the path I have defined earlier with all the word files)'
'Open Word File
wdApp.Activate
Set wddoc = wdApp.Documents(strDocName)
If wddoc Is Nothing Then Set wddoc = wdApp.Documents.Open(strDocName)
wddoc.Activate
wrdApp.Visible = True
Here is some sample code that I created recently to loop through cells in Excel, which are paths to Word files. Each Word file is opened, scanned for a table (in Word), and copy/paste to Excel. See if you can start with this. Post back if you have additional questions.
Sub LoopThroughAllWordFiles()
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim oTbl As Word.Table
Dim strFolder As String, strFile As String
Dim WkSht As Worksheet
Dim cnt As Long
Dim tableCount As Long
Dim lrow As Long
Dim lastrow As Long
Dim file As String
Dim rng As Range, cell As Range
Dim objDoc As Object
Dim objWord As Object
Dim StartTime As Double
Dim SecondsElapsed As Double
StartTime = Timer
Set rng = Worksheets("Files").Range("A1:A200")
Worksheets("Word_Tables").Select
filecounter = 1
cnt = 1
Set objWord = CreateObject("Word.Application")
obj.Word.Visible = False
For Each cell In rng.SpecialCells(xlCellTypeVisible)
MyStr = Right(cell, 5)
If MyStr = ".docx" Then
mylength = Len(cell)
pos = InStrRev(cell, "\")
strFolder = Left(cell, pos)
strFile = Right(cell, mylength - pos)
Worksheets("Word_Files").Select
Set objWord = wdApp.Documents.Open(Filename:=strFolder & strFile, AddToRecentFiles:=False, ReadOnly:=True, Visible:=False)
Set sht = ThisWorkbook.Worksheets("Word_Files")
lastrow = Worksheets("Word_Files").UsedRange.Rows.Count + 1
totTbl = objDoc.Tables.Count
Debug.Print totTbl
For Each oTbl In objDoc.Tables
strCellText = oTbl.cell(1, 1).Range.Text
strCellText = LCase(strCellText)
Debug.Print strCellText
If strCellText Like "*data input*" Then
Worksheets("Word_Files").Range("A" & lastrow) = strFolder & strFile
On Error Resume Next
If cnt = 1 Then
lastrow = lastrow
Else
lastrow = ActiveSheet.UsedRange.Rows.Count
End If
oTbl.Range.Copy
Range("B" & lastrow).Select
sht.Paste
cnt = cnt + 1
End If
Next oTbl
End If
filecounter = filecounter + 1
Debug.Print filecounter
objWord.Close
Next cell
objDoc.Quit
Set objDoc = Nothing
objWord.Quit
Set objWord = Nothing
SecondsElapsed = Round(Timer - StartTime, 2)
SecondsFinal = SecondsElapsed / 60
MsgBox ("Code ran in " & SecondsFinal & "minutes.")
End Sub

How to paste values on every page of word document from Excel VBA?

I have a long list of word-paths and start- and endtags in Excel. I need to open the word document using the path specified in Excel, and paste a start-tag on the beginning of every page, and an end-tag on every end of a page. Every document has three pages.
I'm struggling with Excel VBA and cant seem to get it to work. Can anyone help me?
I need my code to run through the list, opening the file, copy the starttag on the beginning of each page, and the end tag on the end of each page, save and close the document and go on to the next document.
My excel structure
Until now, I managed to open my excel document
Sub startword()
Set WordApp = CreateObject("word.Application")
Path = Range("B2").Value & Range("F5").Value
WordApp.Documents.Open Path
WordApp.Visible = True
End Sub
And I was able to copy and paste values to a NEW document.
Sub copyrange()
'declare word vars
Dim WrdApp As Word.Application
Dim WrdDoc As Word.Document
'Path = Range("B2").Value & Range("F5").Value
'declare excel vars
Dim ExcRng As Range
'create new word instance
Set WrdApp = New Word.Application
WrdApp.Visible = True
WrdApp.Activate
Set WrdDoc = WrdApp.Documents.Add
'create reference to range i want to copy
Set ExcRng = ActiveSheet.Range("B2:E6")
'copy the range and wait for a bit
ExcRng.Copy
Application.Wait Now() + #12:00:01 AM#
'paste the object in word
WrdDoc.Paragraphs(1).Range.PasteExcelTable LinkedToExcel:=True, WordFormatting:=True, RTF:=False
WrdApp.Selection.GoTo What:=wdGoToPage, Which:=wdGoToNext
WrdDoc.Paragraphs(1).Range.PasteSpecial Link:=True, DataType:=wdPasteOLEObject
'clear clipboard
Application.CutCopyMode = False
End Sub
The range is totally random
PART TWO OF THE QUESTION
I'm struggling with the next piece of my code. I need to extract the contents between the first start and end tag (with the tag included) and move them to doc 1, same with page 2 to doc2, page 3 to doc 3. So I'll get three documents. doc1 with all the first pages of my documents, doc 2 with all the 2nd pages etc. I've made an attempt to find/select the code, but it selects the first and the last page, not the first one.
This is my current code for opening the word docs one by one:
Sub SelectRangeBetween()
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets("Sheet1") 'Change to the correct sheetname
Dim wrdApp As Word.Application
Dim WrdDoc As Word.Document
Set wrdApp = New Word.Application '
wrdApp.Visible = True 'set to false for higher speed
Const StarttagColumn = "C" 'Edit this for the column of the starttag.
Const EndtagColumn = "D" 'Edit this for the column of the endtag.
Const FilelocationColumn = "E" 'Edit this for the column of the Filelocation.
Const startRow As Long = 5 'This is the first row of tags and filenames
'Const endRow As Long = 140 'uncomment if you want a fixed amount of rows (for ranges with empty cells)
Dim endRow As Long 'comment out if const-endrow is used
endRow = ws.Range("B" & Rows.Count).End(xlUp).Row 'comment out if const-endrow is used
Dim i As Long
For i = startRow To endRow
Dim wrdPath As String
wrdPath = ws.Cells(i, FilelocationColumn).Value2 '
If wrdPath <> vbNullString Then '
If Dir(wrdPath) <> vbNullString Then '
Dim startTag As String '
Dim endTag As String '
startTag = ws.Cells(i, StarttagColumn).Value2 '
endTag = ws.Cells(i, EndtagColumn).Value2 '
Set WrdDoc = wrdApp.Documents.Open(wrdPath) '
With wrdApp
'.Documents.Add
' .Visible = True
' Types the text
'.Selection.HomeKey Unit:=wdStory
'.Selection.TypeText Text:="Hello and Goodbye"
' The Real script
'Dim StartWord As String, EndWord As String
'StartWord = "Hello"
'EndWord = "Goodbye"
With .ActiveDocument.Content.Duplicate
.Find.Execute FindText:=startTag & "*" & endTag, MatchWildcards:=False
.MoveStart wdCharacter, Len(StardWord)
.MoveEnd wdCharacter, -Len(EndWord)
.Select ' Or whatever you want to do
End With
End With
With WrdDoc
.Close
End With
End If
End If
Next i
End Sub
Try this version, I suggest you try with a small batch of documents first as the document will be saved immediately after pasting the tag. (comment out the lines if you do not want to save and/or close):
Option Explicit
Private Sub PasteTagsToDocument()
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets("Sheet1") 'Change to the correct name
Const startRow As Long = 5
Dim endRow As Long
endRow = ws.Range("B" & Rows.Count).End(xlUp).Row
Dim wrdApp As Word.Application
Dim wrdDoc As Word.Document
Set wrdApp = New Word.Application
wrdApp.Visible = True
Dim i As Long
For i = startRow To endRow
Dim wrdPath As String
wrdPath = ws.Cells(i, 2).Value2
If wrdPath <> vbNullString Then
If Dir(wrdPath) <> vbNullString Then
Dim startTag As String
Dim endTag As String
startTag = ws.Cells(i, 3).Value2
endTag = ws.Cells(i, 4).Value2
Set wrdDoc = wrdApp.Documents.Open(wrdPath)
With wrdDoc
.Range(0, 0).InsertBefore startTag & vbNewLine
.GoTo(wdGoToPage, wdGoToAbsolute, 2).InsertBefore endTag & vbNewLine & startTag & vbNewLine
.GoTo(wdGoToPage, wdGoToAbsolute, 3).InsertBefore endTag & vbNewLine & startTag & vbNewLine
.Range.Paragraphs.Last.Range.InsertAfter vbNewLine & endTag
.Save 'Comment out if you do not want to save
.Close 'Comment out if you do not want to close the document
End With
Else
If MsgBox("File don't exist. " & vbNewLine & wrdPath & vbNewLine & "Click Ok to Continue or Cancel to stop the macro.", vbOKCancel) = vbCancel Then Exit For
End If
End If
Next i
Set ws = Nothing
Set wrdDoc = Nothing
wrdApp.Quit
Set wrdApp = Nothing
MsgBox "Complete!"
End Sub

Update file names when the file names contain specific string referenced from an Excel table

I have created a fully functional outlook macro, that downloads Outlook attachments to OneDrive specified folder.
So the macro would update the file name with the email domain and month/year
e.g.
from original attachment name "Invoice_GBR_Z-GRX_2019_07.pdf"
it becomes "comfone.com_08-2019___Invoice_GBR_Z-GRX_2019_07.pdf" after executing the macro.
However, I would like the macro to also have the ability to compare against a static Excel table called Table.xls on my desktop (2 columns where column A contain the email domain name, and column B containing its respective company code), wherein if the Excel cell contains "comfone.com", then its corresponding company code say 0001 would then be appended to the file name
So the file name gets updated to
"0001_comfone.com_08-2019___Invoice_GBR_Z-GRX_2019_07.pdf"
I'm struggling quite a fair bit not knowing how to reference to an Excel table from my Outlook vba.
Public Sub SaveAttachments()
Dim objOL As Outlook.Application
Dim objMsg As Outlook.MailItem 'Object
Dim objAttachments As Outlook.Attachments
Dim objSelection As Outlook.Selection
Dim i As Long
Dim lngCount As Long
Dim strFile As String
Dim strFolderpath As String
Dim strDeletedFiles As String
Dim saveName As String
Dim userName As String
Dim sndrEmailAdd As String
Dim sndrEmailRight As String
Dim sndrEmailPreDot As String
' Get the path to your OneDrive folder.
userName = CreateObject("WScript.Network").userName
Debug.Print userName
'strFolderpath = "C:\Users\" & VBA.Environ$("USERNAME") & "\OneDrive - SAP
SE"
strFolderpath = "C:\Users\" & userName & "\OneDrive - SAP SE"
On Error Resume Next
' Instantiate an Outlook Application object.
Set objOL = CreateObject("Outlook.Application")
' Get the collection of selected objects.
Set objSelection = objOL.ActiveExplorer.Selection
' Set the Test folder, change "Test" to any folder name in your OneDrive
strFolderpath = strFolderpath & "\Downloaded Invoices\"
' Check each selected item for attachments. If attachments exist,
' save them to the strFolderPath folder and strip them from the item.
For Each objMsg In objSelection
' This code only strips attachments from mail items.
' If objMsg.class=olMail Then
' Get the Attachments collection of the item.
Set objAttachments = objMsg.Attachments
lngCount = objAttachments.Count
strDeletedFiles = ""
If lngCount > 0 Then
' We need to use a count down loop for removing items
' from a collection. Otherwise, the loop counter gets
' confused and only every other item is removed.
For i = lngCount To 1 Step -1
'Extract text, after # and before dot, from the email address.
sndrEmailAdd = objMsg.SenderEmailAddress
Debug.Print sndrEmailAdd
'Debug.Print " position of # sign: " & InStr(sndrEmailAdd, "#")
'Debug.Print " number of characters right of # sign: " &
Len(sndrEmailAdd) - InStr(sndrEmailAdd, "#")
'sndrEmailRight = Right(sndrEmailAdd, Len(sndrEmailAdd) -
InStr(sndrEmailAdd, "#"))
sndrEmailRight = Right(sndrEmailAdd, Len(sndrEmailAdd) -
InStr(sndrEmailAdd, "#"))
Debug.Print " text after # sign: " & sndrEmailRight
Debug.Print " position of the (first) . period in the remaining text:
" & InStr(sndrEmailRight, ".")
'sndrEmailPreDot = Left(sndrEmailRight, InStr(sndrEmailRight, ".") -
1)
' Save attachment before deleting from item.
' Get the file name.
strFile = sndrEmailRight & "_" & Format(DateAdd("m", -1,
objMsg.ReceivedTime), "mm-yyyy") & "___" &
objAttachments.item(i).FileName
' Combine with the path to the Temp folder.
saveName = strFolderpath & strFile
' Save the attachment as a file.
objAttachments.item(i).SaveAsFile saveName
' Delete the attachment.
'objAttachments.item(i).Delete
Next i
objMsg.Save
End If
Next
ExitSub:
Set objAttachments = Nothing
Set objMsg = Nothing
Set objSelection = Nothing
Set objOL = Nothing
End Sub
See example on how to call Excel from Outlook
https://stackoverflow.com/a/41801050/4539709
I have updated your code, make sure to Reference to Microsoft Excel xx.x Object Library and update your one-drive folder path
Your code example
Option Explicit
Public Sub SaveAttachments()
Dim Atmts As Outlook.Attachments
Dim i As Long
Dim lngCount As Long
Dim strFile As String
Dim strDeletedFiles As String
Dim saveName As String
Dim userName As String
Dim sndrEmailAdd As String
Dim sndrEmailRight As String
Dim sndrEmailPreDot As String
Dim strFolderpath As String
strFolderpath = "C:\Temp\"
' Get the collection of selected objects.
Dim objSelection As Outlook.Selection
Set objSelection = Application.ActiveExplorer.Selection
Dim xlApp As Excel.Application
Set xlApp = New Excel.Application
Dim Book As Workbook
Set Book = xlApp.Workbooks.Open("C:\Temp\Book1.xlsx")
Dim xlStarted As Boolean
xlStarted = True
Dim Sht As Excel.Worksheet
Set Sht = Book.Sheets("Sheet1")
Dim Rng As Range
' Check each selected item for attachments. If attachments exist,
' save them to the strFolderPath folder and strip them from the item.
Dim objMsg As Outlook.MailItem 'Object
For Each objMsg In objSelection
' This code only strips attachments from mail items.
' If objMsg.class=olMail Then
' Get the Attachments collection of the item.
Set Atmts = objMsg.Attachments
lngCount = Atmts.Count
Debug.Print lngCount
strDeletedFiles = ""
If lngCount > 0 Then
' We need to use a count down loop for removing items
' from a collection. Otherwise, the loop counter gets
' confused and only every other item is removed.
For i = lngCount To 1 Step -1
'Extract text, after # and before dot, from the email address.
sndrEmailAdd = objMsg.SenderEmailAddress
Debug.Print sndrEmailAdd
sndrEmailRight = Right(sndrEmailAdd, Len( _
sndrEmailAdd) - InStr( _
sndrEmailAdd, "#"))
Debug.Print sndrEmailRight
sndrEmailPreDot = Left(sndrEmailRight, _
InStr(sndrEmailRight, ".") - 1)
Debug.Print sndrEmailPreDot
For Each Rng In Sht.Range("A1", Sht.Range("A100").End(xlUp))
If (Rng.Value) = sndrEmailRight Then
sndrEmailPreDot = Rng.Offset(0, 1).Value & "_" & sndrEmailPreDot
Debug.Print sndrEmailPreDot
End If
Next
' Save attachment before deleting from item.
' Get the file name.
strFile = sndrEmailPreDot & "_" & _
Format(DateAdd("m", -1, objMsg.ReceivedTime _
), "mm-yyyy") & "___" & Atmts.Item(i).FileName
Debug.Print strFile
' Combine with the path to the Temp folder.
saveName = strFolderpath & strFile
Debug.Print saveName
' Save the attachment as a file.
Atmts.Item(i).SaveAsFile saveName
' Delete the attachment.
'Atmts.item(i).Delete
Next i
objMsg.Save
End If
Next
' Close & SaveChanges
Book.Close SaveChanges:=True
If xlStarted Then
xlApp.Quit
End If
Set xlApp = Nothing
Set Book = Nothing
Set Sht = Nothing
Set Rng = Nothing
Set Atmts = Nothing
Set objMsg = Nothing
Set objSelection = Nothing
End Sub

Selecting specific controls when extracting from word to excel

Is it possible to select which specific form controls are extracted from word to excel?
I have a macro at the moment that works fine and extracts all the form controls into excel, onto one single row. The thing is, I need to break down the controls into 3 different sections. Each having its own sheet/tab. The form controls are text and drop down lists.
For example: Say the form has 9 questions.
1st worksheet/tab, macro will pull questions
1.
2.
3.
2nd worksheet/tab, macro will pull questions (I don't mind a separate macro)
4.
5.
6.
3rd worksheet/tab macro will pull questions(I don't mind a separate macro)
7.
8.
9.
Current macro that runs great, but brings in every single control:
Sub GetFormData()
'Note: this code requires a reference to the Word object model
Application.ScreenUpdating = False
Dim wdApp As New Word.Application, wdDoc As Word.Document, CCtrl As Word.ContentControl
Dim strFolder As String, strFile As String, WkSht As Worksheet, i As Long, j As Long
strFolder = GetFolder
If strFolder = "" Then Exit Sub
Set WkSht = ActiveSheet
i = WkSht.Cells(WkSht.Rows.Count, 1).End(xlUp).Row
strFile = Dir(strFolder & "\*.docx", vbNormal)
While strFile <> ""
i = i + 1
Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
With wdDoc
j = 0
For Each CCtrl In .ContentControls
With CCtrl
Select Case .Type
Case Is = wdContentControlCheckBox
j = j + 1
WkSht.Cells(i, j).Value = .Checked
Case wdContentControlDate, wdContentControlDropdownList, wdContentControlRichText, wdContentControlText
j = j + 1
WkSht.Cells(i, j).Value = .Range.Text
Case Else
End Select
End With
Next
.Close SaveChanges:=False
End With
strFile = Dir()
Wend
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing: Set WkSht = Nothing
Application.ScreenUpdating = True
End Sub
Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function
Example of what it looks like. The questions are repeated so dont mind them:
Here 's an outline to approach what you want. Basically it's all in the set up. My solution assumes that each control in your Word document has the Title field set and defined to a unique value.
My suggestion is to isolate similarly coded logic into separate functions. As an example, the SaveControlData and IsInArray.
Option Explicit
Sub example()
Dim thisSheet As Worksheet
Dim thatSheet As Worksheet
Dim theOtherSheet As Worksheet
Set thisSheet = ThisWorkbook.Sheets("Sheet1")
Set thatSheet = ThisWorkbook.Sheets("Sheet2")
Set theOtherSheet = ThisWorkbook.Sheets("Sheet3")
'--- map the control (by Title) to each worksheet
Dim thisTitles As Variant
Dim thatTitles As Variant
Dim theOtherTitles As Variant
thisTitles = Split("MyCheckbox,MyTextbox", ",")
thatTitles = Split("MyDatebox", ",")
theOtherTitles = Split("MyCheckbox,MyDatebox", ",")
Dim wdApp As Word.Application
Set wdApp = New Word.Application
Dim wdDoc As Word.Document
Set wdDoc = wdApp.Documents.Open("C:\Temp\Test text.docx")
'--- determine the starting point for data on each worksheet
Dim thisCell As Range
Dim thatCell As Range
Dim theOtherCell As Range
Set thisCell = thisSheet.Range("A1") 'calculate last row?
Set thatCell = thatSheet.Range("A1")
Set theOtherCell = theOtherSheet.Range("A1")
Dim CCtrl As Word.ContentControl
With wdDoc
For Each CCtrl In .ContentControls
'--- arranging the If statements like this means you could
' technically copy the same control value to different
' worksheets
If IsInArray(thisTitles, CCtrl.Title) Then
SaveControlData thisCell, CCtrl
thisCell.Offset(0, 1).value = CCtrl.Title
Set thisCell = thisCell.Offset(1, 0)
End If
If IsInArray(thatTitles, CCtrl.Title) Then
SaveControlData thatCell, CCtrl
thatCell.Offset(0, 1).value = CCtrl.Title
Set thatCell = thatCell.Offset(1, 0)
End If
If IsInArray(theOtherTitles, CCtrl.Title) Then
SaveControlData theOtherCell, CCtrl
theOtherCell.Offset(0, 1).value = CCtrl.Title
Set theOtherCell = theOtherCell.Offset(1, 0)
End If
Next CCtrl
End With
wdDoc.Close SaveChanges:=False
wdApp.Quit
End Sub
Private Function IsInArray(ByRef wordList As Variant, ByVal thisWord As String) As Boolean
IsInArray = False
Dim i As Long
For i = LBound(wordList, 1) To UBound(wordList, 1)
If wordList(i) = thisWord Then
IsInArray = True
Exit Function
End If
Next i
End Function
Private Sub SaveControlData(ByRef cell As Range, ByRef CCtrl As Variant)
With CCtrl
Select Case .Type
Case Is = wdContentControlCheckBox
cell.value = .Checked
Case wdContentControlDate, _
wdContentControlDropdownList, _
wdContentControlRichText, _
wdContentControlText
cell.value = .Range.Text
Case Else
End Select
End With
End Sub

Copy data from several Word documents to one Excel workbook using Word VBA

I have about 100 Word documents and from each I want to copy data and paste it all in one Excel workbook.
I came up with this code which opens one Word document, copies data, pastes it to Excel and closes the Word document:
Sub WordDataToExcel()
Dim myObj
Dim myWB
Dim mySh
Dim txt As String, Lgth As Long, Strt As Long
Dim i As Long
Dim oRng As Range
Dim Tgt As String
Dim TgtFile As String
Dim arr()
Dim ArrSize As Long
Dim ArrIncrement As Long
ArrIncrement = 1000
ArrSize = ArrIncrement
ReDim arr(ArrSize)
Dim wrdDoc As Object
Documents.Open ("D:\ekr5_i.doc")
TgtFile = "result.xlsx"
Tgt = "D:\" & TgtFile
'finds the text string of Lgth lenght
txt = "thetext"
Lgth = 85
Strt = Len(txt)
'Return data to array
With Selection
.HomeKey unit:=wdStory
With .Find
.ClearFormatting
.Forward = True
.Text = txt
.Execute
While .Found
i = i + 1
Set oRng = ActiveDocument.Range _
(Start:=Selection.Range.Start + Strt, _
End:=Selection.Range.End + Lgth)
arr(i) = oRng.Text
oRng.Start = oRng.End
.Execute
If i = ArrSize - 20 Then
ArrSize = ArrSize + ArrIncrement
ReDim Preserve arr(ArrSize)
End If
Wend
End With
End With
ReDim Preserve arr(i)
'Set target and write data
Set myObj = CreateObject("Excel.Application")
Set myWB = myObj.Workbooks.Open(Tgt)
Set mySh = myWB.Sheets(1)
With mySh
.Range(.Cells(1, 1), .Cells(i, 1)) = myObj.Transpose(arr)
End With
'Tidy up
myWB.Close True
myObj.Quit
ActiveDocument.Close SaveChanges:=wdDoNotSaveChanges
Set mySh = Nothing
Set myWB = Nothing
Set myObj = Nothing
End Sub
I need to loop through all the documents in the folder.
I have implemented the same with Excel workbooks, but I don't know how for Word documents.
Here is the code for Excel workbooks:
Sub combine_into_one()
Dim FSO As Object: Set FSO = CreateObject("Scripting.FileSystemObject")
Dim strPath$, Pivot$, sUserName$, sFolderName$, sSourceName$, x&
Dim oFldialog As FileDialog
Dim oFolder
Application.DisplayAlerts = False
Application.ScreenUpdating = False
Set oFldialog = Application.FileDialog(msoFileDialogFolderPicker)
With oFldialog
If .Show = -1 Then
.Title = "Select a Folder"
.AllowMultiSelect = False
.InitialFileName = strPath
sFolderName = .SelectedItems(1)
End If
End With
Set oFolder = FSO.GetFolder(sFolderName)
Workbooks.Add: Pivot = ActiveWorkbook.Name 'Destination workbook
For Each oFile In oFolder.Files
Workbooks(Pivot).Activate
x = Workbooks(Pivot).Sheets(1).Cells.SpecialCells(xlCellTypeLastCell).Row + 1
Workbooks.Open Filename:=oFile: sSourceName = ActiveWorkbook.Name
Workbooks(sSourceName).Activate
Workbooks(sSourceName).Sheets(1).[A80:Q94].copy
Workbooks(Pivot).Activate
Workbooks(Pivot).Sheets(1).Cells(x + 1, 1).PasteSpecial xlPasteAll
Workbooks(sSourceName).Close False
Next
Application.CutCopyMode = False
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End Sub
There are so, so, so many things you can do between Excel & Word. I'm not sure I totally understand your question. The script below may help you; it has definitely served me well over time. If you need something different, please describe your issue more, to better clarify the issue you are facing.
Sub OpenAndReadWordDoc()
Rows("2:1000000").Select
Range(Selection, Selection.End(xlDown)).Select
Selection.ClearContents
Range("A1").Select
' assumes that the previous procedure has been executed
Dim oWordApp As Word.Application
Dim oWordDoc As Word.Document
Dim blnStart As Boolean
Dim r As Long
Dim sFolder As String
Dim strFilePattern As String
Dim strFileName As String
Dim sFileName As String
Dim ws As Worksheet
Dim c As Long
Dim n As Long
'~~> Establish an Word application object
On Error Resume Next
Set oWordApp = GetObject(, "Word.Application")
If Err Then
Set oWordApp = CreateObject("Word.Application")
' We started Word for this macro
blnStart = True
End If
On Error GoTo ErrHandler
Set ws = ActiveSheet
r = 1 ' startrow for the copied text from the Word document
' Last column
n = ws.Range("A1").End(xlToRight).Column
sFolder = "C:\Users\Excel\Desktop\Coding\Microsoft Excel\PWC\Resumes\"
'~~> This is the extension you want to go in for
strFilePattern = "*.doc*"
'~~> Loop through the folder to get the word files
strFileName = Dir(sFolder & strFilePattern)
Do Until strFileName = ""
sFileName = sFolder & strFileName
'~~> Open the word doc
Set oWordDoc = oWordApp.Documents.Open(sFileName)
' Increase row number
r = r + 1
' Enter file name in column A
ws.Cells(r, 1).Value = sFileName
ActiveCell.Offset(1, 0).Select
ActiveSheet.Hyperlinks.Add Anchor:=Sheets("Sheet1").Range("A" & r), Address:=sFileName, _
SubAddress:="A" & r, TextToDisplay:=sFileName
' Loop through the columns
For c = 2 To n
If oWordDoc.Content.Find.Execute(FindText:=Trim(ws.Cells(1, c).Value), _
MatchWholeWord:=True, MatchCase:=False) Then
' If text found, enter Yes in column number c
ws.Cells(r, c).Value = "Yes"
End If
Next c
oWordDoc.Close SaveChanges:=False
'~~> Find next file
strFileName = Dir
Loop
ExitHandler:
On Error Resume Next
' close the Word application
Set oWordDoc = Nothing
If blnStart Then
' We started Word, so we close it
oWordApp.Quit
End If
Set oWordApp = Nothing
Exit Sub
ErrHandler:
MsgBox Err.Description, vbExclamation
Resume ExitHandler
End Sub
Function GetDirectory(path)
GetDirectory = Left(path, InStrRev(path, "\"))
End Function
In this scenario, whatever you put in the headers of B1:K1 (or more to the right) is searched for, each word document in a folder is opened, scanned, and if the string in B1:K1 is found, an 'x' is placed in the same x-y coordinate.
Again, if this doesn't help, please describe your issue better, and I'll post back with alternative solutions. Thanks!!

Resources