Excel VBA to get Word table data - excel

I found VBA code and modified it but it didn't work.
Sub ImportWordTable()
Dim wdDoc As Object
Dim wdFileName As Variant
Dim i As Integer
Dim j As Integer
Dim TableNo As Integer 'table number in Word
Dim iRow As Long 'row index in Excel
Dim iCol As Integer 'column index in Excel
wdFileName = Application.GetOpenFilename()
j = 1
If wdFileName = False Then Exit Sub '(user cancelled import file browser)
Set wdDoc = GetObject(wdFileName) 'open Word file
With wdDoc
For i = 1 To .Tables.Count
With .Tables(i)
'copy cell contents from Word table cells to Excel cells
For iRow = 2 To .Rows.Count
For iCol = 1 To .Columns.Count
Cells(j, iCol) = WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text)
Next iCol
j = j + 1
Next iRow
End With
Next i 'Next Table
End With 'End of the document
Set wdDoc = Nothing
End Sub
I have a Word document, with 750 pages, with a table on each page. I want to import the contents of the tables into an Excel file (with the exception of the first line of each table as that is the column names).
It throws an error (As shown in picture) -- Automation Error -- "Run-time Error 2147418105 (80010007)".

From your screenshot it appears you got through row 10 and started row 11 before the error occurred. Without seeing the full document or knowing the table schema I would try one or more of the following approaches:
Try setting breakpoint before word crashes, then step through to see if specific data or some other issue may be the culprit. (At the Row loop (For iRow = 2 To .Rows.Count) or add IF iRow=10 THEN debug.break then step into each line (Debug>Step Into (F8)) from there.).
If table ends near 10 rows then try closing wdDoc (Set wdDoc = Nothing), reopen, and continue import starting right where you left off. This sounds impractical, but if it works around the bug then you can try adjusting some values and narrowing down the root cause.
Run this VBA on a computer that has lots of free RAM. MS Word may be loading entire 750-page document into memory, then hitting OutOfMemory, pagefile thrashing, or other OS resource limit that devolves into app crash.
Try splitting the 750-page document into 8 files with no more than 100 pages per file. Then loop through each. If that works then look into computer resource limits or issue with Word size limitations.
FYI: I've seen some issues with Word stability at ~1000 pages but I had lots of screenshots and other formatting goodies in that example (and I found solution that kept it split into 27 separate reference detail-level sections with an overall <50-page top-level document loosely referring to them.)
A Year late and short of a known fix, but this debugging checklist might help you or others who stumble upon this thread :-). I'm working with very similar routine (and just started searching for how to pull statistics for each table :-)). Cheers and good luck!

Related

How to select a specific type of tracked change in a word table and copy it to excel?

I have a unique situation I am trying to find a way to implement:
I am working with word documents that are simple tables.. all of the information in the word doc is in a table. Some are hundreds of pages long, and are revised regularly. What I am trying to do is to (in excel from a macro) open the word document, scan it and copy over to excel only those rows from the table that are insertions.
I have managed to cobble together from various sources the code that will open the word doc and copy over ANY track changes... But for the life of me I cannot see or find a way to limit it to insertions, I'm hoping that someone may have some ideas...
Here is the code I am using now that works to bring over all tracked changes in the proper columns:
'declare variables
Dim ws As Worksheet
Dim WordFilename As Variant
Dim Filter As String
Dim WordDoc As Object
Dim tbNo As Long
Dim RowOutputNo As Long
Dim RowNo As Long
Dim ColNo As Integer
Dim tbBegin As Integer
Set ws = Worksheets("Analysis")
Filter = "Word File New (*.docx), *.docx," & _
"Word File Old (*.doc), *.doc,"
'clear all of the content in the worksheet where the tables from the Word document are to be imported
ws.Cells.ClearContents
'if you only want to clear a specific range, replace .Cells with the range that you want to clear
'displays a Browser that allows you to select the Word document that contains the table(s) to be imported into Excel
WordFilename = Application.GetOpenFilename(Filter, , "Select Word file")
If WordFilename = False Then Exit Sub
'open the selected Word document
Set WordDoc = GetObject(WordFilename)
With WordDoc
tbNo = WordDoc.Tables.Count
If tbNo = 0 Then
MsgBox "This document contains no tables"
End If
'nominate which row to begin inserting the data from. In this example we are inserting the data from row 1
RowOutputNo = 1
'go through each of the tables in the Word document and insert the data from each of the cells into Excel
For tbBegin = 1 To tbNo
With .Tables(tbBegin)
For RowNo = 1 To .rows.Count
For ColNo = 1 To .Columns.Count
'-----This code works to only select revisions ----------------
'-----Next step - make it only select insertions -
' OR - let it mark what kind of revision it is-----
Set rng = .Cell(RowNo, ColNo).Range
'don't include the "end of cell" marker in the checked range
'rng.MoveEnd wdCharacter, -1
numRevs = rng.Revisions.Count
If numRevs > 0 Then
ws.Cells(RowOutputNo, ColNo) = Application.WorksheetFunction.Clean(.Cell(RowNo, ColNo).Range.Text)
End If
Next ColNo
RowOutputNo = RowOutputNo + 1
Next RowNo
End With
RowOutputNo = RowOutputNo
Next tbBegin
End With
End Sub
For example:
If numRevs > 0 Then
For revIdx = 1 To numRevs
If Rng.Revisions(revIdx).Type = 1 Then
'it's an insert
End If
Next revIdx

Enable editing vba

I am working in Excel 2016. I currently created a macro that will loop through all open workbooks and grab the data in them if they start with the word "report". The issue I am trying to solve now is how to enable editing. If the users enable editing after downloading all reports to be combined there is no issue with the macro. They run into issues with the macro not grabbing the data if they missed that button.
While they are not working with that many workbooks, I am trying to make it easier for them. The code that I have posted will do the first 3 workbooks and then continue looping through the remaining 5 but will not "Enable Edit".
Sub EnableEdit()
Dim bk As Workbook
Dim w As Long, wCount As Long
wCount = Application.ProtectedViewWindows.Count
Set wsh = ThisWorkbook.Worksheets("Data")
On Error Resume Next
If wCount > 0 Then
For w = 1 To wCount
Application.ProtectedViewWindows(w).Activate
Application.ProtectedViewWindows(w).Edit
If Left(ActiveWorkbook.Name, 6) = "report" Then
ActiveWorkbook.Worksheets(1).Range("A1:Z1").Copy _
Destination:=wsh.Range("A1")
nrow = wsh.Cells(Rows.Count, 1).End(xlUp).Row + 1
ActiveWorkbook.Worksheets(1).Range("A2:Z500").Copy _
Destination:=wsh.Range("A" & nrow)
ActiveWorkbook.Close
End If
Next w
End If
On Error GoTo 0
End Sub
Application.ProtectedViewWindows appears to be a collection of all the protected-view windows. As soon as you execute the .Edit method on one of those protected-view windows it is no longer in protected-view mode and is therefore removed from the collection.
This means that when you Edit the first member of the collection (when w is 1), what was the second member now becomes the first member, what was the third member now becomes the second, etc. And then on the next iteration of your loop (when w is 2) your code is therefore looking at the original third member, having completely ignored looking at the original second member.
The easiest way to fix the issue is to loop through the array in reverse order, i.e. use:
For w = wCount To 1 Step -1

Automate Word document into excel

I have to transfer hundreds of word documents into excel. The documents are client records from way back, and they look like this:
Date: dd/mm/yy
Employee name: Name
Several lines of text in one paragraph.
There will be dozens of entrees on the page like this, formatted into one column. I need the end result to be one column for each category. The major challenge with this is the text descriptions. The way they are formatted in word, there doesn't seem to be a simple way to copy and paste them into excel without the result being one cell for each line of the paragraph. Any method I've tried that replaces hard returns for soft returns takes all the text on the entire page and puts into one cell, which makes it worse. I have hundreds of files to transfer, so I don't have the time to copy and paste each note into one cell. The paragraphs also are not uniform in length, although there is typically an empty line before and after the note starts/ends. I'm sure there must be a VB solution for this, but I'm not knowledgeable enough to figure it out. Any help would be appreciated.
You can set Bookmarks in Word, and import all Bookmarks into ranges in Excel.
Sub TryThis()
Dim oWord As Word.Application
Dim oDoc As Word.document
Dim vBkMarks As Variant
Dim vRecord
Dim rRecord As Range
Dim nFields As Long
Dim i As Long
vBkMarks = Array("Bookmark1", "Bookmark2", "Bookmark3") 'etc...
ReDim vRecord(LBound(vBkMarks) To UBound(vBkMarks))
nFields = UBound(vBkMarks) - LBound(vBkMarks) + 1
On Error Resume Next
Set oWord = GetObject(, "Word.Application")
On Error GoTo 0
If oWord Is Nothing Then _
Set oWord = CreateObject("Word.Application")
Set oDoc = oWord.ActiveDocument
For i = LBound(vBkMarks) To UBound(vBkMarks)
vRecord(i) = oDoc.Bookmarks(vBkMarks(i)).Range.Text
Next i
With Sheets("DataTable")
.Cells(.Rows.Count, 1).End(xlUp).Offset(1, 0).Resize( _
1, nFields).Value = vRecord
End With
End Sub

Run-time error 13 Type Mismatch, Transferring data from Access to Excel

I have a button in Access (2003) that transfers data to Excel (also 2003). It opens the Excel workbook, then cycles through the Access subforms and transfers data.
To give more information on how this works, Excel has a range called "Tables" which contains the names of the Access subforms ("Main", "Demographics", "History", etc). Excel also has a range for each of the names in that first range. For example, the range "Demographics" contains a series of field names ("FirstName", "LastName", etc). So the first loop moves through the subforms, and the nested loop moves through the field names. Each field then passes the value in it over to excel. Excel also has ranges for "Demographics_Anchor" and "History_Anchor" etc, which is the first value in the column next to each range (ie the range Demographics has firstname, lastname, and to the right is where the data would go. So the first item in the range is FirstName, to the right "Demographics_Anchor" is where firstname will go. Then LastName goes to Demographics_Anchor offset by 1 - or 1 cell down from the anchor).
Dim ThisForm As Form
Dim CForm As Object
Dim CTab As TabControl
Dim CControl As Control
Dim CurrentTab As Variant
Dim CControlName As Variant
Dim CControlValue As String
Dim Code As Control
Dim counter1 As Integer
Dim appExcel As Object
Dim Anchor As Object
Dim PageRange As Object
Dim ControlNameRange As Object
strpath = "C:\blah\blah\filename.xlsm"
Set appExcel = CreateObject("Excel.Application")
appExcel.Workbooks.Open Filename:=strpath, UpdateLinks:=1, ReadOnly:=True
Set wbk = appExcel.ActiveWorkbook
Set PageRange = appExcel.Range("Tables")
'set Access environment
Set ThisForm = Forms("frmHome")
Set CTab = ThisForm.Controls("Subforms")
'export the data from Access Forms to Excel
For Each CurrentTab In PageRange
If CurrentTab = "Main" Then
Set CForm = ThisForm
Else
CTab.Pages(CurrentTab).SetFocus
Set CForm = ThisForm.Controls(CurrentTab & " Subform").Form
End If
Set ControlNameRange = appExcel.Range(CurrentTab)
Set Anchor = appExcel.Range(CurrentTab & "_Anchor")
counter1 = 0
For Each CControlName In ControlNameRange
Set CControl = CForm.Controls(CControlName)
CControl.SetFocus
Anchor.Offset(RowOffset:=counter1).Value = CControl.Value
counter1 = counter1 + 1
Next CControlName
Next CurrentTab
I hope this explains what is going on in the code. I just can't figure out why this keeps bombing out with type mistmatch (error 13).
The data DOES transfer. It goes through the entire code and every piece of data correctly gets transferred over. It bombs out at the end as if it goes through the code 1 last time when it shouldn't. I did confirm that every range is correct and doesn't contain any null values. The code bombs out on this line: Set CControl = CForm.Controls(CControlName) which is towards the bottom of the second loop.
Please help! I've spent weeks working with this code and had no luck. This exact code works in every other database I've worked with.
You are getting the name of the control CControlName from your Excel Range, but then setting the value of this control to the control on the Access form Set CControl = CForm.Controls(CControlName). From this, the most likely explanation is probably that the CControlName isn't actually on the Access form (perhaps a typo?).
In the VBA IDE, go under the Tools Menu, select Options and then select the General tab. Under the Error Trapping section, select the "Break on All Errors" option and click "OK" to set the preference. Run your code again; when an error is encountered VBA will stop processing on the line that caused the error. Check the value of CControlName and make sure it actually exists on the Access form.

Copy email to the clipboard with Outlook VBA

How do I copy an email to the clipboard and then paste it into excel with the tables intact?
I am using Outlook 2007 and I want to do the equivalent of
"Click on email > Select All > Copy > Switch to Excel > Select Cell > Paste".
I have the Excel Object Model pretty well figured out, but have no experience in Outlook other than the following code.
Dim mapi As NameSpace
Dim msg As Outlook.MailItem
Set mapi = Outlook.Application.GetNamespace("MAPI")
Set msg = mapi.Folders.Item(1).Folders.Item("Posteingang").Folders.Item(1).Folders.Item(7).Items.Item(526)
I must admit I use this in Outlook 2003, but please see if it works in 2007 as well:
you can use the MSForms.DataObject to exchange data with the clipboard. In Outlook VBA, create a reference to "Microsoft Forms 2.0 Object Library", and try this code (you can of course attach the Sub() to a button, etc.):
Sub Test()
Dim M As MailItem, Buf As MSForms.DataObject
Set M = ActiveExplorer().Selection.Item(1)
Set Buf = New MSForms.DataObject
Buf.SetText M.HTMLBody
Buf.PutInClipboard
End Sub
After that, switch to Excel and press Ctrl-V - there we go!
If you also want to find the currently running Excel Application and automate even this, let me know.
There's always a valid HTMLBody, even when the mail was sent in Plain Text or RTF, and Excel will display all text attributes conveyed within HTMLBody incl. columns, colors, fonts, hyperlinks, indents etc. However, embedded images are not copied.
This code demonstrates the essentials, but doesn't check if really a MailItem is selected. This would require more coding, if you want to make it work for calendar entries, contacts, etc. as well.
It's enough if you have selected the mail in the list view, you don't even need to open it.
I finally picked it up again and completely automated it. Here are the basics of what I did to automate it.
Dim appExcel As Excel.Application
Dim Buf As MSForms.DataObject
Dim Shape As Excel.Shape
Dim mitm As MailItem
Dim itm As Object
Dim rws As Excel.Worksheet
'code to open excel
Set appExcel = VBA.GetObject(, "Excel.Application")
'...
'code to loop through emails here
Set mitm = itm
body = Replace(mitm.HTMLBody, "http://example.com/images/logo.jpg", "")
Call Buf.SetText(body)
Call Buf.PutInClipboard
Call rws.Cells(i, 1).PasteSpecial
For Each Shape In rws.Shapes
Shape.Delete 'this deletes the empty shapes
Next Shape
'next itm
I removed the logo urls to save time, and when you're dealing with 300 emails, that translates into at least ten minutes saved.
I got the code I needed from a TechRepublic article, and then changed it to suit my needs. Many thanks to the accepted answerer of this question for the clipboard code.
Ok so I will have to make certain assumptions because there is information missing from your question.
Firstly you didn't say what mailformat the message is... HTML would be the easiest, the process will be different for RTF and not possible in plaintext
Since you are refering to tables I will assume they are HTML tables and the mail format is HTML.
Also it is not clear from your question if you want the table content pasted seperately (1 excel cell per table cell) and the rest of the emails bodytext pasted into 1 cell or several?
finally you haven't really said if you want the VBA running from Outlook or Excel (not that important but it affects which intrinsic objects are available.
Anyway code sample:
Outlook code to access the htmlbody prop
Dim mapi As Namespace
Set mapi = Application.Session
Dim msg As MailItem
Set msg = mapi.Folders.Item(1).Folders.Item("Posteingang").Folders.Item(1).Folders.Item(7).Items.Item(526)
Dim strHTML as String
strHTML = msg.HTMLBody
' There is no object model collection for html tables within the htmlbody (which is a string of html) you will need to parse the html and collect the tables before inserting into Excel.
After a while again, I found another way. MailItem.Body is plain text, and has a tab character between table cells. So I used that. Here is the gist of what I did:
Sub Import()
Dim itms As Outlook.Items
Dim itm As Object
Dim i As Long, j As Long
Dim body As String
Dim mitm As Outlook.MailItem
For Each itm In itms
Set mitm = itm
ParseReports (mitm.body) 'uses the global var k
Next itm
End Sub
Sub ParseReports(text As String)
Dim table(1 To 1000, 1 To 11) As String 'I'm not expecting to see a thousand rows!
Dim drow(1 To 11) As String
For Each Row In VBA.Split(text, vbCrLf)
j = 1
For Each Col In VBA.Split(Row, vbTab)
table(i, j) = Col
j = j + 1
Next Col
i = i + 1
Next Row
For i = 1 To l
For j = 1 To 11
drow(j) = table(i, j)
Next j
hws.Range(hws.Cells(k, 1), hws.Cells(k, 11)) = drow
k = k + 1
Next i
End Sub
Average: 77 emails processed per second. I do some minor processing and extracting.

Resources