Exporting pdf to excel - excel

In Adobe DC Pro, you have the option to export your pdf to excel 2003 spreadsheet by first going to "Export PDF", next by choosing "XML Spreadsheet 2003" and last by choosing the setting "Create Worksheet for each page."
I can't find any code that does this in excel vba.
My question is, how do I make excel complete this option through VBA? Just saving it as SaveAS FileFormat:=XlFileFormat.xlXMLSpreadsheet only makes it a xml spreadsheet, not "creating the worksheet for each page" that Adobe Pro makes.
Thanks

You can do it using the Adobe Acrobat 10.0 Type Library reference, that comes with Adobe Acrobat Pro.
Here is an example:
This code will open a PDF file and assign each PDF page to individual worksheets.
Option Explicit
Sub PDF_To_Excel()
Dim PDF_ As Acrobat.AcroPDDoc
Dim Hilight_Text As Acrobat.AcroHiliteList
Dim PDF_Page As Acrobat.AcroPDPage
Dim Page_Text As Acrobat.AcroPDTextSelect
Dim ws As Worksheet
Dim Count_Page As Long
Dim i As Long, j As Long, k As Long
Dim PDF_Text_Str As String
Dim Hold_Txt As Variant
Set PDF_ = New Acrobat.AcroPDDoc
Set Hilight_Text = New Acrobat.AcroHiliteList
Hilight_Text.Add 0, 32767
With PDF_
.Open "C:\ED5049PX2.pdf"
Count_Page = .GetNumPages
For i = 1 To Count_Page
PDF_Text_Str = ""
Set PDF_Page = .AcquirePage(i - 1)
Set Page_Text = PDF_Page.CreateWordHilite(Hilight_Text)
If Not Page_Text Is Nothing Then
With Page_Text
For j = 0 To .GetNumText - 1
PDF_Text_Str = PDF_Text_Str & .GetText(j)
Next j
End With
End If
Set ws = Worksheets.Add(, Worksheets(Sheets.Count))
With ws
.Name = "Page-" & i
If PDF_Text_Str <> "" Then
Hold_Txt = Split(PDF_Text_Str, vbCrLf)
For k = 0 To UBound(Hold_Txt)
PDF_Text_Str = CStr(Hold_Txt(k))
If Left(PDF_Text_Str, 1) = "=" Then PDF_Text_Str = "'" & PDF_Text_Str
.Cells(k + 1, 1).Value = PDF_Text_Str
Next k
Else
.Cells(1, 1).Value = "No text found in page " & i
End If
.Cells.Select
.Cells.EntireColumn.AutoFit
End With
Next
.Close
End With
MsgBox ("Done")
End Sub

I'm still have a problem with Page_Text is Nothing. Even though I have a pdf open, it seems like it doesn't read the Page_Text. Here is the code:
.Open "C:\User\test.pdf"
Count_Page=.GetNumPages
For I=1 to Count_Page
PDF_Text_Str=""
Set PDF_Page=.AcquirePage(i-1)
Set Page_Text=PDF_Page.CreateWordHilite(Hilight_Text)
If Not Page_Text is Nothing then
With Page_Text
For j=0 to .GetNumText -1
PDF_Text_Str=PDF_Text_Str & .GetText(j)
Next j
End With
End if
Thanks

Related

Alternative to Send Keys - copy and pasting Excel data to Notepad

Essentially I am trying use VBA to copy data from Excel and paste values in a new notepad file. I was able to create a successful macro using sendkeys, but am trying to avoid this I know it is not the most reliable. I have tried using createobject to create and write to temp files but no luck. I am getting a "permission denied" error. Once the Notepad file is generated, I do not want to save it down, just leave it open for the user to review the data and they can save as they need. Any suggestions on alternatives? Code I have been trying is below.
Application.ScreenUpdating = False
Dim formulasheet As Worksheet
Dim copysheet As Worksheet
Dim num As Integer
Dim valuecolumn As Range, cell As Object
Dim copycolumn As Range
Dim i As Range
Dim strTempFile As String
Dim strData As String
num = 0
Set formulasheet = ActiveWorkbook.Sheets("Template")
Set copysheet = ActiveWorkbook.Sheets("Copy")
Set valuecolumn = formulasheet.Range("B:B")
Set copycolumn = formulasheet.Range("A:A")
formulasheet.Visible = xlSheetVisible
copysheet.Visible = xlSheetVisible
copysheet.Cells.Clear
formulasheet.Select
For Each i In valuecolumn
If i.Value > 0 Then
i.Offset(0, -1).Copy
copysheet.Select
copysheet.Range("A1").End(xlUp).Offset(num, 0).PasteSpecial Paste:=xlPasteValues
num = num + 1
End If
Next i
copysheet.Cells.WrapText = False
If copysheet.Range("A1") = "" Then
MsgBox "No transaction amounts, please review."
copysheet.Visible = xlSheetHidden
formulasheet.Visible = xlSheetHidden
Exit Sub
Else
copysheet.Select
strData = copysheet.Range("A:A").SpecialCells(xlCellTypeConstants).Copy
strTempFile = "C:\temp.txt"
With CreateObject("Scripting.FileSystemObject")
.CreateTextFile(strTempFile, False).Write strData
End With
Shell "cmd /c ""notepad.exe """ & strTempFile & """", vbHide
End If
SendKeys "{NUMLOCK}", True
copysheet.Visible = xlSheetHidden
formulasheet.Visible = xlSheetHidden

Exiting Word Table after pasting from Excel

I'm trying to copy information from an excel sheet to a new word document. Currently everything copies correctly on the first loop, but pastes into the previously pasted table in the next loop. I've tried every variation of ways to exit the table I can find through searching and none seem to fix the issue. Hoping someone can help.
Sub createWord()
Dim objWord
Dim objDoc
Dim heading As New DataObject
Dim fileName As String
Dim tableRange As Word.Range
Set objWord = CreateObject("Word.Application")
Set objDoc = objWord.Documents.Add()
fileName = ActiveWorkbook.Name
fileName = Left$(fileName, InStrRev(fileName, ".") - 1) & " Data.doc"
'objDoc.SaveAs fileName:=ThisWorkbook.Path & "\" & fileName
objWord.Visible = True
For i = 4 To Application.Sheets.Count
Dim k As Integer
k = ((i - 4) * 4) + 1
heading.SetText ThisWorkbook.Worksheets(i).Cells(1, 4).Value
heading.PutInClipboard
objDoc.Paragraphs.Add
objDoc.Paragraphs(k).Range.Paste
k = k + 1
Call copyGraphAuto(i)
objDoc.Paragraphs.Add
objDoc.Paragraphs(k).Range.Paste
k = k + 1
heading.SetText ThisWorkbook.Worksheets(i).Cells(24, 5).Value
heading.PutInClipboard
objDoc.Paragraphs.Add
objDoc.Paragraphs(k).Range.Paste
k = k + 1
Call copyTableAuto(i)
objDoc.Paragraphs.Add
objDoc.Paragraphs(k).Range.Paste
Set tableRange = objDoc.Tables(k - 3).Range
tableRange.Collapse Direction:=wdCollapseEnd
'Exit For
Next i
End Sub
Sub copyTableAuto(Optional ByVal sheetNumber As Integer)
Dim ppmCount As Integer
If sheetNumber = 0 Then sheetNumber = ThisWorkbook.ActiveSheet.Index
ppmCount = Worksheets(sheetNumber).Range("M4:M9").Cells.SpecialCells(xlCellTypeConstants).Count
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Worksheets(sheetNumber).Range("E29:E" & CStr(ppmCount + 28)).Merge
Worksheets(sheetNumber).Range("E25:I" & CStr(ppmCount + 28)).Copy
End Sub
Thanks
The issue is caused by your attempt to maintain an index of the paragraphs in the document.
As you are adding data to the document consecutively it would be better, and simpler, to use Word's own index and work with:
objDoc.Paragraphs.Last.Range

Automatically Transfer from Excel to a Tab Delimited List in Word

I have a table in excel which has the data I would like to transfer to a word document. Based on which column the values are in I am trying to put the data into a different tabbed order (Ex: List Level 1 is initial list, List Level 2 is pressing tab once in list).
I am trying to do this by recognizing a cell on a previous sheet and the code I have so far works to get the word document open but in order to actually bring in the data I can't seem to figure it out.
My current code is show below (I have the word document "Template.docx" in the same folder:
Private Sub CreateList()
Dim WRD As Object, DOC As Object
On Error Resume Next
Set WRD = CreateObject("Word.Application")
If Err.Number <> 0 Then
Set WRD = CreateObject("Word.Application")
End If
On Error GoTo 0
Set DOC = WRD.Documents.Open(ThisWorkbook.Path &
"\Template.docx", ReadOnly:=True)
WRD.Visible = True
If Sheet1.Range("A1").Value = "Package 1" Then
With DOC
' INSERT DATA FROM EXCEL INTO A TAB DELIMITED LIST
End With
End If
Set WRD = Nothing
Set DOC = Nothing
End Sub
You refer to a tab-delimited list in Word, but your pic depicts something that would ordinarily be dealt with as paragraph headings in Word.
Assuming you really want headings and that your Word document employs Word's Heading Styles with multi-level list-numbering correctly, you could use something like:
Sub CreateList()
'Note: A reference to the Word library must be set, via Tools|References
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim xlSht As Worksheet, sPath As String, LRow As Long, LCol As Long, r As Long, c As Long
sPath = ActiveWorkbook.Path: Set xlSht = ActiveSheet
With xlSht.Cells.SpecialCells(xlCellTypeLastCell)
LRow = .Row: LCol = .Column: If LCol > 9 Then LCol = 9
End With
With wdApp
.Visible = False
Set wdDoc = .Documents.Open(Filename:=sPath & "\Template.docx", AddToRecentFiles:=False, ReadOnly:=True, Visible:=True)
With wdDoc
For r = 2 To LRow
For c = 1 To LCol
If xlSht.Cells(r, c).Value <> "-" Then
.Characters.Last.InsertBefore xlSht.Cells(r, c).Value & vbCr & vbCr
.Characters.Last.Previous.Previous.Style = "Heading " & c
End If
Next
Next
End With
.Visible = True
End With
Set wdDoc = Nothing: Set wdApp = Nothing: Set xlSht = Nothing
End Sub
If you're wedded to using list-level numbering, you could replace the:
If xlSht.Cells(r, c).Value <> "-" Then
...
End If
code block with something like:
If xlSht.Cells(r, c).Value <> "-" Then
.Characters.Last.InsertBefore xlSht.Cells(r, c).Value & vbCr & vbCr
With .Paragraphs(.Paragraphs.Count - 2).Range.ListFormat
.ApplyListTemplateWithLevel ListTemplate:=ListGalleries(wdOutlineNumberGallery).ListTemplates(2), _
ApplyTo:=wdListApplyToWholeList, DefaultListBehavior:=wdWord9ListBehavior
.ListLevelNumber = c
End With
End If
and insert:
For c = 1 To LCol ' or 9 for all possible levels
.ListTemplates(2).ListLevels(c).TextPosition = InchesToPoints(c * 0.5 - 0.5)
.ListTemplates(2).ListLevels(c).ResetOnHigher = True
Next
after the existing final 'Next'.
If the above doesn't provide the list numbering format you want, you will need to choose the appropriate ListGallery (from wdBulletGallery, wdNumberGallery, or wdOutlineNumberGallery) and the and ListTemplate number.

Looping Through PDF Files

I have a working VBA script that pulls specific form fields from a specified PDF file into a spreadsheet. However I have several hundred PDFs that I need to do this for, so I'd like to loop through files in a directory and perform the same action.
Conveniently I have an old VBA script that loops through Word files in a directory and imports the content of each just how I'd like.
I hardly know VBA but I've adapted scripts in several language including VBA to meet my needs. I thought this would take 10 minutes but its taken several hours. Can somebody please look at my script below and tell me where I'm going wrong? I assume it has something to do with the Word and Acrobat libraries having different requirements, but even my loop isn't displaying the test message.
PS I have Acrobat Pro installed.
My Script (Non-Working)
Private Sub CommandButton1_Click()
Dim f As String: f = "C:\temp\ocopy"
Dim s As String: s = Dir(f & "*.pdf")
Dim AcroApp As Acrobat.CAcroApp
Dim theForm As Acrobat.CAcroPDDoc
Dim jso As Object
Dim text1, text2 As String
Dim col As Integer: col = 1
Do Until s = ""
Set AcroApp = CreateObject("AcroExch.App")
Set theForm = CreateObject("AcroExch.PDDoc")
theForm.Open (f & s)
Set jso = theForm.GetJSObject
text1 = jso.getField("Name of serviceRow1").Value
text2 = jso.getField("Who are the key contacts?").Value
MsgBox text1
MsgBox "text1"
Sheet1.Cells(col, 1).Value = text1
Sheet1.Cells(col, 2).Value = text2
col = col + 1: s = Dir
theForm.Close
AcroApp.Exit
Set AcroApp = Nothing
Set theForm = Nothing
Loop
End Sub
Word Script - Works at Looping and Importing
Sub fromWordDocsToMultiCols()
Dim f As String: f = "C:\temp\Test\"
Dim s As String: s = Dir(f & "*.docx")
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim col As Integer: col = 1
On Error GoTo errHandler
Do Until s = ""
Set wdDoc = wdApp.Documents.Open(f & s)
wdDoc.Range.Copy
Sheet1.Cells(1, col).Value = s
Sheet1.Cells(2, col).PasteSpecial xlPasteValues
wdDoc.Close False: col = col + 1: s = Dir
Loop
errHandler:
If Err.Number <> 0 Then MsgBox Err.Description
If Not wdApp Is Nothing Then wdApp.Quit False
End Sub
Acrobat Script - Works as Importing One-by-One
Private Sub CommandButton1_Click()
Dim AcroApp As Acrobat.CAcroApp
Dim theForm As Acrobat.CAcroPDDoc
Dim jso As Object
Dim text1, text2 As String
Set AcroApp = CreateObject("AcroExch.App")
Set theForm = CreateObject("AcroExch.PDDoc")
theForm.Open ("C:\temp\ocopy\Minerals asset management.pdf")
Set jso = theForm.GetJSObject
' get the information from the form fiels Text1 and Text2
text1 = jso.getField("Name of serviceRow1").Value
text2 = jso.getField("Who are the key contacts within the team for this service? Please provide one contact per region").Value
Sheet1.Cells(1, 1).Value = text1
Sheet1.Cells(1, 2).Value = text2
theForm.Close
AcroApp.Exit
Set AcroApp = Nothing
Set theForm = Nothing
End Sub
Many thanks in advance.

How to extract email message and fill spreadsheet with values

I have a macro that reads the unread messages in my inbox and extracts the data from the message with a delimiter of ":" . In the loop I want to be able to load the new excel spreadsheet with the values from the message.
I am able to select the first cell and save the data but it is getting over written. each time in the loop I want the data to go to the next cell in the column that is empty instead of overwriting the same cell.
Here is my code so far...
Public Sub Application_NewMail()
Dim newbk As Workbook
Set newbk = Workbooks.Add
newbk.SaveAs "C:\Users\RickG\Desktop\test2.xlsx" 'other parameters can be set here if required
' perform operations on newbk
newbk.Close savechanges:=True
Dim ns As Outlook.NameSpace
Dim InBoxFolder As MAPIFolder
Dim InBoxItem As Object 'MailItem
Dim Contents As String, Delimiter As String
Dim Prop, Result
Dim i As Long, j As Long, k As Long
'Setup an array with all properties that can be found in the mail
Prop = Array("Name", "Email", "Phone", "Customer Type", _
"Message")
'The delimiter after the property
Delimiter = ":"
Set ns = Session.Application.GetNamespace("MAPI")
'Access the inbox folder
Set InBoxFolder = ns.GetDefaultFolder(olFolderInbox)
Dim xlApp As Excel.Application
Dim xlWB As Excel.Workbook
Dim ws As Worksheet
Set xlApp = New Excel.Application
With xlApp
.Visible = False
Set xlWB = .Workbooks.Open("C:\Users\RickG\Desktop\test2.xlsx", , False)
Set ws = .Worksheets("Sheet1")
End With
Dim LR As Long
For Each InBoxItem In InBoxFolder.Items
'Only process mails
If Not TypeOf InBoxItem Is MailItem Then GoTo SkipItem
'Skip wrong subjects
If InStr(1, InBoxItem.Subject, "FW: New Lead - Consumer - Help with Medical Bills", vbTextCompare) = 0 Then GoTo SkipItem
'Already processed?
If Not InBoxItem.UnRead Then GoTo SkipItem
'Mark as read
InBoxItem.UnRead = False
'Get the body
Contents = InBoxItem.Body
'Create space for the result
ReDim Result(LBound(Prop) To UBound(Prop)) As String
'Search each property
i = 1
For k = LBound(Prop) To UBound(Prop)
'Find the property (after the last position)
i = InStr(i, Contents, Prop(k), vbTextCompare)
If i = 0 Then GoTo NextProp
'Find the delimiter after the property
i = InStr(i, Contents, Delimiter)
If i = 0 Then GoTo NextProp
'Find the end of this line
j = InStr(i, Contents, vbCr)
If j = 0 Then GoTo NextProp
'Store the related part
Result(k) = Trim$(Mid$(Contents, i + Len(Delimiter), j - i - Len(Delimiter)))
'for every row, find the first blank cell and select it
'MsgBox Result(k)
LR = Range("A" & Rows.Count).End(xlUp).Row
Range("A" & LR).Value = Result(k)
'Update the position
i = j
NextProp:
Next
xlApp.DisplayAlerts = False
xlWB.SaveAs ("C:\Users\RickG\Desktop\test2.xlsx")
xlWB.Close
xlApp.Quit
If MsgBox(Join(Result, vbCrLf), vbOKCancel, "Auto Check In") = vbCancel Then Exit Sub
SkipItem:
Next
End Sub
You're not tracking your loop correctly. If you change
Range("A" & LR).Value = Result(k)
to
Range("A" & LR + 1).Value = Result(k)
in your
For k = LBound(Prop) To UBound(Prop)
loop, that should correct your issue.
EDIT: Sorry, findwindow. I didn't see the comment thread below the question. I just saw that the question had no answer yet.

Resources