I have multiple files in a folder and each of them have different password (which I have with me). Every week, I need to open the file using the password and save it without password (for an automation to pick the files).
I am using the below code which does the job. But it's slow and takes a few mins even if it's less than 10 files. Anything wrong with the code?
Please help!
Sub blah()
Set objFSO = CreateObject("Scripting.FileSystemObject")
mFolder = "folder path"
Set mainfolder = objFSO.GetFolder(mFolder)
Dim xValue As String
Dim OutValue As String
Dim PW As String
For Each fil In mainfolder.Files
PW = ""
xValue = ""
OutValue = ""
xValue = fil.Name
For xIndex = 1 To VBA.Len(xValue)
If Not VBA.IsNumeric(VBA.Mid(xValue, xIndex, 1)) Then
OutValue = OutValue & VBA.Mid(xValue, xIndex, 1)
End If
Next
TextOnly = Split(OutValue, ".")
If LCase(Left(TextOnly(UBound(TextOnly)), 4)) = "xlsx" Then
PW = Replace(OutValue, ".xlsx", "taxipw")
ElseIf LCase(Left(TextOnly(UBound(TextOnly)), 3)) = "xls" Then
PW = Replace(OutValue, ".xls", "taxipw")
End If
Set myWB = Workbooks.Open(Filename:=fil, Password:=PW)
myWB.Password = ""
myWB.Close True
Next fil
End Sub
Related
I am trying to select a folder and then assign that path to my file Variant. However, the code stops after the folder is selected and does not go to the next step. What can be wrong? The next step would be 'If selected_folder <> "" Then' but it just stops and debugger goes back to Sub.
Sub sheetCompare2()
Application.ScreenUpdating = False
Dim i As Integer
Dim WS_Count As Integer
Dim mDirs As String
Dim path As String
Dim OutFile As Variant, SrcFile As Variant
Dim file As Variant
Dim wb As Workbook
Dim datevar As Variant
Dim datevar2 As Variant
Dim selected_folder As String
Set wb = ThisWorkbook
WS_Count = ActiveWorkbook.Worksheets.Count
OutFile = ActiveWorkbook.Name
With Application.FileDialog(msoFileDialogFolderPicker)
.Title = "Select a Folder"
If Application.FileDialog(msoFileDialogFolderPicker).Show = -1 Then
selected_folder = .SelectedItems(1)
End If
End With
If selected_folder <> "" Then
file = Dir(selected_folder)
End If
While (file <> "")
path = selected_folder + file
Workbooks.Open (path)
SrcFile = ActiveWorkbook.Name
datevar = Right(file, 9)
datevar2 = Left(datevar, 4)
....'and so on
End Sub
Please, try changing this part of the code:
If selected_folder <> "" Then
file = Dir(selected_folder)
End If
While (file <> "")
path = mDirs + file
Workbooks.Open (path)
SrcFile = ActiveWorkbook.Name
datevar = Right(file, 9)
datevar2 = Left(datevar, 4)
'....'and so on
with:
If selected_folder <> "" Then
File = Dir(selected_folder & "\" & "*.xls*")
End If
Do While File <> ""
path = selected_folder & "\" & File
Workbooks.Open (path)
SrcFile = ActiveWorkbook.Name
datevar = Right(File, 9)
datevar2 = left(datevar, 4)
File = Dir
'...
Loop
In order to make Dir return a file name, you must set an extension. At least *.* for iterate between all files. But, wanting to open the files in Excel, it is good to use "*.xls*", as the above code does.
Then, the path of the file to be open should be built as above.
Your code does not show how you redefine File in order to make the loop working. You maybe have File = Dir before the loop end. If not, I added...
I'm reading an XML file in VBA and then using that to extract some information to create another Output XML file. I only need to generate amn output file when the fields in the initial XML file are filled in.
Currently when it gets to an empty tag it fails. I've tried some of the empty array solutions I've found on here but they don't seem to work. Is my code that bad??
Private Sub gDialer()
For i = 7 To 30000
If Sheets("Latest Report").Cells(i, "AL").Value = "Active - G to sign" Then
If Sheets("Latest Report").Cells(i, "AJ").Value <> "" Then
GenerateReadXML "Z:\AReports\A\Reports\readXML\gRead" & i & ".xml", Sheets("Latest Report").Cells(i, "AJ").Value
sendXML "readXML\gRead" & i & ".xml", "gRead" & i & "response", "Read"
Set oXMLFile = CreateObject("Microsoft.XMLDOM")
XMLFileName = "Z:\AReports\A\Reports\fActivities\results\gRead" & i & "response.xml"
'MsgBox XMLFileName
oXMLFile.Load (XMLFileName)
Set Numnode = oXMLFile.SelectNodes("/result/data29/text()")
Set titleNode = oXMLFile.SelectNodes("/result/data25/text()")
Set firstNode = oXMLFile.SelectNodes("/result/data26/text()")
Set lastNode = oXMLFile.SelectNodes("/result/data27/text()")
On Error GoTo didntfindit
Dim numStr As String
Dim titleStr As String
Dim firstStr As String
Dim lastStr As String
'### here's where it fails ###
numStr = Numnode(0).NodeValue
titleStr = titleNode(0).NodeValue
firstStr = firstNode(0).NodeValue
lastStr = lastNode(0).NodeValue
GenerateDialerXML "Z:\AReports\A\Reports\dialerXML\dialer" & i & ".xml", Numnode(0).NodeValue, titleNode(0).NodeValue, firstNode(0).NodeValue, lastNode(0).NodeValue, Sheets("Latest Report").Cells(i, "AJ").Value, 87
sendDialerXML "dialer" & i & ".xml", sendHandshake, "gDResponse"
didntfindit:
End If
End If
Next i
End Sub
When it fails, it's because the XML is like this:
<data24>something</data24>
</data25>
</data26>
</data27>
<data28>something else</data28>
</data29>
It works fine when there are values, but I can't figure out the next step.
Thanks in advance!
I've fixed it now, here's what I came up with, hope this helps someone one day:
oXMLFile.Load (XMLFileName)
Set numNode = oXMLFile.getElementsByTagName("data29")
Set titleNode = oXMLFile.getElementsByTagName("data25")
Set firstNode = oXMLFile.getElementsByTagName("data26")
Set lastNode = oXMLFile.getElementsByTagName("data27")
On Error GoTo didntfindit
Dim numStr As String
Dim titleStr As String
Dim firstStr As String
Dim lastStr As String
numStr = numNode(0).Text
titleStr = titleNode(0).Text
firstStr = firstNode(0).Text
lastStr = lastNode(0).Text
I've written a macro that downloads zip files containing CSVs from a website. The downloading and unzipping is going perfectly, however when I try to loop through the CSVs searching for the occurrence of a specific string, the macro simply quits after opening about a thousand. There is no error message, it simply stops working, leaving the last CSV it was working on open.
Here is my code:
Sub OpenSearch()
Dim ROW, j As Integer
Workbooks.Open Filename:=FileNameFolder & FileListCSV(i)
For j = 1 To 7
ROW = 3
Do Until IsEmpty(Cells(ROW, 6))
If Cells(ROW, 6) = WantedID(j, 1) Then
MsgBox "WE HAVE A MATCH!"
End If
ROW = ROW + 1
Loop
Next j
Workbooks(FileListCSV(i)).Close False
Kill FileNameFolder & FileListCSV(i)
End Sub
I did not include the main module that calls the this sub and downloads and unzips the files, because on its own, that works perfectly. It only stops working when the sub I copied here is being called.
The Filename comes from a public variable defined in the main module, WantedID contains the strings I need to find in the CSVs.
I've tried to put Application.Wait in the first line, but it did not solve the problem. Also how far the macro gets is completely random. It never stops after the same number of CSVs opened and closed.
UPDATE: Here is the code (parent sub) for the downloading and unzipping. I did not come up with this on my own, but copied it from an online source I cannot recall:
Public FileListCSV(1 To 288) As String
Public i As Integer
Public FileNameFolder As Variant
Public WantedID As Variant
Sub DownloadandUnpackFile()
Dim myURL As String
Dim YearNUM As Integer
Dim MonthNUM As Integer
Dim StarMonth, EndMonth As Integer
Dim DayNUM As Integer
Dim YearSTR As String
Dim MonthSTR As String
Dim DaySTR As String
Dim FixURL As String
Dim TargetFileName As String
Dim FSO As Object
Dim oApp As Object
Dim Fname As Variant
Dim DefPath As String
Dim strDate As String
Dim StrFile As String
Dim FileList(1 To 288) As String
Application.ScreenUpdating = False
FixURL = "http://www.nemweb.com.au/REPORTS/ARCHIVE/Dispatch_SCADA PUBLIC_DISPATCHSCADA_"
WantedID = Range(Cells(2, 1), Cells(8, 1))
YearNUM = 2016
StarMonth = 12
EndMonth = 12
For YearNUM = 2015 To 2016
For MonthNUM = StarMonth To EndMonth
For DayNUM = 1 To 31
YearSTR = CStr(YearNUM)
If MonthNUM < 10 Then
MonthSTR = "0" & CStr(MonthNUM)
Else:
MonthSTR = CStr(MonthNUM)
End If
If DayNUM < 10 Then
DaySTR = "0" & CStr(DayNUM)
Else:
DaySTR = CStr(DayNUM)
End If
myURL = FixURL & YearSTR & MonthSTR & DaySTR & ".zip"
Cells(1, 1) = myURL
Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
WinHttpReq.Open "GET", myURL, False
WinHttpReq.Send
myURL = WinHttpReq.ResponseBody
If WinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.ResponseBody
TargetFileName = "C:\Users\istvan.szabo\Documents \Basslink\AEMO RAW DATA\RAWRAW\" & YearSTR & MonthSTR & DaySTR & ".zip"
oStream.SaveToFile (TargetFileName)
oStream.Close
End If
'try unzippin'
Fname = TargetFileName
If Fname = False Then
'Do nothing
Else
'Root folder for the new folder.
'You can also use DefPath = "C:\Users\Ron\test\"
DefPath = Application.DefaultFilePath
If Right(DefPath, 1) <> "\" Then
DefPath = DefPath & "\"
End If
FileNameFolder = "C:\Users\istvan.szabo\Documents\Basslink\AEMO RAW DATA\RAWRAW\" & YearSTR & MonthSTR & DaySTR
'Make the normal folder in DefPath
MkDir FileNameFolder
'Extract the files into the newly created folder
Set oApp = CreateObject("Shell.Application")
oApp.Namespace(FileNameFolder).CopyHere oApp.Namespace(Fname).items
On Error Resume Next
Set FSO = CreateObject("scripting.filesystemobject")
FSO.deletefolder Environ("Temp") & "\Temporary Directory*", True
i = 1
StrFile = Dir(FileNameFolder & "\")
Do While Len(StrFile) > 0
FileList(i) = FileNameFolder & "\" & StrFile
FileListCSV(i) = Left(StrFile, Len(StrFile) - 3) & "csv"
StrFile = Dir
i = i + 1
Loop
'unzip the unzipped
For i = 1 To 288
Fname = FileList(i)
If Fname = False Then
'Do nothing
Else:
DefPath = Application.DefaultFilePath
If Right(DefPath, 1) <> "\" Then
DefPath = DefPath & "\"
End If
FileNameFolder = "C:\Users\istvan.szabo\Documents\Basslink\AEMO RAW DATA\"
Set oApp = CreateObject("Shell.Application")
oApp.Namespace(FileNameFolder).CopyHere oApp.Namespace(Fname).items
On Error Resume Next
Set FSO = CreateObject("scripting.filesystemobject")
FSO.deletefolder Environ("Temp") & "\Temporary Directory*", True
Call OpenSearch
End If
Next i
End If
Next DayNUM
Next MonthNUM
StarMonth = 1
EndMonth = 5
Next YearNUM
Application.ScreenUpdating = True
End Sub
You could check the file without opening it. That would save you time and resources. Here is a quick draw of the code I would use:
Sub OpenSearch()
Dim ROW, j As Integer
Dim fileID
Dim buf As String
Dim tmp As Variant
Open FileNameFolder & FileListCSV(i) For Input As #1
For j = 1 To 7
ROW = 3
Do Until EOF(1)
Line Input #1, buf
'Remove double quotes
buf = Replace(buf, """", "")
'Split line to a array
tmp = Split(buf, ",")
'5 is the 6th column in excel tmp index starts with 0
fileID = tmp(5)
If fileID = WantedID(j, 1) Then
MsgBox "WE HAVE A MATCH!"
End If
ROW = ROW + 1
Loop
Next j
Close #1
Kill FileNameFolder & FileListCSV(i)
End Sub
EDIT: Also try to add a resource cleanup code, for example: Set WinHttpReq = Nothing, Set oStream = Nothing etc.
In line with other advice in the comments: -
You should close of resources when you are done with them using Set WinHttpReq = Nothing for example. This can avoid memory problems that are similar to the issue you are seeing.
It is also advisable to remove On Error Resume Next. This is hiding errors and you may well be missing results that you need. It would also allow for more information during errors.
I took your two code blocks and wrote them into one that I believe will be stable during running and make it to the end, Run this and let us know if it did resolve the issue. I did it this way as there was a lot of small changes that went towards what I suspect will be more stable and make it to the end.
Sub DownloadandUnpackFile()
Dim FSO As New FileSystemObject
Dim DteDate As Date
Dim Fl As File
Dim Fl_Root As File
Dim Fldr As Folder
Dim Fldr_Root As Folder
Dim LngCounter As Long
Dim LngCounter2 As Long
Dim oApp As Object
Dim oStream As Object
Dim oWinHttpReq As Object
Dim RngIDs As Range
Dim StrURL As String
Dim StrRootURL As String
Dim VntFile As Variant
Dim VntFolder As Variant
Dim VntRootFile As Variant
Dim VntRootFolder As Variant
Dim WkBk As Workbook
Dim WkSht As Worksheet
'This will speed up processing, but you might not see progress while it is working
Application.ScreenUpdating = False
'Set variables
StrRootURL = "http://www.nemweb.com.au/REPORTS/ARCHIVE/Dispatch_SCADA/PUBLIC_DISPATCHSCADA_"
'You should be a little more explicit here for clarity, refernce a worksheet
'E.g. StrID = ThisWorkbook.Worksheets("Sheet1").Range(Cells(2, 1), Cells(8, 1))
Set RngIDs = Range(Cells(2, 1), Cells(8, 1))
Set oWinHttpReq = CreateObject("Microsoft.XMLHTTP")
Set oApp = CreateObject("Shell.Application")
'Loop from 21/Feb/2015 to today
For DteDate = CDate("21/Feb/2015") To Date
StrURL = StrRootURL & Format(DteDate, "YYYYMMDD") & ".zip"
Debug.Print StrURL
oWinHttpReq.Open "GET", StrURL, False
oWinHttpReq.Send
StrURL = oWinHttpReq.ResponseBody
If oWinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write oWinHttpReq.ResponseBody
VntRootFile = Environ("UserProfile") & "\Documents\Basslink\AEMO RAW DATA\RAWRAW\" & Format(DteDate, "YYYYMMDD") & ".zip"
oStream.SaveToFile VntRootFile
oStream.Close
Set oStream = Nothing
VntRootFolder = Environ("UserProfile") & "\Documents\Basslink\AEMO RAW DATA\RAWRAW\" & Format(DteDate, "YYYYMMDD") & "\"
FSO.CreateFolder VntRootFolder
oApp.Namespace(VntRootFolder).CopyHere oApp.Namespace(VntRootFile).Items
Set Fldr_Root = FSO.GetFolder(VntRootFolder)
'Unzip the zipped zips
For Each Fl_Root In Fldr_Root.Files
If Right(LCase(Fl_Root.Name), 4) = ".zip" Then
VntFolder = Fl_Root.ParentFolder & "\" & Left(Fl_Root.Name, Len(Fl_Root.Name) - 4) & "\"
FSO.CreateFolder VntFolder
VntFile = Fl_Root.Path
oApp.Namespace(VntFolder).CopyHere oApp.Namespace(VntFile).Items
Set Fldr = FSO.GetFolder(VntFolder)
For Each Fl In Fldr.Files
If Right(LCase(Fl.Name), 4) = ".csv" Then
Set WkBk = Application.Workbooks.Open(Fl.Path)
Set WkSht = WkBk.Worksheets(1)
For LngCounter = 1 To RngIDs.Rows.Count
LngCounter2 = 1
Do Until WkSht.Cells(LngCounter2, 6) = ""
If WkSht.Cells(LngCounter2, 6) = RngIDs.Cells(LngCounter, 1) Then
Debug.Print "FOUND: " & Fl.Name & ": " & WkSht.Cells(LngCounter2, 6).Address
End If
LngCounter2 = LngCounter2 + 1
Loop
Next
Set WkSht = Nothing
WkBk.Close 0
Set WkBk = Nothing
End If
DoEvents
Next
Set Fldr = Nothing
End If
Next
Fldr_Root.Delete True
Set Fldr_Root = Nothing
FSO.DeleteFile VntRootFile, True
End If
DoEvents
Next
Set oApp = Nothing
Set oWinHttpReq = Nothing
Set RngIDs = Nothing
Application.ScreenUpdating = True
End Sub
Changes I have made: -
I used early binding to FileSystemObject simply to make it easier
to write up. You will need the 'Windows Scripting Runtime' reference
added (Tools > References > tick 'Windows Scripting Runtime')
I iterated through dates as a single loop rather then three loops of
strings working as a date
I set IDs to be a range and note a variant
I opened references once, reuse them (i.e. oApp), and then close
them
I added DoEvents to give time back to the computer to run anything it
may need to, this maintains a health system.
I used Debug.Print to add information to the immediate window instead
of msgbox, but you should really list the finds out in a separate
worksheet, (debug.print has a size limit so you may end up only
seeing X number of results as others are truncated off.
I have the following code, which looks at each cell in column A of my spreadsheet, searches for the text it finds there in the specified PDF and then extracts the page where it finds the text as a PDF, naming it with the value in the cell of the spreadsheet. The code works but is rather slow, I may need to search for as many as 200 words in a PDF which could be as long as 600 pages. Is there a way to make the code faster? Currently it loops through each cell searches through each page looping through each word until it finds the word in the cell.
Sub test_with_PDF()
Dim objApp As Object
Dim objPDDoc As Object
Dim objjso As Object
Dim wordsCount As Long
Dim page As Long
Dim i As Long
Dim strData As String
Dim strFileName As String
Dim lastrow As Long, c As Range
Dim PageNos As Integer
Dim newPDF As Acrobat.CAcroPDDoc
Dim NewName As String
Dim Folder As String
lastrow = Sheets("Sheet1").Cells(Rows.Count, "A").End(xlUp).Row
strFileName = selectFile()
Folder = GetFolder()
Set objApp = CreateObject("AcroExch.App")
Set objPDDoc = CreateObject("AcroExch.PDDoc")
'AD.1 open file, if =false file is damage
If objPDDoc.Open(strFileName) Then
Set objjso = objPDDoc.GetJSObject
PageNos = 0
For Each c In Sheets("Sheet1").Range("A2:A" & lastrow)
For page = 0 To objPDDoc.GetNumPages - 1
wordsCount = objjso.GetPageNumWords(page)
For i = 0 To wordsCount
If InStr(1, c.Value, ", ") = 0 Then
If objjso.getPageNthWord(page, i) = c.Value Then
PageNos = PageNos + 1
If FileExist(Folder & "\" & c.Offset(0, 4) & ".pdf") Then
Set newPDF = CreateObject("AcroExch.pdDoc")
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.Open (NewName)
newPDF.InsertPages lngPages, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
Else
Set newPDF = CreateObject("AcroExch.PDDoc")
newPDF.Create
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.InsertPages -1, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
End If
End If
Else
If objjso.getPageNthWord(page, i) = c.Offset(0, 1).Value Then
If objjso.getPageNthWord(page, i + 1) = c.Offset(0, 2).Value Then
PageNos = PageNos + 1
If FileExist(Folder & "\" & c.Offset(0, 4) & ".pdf") Then
Set newPDF = CreateObject("AcroExch.pdDoc")
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.Open (NewName)
newPDF.InsertPages lngPages, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
Else
Set newPDF = CreateObject("AcroExch.PDDoc")
newPDF.Create
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.InsertPages -1, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
End If
Exit For
End If
End If
End If
Next i
Next page
c.Offset(0, 3).Value = PageNos
PageNos = 0
Next c
MsgBox "Done"
Else
MsgBox "error!"
End If
End Sub
Function FileExist(path As String) As Boolean
If Dir(path) <> vbNullString Then FileExist = True
End Function
Function selectFile()
Dim fd As FileDialog, fileName As String
On Error GoTo ErrorHandler
Set fd = Application.FileDialog(msoFileDialogFilePicker)
fd.AllowMultiSelect = False
If fd.Show = True Then
If fd.SelectedItems(1) <> vbNullString Then
fileName = fd.SelectedItems(1)
End If
Else
'Exit code if no file is selected
End
End If
'Return Selected FileName
selectFile = fileName
Set fd = Nothing
Exit Function
ErrorHandler:
Set fd = Nothing
MsgBox "Error " & Err & ": " & Error(Err)
End Function
Function GetFolder() As String
Dim fldr As FileDialog
Dim sItem As String
Set fldr = Application.FileDialog(msoFileDialogFolderPicker)
With fldr
.Title = "Select the Folder where you want you new PDFs to go"
.AllowMultiSelect = False
.InitialFileName = Application.DefaultFilePath
If .Show <> -1 Then GoTo NextCode
sItem = .SelectedItems(1)
End With
NextCode:
GetFolder = sItem
Set fldr = Nothing
End Function
Many thanks in advance.
Loops are definitely excellent for some things, but can tie down processing with these higher queries. Recently, a colleague and I were doing a similar task (not pdf-related though), and we had much success with using a range.find method instead of a loop executing instr on each cell.
Some points of interest:
-To mimic the “loop cells” functionality when using the .find method, we ended our range statement with .cells, as seen below:
activesheet.usedrange.cells.find( )
Where the desired string goes within the ( ).
-The return value: “A Range object that represents the first cell where that information is found.”
Once the .find method returns a range, a subsequent subroutine can extract the page number and document name.
-If you need to find the nth instance of an occurrence, “You can use the FindNext andFindPrevious methods to repeat the search.” (Microsoft)
Microsoft overview of range.find:
https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-find-method-excel
So with this approach, the user can use a loop based on a count of cells in your list to execute the .find method for each string.
Downside is (I assume) that this must be done on text within the excel application; also, I’ve not tested it to determine if the string has to inhabit the cell by itself (I don’t think this is a concern).
‘===================
Another suggestion that might be beneficial is to first bulk-rip all text from the .pdf with as little looping as possible (direct actions at the document object level). Then your find/return approach can be applied to the bulk text.
I did a similar activity when creating study notes from a professor’s PowerPoints; I grabbed all the text into a .txt file, then returned every sentence containing the instance of a list of strings.
‘=====================
A few caveats: I admit that I have not executed parsing at the sheer size of your project, so my suggestions might not be advantageous in practice.
Also, I have not done much work parsing .pdf documents, as I try to opt for anything that is .txt/excel app first, and engage it instead.
Good luck in your endeavors; I hope I was able to at least provide food for thought!
Sorry to post a quick, incomplete answer, but I think I can point you in a good direction.
Instead of making the system look up the two terms hundreds of billions of times, then make hundreds of billions of comparisons, put your search terms into an array, and the text of each page into a long string.Then it only has to do one look up and 200 comparisons per page.
'Dim your Clipboard functions
Public Declare PtrSafe Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long
Public Declare PtrSafe Function EmptyClipboard Lib "user32" () As Long
Public Declare PtrSafe Function CloseClipboard Lib "user32" () As Long
'...
Dim objData As New MSForms.DataObject
Dim arrSearch() As String
Dim strTxt As String
'...
'Create array of search terms
For i = 2 To lastrow
arrSearch(i - 2) = Sheets("Sheet1").Cells(1, i)
Next i
For page = 0 To objPDDoc.GetNumPages - 1
'[Move each page into a new document. You already have that code]
'Clear clipboard
OpenClipboard (0&)
EmptyClipboard
CloseClipboard
'Copy page to clipboard
objApp.MenuItemExecute ("SelectAll")
objApp.MenuItemExecute ("Copy")
'You can also do this with the JavaScript object: objjso.ExecMenuItem("Item Name")
'You may have to insert a waiting function like sleep() here to wait for the action to complete
'Put data from clipboard into a string.
objData.GetFromClipboard
strTxt = objData.GetText 'Now you can search the entire content of the page at once, within memory
'Compare each element of the array to the string
For i = LBound(arrSearch) To UBound(arrSearch)
If InStr(1, strTxt, arrSearch(i)) > 0 Then
'[You found a match. Your code here]
End If
Next i
Next page
This is still cumbersome because you have to open each page in a new document. If there is a good way to determine which page you're on purely by text (such as the page number at the bottom of page a, followed immediately by the header at the top of page b) then you might look at copying the entire text of the document into one string, then using the clues from the text to decide which page to extract once you find a match. That would be a lot faster I believe.
Sub BatchRenameCS()
Dim objApp As Object
Dim objPDDoc As Object
Dim objjso As Object
Dim newPDF As Acrobat.CAcroPDDoc
Dim lastrow2 As Long
Dim strFileName As String
Dim Folder As String
Dim Page As Long
Dim Cell As Long
Dim PDFCharacterCount() As Long
Dim CharacterCount As Long
Dim i As Integer
Dim c As Integer
Dim x As Integer
Dim strSource As String
Dim strResult As String
Dim PDFCharacters As String
Dim PDFCharacters2 As String
Dim PDFPasteData() As String
Dim PasteDataPage As Integer
Dim LastRow As Long
Dim NewName As String
Dim NewNamePageNum As Integer
Dim Check()
Sheets("Sheet1").Range("C:D").ClearContents
strFileName = selectFile()
Folder = GetFolder()
'create array with pdf word count
Set objApp = CreateObject("AcroExch.App")
Set objPDDoc = CreateObject("AcroExch.PDDoc")
'AD.1 open file, if =false file is damage
If objPDDoc.Open(strFileName) Then
Set objjso = objPDDoc.GetJSObject
ReDim PDFCharacterCount(1 To objPDDoc.GetNumPages) As Long
For Page = 1 To objPDDoc.GetNumPages
PDFCharacters = ""
PDFCharacters2 = ""
For c = 0 To objjso.GetPageNumWords(Page - 1)
PDFCharacters = PDFCharacters & objjso.getPageNthWord(Page - 1, c)
Next c
For i = 1 To Len(PDFCharacters)
Select Case Asc(Mid(PDFCharacters, i, 1))
Case 48 To 57, 65 To 90, 97 To 122:
PDFCharacters2 = PDFCharacters2 & (Mid(PDFCharacters, i, 1))
Case Else
PDFCharacters2 = PDFCharacters2 & ""
End Select
Next
PDFCharacterCount(Page) = Len(PDFCharacters2)
Next Page
lastrow2 = Sheets("Sheet2").Cells(Rows.Count, "A").End(xlUp).Row
Page = 1
ReDim PDFPasteData(1 To objPDDoc.GetNumPages) As String
For Cell = 1 To lastrow2
strResult = ""
strSource = Sheets("Sheet2").Cells(Cell, 1).Text
PDFPasteData(Page) = PDFPasteData(Page) & " " & strSource
For i = 1 To Len(strSource)
Select Case Asc(Mid(strSource, i, 1))
Case 48 To 57, 65 To 90, 97 To 122:
strResult = strResult & (Mid(strSource, i, 1))
Case Else
strResult = strResult & ""
End Select
Next
CharacterCount = CharacterCount + Len(strResult)
If CharacterCount = PDFCharacterCount(Page) Then
CharacterCount = 0
Page = Page + 1
End If
Next Cell
ReDim Check(2, objPDDoc.GetNumPages)
LastRow = Sheets("Sheet1").Cells(Rows.Count, "A").End(xlUp).Row
For Each LookUpCell In Worksheets("Sheet1").Range("A2:A" & LastRow)
For PasteDataPage = 1 To objPDDoc.GetNumPages
If InStr(PDFPasteData(PasteDataPage), LookUpCell.Value) Then
Check(1, PasteDataPage) = Check(1, PasteDataPage) + 1
Check(2, PasteDataPage) = Check(2, PasteDataPage) & LookUpCell.Offset(0, 1).Value & Chr(10)
If FileExist(Folder & "\" & LookUpCell.Offset(0, 1) & ".pdf") Then
Set newPDF = CreateObject("AcroExch.pdDoc")
NewName = Folder & "\" & LookUpCell.Offset(0, 1) & ".pdf"
newPDF.Open (NewName)
newPDF.InsertPages newPDF.GetNumPages - 1, objPDDoc, PasteDataPage - 1, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Else
Set newPDF = CreateObject("AcroExch.PDDoc")
newPDF.Create
NewName = Folder & "\" & LookUpCell.Offset(0, 1) & ".pdf"
newPDF.InsertPages -1, objPDDoc, PasteDataPage - 1, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
End If
End If
Next PasteDataPage
Next LookUpCell
x = 1
For PasteDataPage = 1 To objPDDoc.GetNumPages
If Check(1, PasteDataPage) <> 1 Then
Sheets("Sheet1").Cells(x, 3) = PasteDataPage
Sheets("Sheet1").Cells(x, 4) = Check(2, PasteDataPage)
x = x + 1
End If
Next PasteDataPage
End If
MsgBox "Done"
End Sub
Function FileExist(path As String) As Boolean
If Dir(path) <> vbNullString Then FileExist = True
End Function
Function selectFile()
Dim fd As FileDialog, fileName As String
On Error GoTo ErrorHandler
Set fd = Application.FileDialog(msoFileDialogFilePicker)
fd.AllowMultiSelect = False
If fd.Show = True Then
If fd.SelectedItems(1) <> vbNullString Then
fileName = fd.SelectedItems(1)
End If
Else
'Exit code if no file is selected
End
End If
'Return Selected FileName
selectFile = fileName
Set fd = Nothing
Exit Function
ErrorHandler:
Set fd = Nothing
MsgBox "Error " & Err & ": " & Error(Err)
End Function
Function GetFolder() As String
Dim fldr As FileDialog
Dim sItem As String
Set fldr = Application.FileDialog(msoFileDialogFolderPicker)
With fldr
.Title = "Select the Folder where you want you new PDFs to go"
.AllowMultiSelect = False
.InitialFileName = Application.DefaultFilePath
If .Show <> -1 Then GoTo NextCode
sItem = .SelectedItems(1)
End With
NextCode:
GetFolder = sItem
Set fldr = Nothing
End Function
I have made a script that allows the user to open a file using the shell browser and once selected they are prompted to enter the interval in which they want to split the .CSV file down into smaller files.
The problem that arises is that once I select the file using the browser I get an unspecified error with the code 80004005 it appears on line 15 character 1 and I have no idea how to solve this.
Any help would be much appreciated!
Option Explicit
Const ForReading = 1
Const ForWriting = 2
Dim objFSO, objInputFile, objOutputFile
Dim intLine, intFile
Dim strHeaders
Dim strInputFile, strOutputPrefix, strLine
Dim MyDate
Dim shell
Dim file
Set shell = CreateObject("Shell.Application")
Set file = shell.BrowseForFolder(0, "Choose a file:", &H4000)
BrowseForFile = file.self.Path
strInputFile = BrowseForFile
strOutputPrefix = objFSO.GetBaseName(strInputFile) & DatePart("yyyy", Now) & "-" & DatePart("m", Now) & "-" & DatePart("d", Now)
userSplit = InputBox("Enter when you want to split")
intFile = 1
intLine = 0
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objInputFile = objFSO.OpenTextFile(strInputFile, ForReading)
If (objInputFile.AtEndOfStream = True) Then
' The file is empty
WScript.Quit 1
End If
strHeaders = objInputFile.ReadLine
Do While (objInputFile.AtEndOfStream = False)
strLine = objInputFile.ReadLine
If (intLine <= 0) Then
Set objOutputFile = objFSO.CreateTextFile(strOutputPrefix & "_" & intFile & ".csv", True)
objOutputFile.WriteLine strHeaders
intLine = 1
End If
objOutputFile.WriteLine strLine
If (intLine >= userSplit) Then
objOutputFile.Close
Set objOutputFile = Nothing
intFile = intFile + 1
intLine = 0
Else
intLine = intLine + 1
End If
Loop
Your BrowseForFolder call misses the third/option parameter; so change
Set file = shell.BrowseForFolder(0, "Choose a file:", &H4000)
to
Set file = shell.BrowseForFolder(0, "Choose a file:", 0, &H4000)
The above (kept for context for comments) is all wrong
The docs state clearly:
Creates a dialog box that enables the user to select a folder and then
returns the selected folder's Folder object.
so you can display the files, but not pick/return them.