I have an excel workbook with column [A] having numerous filenames. But only the name of the file, so without any extension. I would like to loop through each cell and check whether a file in a specific folder exist matching this string in the cell.
someone did not manage to find anything on Google.
thanks
This should do it.
Const dirPath As String = "C:\whateveryourPathIs\"
Sub RunIt()
Dim Rcell As Range
For Each Rcell In Intersect(Range("A:A"), ActiveSheet.UsedRange).Cells
If Not IsEmpty(Rcell) Then
If CheckIfFileExists(dirPath, Rcell.Value) Then
'whatever you want to happen when it finds a match
Debug.Print Rcell.Value & " was found"
End If
End If
Next Rcell
End Sub
Private Function CheckIfFileExists(srchDIR As String, MatchMember As String) As Boolean
Dim file As Variant, nameOfFile As String
If Right(srchDIR, 1) <> "\" Then srchDIR = srchDIR & "\"
file = Dir(srchDIR)
While (file <> "")
nameOfFile = Left(file, InStrRev(file, ".", -1, vbTextCompare) - 1)
If UCase(nameOfFile) = UCase(MatchMember) Then
CheckIfFileExists = True
Exit Function
End If
file = Dir
Wend
End Function
#PGCodeRider Thanks, I modified your code to my needs and it works now. anyone who is interested ("or also did not find anything on google") below the code:
Const dirPath As String = "C:\folderpath\..."
Sub RunIt()
Dim Rcell As Range
For Each Rcell In Intersect(Range("A:A"), ActiveSheet.UsedRange).Cells
If Not IsEmpty(Rcell) Then
If CheckIfFileExists(dirPath, Rcell.Value) Then
'whatever you want to happen when it finds a match
Debug.Print Rcell.Value & " was found"
End If
End If
Next Rcell
End Sub
Private Function CheckIfFileExists(srchDIR As String, MatchMember As String) As Boolean
Dim file As Variant
Dim length As Long
If Right(srchDIR, 1) <> "\" Then srchDIR = srchDIR & "\"
file = Dir(srchDIR)
While (file <> "")
length = Len(file)
For i = 1 To length
If Right(file, 1) <> "." Then
file = Left(file, length - 1)
length = Len(file)
Else
Exit For
End If
Next i
file = Left(file, length - 1)
'If InStr(1, file, MatchMember, vbTextCompare) > 0 Then
If file = MatchMember Then
CheckIfFileExists = True
Exit Function
End If
file = Dir
Wend
End Function
Related
I need help understanding why file name extraction from string only works every other time.
I've tried both right and mid. I've used before in other code with no problem. I'm useing msgbox for debugging.
Final outcome should be adding multiple file names to the bottom of a table
Private Sub ButtonAdd_Click()
Dim fd As FileDialog
Dim fName As String ' full path file name
Dim nextRow As Long
Dim filename As String ' extracted file name only
Set fd = Application.FileDialog(msoFileDialogFilePicker)
fd.Title = "Please select file to add"
fd.InitialFileName = ThisWorkbook.FullName
fd.AllowMultiSelect = True
fchosen = fd.Show
If fchosen = -1 Then
For i = 1 To fd.SelectedItems.Count
fName = fd.SelectedItems(i)
'filename = Right(fName, Len(fName) - InStrRev(filename, "\"))
filename = Mid(fName, InStrRev(filename, "\") + 1)
MsgBox (filename)
nextRow = Range("a" & Rows.Count).End(xlUp).row + 1
'Range("a" & nextRow) = filename
Next i
End If
End Sub
As Josh writes in the comments, you are using the wrong variable filename instead of fname. Another example of how nesting commands makes it difficult to find an error and how naming of variables matters.
Split the lines into two pieces and rename fname to something like fullFilename:
Dim fullFilename as string, filename as string, p as long
fullFilename = fd.SelectedItems(i)
p = InStrRev(fullFilename, "\")
if p > 0 then
filename = mid(fullFilename, p+1)
else
filename = fullFilename
End If
Now you can easily distinguish between the variable holding the full path and the one that has only the file name. And even if you mix something up, you can easily find the problem using the debugger
Take your pick
Get File Name From File Path
Option Explicit
Sub Sample()
MsgBox GetFilenameFromPath("C:\Temp\Myfile.Txt")
End Sub
Private Function GetFilenameFromPath(ByVal strPath As String) As String
If Right$(strPath, 1) <> "\" And Len(strPath) > 0 Then
GetFilenameFromPath = GetFilenameFromPath( _
Left$(strPath, Len(strPath) - 1)) + Right$(strPath, 1)
End If
End Function
Get File Name Without Extension
Option Explicit
Sub Sample()
Dim fName As String
fName = "C:\Temp\Myfile.Txt"
MsgBox GetFilenameFromPath(Left(fName, (InStrRev(fName, ".", -1, vbTextCompare) - 1)))
End Sub
Private Function GetFilenameFromPath(ByVal strPath As String) As String
If Right$(strPath, 1) <> "\" And Len(strPath) > 0 Then
GetFilenameFromPath = GetFilenameFromPath( _
Left$(strPath, Len(strPath) - 1)) + Right$(strPath, 1)
End If
End Function
I am working on a project in excel, where I am renaming multiple files.
Fow now I am using this code
Sub RenameFiles()
Dim xDir As String
Dim xFile As String
Dim xRow As Long
With Application.FileDialog(msoFileDialogFolderPicker)
.AllowMultiSelect = False
If .Show = -1 Then
xDir = .SelectedItems(1)
xFile = Dir(xDir & Application.PathSeparator & "*")
Do Until xFile = ""
xRow = 0
On Error Resume Next
xRow = Application.Match(xFile, Range("A:A"), 0)
If xRow > 0 Then
Name xDir & Application.PathSeparator & xFile As _
xDir & Application.PathSeparator & Cells(xRow, "G").Value
End If
xFile = Dir
Loop
End If
End With
End Sub
which lets me change the names of the files in one specific folder, but I would like to be able to pick the main folder containing subfolders and it would change all the names corresponding with names I have made in my excel sheet.
I’m sure you are aware that renaming files if go wrong can have very serious, sometimes even catastrophic consequences, with that been said I hope that all necessary step to avoid any of those problems have been taken.
Data and Code:
It seems that columns A and G contain the "old" and "new" names of the files (excluding the path), and that’s the reason of asking the user for the path and the possibility of running the renaming of the files for subfolders as well.
The code posted compares every file in the folders (and subfolder as expected) against the list of files in the data, which could be time consuming.
Also, I’ll would suggest to have a track of what files have been renamed, so in case of any error, this allows to easily track back and undo what could have be an error.
Solution Proposed
The solution proposed below uses the FileSystemObject object which provides a robust access to the machine file system, you can interact with it in two manners: Early and Late Binding (Visual Basic). These procedures use late binding, to use early binding see How do I use FileSystemObject in VBA?
Folders_ƒGet_From_User: A function that ask the user to select the folder and to process or not subfolders. It returns a list of the subfolder selected (names only), excluding folders with no files.
Files_Get_Array: Creates and array with all the Filenames to be processed (Old & New)
Files_ƒRename: This function renames all files found in any of the folders from the list obtained from point 1. These procedure instead of validating every file present in the subfolders against the list, check if the files in the list Exist in any folder, and if so passes to the function File_ƒRename_Apply that does the renaming and returns the result, allowing the creation of the “Audit Track” array. It returns an array with the results of all the files names in the list in all the folders list ( from point 1 and 2) respectively.
File_Rename_Post_Records: Creates a worksheet named FileRename(Track) (if not present) to post the Audit Track of the results of the Files_ƒRename function.
All of them are called from the procedure: Files_Rename
Let me know of any questions you might have regarding the resources used.
Option Explicit
Private Const mk_Wsh As String = "FileRename(Track)"
Private Const mk_MsgTtl As String = "Files Rename"
Private mo_Fso As Object
…
Sub Files_Rename()
Dim aFolders() As String, aFiles As Variant
Dim aRenamed As Variant
Set mo_Fso = CreateObject("Scripting.FileSystemObject")
If Not (Folders_ƒGet_From_User(aFolders)) Then Exit Sub
Call Files_Get_Array(aFiles)
If Not (Files_ƒRename(aRenamed, aFolders, aFiles)) Then
Call MsgBox("None file was renamed", vbInformation, mk_MsgTtl)
Exit Sub
End If
Call File_Rename_Post_Records(aFiles, aRenamed)
Call MsgBox("Files were renamed" & String(2, vbLf) _
& vbTab & "see details in sheet [" & mk_Wsh & "]", vbInformation, mk_MsgTtl)
End Sub
…
Private Function Folders_ƒGet_From_User(aFolders As Variant) As Boolean
Dim aFdrs As Variant
Dim oFdr As Object, sFolder As String, blSubFdrs As Boolean
Erase aFolders
With Application.FileDialog(msoFileDialogFolderPicker)
.AllowMultiSelect = False
If .Show <> -1 Then Exit Function
sFolder = .SelectedItems(1)
End With
If MsgBox("Do you want to include subfolders?", _
vbQuestion + vbYesNo + vbDefaultButton2, _
mk_MsgTtl) = vbYes Then blSubFdrs = True
Set oFdr = mo_Fso.GetFolder(sFolder)
Select Case blSubFdrs
Case False
If oFdr.Files.Count > 0 Then
aFdrs = aFdrs & "|" & oFdr.Path
Else
MsgBox "No files found in folder:" & String(2, vbLf) & _
vbTab & sFolder & String(2, vbLf) & _
vbTab & "Process is being terminated.", _
vbInformation, mk_MsgTtl
Exit Function
End If
Case Else
Call SubFolders_Get_Array(aFdrs, oFdr)
If aFdrs = vbNullString Then
MsgBox "No files found in folder & subfolders:" & String(2, vbLf) & _
vbTab & sFolder & String(2, vbLf) & _
vbTab & "Process is being terminated.", _
vbInformation, mk_MsgTtl
Exit Function
End If
End Select
Rem String To Array
aFdrs = Mid(aFdrs, 2)
aFdrs = Split(aFdrs, "|")
aFolders = aFdrs
Folders_ƒGet_From_User = True
End Function
…
Private Sub SubFolders_Get_Array(aFdrs As Variant, oFdr As Object)
Dim oSfd As Object
With oFdr
If .Files.Count > 0 Then aFdrs = aFdrs & "|" & .Path
For Each oSfd In .SubFolders
Call SubFolders_Get_Array(aFdrs, oSfd)
Next: End With
End Sub
…
Private Sub Files_Get_Array(aFiles As Variant)
Dim lRow As Long
With ThisWorkbook.Sheets("DATA") 'change as required
lRow = .Rows.Count
If Len(.Cells(lRow, 1).Value) = 0 Then lRow = .Cells(lRow, 1).End(xlUp).Row
aFiles = .Cells(2, 1).Resize(-1 + lRow, 7).Value
End With
End Sub
…
Private Function Files_ƒRename(aRenamed As Variant, aFolders As Variant, aFiles As Variant) As Boolean
Dim vRcd As Variant: vRcd = Array("Filename.Old", "Filename.New")
Dim blRenamed As Boolean
Dim oDtn As Object, aRcd() As String, lRow As Long, bFdr As Byte
Dim sNameOld As String, sNameNew As String
Dim sFilename As String, sResult As String
aRenamed = vbNullString
Set oDtn = CreateObject("Scripting.Dictionary")
vRcd = Join(vRcd, "|") & "|" & Join(aFolders, "|")
vRcd = Split(vRcd, "|")
oDtn.Add 0, vRcd
With mo_Fso
For lRow = 1 To UBound(aFiles)
sNameOld = aFiles(lRow, 1)
sNameNew = aFiles(lRow, 7)
vRcd = sNameOld & "|" & sNameNew
For bFdr = 0 To UBound(aFolders)
sResult = Chr(39)
sFilename = .BuildPath(aFolders(bFdr), sNameOld)
If .FileExists(sFilename) Then
If File_ƒRename_Apply(sResult, sNameNew, sFilename) Then blRenamed = True
End If
vRcd = vRcd & "|" & sResult
Next
vRcd = Mid(vRcd, 2)
vRcd = Split(vRcd, "|")
oDtn.Add lRow, vRcd
Next: End With
If Not (blRenamed) Then Exit Function
aRenamed = oDtn.Items
aRenamed = WorksheetFunction.Index(aRenamed, 0, 0)
Files_ƒRename = True
End Function
…
Private Function File_ƒRename_Apply(sResult As String, sNameNew As String, sFileOld As String) As Boolean
With mo_Fso.GetFile(sFileOld)
sResult = .ParentFolder
On Error Resume Next
.Name = sNameNew
If Err.Number <> 0 Then
sResult = "¡Err: " & Err.Number & " - " & Err.Description
Exit Function
End If
On Error GoTo 0
End With
File_ƒRename_Apply = True
End Function
…
Private Sub File_Rename_Post_Records(aFiles As Variant, aRenamed As Variant)
Const kLob As String = "lo.Audit"
Dim blWshNew As Boolean
Dim Wsh As Worksheet, Lob As ListObject, lRow As Long
Rem Worksheet Set\Add
With ThisWorkbook
On Error Resume Next
Set Wsh = .Sheets(mk_Wsh)
On Error GoTo 0
If Wsh Is Nothing Then
.Worksheets.Add After:=.Sheets(.Sheets.Count)
Set Wsh = .Sheets(.Sheets.Count)
blWshNew = True
End If: End With
Rem Set ListObject
With Wsh
.Name = mk_Wsh
.Activate
Application.GoTo .Cells(1), 1
Select Case blWshNew
Case False
Set Lob = .ListObjects(kLob)
lRow = 1 + Lob.ListRows.Count
Case Else
With .Cells(2, 2).Resize(1, 4)
.Value = Array("TimeStamp", "Filename.Old", "Filename.New", "Folder.01")
Set Lob = .Worksheet.ListObjects.Add(xlSrcRange, .Resize(2), , xlYes)
Lob.Name = "lo.Audit"
lRow = 1
End With: End Select: End With
Rem Post Data
With Lob.DataBodyRange.Cells(lRow, 1).Resize(UBound(aRenamed), 1)
.Value = Format(Now, "YYYYMMDD_HHMMSS")
.Offset(0, 1).Resize(, UBound(aRenamed, 2)).Value = aRenamed
.CurrentRegion.Columns.AutoFit
End With
End Sub
Renaming Files (Subfolders)
Not nearly enough tested.
You better create a copy of the folder where it should run to avoid losing files.
It will write all files in the folder and its subfolders to a dictionary object whose keys (file paths) will be checked against the file paths in column A. If matched, the files will be renamed to the name in column G with the same file path.
It checks each new file path only against the file paths in the dictionary before renaming.
It will fail if a file name is not valid.
Copy the complete code to a standard module, e.g. Module1.
Adjust the values in the constants section of the first procedure.
Run only the first procedure, the rest is being called by it.
The Code
Option Explicit
Sub renameFiles()
' Define constants.
Const wsName As String = "Sheet1"
Const FirstRow As Long = 2
Dim Cols As Variant
Cols = Array("A", "G")
Dim wb As Workbook
Set wb = ThisWorkbook
' Define worksheet.
Dim ws As Worksheet
Set ws = wb.Worksheets(wsName)
' Define Lookup Column Range.
Dim rng As Range
Set rng = defineColumnRange(ws, Cols(LBound(Cols)), FirstRow)
' Write values from Column Ranges to jagged Column Ranges Array.
Dim ColumnRanges As Variant
ColumnRanges = getColumnRanges(rng, Cols)
' Pick a folder.
Dim FolderPath As String
FolderPath = pickFolder
' Define a Dictionary object.
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
' Write the paths and the names of the files in the folder
' and its subfolders to the Dictionary.
Set dict = getFilesDictionary(FolderPath)
' Rename files.
Dim RenamesCount As Long
RenamesCount = renameColRngDict(ColumnRanges, dict)
' Inform user.
If RenamesCount > 0 Then
MsgBox "Renamed " & RenamesCount & " file(s).", vbInformation, "Success"
Else
MsgBox "No files renamed.", vbExclamation, "No Renames"
End If
End Sub
Function defineColumnRange(Sheet As Worksheet, _
ColumnIndex As Variant, _
FirstRowNumber As Long) _
As Range
Dim rng As Range
Set rng = Sheet.Cells(FirstRowNumber, ColumnIndex) _
.Resize(Sheet.Rows.Count - FirstRowNumber + 1)
Dim cel As Range
Set cel = rng.Find(What:="*", _
LookIn:=xlFormulas, _
SearchDirection:=xlPrevious)
If Not cel Is Nothing Then
Set defineColumnRange = rng.Resize(cel.Row - FirstRowNumber + 1)
End If
End Function
Function getColumnRanges(ColumnRange As Range, _
BuildColumns As Variant) _
As Variant
Dim Data As Variant
ReDim Data(LBound(BuildColumns) To UBound(BuildColumns))
Dim j As Long
With ColumnRange.Columns(1)
For j = LBound(BuildColumns) To UBound(BuildColumns)
If .Rows.Count > 1 Then
Data(j) = .Offset(, .Worksheet.Columns(BuildColumns(j)) _
.Column - .Column).Value
Else
Dim OneCell As Variant
ReDim OneCell(1 To 1, 1 To 1)
Data(j) = OneCell
Data(1, 1) = .Offset(, .Worksheet.Columns(BuildColumns(j)) _
.Column - .Column).Value
End If
Next j
End With
getColumnRanges = Data
End Function
Function pickFolder() _
As String
With Application.FileDialog(msoFileDialogFolderPicker)
.AllowMultiSelect = False
If .Show = -1 Then
pickFolder = .SelectedItems(1)
End If
End With
End Function
' This cannot run without the 'listFiles' procedure.
Function getFilesDictionary(ByVal FolderPath As String) _
As Object
Dim dict As Object ' ByRef
Set dict = CreateObject("Scripting.Dictionary")
With CreateObject("Scripting.FileSystemObject")
listFiles dict, .GetFolder(FolderPath)
End With
Set getFilesDictionary = dict
End Function
' This is being called only by 'getFileDictionary'
Sub listFiles(ByRef Dictionary As Object, _
fsoFolder As Object)
Dim fsoSubFolder As Object
Dim fsoFile As Object
For Each fsoFile In fsoFolder.Files
Dictionary(fsoFile.Path) = Empty 'fsoFile.Name
Next fsoFile
For Each fsoSubFolder In fsoFolder.SubFolders
listFiles Dictionary, fsoSubFolder
Next
End Sub
' Breaking the rules:
' A Sub written as a function to return the number of renamed files.
Function renameColRngDict(ColumnRanges As Variant, _
Dictionary As Object) _
As Long
Dim Key As Variant
Dim CurrentIndex As Variant
Dim NewFilePath As String
For Each Key In Dictionary.Keys
Debug.Print Key
CurrentIndex = Application.Match(Key, _
ColumnRanges(LBound(ColumnRanges)), 0)
If Not IsError(CurrentIndex) Then
NewFilePath = Left(Key, InStrRev(Key, Application.PathSeparator)) _
& ColumnRanges(UBound(ColumnRanges))(CurrentIndex, 1)
If IsError(Application.Match(NewFilePath, Dictionary.Keys, 0)) Then
renameColRngDict = renameColRngDict + 1
Name Key As NewFilePath
End If
End If
Next Key
End Function
I have the following code, which looks at each cell in column A of my spreadsheet, searches for the text it finds there in the specified PDF and then extracts the page where it finds the text as a PDF, naming it with the value in the cell of the spreadsheet. The code works but is rather slow, I may need to search for as many as 200 words in a PDF which could be as long as 600 pages. Is there a way to make the code faster? Currently it loops through each cell searches through each page looping through each word until it finds the word in the cell.
Sub test_with_PDF()
Dim objApp As Object
Dim objPDDoc As Object
Dim objjso As Object
Dim wordsCount As Long
Dim page As Long
Dim i As Long
Dim strData As String
Dim strFileName As String
Dim lastrow As Long, c As Range
Dim PageNos As Integer
Dim newPDF As Acrobat.CAcroPDDoc
Dim NewName As String
Dim Folder As String
lastrow = Sheets("Sheet1").Cells(Rows.Count, "A").End(xlUp).Row
strFileName = selectFile()
Folder = GetFolder()
Set objApp = CreateObject("AcroExch.App")
Set objPDDoc = CreateObject("AcroExch.PDDoc")
'AD.1 open file, if =false file is damage
If objPDDoc.Open(strFileName) Then
Set objjso = objPDDoc.GetJSObject
PageNos = 0
For Each c In Sheets("Sheet1").Range("A2:A" & lastrow)
For page = 0 To objPDDoc.GetNumPages - 1
wordsCount = objjso.GetPageNumWords(page)
For i = 0 To wordsCount
If InStr(1, c.Value, ", ") = 0 Then
If objjso.getPageNthWord(page, i) = c.Value Then
PageNos = PageNos + 1
If FileExist(Folder & "\" & c.Offset(0, 4) & ".pdf") Then
Set newPDF = CreateObject("AcroExch.pdDoc")
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.Open (NewName)
newPDF.InsertPages lngPages, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
Else
Set newPDF = CreateObject("AcroExch.PDDoc")
newPDF.Create
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.InsertPages -1, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
End If
End If
Else
If objjso.getPageNthWord(page, i) = c.Offset(0, 1).Value Then
If objjso.getPageNthWord(page, i + 1) = c.Offset(0, 2).Value Then
PageNos = PageNos + 1
If FileExist(Folder & "\" & c.Offset(0, 4) & ".pdf") Then
Set newPDF = CreateObject("AcroExch.pdDoc")
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.Open (NewName)
newPDF.InsertPages lngPages, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
Else
Set newPDF = CreateObject("AcroExch.PDDoc")
newPDF.Create
NewName = Folder & "\" & c.Offset(0, 4) & ".pdf"
newPDF.InsertPages -1, objPDDoc, page, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Exit For
End If
Exit For
End If
End If
End If
Next i
Next page
c.Offset(0, 3).Value = PageNos
PageNos = 0
Next c
MsgBox "Done"
Else
MsgBox "error!"
End If
End Sub
Function FileExist(path As String) As Boolean
If Dir(path) <> vbNullString Then FileExist = True
End Function
Function selectFile()
Dim fd As FileDialog, fileName As String
On Error GoTo ErrorHandler
Set fd = Application.FileDialog(msoFileDialogFilePicker)
fd.AllowMultiSelect = False
If fd.Show = True Then
If fd.SelectedItems(1) <> vbNullString Then
fileName = fd.SelectedItems(1)
End If
Else
'Exit code if no file is selected
End
End If
'Return Selected FileName
selectFile = fileName
Set fd = Nothing
Exit Function
ErrorHandler:
Set fd = Nothing
MsgBox "Error " & Err & ": " & Error(Err)
End Function
Function GetFolder() As String
Dim fldr As FileDialog
Dim sItem As String
Set fldr = Application.FileDialog(msoFileDialogFolderPicker)
With fldr
.Title = "Select the Folder where you want you new PDFs to go"
.AllowMultiSelect = False
.InitialFileName = Application.DefaultFilePath
If .Show <> -1 Then GoTo NextCode
sItem = .SelectedItems(1)
End With
NextCode:
GetFolder = sItem
Set fldr = Nothing
End Function
Many thanks in advance.
Loops are definitely excellent for some things, but can tie down processing with these higher queries. Recently, a colleague and I were doing a similar task (not pdf-related though), and we had much success with using a range.find method instead of a loop executing instr on each cell.
Some points of interest:
-To mimic the “loop cells” functionality when using the .find method, we ended our range statement with .cells, as seen below:
activesheet.usedrange.cells.find( )
Where the desired string goes within the ( ).
-The return value: “A Range object that represents the first cell where that information is found.”
Once the .find method returns a range, a subsequent subroutine can extract the page number and document name.
-If you need to find the nth instance of an occurrence, “You can use the FindNext andFindPrevious methods to repeat the search.” (Microsoft)
Microsoft overview of range.find:
https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-find-method-excel
So with this approach, the user can use a loop based on a count of cells in your list to execute the .find method for each string.
Downside is (I assume) that this must be done on text within the excel application; also, I’ve not tested it to determine if the string has to inhabit the cell by itself (I don’t think this is a concern).
‘===================
Another suggestion that might be beneficial is to first bulk-rip all text from the .pdf with as little looping as possible (direct actions at the document object level). Then your find/return approach can be applied to the bulk text.
I did a similar activity when creating study notes from a professor’s PowerPoints; I grabbed all the text into a .txt file, then returned every sentence containing the instance of a list of strings.
‘=====================
A few caveats: I admit that I have not executed parsing at the sheer size of your project, so my suggestions might not be advantageous in practice.
Also, I have not done much work parsing .pdf documents, as I try to opt for anything that is .txt/excel app first, and engage it instead.
Good luck in your endeavors; I hope I was able to at least provide food for thought!
Sorry to post a quick, incomplete answer, but I think I can point you in a good direction.
Instead of making the system look up the two terms hundreds of billions of times, then make hundreds of billions of comparisons, put your search terms into an array, and the text of each page into a long string.Then it only has to do one look up and 200 comparisons per page.
'Dim your Clipboard functions
Public Declare PtrSafe Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long
Public Declare PtrSafe Function EmptyClipboard Lib "user32" () As Long
Public Declare PtrSafe Function CloseClipboard Lib "user32" () As Long
'...
Dim objData As New MSForms.DataObject
Dim arrSearch() As String
Dim strTxt As String
'...
'Create array of search terms
For i = 2 To lastrow
arrSearch(i - 2) = Sheets("Sheet1").Cells(1, i)
Next i
For page = 0 To objPDDoc.GetNumPages - 1
'[Move each page into a new document. You already have that code]
'Clear clipboard
OpenClipboard (0&)
EmptyClipboard
CloseClipboard
'Copy page to clipboard
objApp.MenuItemExecute ("SelectAll")
objApp.MenuItemExecute ("Copy")
'You can also do this with the JavaScript object: objjso.ExecMenuItem("Item Name")
'You may have to insert a waiting function like sleep() here to wait for the action to complete
'Put data from clipboard into a string.
objData.GetFromClipboard
strTxt = objData.GetText 'Now you can search the entire content of the page at once, within memory
'Compare each element of the array to the string
For i = LBound(arrSearch) To UBound(arrSearch)
If InStr(1, strTxt, arrSearch(i)) > 0 Then
'[You found a match. Your code here]
End If
Next i
Next page
This is still cumbersome because you have to open each page in a new document. If there is a good way to determine which page you're on purely by text (such as the page number at the bottom of page a, followed immediately by the header at the top of page b) then you might look at copying the entire text of the document into one string, then using the clues from the text to decide which page to extract once you find a match. That would be a lot faster I believe.
Sub BatchRenameCS()
Dim objApp As Object
Dim objPDDoc As Object
Dim objjso As Object
Dim newPDF As Acrobat.CAcroPDDoc
Dim lastrow2 As Long
Dim strFileName As String
Dim Folder As String
Dim Page As Long
Dim Cell As Long
Dim PDFCharacterCount() As Long
Dim CharacterCount As Long
Dim i As Integer
Dim c As Integer
Dim x As Integer
Dim strSource As String
Dim strResult As String
Dim PDFCharacters As String
Dim PDFCharacters2 As String
Dim PDFPasteData() As String
Dim PasteDataPage As Integer
Dim LastRow As Long
Dim NewName As String
Dim NewNamePageNum As Integer
Dim Check()
Sheets("Sheet1").Range("C:D").ClearContents
strFileName = selectFile()
Folder = GetFolder()
'create array with pdf word count
Set objApp = CreateObject("AcroExch.App")
Set objPDDoc = CreateObject("AcroExch.PDDoc")
'AD.1 open file, if =false file is damage
If objPDDoc.Open(strFileName) Then
Set objjso = objPDDoc.GetJSObject
ReDim PDFCharacterCount(1 To objPDDoc.GetNumPages) As Long
For Page = 1 To objPDDoc.GetNumPages
PDFCharacters = ""
PDFCharacters2 = ""
For c = 0 To objjso.GetPageNumWords(Page - 1)
PDFCharacters = PDFCharacters & objjso.getPageNthWord(Page - 1, c)
Next c
For i = 1 To Len(PDFCharacters)
Select Case Asc(Mid(PDFCharacters, i, 1))
Case 48 To 57, 65 To 90, 97 To 122:
PDFCharacters2 = PDFCharacters2 & (Mid(PDFCharacters, i, 1))
Case Else
PDFCharacters2 = PDFCharacters2 & ""
End Select
Next
PDFCharacterCount(Page) = Len(PDFCharacters2)
Next Page
lastrow2 = Sheets("Sheet2").Cells(Rows.Count, "A").End(xlUp).Row
Page = 1
ReDim PDFPasteData(1 To objPDDoc.GetNumPages) As String
For Cell = 1 To lastrow2
strResult = ""
strSource = Sheets("Sheet2").Cells(Cell, 1).Text
PDFPasteData(Page) = PDFPasteData(Page) & " " & strSource
For i = 1 To Len(strSource)
Select Case Asc(Mid(strSource, i, 1))
Case 48 To 57, 65 To 90, 97 To 122:
strResult = strResult & (Mid(strSource, i, 1))
Case Else
strResult = strResult & ""
End Select
Next
CharacterCount = CharacterCount + Len(strResult)
If CharacterCount = PDFCharacterCount(Page) Then
CharacterCount = 0
Page = Page + 1
End If
Next Cell
ReDim Check(2, objPDDoc.GetNumPages)
LastRow = Sheets("Sheet1").Cells(Rows.Count, "A").End(xlUp).Row
For Each LookUpCell In Worksheets("Sheet1").Range("A2:A" & LastRow)
For PasteDataPage = 1 To objPDDoc.GetNumPages
If InStr(PDFPasteData(PasteDataPage), LookUpCell.Value) Then
Check(1, PasteDataPage) = Check(1, PasteDataPage) + 1
Check(2, PasteDataPage) = Check(2, PasteDataPage) & LookUpCell.Offset(0, 1).Value & Chr(10)
If FileExist(Folder & "\" & LookUpCell.Offset(0, 1) & ".pdf") Then
Set newPDF = CreateObject("AcroExch.pdDoc")
NewName = Folder & "\" & LookUpCell.Offset(0, 1) & ".pdf"
newPDF.Open (NewName)
newPDF.InsertPages newPDF.GetNumPages - 1, objPDDoc, PasteDataPage - 1, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
Else
Set newPDF = CreateObject("AcroExch.PDDoc")
newPDF.Create
NewName = Folder & "\" & LookUpCell.Offset(0, 1) & ".pdf"
newPDF.InsertPages -1, objPDDoc, PasteDataPage - 1, 1, 0
newPDF.Save 1, NewName
newPDF.Close
Set newPDF = Nothing
End If
End If
Next PasteDataPage
Next LookUpCell
x = 1
For PasteDataPage = 1 To objPDDoc.GetNumPages
If Check(1, PasteDataPage) <> 1 Then
Sheets("Sheet1").Cells(x, 3) = PasteDataPage
Sheets("Sheet1").Cells(x, 4) = Check(2, PasteDataPage)
x = x + 1
End If
Next PasteDataPage
End If
MsgBox "Done"
End Sub
Function FileExist(path As String) As Boolean
If Dir(path) <> vbNullString Then FileExist = True
End Function
Function selectFile()
Dim fd As FileDialog, fileName As String
On Error GoTo ErrorHandler
Set fd = Application.FileDialog(msoFileDialogFilePicker)
fd.AllowMultiSelect = False
If fd.Show = True Then
If fd.SelectedItems(1) <> vbNullString Then
fileName = fd.SelectedItems(1)
End If
Else
'Exit code if no file is selected
End
End If
'Return Selected FileName
selectFile = fileName
Set fd = Nothing
Exit Function
ErrorHandler:
Set fd = Nothing
MsgBox "Error " & Err & ": " & Error(Err)
End Function
Function GetFolder() As String
Dim fldr As FileDialog
Dim sItem As String
Set fldr = Application.FileDialog(msoFileDialogFolderPicker)
With fldr
.Title = "Select the Folder where you want you new PDFs to go"
.AllowMultiSelect = False
.InitialFileName = Application.DefaultFilePath
If .Show <> -1 Then GoTo NextCode
sItem = .SelectedItems(1)
End With
NextCode:
GetFolder = sItem
Set fldr = Nothing
End Function
I need this macro to automatically grab the data from column A, find the data into the path given and replace it with column B. It is working but I need it to work just for once and goes on forward automatically..
Can anyone help me in this..
Sub UnkownFunctionName()
Dim myfolder
Dim Fnd As String, Rplc As String
Fnd = Application.InputBox(prompt:="Find string:", Title:="Rename files and folders", Type:=2)
Rplc = Application.InputBox(prompt:="Replace with:", Title:="Rename files and folders", Type:=2)
With Application.FileDialog(msoFileDialogFolderPicker)
.Show
myfolder = .SelectedItems(1) & "\"
End With
Call Recursive(myfolder, Fnd, Rplc)
End Sub
Sub Recursive(FolderPath As Variant, Fnd As String, Rplc As String)
Dim Value As String, Folders() As String, Fname As String, Fext As String, Mtxt As String
Dim x As Integer
Dim Folder As Variant, a As Long
ReDim Folders(0)
If Right(FolderPath, 2) = "\\" Then Exit Sub
Value = Dir(FolderPath, &H1F)
Do Until Value = ""
If Value = "." Or Value = ".." Then
Else
If GetAttr(FolderPath & Value) = 16 Or GetAttr(FolderPath & Value) = 48 Then
On Error Resume Next
Mtxt = "Rename folder " & Value & " to " & WorksheetFunction.Substitute(Value, Fnd, Rplc) & "?"
x = MsgBox(Mtxt, vbYesNoCancel)
If x = vbCancel Then Exit Sub
If x = vbYes Then
Name FolderPath & Value As FolderPath & WorksheetFunction.Substitute(Value, Fnd, Rplc)
End If
Value = WorksheetFunction.Substitute(Value, Fnd, Rplc)
If Err <> 0 Then
MsgBox "Error"
Exit Sub
End If
On Error GoTo 0
Folders(UBound(Folders)) = Value
ReDim Preserve Folders(UBound(Folders) + 1)
Else
On Error Resume Next
Fext = Split(Value, ".")(UBound(Split(Value, ".")))
Fname = Left(Value, Len(Value) - Len(Split(Value, ".")(UBound(Split(Value, ".")))) - 1)
Fname = WorksheetFunction.Substitute(Fname, Fnd, Rplc)
If Value <> (Fname & "." & Fext) Then
Mtxt = "Rename file " & Value & " to " & Fname & "." & Fext & "?"
x = MsgBox(Mtxt, vbYesNoCancel)
If x = vbCancel Then Exit Sub
If x = vbYes Then
Name FolderPath & Value As FolderPath & Fname & "."& Fext
End If
End If
If Err <> 0 Then
MsgBox "Error"
Exit Sub
End If
On Error GoTo 0
End If
End If
Value = Dir
Loop
For Each Folder In Folders
Call Recursive(FolderPath & Folder & "\", Fnd, Rplc)
Next
End Sub
If this accomplishes what you want, why not put a pause of some kind after the loop that accomplishes your goal completes. For instance-
...
End If
If MsgBox("Continue?", vbYesNo, "Confirm") = vbNo Then Exit Sub
...
I'm having a hard time linking what the code does to what your question suggests. It seems that the code renames files and folders. Can you explain a bit more about your goal?
I would like to make folder and file by reading the following paths
/project/tags/folder2/command.txt
/project/branches/folder1/folder1.1/Notes.docx
and construct folders and files under drive D:\ likes this
project
tags
folder2
command.txt
branches
folder1
folder1.1
Notes.docx
.Then use this physical structure to type tree view with hyperlinks(Please assume I mark * for the names that words have hyperlinks) at last files and folders in excel sheet using vba macro.See
project
|_tags
| |_folder2*
| |_command.txt*
|_branches
| |_folder1
| |_folder1.1*
| |_Notes.docx*
So please help something for vba noob.
I think that should do the trick.
This macro will take folder path from cell A1 and list recursively its contents and subfolder contents with hyperlinks.
Update: fixed, now it's working. :)
Public Position As Integer
Public Indent As Integer
Sub ListFileTree()
Position = 0
Indent = 0
Call RecurseFolderList(Range("A1").Value)
End Sub
Private Sub ClearFormatting(Rng As Range)
Rng.Formula = Rng.Value2
Rng.Font.ColorIndex = xlAutomatic
Rng.Font.Underline = xlUnderlineStyleNone
End Sub
Function GetFilenameFromPath(ByVal strPath As String) As String
If Right$(strPath, 1) <> "\" And Len(strPath) > 0 Then
GetFilenameFromPath = GetFilenameFromPath(Left$(strPath, Len(strPath) - 1)) + Right$(strPath, 1)
End If
End Function
Function RecurseFolderList(FolderName As String) As Boolean
On Error Resume Next
Dim FSO, NextFolder, FolderArray, FileArray, NextFile
Dim OriginalRange As Range
Dim RemoveHyperlink As Boolean
Set FSO = CreateObject("Scripting.FileSystemObject")
If Err.Number > 0 Then
RecurseFolderList = False
Exit Function
End If
On Error GoTo 0
If FSO.FolderExists(FolderName) Then
Set NextFolder = FSO.GetFolder(FolderName)
Set FolderArray = NextFolder.SubFolders
Set FileArray = NextFolder.Files
RemoveHyperlink = False
Set OriginalRange = Range("A2").Offset(Position - 1, Indent)
Indent = Indent + 1
For Each NextFolder In FolderArray
Range("A2").Offset(Position, Indent).Formula = "=HYPERLINK(""" & NextFile & """,""" & UCase(GetFilenameFromPath(NextFolder)) & """)"
Position = Position + 1
RecurseFolderList (NextFolder)
RemoveHyperlink = True
Next
For Each NextFile In FileArray
Range("A2").Offset(Position, Indent).Formula = "=HYPERLINK(""" & NextFile & """,""" & GetFilenameFromPath(NextFile) & """)"
Position = Position + 1
RemoveHyperlink = False
DoEvents
Next
If RemoveHyperlink Then
Call ClearFormatting(OriginalRange)
End If
Set NextFolder = Nothing
Set FolderArray = Nothing
Set FileArray = Nothing
Set NextFile = Nothing
Else
RecurseFolderList = False
End If
Set FSO = Nothing
Indent = Indent - 1
End Function