Extract file names from a File Explorer search into Excel - excel

This has been bugging me for while as I feel I have few pieces of the puzzle but I cant put them all together
So my goal is to be able to search all .pdfs in a given location for a keyword or phrase within the content of the files, not the filename, and then use the results of the search to populate an excel spreadsheet.
Before we start, I know that this easy to do using the Acrobat Pro API, but my company are not going to pay for licences for everyone so that this one macro will work.
The windows file explorer search accepts advanced query syntax and will search inside the contents of files assuming that the correct ifilters are enabled. E.g. if you have a word document called doc1.docx and the text inside the document reads "blahblahblah", and you search for "blah" doc1.docx will appear as the result.
As far as I know, this cannot be acheived using the FileSystemObject, but if someone could confirm either way that would be really useful?
I have a simple code that opens an explorer window and searches for a string within the contents of all files in the given location. Once the search has completed I have an explorer window with all the files required listed. How do I take this list and populate an excel with the filenames of these files?
dim eSearch As String
eSearch = "explorer " & Chr$(34) & "search-ms://query=System.Generic.String:" & [search term here] & "&crumb=location:" & [Directory Here] & Chr$(34)
Call Shell (eSearch)

Assuming the location is indexed you can access the catalog directly with ADO (add a reference to Microsoft ActiveX Data Objects 2.x):
Dim cn As New ADODB.Connection
Dim rs As New ADODB.Recordset
Dim sql As String
cn.Open "Provider=Search.CollatorDSO;Extended Properties='Application=Windows'"
sql = "SELECT System.ItemNameDisplay, System.ItemPathDisplay FROM SystemIndex WHERE SCOPE='file:C:\look\here' AND System.Kind <> 'folder' AND CONTAINS(System.FileName, '""*.PDF""') AND CONTAINS ('""find this text""')"
rs.Open sql, cn, adOpenForwardOnly, adLockReadOnly
If Not rs.EOF Then
Do While Not rs.EOF
Debug.Print "File: "; rs.Collect(0)
Debug.Print "Path: "; rs.Collect(1)
rs.MoveNext
Loop
End If

Try using the next function, please:
Function GetFilteredFiles(foldPath As String) As Collection
'If using a reference to `Microsoft Internet Controls (ShDocVW.dll)_____________________
'uncomment the next 2 lines and comment the following three (without any reference part)
'Dim ExpWin As SHDocVw.ShellWindows, CurrWin As SHDocVw.InternetExplorer
'Set ExpWin = New SHDocVw.ShellWindows
'_______________________________________________________________________________________
'Without any reference:_____________________________________
Dim ExpWin As Object, CurrWin As Object, objshell As Object
Set objshell = CreateObject("Shell.Application")
Set ExpWin = objshell.Windows
'___________________________________________________________
Dim Result As New Collection, oFolderItems As Object, i As Long
Dim CurrSelFile As String
For Each CurrWin In ExpWin
If Not CurrWin.Document Is Nothing Then
If Not CurrWin.Document.FocusedItem Is Nothing Then
If left(CurrWin.Document.FocusedItem.Path, _
InStrRev(CurrWin.Document.FocusedItem.Path, "\")) = foldPath Then
Set oFolderItems = CurrWin.Document.folder.Items
For i = 0 To oFolderItems.count
On Error Resume Next
If Err.Number <> 0 Then
Err.Clear: On Error GoTo 0
Else
Result.Add oFolderItems.item(CLng(i)).Name
On Error GoTo 0
End If
Next
End If
End If
End If
Next CurrWin
Set GetFilteredFiles = Result
End Function
Like it is, the function works without any reference...
The above function must be called after you executed the search query in your existing code. It can be called in the next (testing) way:
Sub testGetFilteredFiles()
Dim C As Collection, El As Variant
Set C = GetFilteredFiles("C:\Teste VBA Excel\")'use here the folder path you used for searching
For Each El In C
Debug.Print El
Next
End Sub
The above solution iterates between all IExplorer windows and return what is visible there (after filtering) for the folder you initially used to search.
You can manually test it, searching for something in a specific folder and then call the function with that specific folder path as argument ("\" backslash at the end...).

I've forgotten everything I ever knew about VBA, but recently stumbled across an easy way to execute Explorer searches using the Shell.Application COM object. My code is PowerShell, but the COM objects & methods are what's critical. Surely someone here can translate.
This has what I think are several advantages:
The query text is identical to what you wouold type in the Search Bar in Explorer, e.g.'Ext:pdf Content:compressor'
It's easily launched from code and results are easily extracted with code, but SearchResults window is available for visual inspection/review.
With looping & pauses, you can execute a series of searches in the same window.
I think this ability has been sitting there forever, but the MS documentation of the Document object & FilterView method make no mention of how they apply to File Explorer.
I hope others find this useful.
$FolderToSearch = 'c:\Path\To\Folder'
$SearchBoxText = 'ext:pdf Content:compressor'
$Shell = New-Object -ComObject shell.application
### Get handles of currenlty open Explorer Windows
$CurrentWindows = ( $Shell.Windows() | Where FullName -match 'explorer.exe$' ).HWND
$WinCount = $Shell.Windows().Count
$Shell.Open( $FolderToSearch )
Do { Sleep -m 50 } Until ( $Shell.Windows().Count -gt $WinCount )
$WindowToSerch = ( $Shell.Windows() | Where FullName -match 'explorer.exe$' ) | Where { $_.HWND -notIn $CurrentWindows }
$WindowToSearch.Document.FilterView( $SearchBoxText )
Do { Sleep -m 50 } Until ( $WindowToSearch.ReadyState -eq 4 )
### Fully-qualified name:
$FoundFiles = ( $WindowToSearch.Document.Folder.Items() ).Path
### or just the filename:
$FoundFiles = ( $WindowToSearch.Document.Folder.Items() ).Name
### $FoundFIles is an array of strings containing the names.
### The Excel portion I leave to you! :D

Related

Is there a resolution in VBA for the error Run-time error '429' when trying to connect to a website?

I am trying to create a VBA to download a google sheet into excel so I can compile stock market data daily. I would simply use power query for this but I am doing this on my personal laptop which is a mac and does not support power query. I am relatively new to coding so have been leaning on following online instructions. The instruction includes this:
Set objWebCon = CreateObject("MSXML2.XMLHTTP.3.0")
This line when ran creates an error message saying:
"
Run-time error '429':
ActiveX component can't create object
"
I think the issue lies within the fact that the instruction is based on a windows operating system. Any solution I've searched for is specific to windows operating systems.
Does anybody here know if I can change the "MSXML2.XMLHTTP.3.0" part of my code to fit it better to mac? Not sure if this is what needs to be done but any guidance would be super appreciated.
I attached my full code below but feel free to ignore it if not relavent. Thank you!!
Sub DownloadGoogleSheets()
Dim ShtUrl As String, Location As String, FileName As String
Dim objWebCon, objWrit As Object
'Sheet Url
ShtUrl = "https://docs.google.com/spreadsheets/d/1wpA_epxtlz96sxETqKttJwsy9Aubb15H8xslcSQ20T0/export?format=csv&id=1wpA_epxtlz96sxETqKttJwsy9Aubb15H8xslcSQ20T0" & gid = 1319327791
'Location
Location = ThisWorkbook.Path & "\" '/Users/[myName]/Desktop/Stock Analysis/n"
'FileName
FileName = "GoogleSheet.csv"
'Connection to Website
Set objWebCon = CreateObject("MSXML2.XMLHTTP.3.0")
'Writer
Set objWrit = CreateObject("ADODB.Stream")
'Connecting to the Website
objWebCon.Open "Get", ShtUrl, False
objWebCon.Send (ShtUrl)
'Once page is fully loaded
If objWebCon.Status = 200 Then
'Write the text of the sheet
objWrit.Open
objWrit.Type = 1
objWrit.Write objWebCon.ResponseBody
objWrit.Position = 0
objWrit.SaveToFile Location & FileName
objWrit.Close
End If
Set objWebCon = Nothing
Set objWrit = Nothing
End Sub

Remove strange characters from .mdb field names

I have a large number of .mdb-files (as in Microsoft Access db-files). The first field (or column) is supposed be named say MyField1. However the files are corrupted so that the actual field name is \ufeffMyField1 or in other words there is 0xFEFFprepended to the actual field name.
I'm trying to copy the field in question from \ufeffMyField1 to NewField using the pyodbc-command
cursor.execute("UPDATE MyTable SET NewField=" + colname + ";")
where colnameis the errouneous field name (assume that NewFieldalready exists)
The value of colname is fetched with pyodbc using something like
rows = cur.columns(table='MyTable')
for row in rows:
if("MyField1" in row.column_name):
colname=row.column_name
Executing the UPDATE... command yields a driver error that the MaxLocksPerFile Ms Access parameter is too low, as described here https://support.microsoft.com/en-us/help/815281/-file-sharing-lock-count-exceeded-error-message-during-large-transacti.
However I've increased the MaxLocksPerFile parameter with several orders of magnitude while opening only a single file in the program, so it seems it is not the actual problem.
Note that without problem I can open the file in MS Access and rename the field in the gui. I have not found a way to see in the gui that the field is incorrectly named, supposedly since the extra bits doesn't match common encodings.
Finally this leads me to my question: how can I pass raw bytes as pyodbc-commands?
Alternatively please suggest in comments if you have other ways to solve the real-world problem of removing the extra characters? (as in clean out all non-ASCII characters from the field names)
If you want to use DAO, then it is pretty simple. You can modify the following code to find and loop thru a folder of all databases.
Sub Rename_First_Field()
Dim dbs As DAO.Database
Dim tdf As DAO.TableDef
Dim fld As DAO.Field
Set dbs = OpenDatabase("C:\....\SomeDB.mdb")
For Each tdf In dbs.TableDefs ' Spin thru all tables in this database
Set fld = tdf.Fields(0) ' Grab the first field
Debug.Print tdf.Name & vbTab & "|" & vbTab & fld.Name
fld.Name = "MyField1" ' Rename to 'MyField1'
Next tdf 'Move to next table
Set tdf = Nothing
Set dbs = Nothing
End Sub

Determine Process ID with VBA

Situation - I have a macro where I need to send keystrokes to two Firefox windows in order. Unfortunately both windows have the same title. To handle this I have activated the window, sent my keystrokes, then used F6 to load the URL of the second window and then send the keystrokes then use F6 to return it to the original page.
The issue is that loading the webpages is unreliable. Page load speeds vary so much that using a wait command is not consistent or reliable to ensure the keystroke makes it to the second window.
Question -
I've read a scattering of posts that mentioned that app activate will work with Process ID's. Since each window would have its own PID that would be an ideal way to handle 2 windows with the same title. I am unable to find information specifically how to determine the PID of each window with a given name.
You can use something like the following. You'll have to tinker about with the different info available in the Win32_Process class to figure out which window is which. It's also important to keep in mind that one window could mean many processes.
Public Sub getPID()
Dim objServices As Object, objProcessSet As Object, Process As Object
Set objServices = GetObject("winmgmts:\\.\root\CIMV2")
Set objProcessSet = objServices.ExecQuery("SELECT ProcessID, name FROM Win32_Process WHERE name = ""firefox.exe""", , 48)
'you may find more than one processid depending on your search/program
For Each Process In objProcessSet
Debug.Print Process.ProcessID, Process.Name
Next
Set objProcessSet = Nothing
End Sub
Since you'll probably want to explore the options with WMI a bit, you may want to add a Tools>>References to the Microsoft WMI library so you don't have to deal with Dim bla as Object. Then you can add breakpoints and see what's going on in the Locals pane.
After adding the reference:
Public Sub getDetailsByAppName()
Dim objProcessSet As WbemScripting.SWbemObjectSet
Dim objProcess As WbemScripting.SWbemObject
Dim objServices As WbemScripting.SWbemServices
Dim objLocator As WbemScripting.SWbemLocator
'set up wmi for local computer querying
Set objLocator = New WbemScripting.SWbemLocator
Set objServices = objLocator.ConnectServer(".") 'local
'Get all the gory details for a name of a running application
Set objProcessSet = objServices.ExecQuery("SELECT * FROM Win32_Process WHERE name = ""firefox.exe""", , 48)
RecordCount = 1
'Loop through each process returned
For Each objProcess In objProcessSet
'Loop through each property/field
For Each Field In objProcess.Properties_
Debug.Print RecordCount, Field.Name, Field.Value
Next
RecordCount = RecordCount + 1
Next
Set objProcessSet = Nothing
Set objServices = Nothing
Set objLocator = Nothing
End Sub
That will print out every property of every process found for the name 'firefox.exe'.

After a WMI search in VBScript, can I create my search filter BEFORE my "For Each" statement?

I've created an alternative search utility to the Windows search utility with VBScript using a WQL search, but, as it turns out, it's quite slow. I would like to speed it up and I think I can do it, but I would need to place my search filter AFTER my WQL search and BEFORE my For Each statement. Is this even possible?
I've already tested by filtering in the WQL search, but it's about 40% faster if I filter after the WQL search. I've also tested with and without iFlags, but they tend to slow the search quite a bit, even though MS seems to believe otherwise.
Since the user can search by filename, creation date, last modified date and/or file size, if the filter is after the For Each statement then the script has to create the search filter each time it enumerates a file. I'd like to create the filter once in the hope of shaving some time off the search.
This will probably make better sense when you take a look at the snippet of code I've posted. Note that the sub subCreateSearchString will have calls to other search options and functions (ie: convert from UTC to local time, format file sizes, etc.)
Dim strSearchName, strComputer, objSWbemServices, objFile, colFiles
Dim strFileName, strReturnedFileName, strQueryDriveAndPath
strSearchName = "test" 'Text being searched for - change as needed
strQueryDriveAndPath = "PATH = '\\Drop_RW\\' AND DRIVE = 'D:'" 'Path and drive in which to search - change as needed
strComputer = "."
Set objSWbemServices = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
Set colFiles = objSWbemServices.ExecQuery("Select * from CIM_DataFile WHERE " & "" & strQueryDriveAndPath & "")
'* I'd like to place the call to "subCreateSearchString" here
On Error Resume Next
For Each objFile in colFiles
strReturnedFileName = objFile.Name
subCreateSearchString ' Search filter - it works when placed here
If strSearchForString Then
MsgBox "File matches:" & vbCrLf & strReturnedFileName
Else
MsgBox "File DOES NOT match" & vbCrLf & strReturnedFileName
End If
Next
Sub subCreateSearchString
'* Set Filename Variable for search:
strFileName = InStr(LCase(strReturnedFileName), LCase(strSearchName))
strSearchForString = strFileName
End Sub
Since you depend on the names of the files you're iterating over in the For Each loop: no, not possible.
I'd strongly recommend making some adjustments, though.
Use a Function rather than a Sub if you want to return something from a subroutine.
Avoid using global variables. They have a nasty tendency of introducing undesired side effects and also make debugging your code a pain in the rear. Pass values into your subroutines via parameters, and return values as actual return values.
The returned value is an integer (or Null), but you use it like a boolean and named your variables (and sub) as if it were a string. Don't do that. Name your functions/procedures after what they're doing, and name your variables after what they contain. And if you want to use a boolean value make your function actually return a boolean value.
Avoid Hungarian Notation. It's pointless code-bloat the way most people use it. Even more if your naming doesn't even match the actual type.
Do not use global On Error Resume Next. Ever. It simply makes your code fail silently without telling you anything about what actually went wrong. Keep error handling as local as possible. Enable it only for single commands or short code blocks, and only if there is no other way to avoid/handle the error.
Function IsInFilename(searchName, fileName)
IsInFilename = InStr(LCase(fileName), LCase(searchName)) > 0
End Function
For Each objFile in colFiles
If IsInFilename(strSearchName, objFile.Name) Then
MsgBox "..."
Else
MsgBox "..."
End If
Next

MS Access & Excel: Turning a query with dynamic parameters into something useful

I got stuck in the problem beneath, because I donĀ“t use Access or Excel much and I have some basic programming language. So here's the deal:
I just made a fairly simple database in MS Access (2007) with a nice query to retrieve data, depending on which parameters you pass. In Excel (2007), I have this big 'template' which basically has parameters for the query. These parameters change per column & per row!
Perhaps superfluously, e.g.
column A contains paramA (10 different options)
column B contains paramB (8 different options)
column C contains paramC (2 different options)
What I'd like to do is to fill this template with dynamic data from Access, minding the continously changing parameters.
e.g.
column D contains Query (ParamA, ParamB, ParamC)
Best way to go I think is to make a (inline?) function that retrieves results from the query, also passing the parameters depending on the relative cell position. And this function is then copied as a normal inline excel function (like: SUM()).
I just don't know how to call /execute an MS Access query from inside an Excel Macro function.
Could someone help me with it? Thank you very much in advance!
A few notes.
Dim cn As Object
Dim rs As Object
''See: http://www.connectionstrings.com/access
strFile = "C:\Docs\AccessDB.mdb"
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
& ";User Id=admin;Password=;"
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
strSQL = "SELECT SomeField, OtherField FROM SomeTable " _
& "WHERE SomeText='" & Range("A1") & "'"
rs.Open strSQL, cn
s = rs.GetString
MsgBox s
'' Or
Sheets("Sheet2").Cells(2, 1).CopyFromRecordset rs
To add to Remou's answer also see
Modules: Sample Excel Automation - cell by cell which is slow and
Modules: Transferring Records to Excel with Automation
Late binding means you can safely remove the reference and only have an error when the app executes lines of code in question. Rather than erroring out while starting up the app and not allowing the users in the app at all. Or when hitting a mid, left or trim function call.
This also is very useful when you don't know version of the external application will reside on the target system. Or if your organization is in the middle of moving from one version to another.
For more information including additional text and some detailed links see the "Late Binding in Microsoft Access" page

Resources