Open link retrieved from IE in google chrome - excel

Since I am not able to save the files in internet explorer, I`m trying to copy the links to google chrome because it downloades automatically. Only problem is that the links are not working or I am not using the proper references.
If I turn on the msgbox it will give me the exact url that I need but It doesn't work when I enter it in the shell line. What am I doing wrong here? Discard the code below downloadfiles, this was a first attempt to store all the links and save them all at once (but I can`t even save 1 file)
Dim allHREFs As New Collection
Dim TableName As String
For iRow = 0 To 1
TableName = "docTypeForm:documentTbl:" & iRow & ":j_idt250"
On Error GoTo DownloadFiles
Set allLinks = obJIE.Document.getElementById(TableName).getElementsByTagName("a")
For Each link In allLinks
allHREFs.Add link.href
testing = link.href
'MsgBox testing
Shell ("C:\Program Files (x86)\Google\Chrome\Application\Chrome.exe" & " testing")
Next link
Next iRow
DownloadFiles:
'For j = 1 To allHREFs.Count
'obJIE.Navigate Url + allHREFs(j)
'Application.Wait (Now + TimeValue("0:00:04"))
'Next

Related

URL search in same tab using google chorme with VBA

I want to screen a lot of websites quickly using VBA. I have to have a quick look at them myself, so I basically just need my code to search for a new URL every other second in the same tab.
The existing VBA code I use is:
Sub test
End = Range("A2:C2").End(xlDown).Row
Dim chromepath As String
chromepath = """chrome.exe"""
Dim Weburl As String
Dim I As Integer
For I = 2 To end
Application.Wait (Now + TimeValue("0:00:02"))
Weburl = Cells(I, 1).Value & """"
Shell (chromepath & " -url " & Cells(I, 1))
end sub
The issue I am having is that it opens a lot of tabs, instead of searching in the same tab every other second. Has anyone found a solution for this?

VBA unable to read files after closing and reopen excel

Assuming I have 3 files all in same folder (item.xlsx, master.xlsx, transfer.xlsm) Main purpose is to transfer data from item to master.
I do all the codes inside transfer.xlsm and allow users to input file name and column mappings. I have been doing few hours for the codes and have tested several times and is working perfectly fine. With a click of a button is able to read data from item.xlsx and copy over to master.xlsx according to the column mapping.
However problem arise when i close all 3 files and reopen again. I open up all 3 file, when i click on the button on transfer,xlsm it show file not found which is the error handling i did. I did tried creating a new folder on my desktop and create a brand new transfer.xlsm inside, i copy the item and master file over and my code into my new button. It actually able to work but when i close and reopen in that new folder is not working,
Basically is working fine when i am working on it, when i close it totally and reopen it is unable to detect the 2 files.
cell values are entered inside transfer.xlsm according to user input
source = Cells(5, 2)
sourceSheet = Cells(6, 2)
sourceSheetRow = Cells(7, 2) - 1
destination = Cells(8, 2)
destinationSheet = Cells(9, 2)
destinationSheetRow = Cells(10, 2) - 1
source = source + ".xlsx"
destination = destination + ".xlsx"
rows = Cells(11, 2)
If FileExists(source) = False Or FileExists(destination) = False Then
MsgBox "File not found, please double check file name and make sure is in the same folder"
Exit Sub
End If
For i = 1 To rows
...
Next i
Function FileExists(FilePath As String) As Boolean
Dim TestStr As String
TestStr = ""
On Error Resume Next
TestStr = Dir(FilePath)
On Error GoTo 0
If TestStr = "" Then
FileExists = False
Else
FileExists = True
End If
End Function
I created this transfer.xlsm so that i can send it to people if they want to copy chunks of data from one excel to another instead of copy paste row by row. Hope someone can give me some guidance
Based on the information you provided, I am assuming that the information provided by the users is just the file name without the extension: item or master, and not the full file path C:\SampleFolder\item.xlsx or C:\SampleFolder\master.xlsx. Additionally, I am assuming when you run this code all three files must be in the same folder.
If this is the case, try using ThisWorkbook.Path, you can apply this to your Source and Destination paths to ensure that the appropriate file path is being used.
Dim sPath as String
sPath = ThisWorkbook.Path + "\"
source = Cells(5, 2)
sourceSheet = Cells(6, 2)
sourceSheetRow = Cells(7, 2) - 1
destination = Cells(8, 2)
destinationSheet = Cells(9, 2)
destinationSheetRow = Cells(10, 2) - 1
source = source + ".xlsx"
destination = destination + ".xlsx"
If FileExists(sPath + source) = False Or FileExists(sPath + destination) = False Then
MsgBox "File not found, please double check file name and make sure is in the same folder"
Exit Sub
End If
...

Using function to open and update values in external workbooks, but returning source errors

I've been using a function from another StackOverflow question (I'm SO sorry I can't find the original answer!) to help go through a number of cells in Column L that contains a formula that spits our a hyperlinked filepath. It is meant to open each one (workbook), update the values, then save and close the workbook before opening the next one. See below.
Sub List_UpdateAndSave()
Dim lr As Long
Dim i As Integer
Dim WBSsource As Workbook
Dim FileNames As Variant
Dim msg As String
' Update the individual credit models
With ThisWorkbook.Sheets("List")
lr = .Cells(.Rows.Count, "L").End(xlUp).Row
FileNames = .Range("L2:L" & lr).Value
End With
For i = LBound(FileNames, 1) To UBound(FileNames, 1)
On Error Resume Next
If FileNames(i, 1) Like "*.xls*" Then
Set WBSsource = Workbooks.Open(FileNames(i, 1), _
ReadOnly:=False, _
Password:="", _
UpdateLinks:=3)
If Err = 0 Then
With WBSsource
'do stuff here
.Save
.Close True
End With
Else
msg = msg & FileNames(i, 1) & Chr(10)
On Error GoTo 0
End If
End If
Set WBSsource = Nothing
Next i
If Len(msg) > 0 Then
MsgBox "The Following Files Could Not Be Opened" & _
Chr(10) & msg, 48, "Error"
End If
End Sub
The problem now is I am using this to work on a Network drive, and as a result it cause pathing issues with the Connections/Edit Links part. Each of the files are stored on S:\... which as a result of using the Hyperlink formula, won't be able to find the source data. See below the example image of a file that as been opened through a hyperlink cell from my original workbook. When I go to update the Edit Links section of it, it shows these errors.
If I open that lettered drive in Windows Explorer and find the file, it works with no problems. Open, Update Values > Save > Close, it says unknown...
(but if I click Update values here they update correctly.)
If opened using a Hyperlink formula in a cell (Also directing to S:\..) it says it contains links that cannot be updated. I choose to edit links and they're all "Error: Source not found". The location on them also starts off with \\\corp\... and not S:\.
Anyway to fix this? Apologies for the long winded question.
I'm adding this as an answer as it contains code and is a bit long for a comment.
I'm not sure if it's what you're after though.
The code will take the mapped drive and return the network drive, or visa-versa for Excel files. DriveMap is the variable containing the final string - you may want to adapt into a function.
Sub UpdatePath()
Dim oFSO As Object
Dim oDrv As Object
Dim FileName As String
Dim DriveMap As String
Set oFSO = CreateObject("Scripting.FileSystemObject")
FileName = Range("A1")
If InStr(oFSO.GetExtensionName(FileName), "xls") > 0 Then
For Each oDrv In oFSO.drives
If oDrv.sharename <> "" Then
'Changes \\corp\.... to S:\
If InStr(FileName, oDrv.sharename) = 1 Then
DriveMap = Replace(FileName, oDrv.sharename, oDrv.Path)
End If
'Changes S:\ to \\corp\....
' If InStr(FileName, oDrv.Path) = 1 Then
' DriveMap = Replace(FileName, oDrv.Path, oDrv.sharename)
' End If
End If
Next oDrv
End If
End Sub

Extracting a row from a CSV file quickly in Excel VBA

I have about 5000 .csv files and I want to search for one row in each file and extract it. I have pasted the key part of code below, which works, but as I have to open and close each .csv file, the process is slow for 5000 files. Is there any way to read a csv file without opening it? I had considered writing a small script to convert each csv file to Excel first? Thx.
SP_File_Name = Dir(DN_Path & "*.*")
Count = 1
Set START_CELL_RANGE = TARGET_SP_SHEET.Range("B3")
Set TICKER_CODE_RANGE = TARGET_SP_SHEET.Range("B1")
While (SP_File_Name <> "")
SP_Full_Path = DN_Path & SP_File_Name
Workbooks.OpenText Filename:=SP_Full_Path, DataType:=xlDelimited, comma:=True, Local:=True
Set INPUT_WORKBOOK = ActiveWorkbook
Set INPUT_SHEET = INPUT_WORKBOOK.Worksheets(1)
INPUT_SHEET.Range("$A$1").Select
Set INPUT_RANGE = ActiveCell.CurrentRegion
Set INPUT_FIRST_MATCH_RANGE = INPUT_RANGE.Find(TICKER_CODE_RANGE)
If INPUT_FIRST_MATCH_RANGE Is Nothing Then
GoTo NOT_FOUND
End If
START_CELL = START_CELL_RANGE.Address
TARGET_SP_SHEET.Range(START_CELL_RANGE.Address, START_CELL_RANGE.Offset(0, 6).Address).Value = INPUT_SHEET.Range(INPUT_FIRST_MATCH_RANGE.Address, INPUT_FIRST_MATCH_RANGE.Offset(0, 7).Address).Value
' write diagnostics
Sheet5.Range("K" & Count + 4).Value = START_CELL
Sheet5.Range("L" & Count + 4).Value = "$A$1"
Sheet5.Range("M" & Count + 4).Value = INPUT_FIRST_MATCH_RANGE.Address
Sheet5.Range("N" & Count + 4).Value = INPUT_FIRST_MATCH_RANGE.Offset(0, 7).Address
NOT_FOUND:
Set START_CELL_RANGE = START_CELL_RANGE.Offset(1, 0)
Workbooks(SP_File_Name).Close SaveChanges:=False
SP_File_Name = Dir
Count = Count + 1
Wend
To call a cmd command from VBA, I have used WshShell. For early binding I set a reference to the Windows Script Host Object Model
One problem with the Shell function is that it runs asynchronously. By using the WshShell Run method, you can have it wait until finished before executing subsequent commands.
Sample code might look as follows:
Option Explicit
Sub foo()
Dim WSH As WshShell
Dim lErrCode As Long
Set WSH = New WshShell
lErrCode = WSH.Run("cmd /c findstr /C:""Power"" ""C:\Users\Ron\filelist.txt"" > ""C:\Users\Ron\Results2.txt""", 1, True)
If lErrCode <> 0 Then
MsgBox "Error Code: " & lErrCode
Stop
End If
Set WSH = Nothing
Call Shell
End Sub
With regard to your command that you showed in your comment, I would ensure that VBA is interpreting the string correctly for the cmd prompt. Looking at your code line, I would wonder whether you are missing a space between the search string and the file path.
I don't think you can read the contents of a file without opening it. Why not just merge all 5000 files into 1 single file and read that into Excel. Certainly that will be much faster. Use the Command Window, point it to the folder that contains all 5000 files, and enter this:
copy *.csv merge.csv
See the link below for an example.
http://analystcave.com/merge-csv-files-or-txt-files-in-a-folder/

Extract Data from PDF and Add to Worksheet

I am trying to extract the data from a PDF document into a worksheet. The PDFs show and text can be manually copied and pasted into the Excel document.
I am currently doing this through SendKeys and it is not working. I get an error when I try to paste the data from the PDF document. Why is my paste not working? If I paste after the macro has stopped running it pastes as normal.
Dim myPath As String, myExt As String
Dim ws As Worksheet
Dim openPDF As Object
'Dim pasteData As MSForms.DataObject
Dim fCell As Range
'Set pasteData = New MSForms.DataObject
Set ws = Sheets("DATA")
If ws.Cells(ws.Rows.Count, "A").End(xlUp).Row > 1 Then Range("A3:A" & ws.Cells(ws.Rows.Count, "A").End(xlUp).Row).ClearContents
myExt = "\*.pdf"
'When Scan Receipts Button Pressed Scan the selected folder/s for receipts
For Each fCell In Range(ws.Cells(1, 1), ws.Cells(1, ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column))
myPath = Dir(fCell.Value & myExt)
Do While myPath <> ""
myPath = fCell.Value & "\" & myPath
Set openPDF = CreateObject("Shell.Application")
openPDF.Open (myPath)
Application.Wait Now + TimeValue("00:00:2")
SendKeys "^a"
Application.Wait Now + TimeValue("00:00:2")
SendKeys "^c"
'Application.Wait Now + TimeValue("00:00:2")
ws.Select
ActiveSheet.Paste
'pasteData.GetFromClipboard
'ws.Cells(3, 1) = pasteData.GetText
Exit Sub
myPath = Dir
Loop
Next fCell
You can open the PDF file and extract its contents using the Adobe library (which I believe you can download from Adobe as part of the SDK, but it comes with certain versions of Acrobat as well)
Make sure to add the Library to your references too (On my machine it is the Adobe Acrobat 10.0 Type Library, but not sure if that is the newest version)
Even with the Adobe library it is not trivial (you'll need to add your own error-trapping etc):
Function getTextFromPDF(ByVal strFilename As String) As String
Dim objAVDoc As New AcroAVDoc
Dim objPDDoc As New AcroPDDoc
Dim objPage As AcroPDPage
Dim objSelection As AcroPDTextSelect
Dim objHighlight As AcroHiliteList
Dim pageNum As Long
Dim strText As String
strText = ""
If (objAvDoc.Open(strFilename, "") Then
Set objPDDoc = objAVDoc.GetPDDoc
For pageNum = 0 To objPDDoc.GetNumPages() - 1
Set objPage = objPDDoc.AcquirePage(pageNum)
Set objHighlight = New AcroHiliteList
objHighlight.Add 0, 10000 ' Adjust this up if it's not getting all the text on the page
Set objSelection = objPage.CreatePageHilite(objHighlight)
If Not objSelection Is Nothing Then
For tCount = 0 To objSelection.GetNumText - 1
strText = strText & objSelection.GetText(tCount)
Next tCount
End If
Next pageNum
objAVDoc.Close 1
End If
getTextFromPDF = strText
End Function
What this does is essentially the same thing you are trying to do - only using Adobe's own library. It's going through the PDF one page at a time, highlighting all of the text on the page, then dropping it (one text element at a time) into a string.
Keep in mind what you get from this could be full of all kinds of non-printing characters (line feeds, newlines, etc) that could even end up in the middle of what look like contiguous blocks of text, so you may need additional code to clean it up before you can use it.
Hope that helps!
I know this is an old issue but I just had to do this for a project at work, and I am very surprised that nobody has thought of this solution yet:
Just open the .pdf with Microsoft word.
The code is a lot easier to work with when you are trying to extract data from a .docx because it opens in Microsoft Word. Excel and Word play well together because they are both Microsoft programs. In my case, the file of question had to be a .pdf file. Here's the solution I came up with:
Choose the default program to open .pdf files to be Microsoft Word
The first time you open a .pdf file with word, a dialogue box pops up claiming word will need to convert the .pdf into a .docx file. Click the check box in the bottom left stating "do not show this message again" and then click OK.
Create a macro that extracts data from a .docx file. I used MikeD's Code as a resource for this.
Tinker around with the MoveDown, MoveRight, and Find.Execute methods to fit the need of your task.
Yes you could just convert the .pdf file to a .docx file but this is a much simpler solution in my opinion.
Over time, I have found that extracting text from PDFs in a structured format is tough business. However if you are looking for an easy solution, you might want to consider XPDF tool pdftotext.
Pseudocode to extract the text would include:
Using SHELL VBA statement to extract the text from PDF to a temporary file using XPDF
Using sequential file read statements to read the temporary file contents into a string
Pasting the string into Excel
Simplified example below:
Sub ReadIntoExcel(PDFName As String)
'Convert PDF to text
Shell "C:\Utils\pdftotext.exe -layout " & PDFName & " tempfile.txt"
'Read in the text file and write to Excel
Dim TextLine as String
Dim RowNumber as Integer
Dim F1 as Integer
RowNumber = 1
F1 = Freefile()
Open "tempfile.txt" for Input as #F1
While Not EOF(#F1)
Line Input #F1, TextLine
ThisWorkbook.WorkSheets(1).Cells(RowNumber, 1).Value = TextLine
RowNumber = RowNumber + 1
Wend
Close #F1
End Sub
Since I do not prefer to rely on external libraries and/or other programs, I have extended your solution so that it works.
The actual change here is using the GetFromClipboard function instead of Paste which is mainly used to paste a range of cells.
Of course, the downside is that the user must not change focus or intervene during the whole process.
Dim pathPDF As String, textPDF As String
Dim openPDF As Object
Dim objPDF As MsForms.DataObject
pathPDF = "C:\some\path\data.pdf"
Set openPDF = CreateObject("Shell.Application")
openPDF.Open (pathPDF)
'TIME TO WAIT BEFORE/AFTER COPY AND PASTE SENDKEYS
Application.Wait Now + TimeValue("00:00:2")
SendKeys "^a"
Application.Wait Now + TimeValue("00:00:2")
SendKeys "^c"
Application.Wait Now + TimeValue("00:00:1")
AppActivate ActiveWorkbook.Windows(1).Caption
objPDF.GetFromClipboard
textPDF = objPDF.GetText(1)
MsgBox textPDF
If you're interested see my project in github.
Copying and pasting by user interactions emulation could be not reliable (for example, popup appears and it switches the focus). You may be interested in trying the commercial ByteScout PDF Extractor SDK that is specifically designed to extract data from PDF and it works from VBA. It is also capable of extracting data from invoices and tables as CSV using VB code.
Here is the VBA code for Excel to extract text from given locations and save them into cells in the Sheet1:
Private Sub CommandButton1_Click()
' Create TextExtractor object
' Set extractor = CreateObject("Bytescout.PDFExtractor.TextExtractor")
Dim extractor As New Bytescout_PDFExtractor.TextExtractor
extractor.RegistrationName = "demo"
extractor.RegistrationKey = "demo"
' Load sample PDF document
extractor.LoadDocumentFromFile ("c:\sample1.pdf")
' Get page count
pageCount = extractor.GetPageCount()
Dim wb As Workbook
Dim ws As Worksheet
Dim TxtRng As Range
Set wb = ActiveWorkbook
Set ws = wb.Sheets("Sheet1")
For i = 0 To pageCount - 1
RectLeft = 10
RectTop = 10
RectWidth = 100
RectHeight = 100
' check the same text is extracted from returned coordinates
extractor.SetExtractionArea RectLeft, RectTop, RectWidth, RectHeight
' extract text from given area
extractedText = extractor.GetTextFromPage(i)
' insert rows
' Rows(1).Insert shift:=xlShiftDown
' write cell value
Set TxtRng = ws.Range("A" & CStr(i + 2))
TxtRng.Value = extractedText
Next
Set extractor = Nothing
End Sub
Disclosure: I am related to ByteScout
Using Bytescout PDF Extractor SDK is a good option. It is cheap and gives plenty of PDF related functionality. One of the answers above points to the dead page Bytescout on GitHub. I am providing a relevant working sample to extract table from PDF. You may use it to export in any format.
Set extractor = CreateObject("Bytescout.PDFExtractor.StructuredExtractor")
extractor.RegistrationName = "demo"
extractor.RegistrationKey = "demo"
' Load sample PDF document
extractor.LoadDocumentFromFile "../../sample3.pdf"
For ipage = 0 To extractor.GetPageCount() - 1
' starting extraction from page #"
extractor.PrepareStructure ipage
rowCount = extractor.GetRowCount(ipage)
For row = 0 To rowCount - 1
columnCount = extractor.GetColumnCount(ipage, row)
For col = 0 To columnCount-1
WScript.Echo "Cell at page #" +CStr(ipage) + ", row=" & CStr(row) & ", column=" & _
CStr(col) & vbCRLF & extractor.GetCellValue(ipage, row, col)
Next
Next
Next
Many more samples available here: https://github.com/bytescout/pdf-extractor-sdk-samples
To improve the solution of Slinky Sloth I had to add this beforere get from clipboard :
Set objPDF = New MSForms.DataObject
Sadly it didn't worked for a pdf of 10 pages.
This doesn't seem to work with the Adobe Type library. As soon as it gets to Open, I get a 429 error. Acrobat works fine though...

Resources