Extract href link from source code using VBA - excel

Below is the source code which i am getting after browsing a website
<item><a href="/search/Listing/45678489?source=results" id="mk:0:mk" class="details">
I just want to copy link /search/Listing/45678489?source=results in excel and want to know how to click it
class="details" is same for all href links that i want copy while id keep on incrementing mk:1:mk, ms:2:mk and so on

So, on each page you can gather the current set of links in a list but looking at your above example you will need to concatenate on the protocol/domain to the url before writing out to Excel. I wouldn't try clicking those written out links (hyperlinks presumably) as this is inefficient and will spawn lots of IE instances you will need to remember to manually close.
On any given page grab the list of links and generate a full url in each case
Dim nodes As Object, i As Long
Set nodes = ie.document.querySelectorAll(".details[id^='mk:']")
With ActiveSheet
For i = 0 To nodes.Length -1
.Cells(i+1,1) = "protocol + domain...." & nodes.item(i).href
Next
End With
Then later, rather than clicking, read those urls into an array, loop the array and either issue xmlhttp requests if possible, or .Navigate with IE to the current url in the array.

Related

Using user input in VBA code and extracting data into workbook

I am exploring webscraping to try and improve efficiency when inputting data. Unfortunately the website I wish to extract data from is now in tabular format and so I wish to use VBA to manipulate the website to the desired result.
I'm not very familiar with coding/VBA but so far I have got VBA to open a website and search for a provided value. In this case the CAS number 67-64-1 refs Acetone on the website.
The code for this is:
Sub BrowseToSite()
Dim IE As New SHDocVw.InternetExplorer
IE.Visible = True
IE.Navigate "https://apps.who.int/food-additives-contaminants-jecfa-database/Search.aspx#"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
IE.Document.forms("form1").elements("ctl00$ContentPlaceHolder1$txtSearch").Value = "67-64-1"
IE.Document.forms("form1").elements("ctl00$ContentPlaceHolder1$btnSearch").Click
End Sub
Ultimately I wish to create a list in an excel sheet of CAS numbers that this code can loop through and return either the found phrase (in this case No safety concern at current levels of intake when used as a flavouring agent) or simply return a "Not Found". Sometimes the search returns multiple results, for the time being I just wish to take the first result.
This raises 2 problems I'm not sure how to solve:
How can I modify my code to loop through values within a column of a worksheet instead of having to explicitly give each one.
2.I'm unsure how to pull the data into the adjacent column.
Below is an image of the desired output. Column A is inputted by the user and hopefully column B is created by VBA code.
Any help would be appreciated.
what you need is a step by step process for web scraping.
i highly recommend you get familiar with seleniumbasic for vba https://florentbr.github.io/SeleniumBasic/
you need to loop on your excel rows using Range() or Cells(i,1) to read the row.
you check the number of search results using collections
save as many results as you wish in the excel in front of the row using cells(i, k) k being number of returned search results.
unfortunately the website did not load for me to help you further

posting to facebook using VBA

I have a large excel file with links and posts that I manually post to a facebook page I run
I've been learning more VBA recently and wanted to try to use it to automate things.
I've researched lots of web scraping videos etc and can't find a solution
I basically want VBA to open my page and post a line of text including a link. I've found how to click the search text box and type in but not the post box as this is not an input box when I inspect the element via chrome after typing in "test" it shows:
test
tried using the get elements by id and the class name and changing the value but when I inspect the elements there's no "value there" so the object won't allow it
Sub posttofb()
Dim ie As Object
Set ie = CreateObject("internetexplorer.application")
ie.navigate "www.facebook.com/kimberskitchenfitness"
ie.Visible = True
Do While ie.ReadyState <> READYSTATE_COMPLETE
Loop
ie.Document.getElementsByClassName("_5rp7").Data -Text = "test!"
End Sub
I want it to click in the box. I'm going to create a dim string to change the "test" to whatever is on the excel cell i want to post

Excel VBA Navigate Through Same Window or Tab

I'm new at excel VBA programming so for those it may be very easy question but i need to learn it.
I have code part like this :
Sub Calculate()
Dim i As Integer
For i = 1 To 1000
Dim Test As Object
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False
IE.navigate "https://www.somewebsite.com/" & Cells(i, 2).Value
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
'some code here
.
.
.
.
Next i
End Sub
This code will work as get some data from the same website with different endings. Each has written on some range (for example (A1:A1001)).
My question is each of this window opened visibility as hidden and get some data from web page. It takes too much time
I want each link opened in one window or one window with different tab (I don't know which one takes longer time) Each of solution is okay but preferred one is same window navigate each link.
I would like your help
Thanks in advance
It's not 100% clear what you're after, but from the sound of it you need IE automation (among other things) pulling websites into tabs within the same window.
This may be helpful:
.IE.Navigate2 "www.somewebsite.com", CLng(2048)

Searching for DATA (VBA+Web)

I'm trying to create a macro that fetch for specific DATA in URL from Excel sheet and when the DATA is found , it copy the values and past it into the Excel File andmove on to the next DATA
I'm a little bit new with SCRAPPING WEB via EXCEL VBA so Anyone can light me and helps me to carry on ?
I made a lot of searches here in stack over flow but I didn't understand too much
I will explain below with Image what I want to Do and I will show My code which is nothing :(
First URL to use to access to MP :http://XXXX-XXXXX.eu.airbus.XXXX:XXXXX/XXXX/consultation/preSearchMP.do?clearBackList=true&CMH_NO_STORING_fromMenu=true
then Take a value from EXCEL FILE and put it in that Path to make search and tape enter
The last Picture IT will loop for MOD and Copy Value and Paste it into Excel File
Can Anyone guide me please with something
this is my first Code
Dim str As String: str = ThisWorkbook.Sheets("Feuil1").Range("A1").Value
Dim myURL as String
' The & symbol concatenates strings. The _ symbol is for line continuation.
myURL =
"http://Confidential.eu.airbus.Confidential:Confidential/Confidential/" _
& "consultation/preViewMP.do?" & str
If you have your URL then the next stage is to send it. To send it you need something to act as a connector, like Internet Explorer, or a background connector without the browser like XMLHttpServer objects - THIS is a good guide to hopefully get you started.

How can I scrape worded information from a website?

I am new to VBA and html coding in general. I apologise if I don't understand basic terms or use them incorrectly. I was looking to create and run a macro in excel for work that would make my job a lot easier. Essentially, I need to grab a bunch of information off of a real estate website. This includes address, list price, listing agency, auction date (if any) etc. I have spent the last 4 hours reading all about web scraping and I understand the processes, I just don't know how to code it. From what I have read, I need to write a code to automatically open the website, force-ably wait until it's loaded, then retrieve information by either tag, name or id. Is this correct? How can I go about this. What resources should I look to use.
TL;DR How to web scrape text from a webpage of search results (noob instructions).
I will not tell you all the details, you have to find them on your own. Some web pages are complicated, some are easy. Other are impossible, especially if the text is displayed not in HTML but in some other form - picture, Flash etc.
It is however quite simple to extract data from HTML web pages in Excel. First of all, you want to automate it. So click 'Record macro' on the 'Developer' ribbon. This way, you will have all the reproducible step recorded and then you can have a look at the macro a adjust some steps to suit your needs. I however can't teach you here how to program VBA.
When your macro is being recorded, click on 'From web' on the 'Data' ribbon. This will show up a new web query. Then you enter the address of the web page you want to read and try to select (with the little arrow or check-off mark) as narrow area you are interested in as possible. You can also explore some of the fine-tuning options in this wizard dialog.
When you are done, click 'Import' and you will have in some form the contents of the web page. If you are lucky, the data you are interested in will be always in the same cells. Then you can read the cells and store the values somewhere (possible using another macro). If the data are not in the same cells every time you refresh the query, then you have bad luck and have to use some complicated formulas or macros to find them.
Next stop the macro which you are recording and review the code which was recorded. Try to experiment and play around with it until you discover what you actually need. Then it is up to you, how you want to automate it. The options are many...
Otherwise Excel is maybe not the best tool. If I wanted to load HTML page and extract data from it, I would use some scripting e.g. Python which has much better tools than Excel and VBA. There are also tools for converting HTML to XHTML and then extracting data from it like from well-formed XML.
Below is a very basic example illustrating some of the concepts of web scraping. Other reading you should do, would be how to use the other element selectors such as getElementByID getElementByClassName getElementByName.
Here's some code to get you started.
Public Sub ExampleWebScraper()
Dim Browser As Object: Set Browser = CreateObject("InternetExplorer.Application")
Dim Elements As Object 'Will hold all the elements in a collection
Dim Element As Object 'Our iterator that will show us the properties
'Open a page and wait for it to load
With Browser
.Visible = True
.Navigate "www.google.com"
'Wait for the page to load
While .busy Or .readystate <> 4
Application.Wait (Now() + TimeValue("00:00:01"))
Wend
'Enumerate all Elements on the page
'It will store these elements into a collection which we can
'iterate over. The * is the key for ALL, here you can specify
'any tagName and it will limit your search to just those.
'E.g. the most common is Likely Input
Set Elements = .document.getElementsByTagname("*") ' All elements
'Iterate through all elements, and print out some properties
For Each Element In Elements
On Error Resume Next ' This is needed as not all elements have the properties below
' if you try and return a property that doesn't exist for that element
' you will receive an error
'The following information will be output to the 'Immediate Window'
'If you don't see this window, Press Ctrl+G, and it will pop up. That's where this info will display
Debug.Print "The Inner Text is: " & Element.InnerText
Debug.Print "The Value is: " & Element.Value
Debug.Print "The Name is: " & Element.Name
Debug.Print "The ID is: " & Element.ID
Debug.Print "The ClassName is: " & Element.Class
Next Element
End With
'Clean up, free memory
Set Browser = Nothing
Set Elements = Nothing
Set Element = Nothing
End Sub

Resources