Soundcloud song names - excel

I'm new to VBA web scraping and html structure in general and I'm having some trouble getting the song names off this html address "https://soundcloud.com/maraudamusic/tracks"
This is what I have tried so far and I can't seem to pull the <li> song names from the html document (I barely know what I'm talking about so forgive me).
Option Explicit
Sub SCScrape()
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim site As String
Dim artist As String
artist = "maraudamusic"
site = "https://soundcloud.com/" & artist & "/tracks"
Set ie = New InternetExplorer
ie.Visible = True
ie.navigate site
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set html = New HTMLDocument
Dim ul As IHTMLElementCollection
'Dim ul As Variant
'Dim li As IHTMLListElement
Dim els As Variant
Dim el As Variant
'Dim li As Variant
'Set ul = html.getElementsByClassName("lazyLoadingList__list sc-list-nostyle sc-clearfix")
'Set ul = html.getElementsByTagName("ul")
Set els = html.getElementsByClassName("soundTitle_usernameTitleContainer")
Debug.Print
For Each el In els
Debug.Print el.innerHTML
Next el
Debug.Print "Hello"
End Sub

The main issue is trying to read elements from a blank document...
You set html = New HTMLDocument. This instantiates the html object as a blank HTMLDocument object.
What you should do is set html = ie.document to read the contents of the current document in the IE window.
Also, as Soundcloud is a Javascript/AJAX type application, you may need to wait a couple of seconds for AJAX calls to finish. This may not be necessary but I find it sometimes helps.
Here's a working example.
Option Explicit
Sub SCScrape()
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim site As String
Dim artist As String
artist = "maraudamusic"
site = "https://soundcloud.com/" & artist & "/tracks"
Set ie = New InternetExplorer
ie.Visible = True
ie.navigate site
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Application.Wait (Now + TimeValue("0:00:03")) ' wait for ajax...
Set html = ie.Document 'get the document from IE
Dim els As Variant
Dim el As Variant
Set els = html.getElementsByClassName("soundTitle__title")
For Each el In els
Debug.Print el.innerText
Next el
Debug.Print "Hello"
End Sub

Related

VBA Excel click on href and fill in form

im trying to get some data from a website by using vba.
The data i want is from this site: https://www.uitvoeringarbeidsvoorwaardenwetgeving.nl/mozard/!suite16.scherm1168?mSel=145576
What i want the code to do is click on the purple bar with the pencil
on it so the screen appears for filters and than fill in a specific time frame in the filters.
When this is done i want to get the data that appears.
Im able to navigate to the site and click on the purple bar so the filter screen appears. but i cant fill in the dates
this is the code i have so far:
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLAs As MSHTML.IHTMLElementCollection
Dim HTMLA As MSHTML.IHTMLElement
Dim pastDate As MSHTML.IHTMLElement
Dim futuredate As MSHTML.IHTMLElement
IE.Visible = True
IE.Navigate "https://www.uitvoeringarbeidsvoorwaardenwetgeving.nl/mozard/!suite16.scherm1168?mGmr=66"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLAs = HTMLDoc.getElementsByTagName("a")
For Each HTMLA In HTMLAs
'Debug.Print HTMLA.className, HTMLA.getAttribute("href"), HTMLA.getAttribute("rel"), HTMLA.innerText
If HTMLA.getAttribute("href") = "https://www.uitvoeringarbeidsvoorwaardenwetgeving.nl/mozard/!suite16.scherm1168?mGmr=66#editmodal" Then
HTMLA.Click
Exit For
End If
Next HTMLA
Do While IE.ReadyState <> 4 Or IE.Busy:
DoEvents: Loop
Set HTMLInput = HTMLDoc.getElementById("frm_FKMT_B931_542_823883_dva_id1")
HTMLInput.Value = "01-01-2020" 'THIS GIVES AN ERROR?
The last line of code gives an error and i dont understand why??
This is the HTML code from the website that i want to change the value of:
<input name="FKMT_B931_542_823883_dva" class="datumveld form-control" id="frm_FKMT_B931_542_823883_dva_id1" type="text" pattern="(0[1-9]|1[0-9]|2[0-9]|3[01]).(0[1-9]|1[012]).[0-9]{4}">
Thanks and sorry for the inconvience or poorly asked question, if there is anything else you guys need to now please feel free to ask!
Thank you!!
This is an example to fill the first date field. The IDs seems not very stable.
Beware: There is a pattern for the entered dates
pattern="(0[1-9]|1[0-9]|2[0-9]|3[01]).(0[1-9]|1[012]).[0-9]{4}"
There are some html events. I don't know if it is necessary to trigger them to make the dialog realy work.
Have you checked if the page works in IE?
Sub OpenAndFillForm()
Dim browser As Object
Dim url As String
Dim nodeToClick As Object
Dim nodeForm As Object
Dim nodeFirstDate As Object
url = "https://www.uitvoeringarbeidsvoorwaardenwetgeving.nl/mozard/!suite16.scherm1168?mGmr=66"
Set browser = CreateObject("internetexplorer.application")
browser.Visible = True
browser.navigate url
Do Until browser.readyState = 4: DoEvents: Loop
Set nodeToClick = browser.document.getElementByID("tabel2").getElementsByTagName("a")(0)
nodeToClick.Click
Application.Wait Now + TimeValue("00:00:02")
Set nodeForm = browser.document.getElementByID("tabel12")
Set nodeFirstDate = nodeForm.getElementsByClassName("datumveld")(0)
nodeFirstDate.Value = "31-12-2019"
End Sub

VBA: Go to IE site and tick by element

New to navigating with IE. I m trying to go to a site https://www.thetrainline.com/ and tick the Return button by using element getElementsByID("return").FirstChild.Click However all this does is bring up the site. It doesn't tick the return option.
Any ideas?
Public Function GetHTML() As String
On Error GoTo ErrorEscape
Dim URL As String
Dim IE As Object: Set IE = CreateObject("InternetExplorer.Application")
Dim HTML As Object
Dim myElement As Object
'Define URL
URL = "https://www.thetrainline.com/"
IE.navigate URL
Application.Wait (Now + TimeValue("0:00:03"))
MsgBox ("IE Loaded")
Set myElement = IE.getElementsByID("return").FirstChild.Click
Exit Function
ErrorEscape:
Set IE = Nothing
Set HTML = Nothing
End Function
P.S. I want many users to be able to use this without having to download something. Hence, why I am not using Selenium.
I try to modify your code to click on 'return' radio button.
Modified code:
Sub demo()
Dim URL As String
Dim IE As Object: Set IE = CreateObject("InternetExplorer.Application")
Dim HTML As Object
Dim myElement As Object
'Define URL
URL = "https://www.thetrainline.com/"
IE.Visible = True
IE.navigate URL
Application.Wait (Now + TimeValue("0:00:03"))
'MsgBox ("IE Loaded")
IE.document.getElementById("return").Click
ErrorEscape:
Set IE = Nothing
Set HTML = Nothing
End Sub
Output:

When the search button is clicked using vba the text entered in search box is not seen by web page

I have written vba code for entering manufacturer part number in search box of below website and clicking on search icon. It is able enter manufacturer part number in search box and click on search icon, but when "search icon is clicked the text entered in the text box is not picked up". It searches empty data.
'HTML Part for search icon
<em class="fa fa-search" aria-hidden="true" style="color: gray;"></em>
It being almost a month I have tried various different way which was also mentioned on stack overflow, like using "createEvent("keyboardevent")" but nothing worked.
' VBA code
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://ecat.aptiv.com"
Do While IE.readyState < READYSTATE_COMPLETE
Loop
Dim idoc As MSHTML.HTMLDocument
Set idoc = IE.document
idoc.getElementById("searchUserInput").Value = "33188785"
Dim doc_ele As MSHTML.IHTMLElement
Dim doc_eles As MSHTML.IHTMLElementCollection
Set doc_eles = idoc.getElementsByTagName("a")
For Each doc_ele In doc_eles
If doc_ele.getAttribute("ng-click") = "SearchButtonClick(1)" Then
doc_ele.Click
Exit Sub
Else
End If
Next doc_ele
End Sub
The page does an xhr request to retrieve the search results. You can find it in the network tab after clicking submit. This means you can avoid, in this case, the expense of a browser and issue an xhr request. The response is json so you do need a json parser to handle the results.
I would use jsonconverter.bas to parse the json. After installing the code from that link in a standard module called JsonConverter, go to VBE > Tools > References > Add a reference to Microsoft Scripting Runtime
I dimension an array to hold the results. I determine rows from the number of items in the json collection returned and the number of columns from the size of the first item dictionary. I loop the json object, and inner loop the dictionary keys of each dictionary in collection, and populate the array. I write the array out in one go at end which is less i/o expensive.
Option Explicit
Public Sub GetInfo()
Dim json As Object, ws As Worksheet, headers()
Dim item As Object, key As Variant, results(), r As Long, c As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://ecat.aptiv.com/json/eCatalogSearch/SearchProducts?filter=All&options=&pageSize=10&search=33188785", False
.send
Set json = JsonConverter.ParseJson(.responseText)("Products")
End With
headers = json.item(1).keys
ReDim results(1 To json.Count, 1 To UBound(headers) + 1)
For Each item In json
r = r + 1: c = 1
For Each key In item.keys
results(r, c) = item(key)
c = c + 1
Next
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
You can do this instead:
txt = "33188785"
IE.navigate "https://ecat.aptiv.com/feature?search=" & txt
This will take you straight to the Search Result.
Code:
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Dim txt As String
Set IE = New InternetExplorer
txt = "33188785"
IE.Visible = True
IE.navigate "https://ecat.aptiv.com/feature?search=" & txt
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
End Sub
This will be faster as You will only have to load one page.
Why that's happening, i am not sure, but seems like the TextBox that is used to input text is not being Activated when adding text automatically to it. It is being activated when we click inside it.
I got the solution for above problem from Mrxel.com below is the link for that post.
https://www.mrexcel.com/forum/excel-questions/1105434-vba-ie-automation-issue-angularjs-input-text-post5317832.html#post5317832
In this case I need to enter the search string character by character and sendKeys and input events inside the loop. Below is the working vba code.
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://ecat.aptiv.com"
Do While IE.readyState < READYSTATE_COMPLETE
Loop
Dim idoc As MSHTML.HTMLDocument
Set idoc = IE.document
IE.document.getElementById("searchUserInput").Focus = True
IE.document.getElementById("searchUserInput").Select
sFieldInput = "33188785"
For s = 1 To Len(sFieldInput)
Application.SendKeys Mid(sFieldInput, s, 1)
While IE.readyState < 4 Or IE.Busy
Application.Wait DateAdd("s", LoopSeconds, Now)
Wend
Next s
IE.document.getElementById("searchUserInput").Focus = False
Dim doc_ele As MSHTML.IHTMLElement
Dim doc_eles As MSHTML.IHTMLElementCollection
Set doc_eles = idoc.getElementsByTagName("a")
For Each doc_ele In doc_eles
If doc_ele.getAttribute("ng-click") = "SearchButtonClick(1)" Then
doc_ele.Click
Exit Sub
Else
End If
Next doc_ele
End Sub

Scrape website (Excel vba) with xml http request after cookie has been set

I would like to scrape a website (extract a product price) from a single website page (with XML HTTP request). But before this script should run I need to have selected the correct store first (saved in browser cookie variable or included in any other way/request if possible) since prices are different in different shops.
I have created a working code but it's taking a very long time to run so i assume there must be faster and cleaner :) way. I also needed to include the application to wait for the website to follow the steps.
My current vba code:
runs a HTTP IE request to open the website, and in multiple clicks selects the desired store and saves it in a cookie (like a site user should do)
next the product page is requested with another HTTP IE request and data is extracted. I found out a can't use the XML HTTP request because it won't use the cookie value with the correct store, displaying the correct price.
The price i'm after (in the example below) is E 1,39 instead of E 1,48 (when no cookie value is used and no store is selected).
The cookie value is saved in the cookie "www.jumbo.com/cookie/HomeStore the Content is holding the store tag which is known upfront and could be hardcoded in a request if possible.
Selecting the correct store (and saving it in a browser cookie)
Sub SetStore()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLSearchbox As MSHTML.IHTMLElement
Dim HTMLSearchboxes As MSHTML.IHTMLElementCollection
Dim HTMLButton As MSHTML.IHTMLElement
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim HTMLSearchButton As MSHTML.IHTMLElement
Dim HTMLSearchButtons As MSHTML.IHTMLElementCollection
Dim HTMLStoreID As MSHTML.IHTMLElement
Dim HTMLStoreIDs As MSHTML.IHTMLElementCollection
Dim HTMLSaveStore As MSHTML.IHTMLElement
Dim HTMLSaveStores As MSHTML.IHTMLElementCollection
'set on False to hide IE screen
IE.Visible = True
'navigate to url with limited content
IE.navigate "https://www.jumbo.com/content/algemene-voorwaarden/"
Do While IE.readyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.document
Set HTMLButtons = HTMLDoc.getElementsByTagName("button")
For Each HTMLButton In HTMLButtons
If HTMLButton.getAttribute("data-jum-action") = "openHomeStoreFinder" Then
HTMLButton.Click
Exit For
End If
Next HTMLButton
Application.Wait Now + #12:00:02 AM#
Set HTMLSearchboxes = HTMLDoc.getElementsByTagName("input")
For Each HTMLSearchbox In HTMLSearchboxes
If HTMLSearchbox.getAttribute("id") = "searchTerm__DkKYx4XylsAAAFJktpb2Guy" Then
'input field store name/location to show search results
HTMLSearchbox.Value = "Oosterhout"
Application.Wait Now + #12:00:03 AM#
HTMLSearchbox.Click
Exit For
End If
Next HTMLSearchbox
Set HTMLSearchButtons = HTMLDoc.getElementsByTagName("button")
For Each HTMLSearchButton In HTMLSearchButtons
If HTMLSearchButton.getAttribute("data-jum-filter") = "search" Then
HTMLSearchButton.Click
Exit For
End If
Next HTMLSearchButton
Application.Wait Now + #12:00:05 AM#
Set HTMLStoreIDs = HTMLDoc.getElementsByTagName("li")
For Each HTMLStoreID In HTMLStoreIDs
'oosterhout = YC8KYx4XB88AAAFIDcIYwKxJ
'nieuwegein = 84IKYx4XziUAAAFInSYYwKrH
'vaassen = JYYKYx4XC1oAAAFItvcYwKxJ
'brielle = OG8KYx4XP4wAAAFIlsEYwKxK
If HTMLStoreID.getAttribute("data-jum-store-id") = "YC8KYx4XB88AAAFIDcIYwKxJ" Then
HTMLStoreID.Click
Application.Wait Now + #12:00:03 AM#
Exit For
End If
Next HTMLStoreID
Set HTMLSaveStores = HTMLDoc.getElementsByTagName("button")
For Each HTMLSaveStore In HTMLSaveStores
If HTMLSaveStore.getAttribute("data-jum-action") = "saveHomeStore" Then
HTMLSaveStore.Click
Exit For
End If
Next HTMLSaveStore
'IE.Quit
End Sub
Extracting data from product page (IE HTTP request, working with cookie store value)
Sub GetJumboPriceIE()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim JumInputs As MSHTML.IHTMLElementCollection
Dim JumInput As MSHTML.IHTMLElement
Dim JumPrice As MSHTML.IHTMLElement
Dim JumboPrice As Double
Dim Price_In_Cents_Tag As String
Dim SKU_tag As String, SKU_url As String
SKU_tag = "173140KST"
SKU_url = "https://www.jumbo.com/lu-bastogne-koeken-original-260g/173140KST/"
IE.Visible = False
IE.navigate SKU_url
Do While IE.readyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.document
IE.Quit
Set JumInputs = HTMLDoc.getElementsByTagName("input")
Price_In_Cents_Tag = "PriceInCents_" & SKU_tag
Set JumPrice = HTMLDoc.getElementById(Price_In_Cents_Tag)
JumboPrice = JumPrice.getAttribute("value") / 100
Debug.Print JumboPrice
End Sub
The code above is working but would like to use XML HTTP request code like below (but using the correct store). The price of 1,39 is printed.
Extracting data from product page (using XML HTTP request), but cookie value is not used
Sub GetJumboPriceXML()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim JumInputs As MSHTML.IHTMLElementCollection
Dim JumInput As MSHTML.IHTMLElement
Dim JumPrice As MSHTML.IHTMLElement
Dim JumboPrice As Double
Dim Price_In_Cents_Tag As String
Dim SKU_tag As String, SKU_url As String
SKU_tag = "173140KST"
SKU_url = "https://www.jumbo.com/lu-bastogne-koeken-original-260g/173140KST/"
XMLReq.Open "GET", SKU_url, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
Set JumInputs = HTMLDoc.getElementsByTagName("input")
Price_In_Cents_Tag = "PriceInCents_" & SKU_tag
Set JumPrice = HTMLDoc.getElementById(Price_In_Cents_Tag)
JumboPrice = JumPrice.getAttribute("value") / 100
Debug.Print JumboPrice
End Sub
This code is not using the correct store and outputting the price i'm not after (The price 1,48 is printed).
To summarize:
When no store is selected (no cookie set) the following URL now gives the price of €1,48.
I would like the VB script to set the store to “Jumbo Oosterhout Nieuwe Bouwlingstraat” and then scrape a predefined list op product URL’s and extract the prices (URL above gives €1,39).
Then set the store to a different local store “Jumbo Brielle Thoelaverweg” and scrape the identical list of product URL’s. The above URL gives €1,48.
You can select a different store by clicking on the location pin icon at the top right of the page.
Thanks a lot for your help

Excel VBA : How to copy href from HTML by referencing the class name

<span class="export excel">Excel </span>|
How can I copy the href in this HTML via Excel VBA?
This is my code, but it doesn't work.
Set Export = ie.Document.all("export excel")
URL = Export.href
ie.Navigate URL
This code you should walk through in debug mode ... it's more for instructional use than for production, but it will do what you want from it. I used Google as a test site.
Sub Test()
Dim Browser As SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim Link As String, Target As Object
Link = "http://www.google.com"
' start browser
Set Browser = New SHDocVw.InternetExplorer
Browser.Visible = True
' wait a bit
Browser.Navigate Link
' wait a bit
Set HTMLDoc = Browser.Document
' wait a bit
Set Target = GetElementByTagAndClassName(HTMLDoc, "SPAN", "gbtb2")
If Not (Target Is Nothing) Then
' test here if parent really is a <a>
Debug.Print Target.parentElement.href
' ta-taaaa!!!
End If
End Sub
' get element by tag and attribute value
Function GetElementByTagAndClassName(Doc As MSHTML.HTMLDocument, ByVal Tag As String, ByVal Match As String) As MSHTML.IHTMLElement
Dim ECol As MSHTML.IHTMLElementCollection
Dim IFld As MSHTML.IHTMLElement
Set GetElementByTagAndClassName = Nothing
Set ECol = Doc.getElementsByTagName(Tag)
For Each IFld In ECol
' Debug.Print IFld.className
If IFld.className = Match Then
Set GetElementByTagAndClassName = IFld
Exit Function
End If
Next
End Function

Resources