Accessing DOM elements via Automation scripts

Accessing DOM elements via Automation scripts - maximo

Below was a case that I was working on that I solved with a launch in context, but I’m still wondering if there is a specific Maximo library that can help me access the DOM via automation script.
I’m looking for functionality similar to how Selenium works with their #browser tag. This tag lets a script reference elements in the current browser.
I'm looking to use an automation script, either JavaScript or Python syntax, in order to manipulate the DOM of the current application a user is on. What I'm trying to do is set the description tag of a current element, given the element ID, and I'm having trouble finding a package that can reference DOM elements.
I did some research on the psdi.common.context.UIContext package, but this was a pretty shallow package and did not provide the functionality I was looking for.

Related

Pulling Data from HTML Tables

So I understand how to pull data from a single weblink looking at tables. I cannot find not 1 tutorial anywhere on the web about how to do so getting it from Div elements and no one talks about it at all. Can someone please give me an example or something? Either Excel or Google Spreadsheets.
Im trying to teach myself doing so but using this website https://newworldstatus.com/regions/us-east for a small project I want to do.
Thank you in advance.

This is not a comprehensive answer, just intended to show you how some very basic concepts work. Second, an answer for Sheets, but let me preface all of this by saying that while your test URL seems simple enough, you will not be able to do any of this for that specific URL. They are either actively trying to stop scraping or they just have it set up in a way that makes it difficult to scrape by accident. If you directly make a web request to that URL, you will get back the JS code that actually handles the data load-in and not the data itself, so any kind of parsing you try to do will fail because what you see in the page isn't what is actually coming back on the initial page request. All the html that will be in the page is enough to show this:
You would need to either try to read through the code and figure out what they're doing, or do some tinkering in the javascript console, and probably some fairly high-level tinkering. So for a first project, or just to learn some basics, I think I would pick a different test case.
First, in VBA. It's both complicated and not all that complicated at the same time. If you know how web technologies work non-language specifically, then it all works pretty much the same way in VBA. First, you'll need to make a web request. You can do that with the winHTTP library or the msXML library. I usually use winHTTP, but unless what you're doing is complex, either one is fine.
WEB REQUEST:
You'll need to instantiate a request object. You can do that by either adding a reference to the library (tools->references-> and pick the library out of the list) or you can use late binding. I prefer to add the reference, because you get intellisense that way. Here are both:
Dim req As New WinHttp.WinHttpRequest
or
Set req = CreateObject("WinHttp.WinHttpRequest.5.1")
Then you open the request. I'm going to assume this is a straight GET. POST requests get a little more complicated:
req.Open "GET", url, TRUE
If you have the reference added and created the req with Dim, then you'll get the intellisense and as you type that the arguments will pop up and you can use that to refer to the documentation if you have questions. TRUE here is to send it asynchronously, which I would do. If you don't, it will block up the interface. This the Open method, which you can find in the documentation.
https://learn.microsoft.com/en-us/windows/win32/winhttp/iwinhttprequest-interface
Then use
req.send
req.WaitForResponse
source = req.responseText
to send the request. WaitForResponse is needed only if you send the request asynchronously. The last part is to get the responseText into a variable.
PARSING:
Then you'll need to do some stuff with the MSHTML library, so add a reference to that. You can also late bind, but I would not, because it will be very helpful to you to have the prompts in intellisense.
First, set up a document
https://learn.microsoft.com/en-us/dotnet/api/mshtml.htmldocument?view=powershellsdk-1.1.0
and write the source you just fetched to it:
Dim doc as new MSHTML.HTMLdocument
doc.write source
Now you have a document object you can manipulate. The trick is to get a reference to the element you want. There are two methods that will return an element:
getElementById
querySelector
If you are lucky, the element you are looking for will have a unique ID and you can just get it. If not so lucky, you can use a selector that identifies it uniquely. In either case, you will set up an IHTMLElement to return to:
Dim el as MSHTML.IHTMLElement
set el = doc.getElementById("uniqueID") 'whatever the unique ID is
Once you have that, you can use the methods and properties of the element to return information about it:
https://learn.microsoft.com/en-us/dotnet/api/mshtml.ihtmlelement?view=powershellsdk-1.1.0
There are more specific interfaces, like
https://developer.mozilla.org/en-US/docs/Web/API/HTMLAnchorElement
You can use the generic IHTMLElement, but sometimes there are advantages to using a specific element type, for instance, the properties that are available to it.
Sometimes you will have to set up an IHTMLElementCollection:
https://learn.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/platform-apis/aa703928(v=vs.85)
and iterate it to find the specific element you are looking for. There are four methods that return collections:
getElementsByName
getElementsByTagName
getElementsByClassName
querySelectorAll
getElementsByClassName is sometimes problematic, so forewarned is forearmed.
If you need to do that, set up and IHTMLElementCollection and return the list to that:
dim els as MSHTML.IHTMLElementCollection
set els = doc.getElementsByTagName("tagName") 'for instance a for anchors, div for divs
That is about it. There is obviously more to it, but a comprehensive answer would be very long. This is mostly intended to point you in the right direction and give you more stuff to google.
I will say that you should test out some of these methods in the browser first. They exist in many languages, and all major browsers have developer tools. For Chrome, for instance, press Ctrl+Shift+I to bring up the dev tools, and then in the console window type something like:
document.getElementById("uniqueID")
and you should get the node. or
document.getElementsByClassName(".test") 'where test is the name of the class
document.querySelectorAll("div") ' where you pass a valid CSS selector
and you will get the node list.
It will be quicker to experiment there than to try to set it up and debug in VBA. once you have a good handle on how it works, try to transfer that knowledge to a VBA implementation.
Here is a basic overview of .querySelector to get you started on understanding how those work, although they can get very complicated. In fact, querySelector is my go to method for finding elements.
https://www.w3schools.com/jsref/met_document_queryselector.asp
Now, Google Sheets:
You don't really want to use IMPORTHTML, even though it seems counterintuitive. That function (AFAIK) only supports tables and lists, and it's index based, too, which means you give it a number n and it returns the nth table or list in the page. That means that if they ever change the layout, or the layouts are dynamic in any way, then you won't be able to rely and an index to accurately identify what you want. Also, as you noted people don't really use tables much anymore, and when they say list I'm pretty sure they mean on and elements, which is also not going to be that useful to you. Here's the docs:
https://support.google.com/docs/table/25273?hl=en&visit_id=637732757707317357-1855795725&rd=2
But you can use IMPORTXML. Even though it says XML, you can still use it to parse HTML (for reasons and with limitations that are out of scope for this answer). IMPORTXML takes a URL and an xpath selector. In this way it's similar to the document.querySelector and querySelectorAll methods. Here is some information on xpath in tutorial from from w3schools.
https://www.w3schools.com/xml/xpath_intro.asp
And if you want to test selectors in Chrome you can use $x("selector") in the javascript console in the dev tools. I believe Firefox also supports this, but I am not sure if other browsers do. If not, you can use document.evaluate:
https://developer.mozilla.org/en-US/docs/Web/API/Document/evaluate
Even though you can't actually use this in sheets against the URL you've given, let's take a look at a couple of xpath selectors in that context. Hit Ctrl+Shift+I to bring up the dev tools (hopefully you are using Chrome), and then go to the elements tab. If you don't have the javascript console showing in the bottom pane, hit Esc. You should see something like this:
Use the arrow icon in the top left of the dev tools to search the elements, and just click on the first row in the table:
so that you can see the structure of the elements, and figure out how to parse out what you want from it. You'll notice that the cell that's highlighted is contained in a div with a role of "row" and an attribute of row-id. I think that's where I would start. So an xpath to that container would look something like this:
//div[#row-id=1]
where we are fetching all elements (//) that match div and have an attribute (#) of row-id = 1.
If you want to get the children of that container, you just add another level to the path
//div[#row-id=1]/div
where we want to get all children (/) that are divs.
And I notice that they all have a col-id attribute, so if you wanted to fetch the "set" information you'd just specify divs that have an attribute of col-id = 'set':
//div[#row-id=1]/div[#col-id='set']
and to get the text out of that:
//div[#row-id=1]/div[#col-id='set']/text()[1]
since it looks like the second node is the one that has the team name in it. Again, you can see how this WOULD work in the dev tools, but you won't actually be able to use this for your URL.
I'm not going to spend a lot of time here. As already stated, you won't be able to use this method on your specific URL. If you can figure out the actual URL that your URL wraps around, then perhaps. Also, since there's only one argument, the selector, then there's not much more to expound on. If you needed something more complex, like the ability to iterate over a set of matching nodes, you could probably do it in Scripts, but I would probably just switch to Excel if it started getting that complicated. The only exception would be if the data was JSON formatted, in which case Scripts will be able to handle that better than VBA, although I would probably switch to a different language entirely in that case.
Since your URL is probably not good for testing, I'm going to point you to this tutorial from Geckoboard, which has a few different examples from sites like Wikipedia and Pinterest.
https://www.geckoboard.com/blog/use-google-sheets-importxml-function-to-display-data/
So google around, experiment, and let me know if you need any help. And this was all off the top of my head, so let me know if any of this stuff throws errors so I can edit the answer.
Also, be aware that Excel is not always the right tool for dealing with this. Very often, while the page might have the elements you are looking for, they will be loaded in with JSON and both php and javascript can natively handle JSON objects, while VBA doesn't. If the data is JSON formatted, it is much easier to parse it out of that than trying to parse it out of the DOM structure (DOM = document object model, another thing to google). Also, in many cases, if the data is loaded in with AJAX, it won't be returned with your winHTTP call, because that doesn't execute any javascript that might be in the page.
Further, in many cases you will need to set headers or cookies in the winHTTP call to get the data (calls without the right setings might return an error or a redirect). That is also not addressed in my answer, although you can set headers and cookies in winHTTP. You would need to sniff the calls, either with Fiddler or similar or with the network tab in dev tools, to find out the right combination of information to pass with your request.

How to access client-side javascript variable from external page?

Let's say, when we open link https://example.com and that page generates javascript variable, called xyz, so we can access it from browser's Inspect console:
console.log(xyz);
However, how can we get & read that variable from node.js ? (the external link needs to be rendered like in browser, to get javascript values out of it).

There are different approaches to get the variable, depending how they are created.
variable is directly in source code so you can simply parse it using regEx for example.
variable is being evaluated in JS runtime, in this case you need to mock the browser environment using PhantomJS, which is quite heavy.

One possible way seems to use plugin like Nightwatch. However, in the past it depended selenium, which is a bit heavy.

How to load CSS from library when using 'require'

I’m building an electron app. In it, I have a webview with a preload script. Inside said script, I’d like to use sweetalert.
I installed sweetalert with npm install --save sweetalert. Inside my script I load it with require('sweetalert') and call it with swal("Hello world!");. I now notice it doesn’t look right, as the alert is missing its required CSS file. But I’m loading it with require('sweetalert'), which is great since sweetalert can just remain in its directory inside node_modules and I don’t have to care for it, but its CSS is an integral part of it, and is not getting pulled the same way.
Now, what is the recommended way of solving this? Keep in mind I’m inside a javascript file and would like to remain that way. Do I really have to go get the CSS file and inject it in some way? And how would I do it correctly, since it is inside node_modules? After testing it, it seems like it can’t be done in this particular case due to Content Security Policy.
Either way, that seems so clunky in comparison to the require statement, it’d seem weird for a simpler solution to not be available.

You'll have to include it like you would normally do in a browser, for example in index.html. Copy it out of the module folder into your css folder if you have one and link it with the link tag. It depends on if you're using plain electron or some other boilerplate template with there is a gulp/grunt workflow on where to stick it but that's it really, electron is just a browser that's running your JS/html so it's really the exact same process. require only loads the JS module but not the styles.
if you wanted to include it dynamically you could use the same techniques as a regular browser for example (ex. document.write/create element).

I'm not familiar with sweetalert, but hopefully this helps.
Your syntax for require should be something similar to this.
var sweetalert = require('sweetalert')
You should then be able to access methods on the sweetalert object using the following syntax.
sweetalert.someMethod()
Remember requiring just returns a javascript object. Those objects usually have methods that will allow certain functionality. If you want to add sweetalert to your page, you will either need to inject it within the html, or the javascript within the sweetalert module will need to dynamically create html where the css is included. I hope that clarifies some things and helps you get a better sense of some of the inner workings.

Sharepoint html element naming conventions, by Sharepoint Controls

A few times I've attempted to customize a SP2007 page using css, html, or javascript in Sharepoint Designer; however, in Sharepoint Designer I am not able to get direct access to the desired elements since they are generated by a Sharepoint Control (such as a web part or dataview) and appear only AFTER the page is rendered in the browser. I use use IE's F12 to tracked the element I wish to change. Then I can see an identifer such as name or id I can use in my javascript or css.
Example 1: SP2007 generates "name=ctl00$PlaceHolderMain$g_ba9196a9_2842_4607_b048_9a443cb4def5$ff2_1$ctl00$ctl00$BooleanField" for an input text box. I use that name to manipulate the text box as I desire.
Example 2: SP2007 generates "id=zz6_menu" for the "Welcome" text which I use to get the users full name.
So far this has worked out fine. Am I tempting fate?
Can someone refer me to a reference that discusses how these names and other Sharepoint Control element identifiers are generated?
Are they stable? Can I count on them to be the same provided the application I develop with my version of SP isn't updated to a later version of SP? And even if that case I'm thinking I can simply update to the identifiers created by the newer version of SP.
Is this a good practice? Any other comments?
All responses are welcomed.
Thanks.

SharePoint is based on ASP.NET and that's why the Ids are automatically genereated.
cf this article.
You should not use them to identify elements on css or js.
Do not write code that references controls using the value of the
generated UniqueID property. You can treat the UniqueID property as a
handle (for example, by passing it to a process), but you should not
rely on it having a specific structure.
In my opinion, the best way is to rely on the css classes because they are not automatically generated and should not change a lot.
Anyway, if you upgrade to SP2010 or 2013, lot of your modifications won't work anymore because the structure and css changed...

How to validate HTML Reports using Watir

We are trying to validate the HTML reports generated by our application. We have planned the below approach to do this
Capture that data related to the report from application
Generate the report
Identify the report’s elements and compare the data captured from application against the data/elements in report.
We started with identification of elements of the report and found that using 'Developers tool' we are able get some of the object properties where the Object ID is missing from these properties.
Can anyone please let us know the possibility of capturing the report elements and comparing them with the application data.

Frankly if you are just trying to parse HTML, and not trying to drive a browser, you'd probably be better off using something like Nokogiri, or another library aimed specifically at parsing HTML See https://www.ruby-toolbox.com/categories/html_parsing for a selection of such tools
Watir is 'Web Application Testing In Ruby" it is designed to drive web-browsers, in order to test websites and webapps. Validation that portions of the HTML are as expected is a part of that, but not the core functionality of Watir.
For what you are trying to do, if I understand you right, you could use watir to do that, in somewhat the same way you can use excel for word processing and page layout, which is to say it can be done, but may not be the most preferred tool or easiest way to go about it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string