How to find elements that do not include a certain class name with selenium and python - python-3.x

I want to find all the elements that contain a certain class name but skip the ones the also contain another class name beside the one that i am searching for
I have the element <div class="examplenameA"> and the element <div class="examplenameA examplenameB">
At the moment i am doing this to overcome my problem:
items = driver.find_elements_by_class_name('examplenameA')
for item in items:
cname = item.get_attribute('class')
if 'examplenameB' in cname:
pass
else:
rest of code
I only want the elements that have the class name examplenameA and i want to skip the ones that also contain examplenameB

To find all the elements with class attribute as examplenameA leaving out the ones with class attribute as examplenameB you can use the following solution:
css_selector:
items = driver.find_elements_by_css_selector("div.examplenameA:not(.examplenameB)")
xpath:
items = driver.find_element_by_xpath("//div[contains(#class, 'examplenameA') and not(#class='examplenameB')]")

You can use xpath in this case. So as per your example you need to use something like driver.find_elements_by_xpath('//div[#class='examplenameA'). This will give you only the elements whose class is examplenameA
So how xpath works is : Xpath=//tagname[#attribute='value']
Hence the class is considered as the attribute & xpath will try to match the exact given value, in this case examplenameA, so <div class="examplenameA examplenameB"> will be ignored
In case of find_elements_by_class_name method, it will try to match the element which has the class as examplenameA, so the <div class="examplenameA examplenameB"> will also be matched
Hope this helps

Related

filtering out elements found with beautiful soup based on a key word in any attribute

Here is an example of an url.
url = 'https://rapaxray.com'
# logo
html_content = requests.get(url, headers=headers).text
soup = BeautifulSoup(html_content, "lxml")
images_found = soup.findAll('img', {'src' : re.compile(r'(jpe?g)|(png)|(svg)$')})
images_found
First I'm narrowing down the list of elements to the ones containing jpg, png or svg in a tag. In this case I only get 3 elements. Then I would like to filter those elements to show me only the ones that have a key word 'logo' in ANY attribute.
The element I'm looking for in this example looks like this:
'img alt="Radiology Associates, P.A." class="attachment-full size-full astra-logo-svg" loading="lazy" src="https://rapaxray.com/wp-content/uploads/2019/09/RAPA100.svg"/'
I want to filter out this element out of all elements based on condition that it has a key word 'logo' in ANY of its attributes
The challenge is that:
I have thousands of urls, and key word logo could be in a different attribute for different url
logic: if 'logo' in ANY(attribute for attribute in list_of_possible_attributes_that_this_element_has) doesn't work the same way as list comprehensions because I couldn't find the way of how to access any possible attribute without using its specific name
Checking all specific names is also problematic because particular attribute could exist in one element but not the other which throws error
Case above is also extra challenging because attribute value is a list, so we would need to flatten it to be able to check if the key word is in it.
For most of the urls the element I'm looking for is not returned as the top one like in this example so choosing top first is not an option.
Is there a way of filtering out elements based on a key word in ANY of its attributes? (without prior knowledge of what the name of the attribute is?).
If I understood you correctly, you could use a filter function similar to this answer to search for all tags such that any tag attribute's value contains val:
def my_filter(tag, val):
types = ['.jpg','.jpeg','.svg','.png']
if tag is not None and tag.name == "img" and tag.has_attr("src"):
if all(y not in tag['src'] for y in types):
return False
for key in tag.attrs.keys():
if isinstance(tag[key], list):
if any(val in entry for entry in tag[key]):
return True
else:
if val in tag[key]:
return True
return False
res = soup.find_all(lambda tag: my_filter(tag, "logo"))

LitElement equivalent of React "key" concept

<example name="One"></example>
<example name="Two"></example>
<example name="Three"></example>
Next render looks like this:
<example name="Four"></example>
<example name="Three"></example>
LitElement will remove the last element and update the first two with new properties.
How do I change this so that LitElement removes all elements except name="three" and a new element is created with name="Four" on first position?
Using React, this would be accomplished by giving them a key property. I want to achieve the same result using LitElement.
<example key="1" name="One"></example>
<example key="2" name="Two"></example>
<example key="3" name="Three"></example>
For this you want to use the lit-html repeat directive. From the docs:
The repeat directive performs efficient updates of lists based on
user-supplied keys:
repeat(items, keyFunction, itemTemplate)
Where:
items is an Array or iterable.
keyFunction is a function that takes a single item as an argument and returns a guaranteed unique key for that item.
itemTemplate is a template function that takes the item and its current index as arguments, and returns a TemplateResult.
For example:
const employeeList = (employees) => html`
<ul>
${repeat(employees, (employee) => employee.id, (employee, index) => html`
<li>${index}: ${employee.familyName}, ${employee.givenName}</li>
`)}
</ul>
`;
If you re-sort the employees array, the repeat directive reorders
the existing DOM nodes.
To use repeat you'll need to import it from lit-html:
import {repeat} from 'lit-html/directives/repeat';

ie getelementsbyID with same ID

i have a script that works with internet explorder (ie) and i need to loop the select fields, that it zelf is no ploblem bu the 4 elements got the same ID (on the same page).
How do i let it loop through the 4 fields?
Can i make them more spesified?
the code i use is the following:
ie.document.getElementByID("DownloadImage").Click
The ie code is the following:
field 1
<a id="DownloadButton" href="javascript:__doPostBack('ctl00$ctl00$MainContent$MainContent$ctl00$declaratiebestandView$RetourInformatieGrid$ctl03$DownloadButton','')">CZ_Specificatie_150005697.pdf</a>
field 2
<a id="DownloadButton" href="javascript:__doPostBack('ctl00$ctl00$MainContent$MainContent$ctl00$declaratiebestandView$RetourInformatieGrid$ctl03$DownloadButton','')">CZ_Specificatie_150005697.pdf</a><input name="ctl00$ctl00$MainContent$MainContent$ctl00$declaratiebestandView$RetourInformatieGrid$ctl03$DownloadImage" class="inlineButton" id="DownloadImage" type="image" src="../images/download.png" text="CZ_Specificatie_150005697.pdf">
then it opens the download screen, and then my code continue's (and works :) )
You can loop them by using querySelectorAll to gather all the elements with an id attribute whose values match what you are after. You can distinguish between them by index. This method will allow you to gather them even though the ids are repeating. However, the HTML you have shared downloads the same document so a loop doesn't seem necessary.
Dim nodeList As Object, i As Long
Set nodeList = ie.document.querySelectorAll("[id=DownloadButton]")
For i = 0 to nodeList.Length-1
nodeList.item(i).Click
Next
That loops all of the matching elements and clicks
By index will be specific but if you familiarize yourself with CSS selectors there are a vast number of possibilities for specifying an element.
The id in HTLM must be unique. If it is not unique it is no valid HTML and should be fixed.
HTML4:
http://www.w3.org/TR/html4/struct/global.html
Section 7.5.2:
id = name [CS]
This attribute assigns a name to an element. This name must be unique in a document.
HTML5:
http://www.w3.org/TR/html5/dom.html#the-id-attribute
The id attribute specifies its element's unique identifier (ID). The
value must be unique amongst all the IDs in the element's home subtree
and must contain at least one character. The value must not contain
any space characters.

How do I find all elements which have a css class in coded ui?

I need to find all Html controls which have a given css class.
var htmlControl = new HtmlControl(document);
htmlControl.SearchProperties[HtmlControl.PropertyNames.Class] = #class;
var uiTestControlCollection = htmlControl.FindMatchingControls();
Using the class name works when there is just one css class on the control. If I have more than one css classes applied on the element, can I search for the element by specifying just one css class and not all of them?
Thanks
You can perform a partial match, like so:
htmlControl.SearchProperties.Add(HtmlControl.PropertyNames.Class, #class, PropertyExpressionOperator.Contains);
var uiTestControlCollection = htmlControl.FindMatchingControls();
The main draw back of this is that it is just a simple string compare. To illustrate, imagine you have two controls A and B. A has class "Test" and B has classes "testdiv topnav". Now if you perform a search for "test", both controls A and B will be selected.
To match a class exactly, you can provide a close as match as possible using the above method and write a helper function to:
Loop through the collection
Get the class of each control
Split the class string on the spaces
Loop through this array and test each for an exact match
Keep the elements where a class matches exactly
Note: This is clearly non-optimal - I'm all ears if someone has a better solution.
Cheers,
Seb

can sizzle return elements that do not have any attributes?

I am experimenting with using a sizzle query to extract elements with out attributes from a webpage.
I can easily return an element by specifying its attribute.
Ex: doc.Tables.Filter(Find.BySelector("[ID='abc']")) returns elements with ID='abc'
Is there a way to return all the elements that do not have any attributes at all?
This is what Im shooting for:
doc.Tables.Filter(Find.BySelector("[]")) 'return all element with no attributes
Thanks
Ken

Resources