Loop through same div' class and grab text using python webdriver

Loop through same div' class and grab text using python webdriver - python-3.x

I done all with selenium and webdriver and now not sure how to get text from ALL div class Text3. But also I have problem with div id="TableStart_00023" That changes now and then numbers "TableStart_00023, TableStart_0283 etc.."
Here is HTML PART OF CODE
<div data-reactroot="" id="TableStart_00023">
<ul>
<li class="FirstRow03">
<a class="aClass">
<div class="innerCl">
<div class="Text1"></div>
<div class="Text2"></div>
<div class="Text3">Wanted data</div>
<div class="Text4"></div>
</div>
</a>
</li>
<li class="FirstRow02">
<a class="aClass">
<div class="innerCl">
<div class="Text1"></div>
<div class="Text2"></div>
<div class="Text3">Wanted data 2</div>
<div class="Text4"></div>
</div>
</a>
</li>
</ul>
</div>
Here is Python PART OF CODE what I done
for content in driver.find_elements_by_id('TableStart_00023'):
mytext= content.find_element_by_xpath('.//div[#class="Text3"]').text
print(mytext)
How can I create loop thought all div class Text3 and get text, when ID TableStart changes numbers? What am I doing wrong?

This xpath will return all elements div class Text3 from your table:
//div[starts-with(#id,'TableStart')]//div[#class='Text3']
When you have all this elements (using driver.find_elements_by_xpath) you can get texts from they.

Related

Apply style to all nested li div's excluding the first li div

No matter what I've tried to either target or exclude the first li div, nothing works. I tried :not(and different variations of the selectors here) but that didn't work. On one occasion, the result rendered as though looking into a mirror that's facing a mirror. No bueno. No classes or ids can be added--representative example below.
ul {
list-style-type: none;
}
.message-content-wrap {
background-color: red;
}
<ul id="thread-list" class="group-message-thread">
<li>
<div class="message-wrap group-messages-highlight">
<div class="avatar-wrap"></div>
<div class="message-content">
<div class="message-content-wrap">
<p>This is content in the first message</p>
</div>
</div>
</div>
</li>
<li>
<div class="message-wrap">
<div class="avatar-wrap"></div>
<div class="message-content">
<div class="message-content-wrap">
<p>This is content in the second message</p>
</div>
</div>
</div>
</li>
<li>
<div class="message-wrap">
<div class="avatar-wrap"></div>
<div class="message-content">
<div class="message-content-wrap">
<p>This is content in the third message</p>
<p>some text</p>
</div>
</div>
</div>
</li>
<li>
<div class="message-wrap">
<div class="avatar-wrap"></div>
<div class="message-content">
<div class="message-content-wrap">
<p>This is content in the fourth message</p>
<p>some more text</p>
</div>
</div>
</div>
</li>
</ul>
Any ideas would be most welcome. My attempt in jsfiddle

a few ways to style child elements
ul > li:not(first-of-type) {
// styles go here
}
ul > li {
// styles go here
}
ul:first-child {
}
li + li {
}

Press "Visit product" button only if an <article> has an <li> class of "availability"

I have the following source code:
<form method="POST" data-component="compareForm" action="#">
<div class="row tsp" data-component="list-page-product">
<article id="123">
<div id='product'>
</div>
<div class="stock">
<ul class="simple" data-product="availability">
<li class="available">
<i class="icon-tick"></i>
<span>Delivery available</span></li>
</ul>
</div>
<div data-component="CT">
<button class="TT" type="button">Visit product</button>
</div>
</article>
<article id="1234">
<div id='product'>
</div>
<div class="stock">
<ul class="simple" data-product="availability">
<li class="available">
<i class="icon-tick"></i>
<span>Delivery available</span></li>
</ul>
</div>
<div data-component="CT">
<button class="TT" type="button">Visit product</button>
</div>
</article>
</div>
</form>
I would like to press the "Visit product" button if I found a class name of "available". In this example only article id="123" should be a match.
My code is:
if self.driver.find_elements_by_xpath("//li[#class='available']"):
self.driver.find_element_by_xpath('//*[#class="TT"]').click()
The first error is that it cannot locate an element using XPath. I don't know what to do next. Any input is much appreciated. Thank you!

If I were you I would search for articles then iterate and if available class is found click.
from selenium import webdriver
d = webdriver.Chrome()
d.get('URL')
articles = d.find_elements_by_xpath('//article')
for article in articles:
try:
available = article.find_element_by_class_name(
'//li[#class="available"]')
article.find_element_by_xpath('//button[#class="TT"]').click()
except:
pass
If button class is not only 'TT' or li class is not only 'available' this will not work. In that case you could use find_element_by_class_name.

Python BeautifulSoup Regex Filter Not Working

I want content of div class 'hide info-json' whose parent li tags class is 'info-wrap' or 'info-wrap no-meta' but not 'info-wrap hide'.
HTML example:
<li class="info-wrap">
<div class="hide info-json">
<p>Content That I Want - JSON Data </p>
</div>
</li>
<li class="info-wrap hide">
<div class="hide info-json">
<p>Content That I Don't Want </p>
</div>
</li>
<li class="info-wrap no-meta">
<div class="hide info-json">
<p>Content That I Want - JSON Data </p>
</div>
</li>
Here is my code:
soup = BeautifulSoup(res.text, "lxml")
for divTags in soup.findAll('li', class_ = re.compile('^(?!.*hide).*info-wrap.*$')):
for infoList in divTags.find_all('div',{'class':'hide info-json'}):
Curinfo = json.loads(infoList.text)
but it returns nothing.
If I check this regex on https://regex101.com/r/8yJ5yI/1, it's working fine. Please help me how to do it.
For me its not mandatory to use regex, all I want is <p>Content That I Want </p>
Thank you

import re
html = """<li class="info-wrap">
<div class="hide info-json">
<p>Content That I Want - JSON Data </p>
</div>
</li>
<li class="info-wrap hide">
<div class="hide info-json">
<p>Content That I Don't Want </p>
</div>
</li>
<li class="info-wrap no-meta">
<div class="hide info-json">
<p>Content That I Want - JSON Data </p>
</div>
</li>"""
l = re.findall(r"""<li\s+class="info-wrap(\s+no-meta)?"\s*>\s*
<div\s+class="hide\s+info-json"\s*>
\s*(.*?)\s*
</div>\s*
</li>
""",html, flags=re.VERBOSE|re.IGNORECASE|re.DOTALL)
l = [item[1] for item in l]
print(l)
Prints:
['<p>Content That I Want - JSON Data </p>', '<p>Content That I Want - JSON Data </p>']
See Demo

Use :not (bs4 4.7.1+) to filter out unwanted class
import requests
from bs4 import BeautifulSoup as bs
html = '''<li class="info-wrap">
<div class="hide info-json">
<p>Content That I Want - JSON Data </p>
</div>
</li>
<li class="info-wrap hide">
<div class="hide info-json">
<p>Content That I Don't Want </p>
</div>
</li>
<li class="info-wrap no-meta">
<div class="hide info-json">
<p>Content That I Want - JSON Data </p>
</div>
</li>'''
soup = bs(html, 'lxml')
print([p.text for p in soup.select('.info-wrap:not(.hide) p')])

BeautifulSoup .find() capturing too much text (how do I narrow it down?)

wondering how to target the "Switch" text on the below html:
<div class="product_title">
<a href="/game/pc/into-the-breach" class="hover_none">
<h1>Into the Breach</h1>
</a>
<span class="platform">
<a href="/game/pc">
PC
</a>
</span>
</div>
<div class="product_data">
<ul class="summary_details">
<li class="summary_detail publisher" >
<span class="label">Publisher:</span>
<span class="data">
<a href="/company/subset-games" >
Subset Games
</a>
</span>
</li>
<li class="summary_detail release_data">
<span class="label">Release Date:</span>
<span class="data" >Feb 27, 2018</span>
</li>
<li class="summary_detail product_platforms">
<span class="label">Also On:</span>
<span class="data">
Switch </span>
</li>
</ul>
</div>
so far I am capturing the "Also On:" text as well (with a lot of spaces) with this code:
self.playable_on_systems_label.setText(self.html_soup.find("span", class_='platform').text.strip() + ', ' + self.html_soup.find("li", class_='summary_detail product_platforms').text.strip())
how do I capture (in this case) only the "Switch" text?
FYI - for the first half of the statement (capturing the "PC") text works fine just not the "also on" text
Thanks in advance,

Your query is getting the entire span element with class="summary_detail product_platforms", which is going to include all the text starting from "Also On:" until "Switch." Try something like .find('a', href=re.compile("^.+switch.+$")) or alternately (using CSS) .select("a[href*=switch]") (solution from here)

you can use BeautifulSoup select() function to navigate the the "Switch" text, check this code!!!
rom bs4 import BeautifulSoup
html = '''<div class="product_title">
<a class="hover_none" href="/game/pc/into-the-breach">
<h1>Into the Breach</h1>
</a>
<span class="platform">
<a href="/game/pc">
PC
</a>
</span>
</div>
<div class="product_data">
<ul class="summary_details">
<li class="summary_detail publisher">
<span class="label">Publisher:</span>
<span class="data">
<a href="/company/subset-games">
Subset Games
</a>
</span>
</li>
<li class="summary_detail release_data">
<span class="label">Release Date:</span>
<span class="data">Feb 27, 2018</span>
</li>
<li class="summary_detail product_platforms">
<span class="label">Also On:</span>
<span class="data">
<a class="hover_none" href="/game/switch/into-the-breach">Switch</a> </span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, 'html.parser')
text = soup.select('.summary_detail.product_platforms .hover_none')[0].text.strip()
print(text)
Output:
Switch

Meteor.JS - Yogiben Favourites - How Can I Show Title?

I'm building a small app within meteor, using the yogiben favourites package to allow users to favourite select items in a collection.
Currently the below code only shows the ID of the favourite/post, rather than the corresponding title.
How can I show the title?
<template name="favoritesSidebar">
<div class="template-favorites-sidebar">
{{#if myFavorites.count}}
<div class="panel panel-default">
<div class="panel-heading">
<h3 class="panel-title">{{_ "favorites"}}</h3>
</div>
<div class="panel-body">
<ul>
{{#each myFavorites collection="Posts"}}
<li>
<a>{{_id}}</a>
</li>
<li>
<a>{{_title}}</a>
</li>
<li>
<a>{{Post._title}}</a>
</li>
{{/each}}
</ul>
</div>
</div>
{{/if}}
</div>
</template>

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Loop through same div' class and grab text using python webdriver - python-3.x

This xpath will return all elements div class Text3 from your table: //div[starts-with(#id,'TableStart')]//div[#class='Text3'] When you have all this elements (using driver.find_elements_by_xpath) you can get texts from they.

Related

Apply style to all nested li div's excluding the first li div

Press "Visit product" button only if an <article> has an <li> class of "availability"

Python BeautifulSoup Regex Filter Not Working

BeautifulSoup .find() capturing too much text (how do I narrow it down?)

Meteor.JS - Yogiben Favourites - How Can I Show Title?

Categories

Resources