I'm trying to learn Scrapy for Python(3), writing a crawler that is supposed to get data on from swedish ecommerce-site Blocket.se.
The "next page" button on the bottom of the page is one of many buttons without a unique class or id. The only difference between the buttons is the actual element text, the tags look the same.
"Next page"-button html
<a class="page_nav" itemprop="name" href="?q=macbook+air&cg=0&w=1&st=s&c=&ca=11&l=0&md=th&o=2&last=1">
Nästa sida »
</a>
"1st page"-button html
<a class="page_nav" itemprop="name" href="?q=macbook+air&cg=0&w=1&st=s&c=&ca=11&l=0&md=th">
1
</a>
Is there a way to specifically target the "next page"-button for the pagination part in the Scrapy code? Maybe by the actual text inside the element?
Try response.xpath(u'//a[contains(text(), "Nästa sida")]/#href').get()
Related
with selenium package for python I want to:
1. Identify element on webpage
2. Select language
I believe that presented below HTML code refers to the button I have in mind:
<ul class="nav navbar-top-links navbar-right">
<li><a>Wersja:6653.606</a></li>
<img src="Img/Flags/32/pl.png">
<li class="dropdown"><a class="dropdown-toggle" data-toggle="dropdown" href="#">Wybór języka<span class="caret"></span></a>
<ul class="dropdown-menu dropdown-user">
<li><img src="Img/Flags/16/pl.png" style="margin-right: 10px">pl</li>
<li><img src="Img/Flags/16/en.png" style="margin-right: 10px">en</li>
</ul>
</li>
<li><span class="glyphicon glyphicon-log-in"></span> Logowanie</li>
</ul>
This is dropdown list, which presents two options:
polish language
english language
How do I locate this specific object on website (dropdown list)?
I was thinking about something like:
select_element = driver.find_element_by_class_name("dropdown-menu dropdown-user")
How do I use webbrowser to select specific option? (Let's say english language)
Some sort of selection, but I have to find the element first.
Thank you.
It took me at least couple of hours to find the solution, but I did it :).
To sum up:
1. I need to select and click the listbox on website with language selection.
2. Then I need to select and click desired language.
**Solution:**
- You can find element (like listbox) with selenium on multiple different ways:
find_element_by_id
find_element_by_name
find_element_by_xpath
find_element_by_link_text
find_element_by_partial_link_text
find_element_by_tag_name
find_element_by_class_name
find_element_by_css_selector
Documentation: https://selenium-python.readthedocs.io/locating-elements.html
- I tried multiple of them and failed.
I found info, that xpath is very reliable until somebody makes changes to html code of website.
- I have 0 knowledge of html / css etc. thus finding xpath was very difficult for me.
Fortunately I found add-in to firefox which did it for me:
https://github.com/trembacz/xpath-finder
- After installation of add-in, I first selected/found xpath for Listbox, then the same for language.
- Python code for both elements:
#Find Listbox with language selection and click it
select_element = driver.find_element_by_xpath("/html/body/form/div[3]/nav[1]/ul/li[2]/a")
select_element.click()
#Find language inside listbox and click it
select_element = driver.find_element_by_xpath("/html/body/form/div[3]/nav[1]/ul/li[2]/ul/li[2]/a")
select_element.click()
Done :)
I am using the Firefox webdriver for Selenium to scrape a webpage that looks to be rendered with React on the client side. The classes in the rendered DOM look dynamically generated, and seem to change with every new request. There are also many button elements on the page, some of which are not in the viewport. So my strategy is to search for a way to click on a button that contains text that I enter using selenium. Several buttons will contain the text, and I want to just find the first such button.
Using selenium/xpath, how would I select the first button that contained the text E9 1QJ?
<button>
<div><svg ...> </div>
<div>
<div>London</div>
<div>E9 1QJ</div>
</div>
</button>
<button>
<div><svg ...> </div>
<div>
<div>London Foo Bar</div>
<div>E9 1QJ</div>
</div>
</button>
Thanks
This should work:
{driver}.find_element_by_xpath("//button[div/div[text()='E9 1QJ']][1]")
But keep in mind that a solution like this it is not very flexible and could break with a minimum change in the html structure.
I am developing an automation script and a part of it requires me to hover over a navigation bar to display a dropdown menu. The script is written using NodeJS and the browser used is Internet Explorer.
Navigation source code
...
<ul class=navigation " data-dojo-attach-point="nonmMenu ">
<li class= "dropdown ">
<i class="fa fa-clipboard nav-icon " aria-hidden="true "></i><span>Accounts</span>
<div class='fulldrop i3">..</div>
</li>
</ul>
...
NodeJS code:
let xPathButton = "//span[text()='Accounts']";
//Find button to hover over
let buttonWithDropDown = driver.findElement(By.xpath(xPathButton));
//Hover
driver.actions().mouseMove(buttonWithDropDown).perform();
However, this does not work. The end goal is to click a link once the dropdown menu appears, which I have tried doing but as the element is not visible I get the exception ElementNotInteractableError: Cannot click on element. I would appreciate some pointers in the right direction to sort this out.
Update:
Been looking at this a bit more; Could the aria-hidden attribute in the anchor tag be causing the selenium driver to not detect the element?
Please note that changing the browser is not an option.
Try to hover over an a or li element. Also you can try click:
By.xpath("//a[span[.='Accounts']]")
By.xpath("//li[.//span[.='Accounts']]")
You can try open menu without opening menu with javascript:
executeJavaScript("arguments[0].click();", yourDropdownMenuElement);
I'm implementing a bootstrap 3 carousel on kentico 9 and need some help with automatically hiding the carousel control (including the circle indicator and the next/previous arrow) if there's only one item left, if possible.
What I've done for the carousel was setting up a new page type for this in which each banner is a page in the content tree under /hero/ folder. Then used 2 repeaters: the first one displays the circle indicator; the second one displays the banner info. All worked well.
Here's how the indicator repeater is set up:
Content before: <ol class="carousel-indicators">
Content after </ol>
Item transformation: <li data-target="#hero-banner" data-slide-to="<%# DataItemIndex%>" class="<%# (DataItemIndex == 0 ? "active" : "" ) %>"></li>
It means the first circle is always there. How to hide it and get rid of the <ol> tags in content before/after?
The next/previous arrows are again in the webpart zone content after, which has this html:
<a class="left carousel-control" href="#hero-banner" data-slide="prev"><span class="icon-prev"></span></a>
<a class="right carousel-control" href="#hero-banner" data-slide="next"><span class="icon-next"></span></a>
</div> <!--/#hero-banner-->
Using content before/after is like hard-coding it onto the page, but I don't know how to make it displayed dynamically and automatically only when we have more than one item. Could you help?
you can use <%# DataItemCount %> One of the [Transformation methods][1]
[1]: https://docs.kentico.com/display/K8/Reference+-+Transformation+methods to determine how many items there are. Then just have the html added in if there is more than one. Something like
<%# If(DataItemCount > 1,'html for more than one item','html for only one') %>
Of course, if you are using the envelope before / after to show arrows, you could also use jquery to determine how many items are there & hide the arrows based off that.
$(function(){
if($(".carousel-indicators li").length == 1){
$(".left.carousel-control").hide();
$(".right.carousel-control").hide();
}
});
manually clicking on tab(anchor tag) its displaying drop down menu(unordered list) with watir element is locating but drop down menu is not displaying
HTML
<ul>
<li id="NetworkAnalysisTabPanel__ext-comp-1038" class=" x-tab-strip-menuable x-tab-strip-active ">
<a class="x-tab-strip-close" onclick="return false;"></a>
<a class="x-tab-strip-menu" onclick="return false;"></a>
<a></a>
<a class="x-tab-right" onclick="return false;" href=""></a>
</li>
</ul>
Tried the following line of code to click on the tab
$ff.div(:id,"NetworkAnalysisTabPanel").div(:index,1).div(:index,1).ul(:index,1).li(:index,1).link(:index,2).fire_event("onClick")
I am using watir 1.6.6 version
Firstly since your HTML sample that you provided does not include the element you are using in the command you attmepted, it's hard to know where that might be going wrong. Secondly since the code you provided does have a div with a unique ID present, why not start there instead of with an outer container.
I think the problem is that you are using
.fire_event("onCLick")
However the code is monitoring for an event named "onclick" (all lower case)
Try using
.fire_event("onclick")
or if you have not already, perhaps just
.click
and see if that works for you
Also, I'd seriously recommend you upgrade to a more current version of Watir.. 1.6.6 is pretty behind the times.
Update: that html code is starting to look very familiar to me, if this is the same basic control from the other two questions you've posted so far, then try firing the 'onmousedown' event against the element that invokes the menu and see if that works