Results extraction for a dynamic result using selenium webdriver

Results extraction for a dynamic result using selenium webdriver - search

I am working on application, were I need to provide search criteria available in a database and no. of results displayed are not equal every time.
For Ex: Search with 1234 , it may display 4 results today and 3 results next day, so my requirements are below:
How to provide the search criteria dynamically for under search(BDD).
How to display the results in console for the given input.
And I want to display the results and validate by clicking on each and every result.
Note: App is developed using Angular JS. I have tried using WebElement List but it is not extracting for dynamic search result for the drop down only looking in the page.
Also anyone can suggest how to provide search result under Behavioral Driven Development.
Please suggest using Selenium Commands.
Thanks for all your support and co-operations.

Related

Hide popup overlay by default if there is no search results in google custom search engine cse

I am trying to find out solution of this from so many days and couldn't find any solution of this please help me with the solution.
I am using popup overlay of Google CSE. I am using only allowing few sites to show the search results.
If search result is available then it shows like this = https://prnt.sc/qAt0KbTVPNb7
If search result is not available then it shows like this type of empty result interface = https://prnt.sc/WnaeQ6_7yyPF
I want this to automatically trigger close button if no search result is exist.
How can I do this? Please help me with this :(

Scrapy not extracting data from a certain xpath

I'm trying to extract some data from an amazon product page.
What I'm looking for is getting the images from the products. For example:
https://www.amazon.com/gp/product/B072L7PVNQ?pf_rd_p=1581d9f4-062f-453c-b69e-0f3e00ba2652&pf_rd_r=48QP07X56PTH002QVCPM&th=1&psc=1
By using the XPath
//script[contains(., "ImageBlockATF")]/text()
I get the part of the source code that contains the urls, but 2 options pop up in the chrome XPath helper.
By trying things out with XPaths I ended up using this:
//*[contains(#type, "text/javascript") and contains(.,"ImageBlockATF") and not(contains(.,"jQuery"))]
Which gives me exclusively the data I need.
The problem that I'm having is that, for certain products ( it can happen within 2 pairs of different shoes) sometimes I can extract the data and other times nothing comes out. I extract by doing:
imagenesString = response.xpath('//*[contains(#type, "text/javascript") and contains(.,"ImageBlockATF") and not(contains(.,"jQuery"))]').extract()
If I use the chrome xpath helper, the data always appears with the xpath above but in the program itself sometimes it appears, sometimes not. I know sometimes the script that the console reads is different than the one that appears on the site but I'm struggling with this one, because sometimes it works, sometimes it does not. Any ideas on what could be going on?

I think I found your problem: Its a captcha.
Follow these steps to reproduce:
1. run scrapy shell
scrapy shell https://www.amazon.com/gp/product/B072L7PVNQ?pf_rd_p=1581d9f4-062f-453c-b69e-0f3e00ba2652&pf_rd_r=48QP07X56PTH002QVCPM&th=1&psc=1
2. view response like scrapy
view(respone)
When executing this I sometimes got a captcha.
Hope this points you in the right direction.
Cheers

Web Scraping into Excel

I would like to create a spreadsheet that I can refresh and pull in each weeks English premier League fixtures, each week I would like to refresh this and see the weeks future fixtures. I have tried to use the import function from Data/From Web and selected the box with the table of fixtures however no data gets pulled into the spreadsheet.
The website I am using is - "http://data.7m.com.cn/matches_data/92/en/index.shtml"
I am open to understand a better way of doing this import and also if there is a better website to use I am also happy to change. I have chosen this one as it seems to have the most simplified listing of the fixtures.
I have also tried this website - https://www.premierleague.com/fixtures
When the import completes it actually skips all the fixtures and returns all the other information.
Should i be looking to some of the HTML elements within the script of the web page to extract the data?
For example on the following site - https://www.premierleague.com/fixtures I am looking for a file received by the website that updates the fixtures each week (after some direction from Google) I hit the F12 command and look within the "Network" tab however I cant understand how the website, this or the others quoted create the weekly fixtures.
Any suggestions on how to pull this into Excel or another tool would be fantastic.

Welcome to [so]! it sounds like you haven't done as much research as you could have. Your first link, in the top corner has links to "Free Feed" which take you to customizable widgets and from there is a link to a customizable live template.The first page also has a link to "Data" , I'm not sure what that consists of or whether it will help (since I'm not much of a sports fan on my continent, and even less on yours!
As for importing into Excel, I didn't have an issue with the table I could see, but once again I'm not clear on what data you're trying to get and what you want to do with it.
On the ribbon's Data tab click From Web.
Enter the first URL from your question and hit Enter
When the Navigator window loads, click "Table 1" and then click Load.
Below is what Excel then automatically loaded as a table:
If instead of clicking Load, you were to click Edit then you are brought into the Power Query Editor, where you can customizable tons of stuff. The one I was interested in was Use First Row on Headers. After choosing that, and clicking Close & Load, and 30 seconds of formatting later I had:
With Power Query you can choose, remove, split, or combine columns from this or other tables. It's fairly advanced but you should be able to find a good Power Query tutorial online, to see examples of what you can do, to learn about other ways you can customize the import and/or analysis of the data.
Edit:
More Information:
Here are the instructions for all versions:
Office Support : Connect to a web page (Power Query)

Need help on how to customize fb generic template with flight search results?

I am working on a travel bot. The user can search and book flight by entering origin,destination along with dates.
I have integrated node.js server and I have an external API to retrieve flight details based on the search.
Everything is working fine,but how do I display the results in a template format (GENERIC TEMPLATE).
I have found a similar bot Skyscanner which display the search results in a beautiful way.
Like the below one.
IMAGE TO SKYSCANNER FLIGHT SEARCH RESULTS
They have converted the search results into a image and displaying in a generic template (HOW can we do this ?).
How can I display search results in a template format?
Appreciate Help!

That appears to be an image they are generating and attaching on the fly to the generic template. You could also use the airline itinerary template:
https://developers.facebook.com/docs/messenger-platform/send-api-reference/airline-itinerary-template

Best Approach to Scrape Paginated Results using import.io

There are several websites within the cruise industry that I would like to scrape.
Examples:
http://www.silversea.com/cruise/cruise-results/?page_num=1
http://www.seabourn.com/find-luxury-cruise-vacation/FindCruises.action?cfVer=2&destCode=&durationCode=&dateCode=&shipCodeSearch=&portCode=
In some scenarios, like the first one shown, the results page follows a patten - ?page_num=1...17. However the number of results will vary over time.
In the second scenario, the URL does not change with pagination.
At the end of the day, what I'd like to do is to get the results for each website into a single file.
Q1: Is there any alternative to setting 17 scrapers for scenario 1 and then actively watching as results grow/shrink over time?
Q2: I'm completely stumped about how to scrape content from second scenario.

Q1- The free tool from (import.io) does not have the ability to actively watch the data change over time. What you could do is have the data Bulk Extracted by the Extractor (with 17 pages this would be really fast) and added to a database. After each entry to the database, the entries could be de-duped or marked as unique. You could do this manually in Excel or programmatically.
Their Enterprise (data as a service) could do this for you.
Q2- If there is not a unique URL for each page, the only tool that will paginate the pages for you is the Connector.

I would recommend you to build an extractor to get the pagination. The result of this extractor will be a list of links, each link corresponding to a page.
This way, every time you run your application and the number of pages changes, you will always get all the pages.
After that, make a call for each page to get the data you want.
Extractor 1: Get pages -- Input: The first URL
Extractor 2: Get items (data) -- Input: The result from Extractor 1

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string