With Excel Power query its possible to pull data from a website provided its in a database/table format.
Many online databased are so large however that they implement a search function instead of showing the entire database which is fine but causes a barrier when trying to efficiently locate information for many keywords.
The database I wish to search is:
https://apps.who.int/food-additives-contaminants-jecfa-database/search.aspx
Is it possible to create a list of keywords/CAS numbers and search the database for each of these sequentially and return data found? This is similar to web scraping but with the added step of actually searching for the data beforehand.
It's totally possible to acheive what you want.
First you analyze the page, specifically the input box and the submit button and find what's identify them. I use Chrome Development Tools for this. Just open the desired page and press F12.
In this case the input box is:
<input name="ctl00$ContentPlaceHolder1$txtSearch" type="text" id="ContentPlaceHolder1_txtSearch">
and the submit button is:
<input type="submit" name="ctl00$ContentPlaceHolder1$btnSearch" value="Search" id="ContentPlaceHolder1_btnSearch">
You can then use the ids to address the box with javascript:
var inputBox = document.getElementById('ContentPlaceHolder1_txtSearch');
inputBox.value = 'your search string';
And the equivalent for the submit button:
var searchButton = document.getElementById('ContentPlaceHolder1_btnSearch');
searchButton.click(); // Start the search
When the results are delivered you then need to analyze that page to figure out what javascript code is needed to extract the part of that page that you're interrested in. Or you can dump the complete page with:
document.documentElement.outerHTML;
Excel VBA example code for running javascript on a webpage here:
https://github.com/peakpeak-github/libEdge
Modify the code to suit your needs.
Related
Learning RPA with UIPath. Happily extracting onscreen data from a website, processing it, using it, etc.
However, there's information in the page that isn't visible, but is in the source, eg, open graph meta tags:
<meta property="og:image" content="https://example.com/foo.jpg" />
What options are open to me to extract this with UIPath? I gather there's an ExtractMetaData flag from ExtractData but I've yet to find a useful tutorial that I can follow at this stage :/
You can try and use Data Scraping option by selecting the respected option from the Wizards Tab as show below:
Now you need to indicate on the screen the area of data that you need to scrape, like:
Structure data in form of table
Specific element on the web page
Or the whole window
Data scraping activity generates a container (Attach Browser or Attach Window) with a selector for the top-level window and an Extract Structured Data activity with a partial selector as per images below:
So all you need to do is place your XML tag as Input under ExtractMetadata field as per image below:
Hope these information will be useful.
Hello I'm making a website and I want that an user can edit some of the texts content in the website, clicking a button "Save" and the changes are permanent.
I suppose I have to put all the text in a database instead of keeping it in html code, but this will slow down the website loading performance.
Is there a library to easily achieve this functionality? Pheraps changing html code instead of acting on the database.
You have to read the file content and load it into a form in a textarea or WYSIWYG and after submitting the form, put the changed data back into the file. The codes for loading and saving data, depends on programming language. check the documentation for opening and saving the file in PHP here.
http://php.net/manual/en/function.file-put-contents.php
On my lotus domino web application, I have customized search form where user can enter criteria (around 10 criteria ) is there, now what I would like to do is that I would like to throw the result to another page/form using html.
But my concern is that I would like to access div elements on the output form/page and I am not sure if I can do it Web query save agent of search form.
Basically what I wanted to do is I will compose html in the WQS agent and assign that HTML to div of the output search form. but I am not sure how to access div element of the another form using WQS agent of the current form.
I can display result in the same form but question would remain again how to access div element in the WQS agent of Lotusscript.
using document context we can access the field of the currrent document submitted but not sure about div element.
Please asssist
You could use some REST here. Basically submit the search form to a REST service, collect the results and render them as needed.
In short, all you can do in a WQS is to spit out a stream of text (which may or may not be HTML) from the server to the browser. So I think you have a couple of options:
In your template HTML, add a placeholder where your <div> is and do a replace() (replace the placeholder with the HTML you want to appear in that div) before sending out the HTML to the browser, or
Output enough JavaScript and/or JQuery and/or whatever so that the div is updated by the client after the document is loaded. Of course there's no guarantee that will happen though.
Another approach is just to create a Notes document with computed fields and/or computed text. In this case you don't think of updating the "div" as updating an HTML div, you're updating computed text on a Notes form. When you return the document to the browser as a document instead of messing with a WQS agent.
I suppose your WQS agent could also potentially send out just JavaScript to update another page, but to me that smacks of a cross-site scripting attack.
I'm using Capybara with Cucumber.
The webpage I'm testing contains many email fields throughout but the ID's and labels for the input field change depending on which page you're on.
What I'm trying to do is create an generic reference to any email field so that one fill in method will work for all pages.
When inspecting the input fields, I can see they are of type='email
The full html:
<input id="privatekeeper_email_email" name="privatekeeper_email.email" value="" data-validity-message="Must be a valid email address" no_optional_label="true" type="email" autocomplete="off" maxlength="254">
In my block below you should be able to grasp what I'm tring to do:
email_fields = all('input[type="email"]')
fill_in(email_fields[0], with: text)
fill_in(email_fields[1], with: text)
end
When I run this, I get the following error:
Capybara::ElementNotFound: Unable to find field #<Capybara::Node::Element tag="input" path="/html/body/div[3]/div/div[2]/form/div/div[2]/div[6]/div/div[2]/div/div/div/div[2]/input">
Reading the Capybara docs, I can see that fill_in responds to ID, name or Label so my reference might not work. Is there anyway I could get this block to work?
Like I said, the Id's and labels are not consistent throughout the user journey
Since you've already found the element you need to call #set on it instead of using fill_in
email_fields[0].set(text)
I'm a coldfusion developer working on a reporting application to display information from a CFSTOREDPROC process. I've been able to get the data from my query to display correctly in a CFGRID, and I'm really happy with the display of the data. The grid saves a lot of time because it avoids using the CFOUTPUT tag and formatting the data in HTML for hundreds of reports.
All I would like to do is add a simple Disk Icon somewhere on the datagrid control that would save the contents of the datagrid and export it into an XLSX(2010) file that an end user could then manipulate in a spreadsheet program. This is important because the data needs to have a 'snapshot' at certain times of year saved.
Solutions Tried:
I looked into having a link from the report options page that would fire into a report_xls.cfm page but designing a page that catches all of the report options a second time seems dumb and would add thousands of CFM's to the website.
CFSPREADSHEET seems not to work for a variety of reasons. One is that the server seems to constantly fight me with the 'write' function in this tag. Another is that I don't know how to make the javascript work for this button to get the output that I want.
I also looked into doing this as a Javascript button that would fire based on the data entered. Although the data from a CFSTOREDPROC will display correctly if I use a CFOUTPUT block, CFGRID seems to have a hard time with all output styles except HTML. This has caused some difficulty with these solutions because the application doesn't spit out a neat HTML table but instead sends a javascript page section.
Raymond Camden's blog contains an entry Exporting from CFGRID that we used in our project.
The example in the article exports to PDF, but it is rather simple to modify the download.cfm file to export to Excel files as well:
You modify the file to generate the <table>...</table> HTML from his example in a <cfsavecontent variable="exportList"> tag, so that the #exportList# variable contains the table that will be shown in the spreadsheet.
Next we have a URL parameter mode that determines whether it is exported to PDF or Excel.
So the end of our download.cfm looks like the following:
<cfif url.mode EQ "PDF">
<cfheader name="Content-Disposition" value="inline; filename=report.pdf">
<cfdocument format="pdf" orientation="landscape">
<cfoutput>#exportList#</cfoutput>
</cfdocument>
<cfelse>
<cfcontent type="application/vnd.ms-excel">
<cfheader name="Content-Disposition" value="report.xls">
<cfoutput>#exportList#</cfoutput>
</cfif>