Nodejs writing a scraper that can read JS protected websites - node.js

nodejs is new for me, and I've just started learning web scrapers. The problem is that I need to scrap a website that is protected with JS. So I need to get phones from the website, but div with the phone number appears only after user clicks on "show number" button. Are there any ways or npm to get the numbers? Here is the website enter link description here, but it is in russian language. So the button is "показать телефоны". Thank you in advance!

Go to the page with the phone numbers and open the console and look at your network tools. When you click on a phone number it makes an ajax request and gives you a nice formatted response
{"status":"ok","phone":"+7 (727) 317-20-86","html_tooltip":"<section class=\"company-phones-tooltip\">\r\n <div class=\"company-phones-tooltip__wrap\">\r\n <header class=\"company-phones-tooltip__header\">\r\n Inform the manager that you learned the information on Allbiz.\r\n <\/header>\r\n <ul class=\"company-phones-tooltip-list\">\r\n <li class=\"company-phones-tooltip-list__item\">\r\n <div class=\"company-phones-tooltip-list__name\">\r\n management\r\n <\/div>\r\n <div class=\"company-phones-tooltip-list__number\">\r\n +7 (727) 317-20-86\r\n <\/div>\r\n <\/li>\r\n <\/ul>\r\n <footer class=\"company-phones-tooltip__footer\">\r\n <a class=\"company-phones-tooltip__link\" href=\"https:\/\/12246-kz.all.biz\/contacts\" target=\"_blank\">\r\n Show all contacts\r\n <\/a>\r\n <\/footer>\r\n <\/div>\r\n<\/section>"}
In order to emulate this you just need to make a call to the http://api.all.biz/ajax/viewphonenew/kz endpoint with the correct parameters after scraping the page. And all the query params that are needed for this endpoint is in the html element.
<div class="company-phones__wrap" data-click="company-phones" data-entid="58474" data-verify="bYjmFpAfm5QWOgIjx8cyNOARdSG3FIoPo6he2dYGLIc=" data-phone="Zk6xDyCXPMqWMXgTaCI51A24FHIsDwuy8IaF993LsHI=" data-country="kz" data-placement="company-phones-tooltip___3" data-tooltip-direction="left" data-source="list">
<div class="company-phones__code">+7(7 </div>
<div class="company-phones__main" data-ga="show-phones-list" data-ga-id="">
<div class="company-phones__btn">Показать телефоны</div>
</div>
</div>

Related

Unable to click expand button using selenium web driver in Python

I'm totally new to programming. Currently trying to automate few daily tasks using selenium web driver on Python. I have a webpage which contains multiple + expand button. Below is the code
Without expansion:
<div class="expansion container">
<div class="expansion_base_parent"></div>
<div class="expansion expansion_parent">
<button type="button" class="compact-visual-toggle"></button>
</div>
</div>
With expansion
<div class="expansion_container">
<div class="expansion_base_parent"></div>
<div class="expansion expansion_parent">
<button type="button" class="compact-visual-toggle active"></button>
</div>
</div>
I'm unable to find this element using any of the find_by method
My colleague said the page contains json that's why unable to locate using find_by
Can somebody please help with the code to locate and click the expand button.
Actual page:
Go-to https://fortigate.fortidemo.com
Username demo
Password demo
Click login read-only
Click later in next window
Now navigate to Network > interfaces There you can see lot of expand button that's what I'm referring I have written code to come till this page, but I want to expand before taking screenshot of the page
It does not have to do anything with JSON.
See, you are saying
<div class="expansion container">
<div class="expansion_base_parent"></div>
<div class="expansion expansion_parent">
<button type="button" class="compact-visual-toggle"></button>
</div>
</div>
that you see this HTML when expand button is present.
You can locate with below css :-
button.compact-visual-toggle
or xpath :
//button[contains(#class, 'compact-visual-toggle')]
and since you have mentioned that they are multiple expension button, you can differentiate like below :
(//button[contains(#class, 'compact-visual-toggle')])[1]
should represent the first, for second you can try :
(//button[contains(#class, 'compact-visual-toggle')])[2]
and so on.. for 3rd, 4th....
in code you can use it like this :
driver.find_element_by_xpath("//button[contains(#class, 'compact-visual-toggle')]").click()
or
driver.find_element_by_xpath("(//button[contains(#class, 'compact-visual-toggle')])[1]").click()
or
driver.find_element_by_xpath("(//button[contains(#class, 'compact-visual-toggle')])[2]").click()
Now, coming to second part :
<div class="expansion_container">
<div class="expansion_base_parent"></div>
<div class="expansion expansion_parent">
<button type="button" class="compact-visual-toggle active"></button>
</div>
</div>
in you you want to un-expand it, you could use the below xpath :
//button[contains(#class, 'compact-visual-toggle active')]
and use it like above.

keystone.js Currently logged in user

I'm new to Keystone, but have been trying all day to find the currently logged in user name but I'm not clear how to do this.
If I take the index view from keystone for example
{{!< default}}
<div class="container">
<div class="jumbotron"><img src="/images/logo.svg" width="160">
<h1>Welcome</h1>
<p>This is your new <a href='http://keystonejs.com' target='_blank'>KeystoneJS</a> website.</p>
<p>
It includes the latest versions of
<a href='http://getbootstrap.com/' target='_blank'>Bootstrap</a>
and <a href='http://www.jquery.com/' target='_blank'>jQuery</a>.
</p>
<p>Visit the <a href='http://keystonejs.com/guide' target='_blank'>Getting Started</a> guide to learn how to customise it.</p>
<hr>
<p>We have created a default Admin user for you with the email <strong>masterofimps#yahoo.co.uk</strong> and the password <strong>admin</strong>.</p>
<p>Sign in to use the Admin UI.</p>
<hr>
<p>
Remember to <a href='https://github.com/keystonejs/keystone' target='_blank'>Star KeystoneJS on GitHub</a> and
<a href='https://twitter.com/keystonejs' target='_blank'>follow #keystonejs</a> on twitter for updates.
</p>
</div>
</div>
and I want to add the user name to the view code somthing like
<h1>Welcome Tim</h1>
I have tried
<h1>Welcome {{locals.user}}</h1>
<h1>Welcome {{locals.user.name}}</h1>
<h1>Welcome {{req.user}}</h1> (using an Express request)
But to no avail. How do I find the user name from the User model? Do I have to define something else in the ..\routes\views first?
If anyone could help with an example or pointer in the right direction, Id be very grateful indeed!
If you used the Yeoman generator, one of the middleware functions included (initLocals) sets the current user to a local variable.
res.locals.user = req.user;
https://github.com/keystonejs/generator-keystone/blob/master/app/templates/routes/_middleware.js#L27
You don't need to include locals before the variable name in Handlebars. So remove it in order to get the current user information.
{{user}}
{{user.name}}

Add Elements to Landing Page Header in Weebly

I would like to change the call to action button on the landing page of my website to a social icons element. How do I edit the HTML code to do this? I am using the "Paris - Business" theme. Thanks.
There is some documentation on how to do this, if you search the web. For example, my own site has a basic example of what to look for Landing Page Button Removal.
In that case, you could replace the code that makes the button with your own code.
That being said, there's some new features you can use that would make the content editable. Weebly just introduced "Sections". Sections allows you to have different sections on a page, and with that you can also drag and drop into the Header area. (See Screenshot)
*BUT, before you go ahead and do this, I should note that Weebly plans on making these changes to the newer themes, in the near future. When Paris would be done, or if it will be done, is anybody's guess.
Depending on the design of your Theme, this might be slightly different, so please keep that in mind.
Basically, for the Paris Theme, what was:
<div class="banner-wrap wsite-background">
<div class="container">
<div class="banner">
<h2>{headline:text global="false"}</h2>
<p>{headline-paragraph:text global="false"}</p>
<div class="button-wrap">{action:button global="false"}</div>
<span id="contentArrow"><span></span></span>
</div>
</div>
</div>
Would become:
<div class="banner-wrap wsite-background">
<div class="container">
{{#header}}
<div class="banner">
{content}
<span id="contentArrow"><span></span></span>
</div>
{{/header}}
</div>
</div>
*If you are customizing a Theme, I might also recommend making a custom page type, specifically with these changes.
If you want to use Sections, for the content area of the page it would look something like:
<div class="main-wrap">
{{#sections}}
<div class="container">{content}</div>
{{/sections}}
</div>
**Note: There isn't any documentation yet, and I have not tested this, so wsite-background may not be needed... but don't take my word for it.

SharePoint 2010 WebPart Personalize Layout

I have a homepage on an Intranet. It has at 15+ webparts (news, weather, etc.) I want to allow the users to customize the page by moving the webparts around or deleting them. At present I don't let them see the ribbon at the top so they don't have access to the "Edit page" button. I have pulled it out from the Ribbon
<a unselectable="on" href="javascript:;" onclick="return false;" class="ms-cui-ctl-large" aria-describedby="Ribbon.WebPartPage.Edit.Edit.Menu.Actions.Edit_ToolTip" mscui:controltype="Button" role="button" style="height: auto;" id="Ribbon.WebPartPage.Edit.Edit-SelectedItem">
<span unselectable="on" class="ms-cui-ctl-largeIconContainer">
<span unselectable="on" class=" ms-cui-img-32by32 ms-cui-img-cont-float">
<img unselectable="on" alt="" src="/_layouts/1033/images/formatmap32x32.png" style="top: -160px; left: -96px;">
</span>
</span>
<span unselectable="on" class="ms-cui-ctl-largelabel" style="height: auto;">Edit<span unselectable="on">
</span>Page</span>
</a>
Unfortunately it is not working. Any thoughts?
I will be doing the same with the "Stop Editing" button as well.
According to this post: http://artykul8.com/2011/03/useful-sharepoint-shortcuts/
the trick is to use the MSOLayout_ToggleLayoutMode(); javascript function.
According to a comment in that same page, in SP2010, that javascript function is only available when you already have a web part in the page. The behavior I observed differs from this one, given that, to me, this function only worked when the page was in edit mode already (and it worked even when the page had no web part added).
I was able to find the javascript file where that function is defined, which is the ie55up.js. You can load it in your masterpage by doing:
<SharePoint:ScriptLink language="javascript" name="ie55up.js" OnDemand="false" runat="server" />
For information about how to build a web control that uses that javascript function, see here: http://www.codeproject.com/KB/sharepoint/SwitchWPMode.aspx
Hope this helps
I finally found the answer to this question. It was very simple after the JS call was determined. I believe this should be for a webpart only page.
To put the page in edit mode for Personalization via a Javascript call:
ChangeLayoutMode(true); // how simple is that??
To stop the editing of the page via a Javascript call:
MSOLayout_ToggleLayoutMode(); // same as given in the other post
The "Stop" edit mode I believe is the same for all pages no matter if it's a personalized page, all webparts page, etc.

Privacy prevent page from showing on back button

There will be a computer on display which users will write in their name, phone number, email and other information. We dont want users going back a page and grabbing ppls emails or other information.
How do i make it so when someone hits back the form no longer shows and a "sorry return to the first page" kind of thing. Theres a small chance there may be an agreement screen so hitting back and submitting another form and no seeing the screen may be trouble but i am not worried about that (or can say please put them on the same page).
I know its been asked but i havent seen any with this reason and the solutions i saw did not work (on firefox 3.6.10)
A little web searching found this page: Clear Web Forms After Submit
Basically calls the reset() function on all forms on the <body> tag's onload and unload events.
Code from the link:
<html>
<head>
<title>A Self-Clearing Form</title>
<script>
function clearForms()
{
var i;
for (i = 0; (i < document.forms.length); i++) {
document.forms[i].reset();
}
}
</script>
</head>
<body onLoad="clearForms()" onUnload="clearForms()">
<h1>A Self-Clearing Form</h1>
This form data will self-destruct when you leave the current web page.
<form method="post" action="page2.php" name="test">
<input name="field1"/> Field One
<p>
<input name="field2" type="radio" value="One"/>One
<input name="field2" type="radio" value="Two"/>Two
<input name="field2" type="radio" value="Three"/>Three
<input name="field2" type="radio" value="Four"/>Four
<p>
<input type="submit" value="Submit Form Data"/>
</form>
</body>
</html>
When the users enter information, save it and then send a redirect (through headers) to the page where users can enter their info.
Could have the form displayed as a result of a POST call, meaning the browser won't cache it. Then, if another user hits back, the browser will ask if they want to resend the request, but even if they do, you display them a blank page.

Resources