Headless browsers and Windows Azure Websites - azure

I´m trying to use a headless browser for crawling purposes to add SEO features in a open source project i´m developing.
The project sample site is deployed via Azure Websites.
I tried several ways to get the task working using different solutions like Selenium .NET (PhantomJSDriver, HTMLUnitDriver, ...) or even standalone PhantomJs .exe file.
I´m using a headless browser because the site is based in DurandalJS, so it needs to execute scripts and wait for a condition to be true in order to return the generated HTML. For this reason, can´t use things like WebClient/WebResponse classes or HTMLAgilityPack which use to work just fine for non-javascript sites.
All the above methods works in my devbox localhost environment but the problem comes when uploading the site to Azure Websites. When using standalone phantomjs the site freezes when accessing the url endpoint and after a while return a HTTP 502 error. In case of using Selenium Webdriver i´m getting a
OpenQA.Selenium.WebDriverException: Unexpected error. System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 127.0.0.1:XXXX
I think the problem is with running .exe files in Azure and not with the code. I know it´s possible to run .exe files in Azure CloudServices via WebRole/WebWorkers but need to stay in Azure Websites for keep things simple.
It´s possible to run a headless browser in Azure Websites? Anyone have experience with this type of situation?
My code for the standalone PhantomJS solution is
//ASP MVC ActionResult
public ActionResult GetHTML(string url)
{
string appRoot = Server.MapPath("~/");
var startInfo = new ProcessStartInfo
{
Arguments = String.Format("{0} {1}", Path.Combine(appRoot, "Scripts\\seo\\renderHTML.js"), url),
FileName = Path.Combine(appRoot, "bin\\phantomjs.exe"),
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
RedirectStandardInput = true,
StandardOutputEncoding = System.Text.Encoding.UTF8
};
var p = new Process();
p.StartInfo = startInfo;
p.Start();
string output = p.StandardOutput.ReadToEnd();
p.WaitForExit();
ViewData["result"] = output;
return View();
}
// PhantomJS script
var resourceWait = 300,
maxRenderWait = 10000;
var page = require('webpage').create(),
system = require('system'),
count = 0,
forcedRenderTimeout,
renderTimeout;
page.viewportSize = { width: 1280, height: 1024 };
function doRender() {
console.log(page.content);
phantom.exit();
}
page.onResourceRequested = function (req) {
count += 1;
//console.log('> ' + req.id + ' - ' + req.url);
clearTimeout(renderTimeout);
};
page.onResourceReceived = function (res) {
if (!res.stage || res.stage === 'end') {
count -= 1;
//console.log(res.id + ' ' + res.status + ' - ' + res.url);
if (count === 0) {
renderTimeout = setTimeout(doRender, resourceWait);
}
}
};
page.open(system.args[1], function (status) {
if (status !== "success") {
//console.log('Unable to load url');
phantom.exit();
} else {
forcedRenderTimeout = setTimeout(function () {
//console.log(count);
doRender();
}, maxRenderWait);
}
});
and for the Selenium option
public ActionResult GetHTML(string url)
{
using (IWebDriver driver = new PhantomJSDriver())
{
driver.Navigate().GoToUrl(url);
WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(30));
IWebElement myDynamicElement = wait.Until<IWebElement>((d) =>
{
return d.FindElement(By.CssSelector("#compositionComplete"));
});
var content = driver.PageSource;
driver.Quit();
return Content(content);
}
}
Thanks!!

You cannot execute exe files in the shared website environment, either you have to use the web services or you have to set up a proper (azure) virtual machine.
The free shared website service is really basic, and won't cut it when you need more advanced functionality.
See this question and accepted answer for a more elaborated answer: Can we run windowservice or EXE in Azure website or in Virtual Machine?

I am not sure about shared and basic website environment but i am successfully run ffmpeg.exe from standart website environment. Despite that still phantomjs and even chromedriver itself is not working.
However i am able run Firefox driver successfully. In order to do that
I copied latest firefox directory from my local to website and below code worked well.
var binary = new FirefoxBinary("/websitefolder/blabla/firefox.exe");
var driver = new FirefoxDriver(binary, new FirefoxProfile());
driver.Navigate().GoToUrl("http://www.google.com");

Related

AWS deployment of Asp.net-Core application using JavascriptServices(AspnetCore.NodeServices)

I need to deploy an web application on AWS-Beanstalk that uses ASP.NET JavascriptServices for using NodeJS modules. The purpose of the application is to generate server side JS charts, I followed the below article
https://gunnarpeipman.com/aspnet/aspnet-core-node-d3js
So this application uses both Asp.net Core and NodeJS(node modules 'JSDOM', 'SVG2PNG' and 'D3'). The application is working fine in local machine however when deployed on AWS other pages work fine when I access the page using Node services I get an Exception(the exception is not displayed).
Below is the code used in Controller to access the Node module in javascript file,
public async Task<IActionResult> About([FromServices] INodeServices nodeServices)
{
var options = new { width = 900, height = 900 };
var data = new[] {
new { label = "Abulia", count = 10 },
new { label = "Betelgeuse", count = 20 },
new { label = "Cantaloupe", count = 30 },
new { label = "Dijkstra", count = 40 }
};
ViewData["ChartImage"] = await nodeServices.InvokeAsync<string>("Noded3Chart.js", options, data);
return View();
}
I need an guidance on the below questions
Can we deploy this type of application on AWS that uses two technologies Asp.Net and NodeJS, if yes any lead will greatly help.
How to install the dependent modules such as SVG2PNG, JSDOM etc. in this type of setup on AWS.

How to copy postman history from chrome app to native app?

Since Google is now ending the support for chrome apps. Recently Postman deprecated their chrome app and introduced a native app.
I am in the process of switching from postman chrome app to native app.
How do I copy the history from my chrome app to native app. Sync doesn't work.
There is a option to export data but that doesn't export the history.
Any Ideas?
So while searching for this I came across this post which is very helpful.
Thanks to stephan for sharing this code.
Follow these steps to copy your history from chrome app to native app.
//In Chrome DevTools on the background page of the Postman extension...
//A handy helper method that lets you save data from the console to a file
(function(console){
console.save = function(data, filename){
if(!data) {
console.error('Console.save: No data')
return;
}
if(!filename) filename = 'console.json'
if(typeof data === "object"){
data = JSON.stringify(data, undefined, 4)
}
var blob = new Blob([data], {type: 'text/json'}),
e = document.createEvent('MouseEvents'),
a = document.createElement('a')
a.download = filename
a.href = window.URL.createObjectURL(blob)
a.dataset.downloadurl = ['text/json', a.download, a.href].join(':')
e.initMouseEvent('click', true, false, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null)
a.dispatchEvent(e)
}
})(console)
//Common error reporting function
function reportError(){
console.error('Oops, something went wrong :-(');
}
//Open the database
var dbReq = indexedDB.open('postman')
dbReq.onerror = reportError;
dbReq.onsuccess = function(){
var db = dbReq.result;
//Query for all the saved requests
var requestReq = db.transaction(["requests"],"readwrite").objectStore('requests').getAll();
requestReq.onerror = reportError;
requestReq.onsuccess = function(){
var requests = requestReq.result;
//Dump them to a file
console.save(JSON.stringify(requests), 'postman-requests-export.json')
console.info('Your existing requests have been exported to a file and downloaded to your computer. You will need to copy the contents of that file for the next part')
};
};
//Switch to standalone app and open the dev console
//Paste the text from the exported file here (overwriting the empty array)
var data = []
//Enter the guid/id of the workspace to import into. Run the script with this value blank if you need some help
// finding this value. Also, be sure you don't end up with extra quotes if you copy/paste the value
var ws = '';
//Common error reporting function
function reportError(){
console.error('Oops, something went wrong :-(');
}
//Open the database
var dbReq = indexedDB.open('postman-app')
dbReq.onerror = reportError;
dbReq.onsuccess = function(){
var db = dbReq.result;
if(!data.length){
console.error('You did not pass in any exported requests so there is nothing for this script to do. Perhaps you forgot to paste your request data?');
return;
}
if(!ws){
var wsReq = db.transaction(["workspace"],"readwrite").objectStore('workspace').getAll();
wsReq.onerror = reportError;
wsReq.onsuccess = function(){
console.error('You did not specify a workspace. Below is a dump of all your workspaces. Grab the guid (ID field) from the workspace you want these requests to show up under and include it at the top of this script');
console.log(wsReq.result);
}
return;
}
data.forEach(function(a){
a.workspace = ws;
db.transaction(["history"],"readwrite").objectStore('history').add(a);
});
console.log('Requests have been imported. Give it a second to finish up and then restart Postman')
}
//Restart Postman
Note :
1.To Use DevTools on your chrome app you will need to enable following flag in
chrome://flags
2.Then just right click and inspect on your chrome postman app.
3.To User DevTools on your native app ctrl+shift+I (view->showDevTools)

How to provision Branding files using SharePoint Hosted App in SharePoint Online/Office 365?

I am looking for SharePoint Hosted App Solution which will provision Branding files (JS/CSS/Images) into SharePoint Online/Office 365 environment.
I got a very good article to achive this and tried to implement the same as shown in below link: http://www.sharepointnutsandbolts.com/2013/05/sp2013-host-web-apps-provisioning-files.html
This solution is not working for me and while execution of app, I am getting below error:
Failed to provision file into host web. Error: Unexpected response data from server. Here is the code which is giving me error:
// utility method for uploading files to host web..
uploadFileToHostWebViaCSOM = function (serverRelativeUrl, filename, contents) {
var createInfo = new SP.FileCreationInformation();
createInfo.set_content(new SP.Base64EncodedByteArray());
for (var i = 0; i < contents.length; i++) {
createInfo.get_content().append(contents.charCodeAt(i));
}
createInfo.set_overwrite(true);
createInfo.set_url(filename);
var files = hostWebContext.get_web().getFolderByServerRelativeUrl(serverRelativeUrl).get_files();
hostWebContext.load(files);
files.add(createInfo);
hostWebContext.executeQueryAsync(onProvisionFileSuccess, onProvisionFileFail);
}
Please suggest me, what can be the issue in this code? Or else suggest me another way/reference in which I can Create a SharePoint-Hosted App to provision Branding Files.
Thanks in Advance!
I would use a different method to access host web context as follows:
//first get app context, you will need it.
var currentcontext = new SP.ClientContext.get_current();
//then get host web context
var hostUrl = decodeURIComponent(getQueryStringParameter("SPHostUrl"));
var hostcontext = new SP.AppContextSite(currentcontext, hostUrl);
function getQueryStringParameter(param) {
var params = document.URL.split("?")[1].split("&");
var strParams = "";
for (var i = 0; i < params.length; i = i + 1) {
var singleParam = params[i].split("=");
if (singleParam[0] == param) {
return singleParam[1];
}
}
}
Here are some references:
https://sharepoint.stackexchange.com/questions/122083/sharepoint-2013-app-create-list-in-host-web
https://blog.appliedis.com/2012/12/19/sharepoint-2013-apps-accessing-data-in-the-host-web-in-a-sharepoint-hosted-app/
http://www.mavention.com/blog/sharePoint-app-reading-data-from-host-web
http://www.sharepointnadeem.com/2013/12/sharepoint-2013-apps-access-data-in.html
Additionally, here is an example of how to deploy a master page, however as you might notice during your testing the method used to get host web context is not working as displayed in the video and you should use the one I described before.
https://www.youtube.com/watch?v=wtQKjsjs55I
Finally, here is a an example of how to deploy branding files through a Console Application using CSOM, if you are smart enough you will be able to convert this into JSOM.
https://channel9.msdn.com/Blogs/Office-365-Dev/Applying-Branding-to-SharePoint-Sites-with-an-App-for-SharePoint-Office-365-Developer-Patterns-and-P

windows azure or IIS slow in inital load

I have a simple personal MVC4 web app that is hosted in Windows Azure.
This web app is very minimal in use, the initial call is very slow specially when I tried to click in the morning.
I’m suspecting that the IIS is sleeping and need to wake up. I found this article and mention that this is a bug in IIS http://social.msdn.microsoft.com/Forums/en-US/wcf/thread/8b3258e7-261c-49a0-888c-0b3e68b2af13 which required setting up in IIS but my web app is hosted in Azure, is there any way to do some sort of setting in Web.config file?
All succeeding calls are fast.
Here is my personal page. javierdelacruz.com
Thanks.
Two options:
Startup Tasks
OnStart Code
For startup tasks, see this link.
For OnStart code, try a function like this (this function does a few more things, too):
private const string _web_app_project_name = "Web";
public static void SetupDefaultEgConfiguration(int idleTimeoutInMinutes = 1440, int recycleTimeoutInMinutes = 1440, string appPoolName = "My Azure App Pool", bool enableCompression = true)
{
if (!RoleEnvironment.IsEmulated)
{
Trace.TraceWarning("Changing IIS settings upon role's OnStart. Inputs: ({0}, {1}, {2}, {3}", idleTimeoutInMinutes, recycleTimeoutInMinutes, appPoolName, enableCompression);
// Tweak IIS Settings
using (var iisManager = new ServerManager())
{
try
{
var roleSite = iisManager.Sites[RoleEnvironment.CurrentRoleInstance.Id + "_" + _web_app_project_name];
if (enableCompression)
{
//================ Enable or disable static/Dynamic compression ===================//
var config = roleSite.GetWebConfiguration();
var urlCompressionSection = config.GetSection("system.webServer/urlCompression");
urlCompressionSection["doStaticCompression"] = true;
urlCompressionSection["doDynamicCompression"] = true;
Trace.TraceWarning("Changing IIS settings to enable static and dynamic compression");
}
//================ To change ApplicationPool name ================================//
var app = roleSite.Applications.First();
app.ApplicationPoolName = appPoolName;
//================ To change ApplicationPool Recycle Timeout ================================//
var appPool = iisManager.ApplicationPools[app.ApplicationPoolName];
appPool.Recycling.PeriodicRestart.Time = new TimeSpan(0, recycleTimeoutInMinutes, 0);
//================ idletimeout ====================================================//
var defaultIdleTimeout = iisManager.ApplicationPoolDefaults.ProcessModel.IdleTimeout;
var newIdleTimeout = new TimeSpan(0, idleTimeoutInMinutes, 0);
if ((int)newIdleTimeout.TotalMinutes != (int)defaultIdleTimeout.TotalMinutes)
{
appPool.ProcessModel.IdleTimeout = newIdleTimeout;
}
// Commit the changes done to server manager.
iisManager.CommitChanges();
}
catch (Exception e)
{
Trace.TraceError("Failure when configuring IIS in Azure: " + e.ToString().Take(63000));
}
}
}
}
Source and some more details for the function I included here - there are some dependencies you'll likely need to accomplish this.

wkhtmltopdf fails on Azure Website

I'm using the https://github.com/codaxy/wkhtmltopdf wrapper to create a pdf from a web page on my website (I pass in an absolute url e.g. http://mywebsite.azurewebsites.net/PageToRender.aspx It works fine in dev and on another shared hosting account but when I deploy to an Azure website it fails and all I get is a ThreadAbortException.
Is it possible to use wkhtmltopdf on azure, and if so, what am I doing wrong?
UPDATE:
This simple example just using Process.Start also doesn't work. It just hangs when run on Azure but works fine on other servers.
string exePath = System.Web.HttpContext.Current.Server.MapPath("\\App_Data\\PdfGenerator\\wkhtmltopdf.exe");
string htmlPath = System.Web.HttpContext.Current.Server.MapPath("\\App_Data\\PdfGenerator\\Test.html");
string pdfPath = System.Web.HttpContext.Current.Server.MapPath("\\App_Data\\PdfGenerator\\Test.pdf");
StringBuilder error = new StringBuilder();
using (var process = new Process())
{
using (Stream fs = new FileStream(pdfPath, FileMode.Create))
{
process.StartInfo.FileName = exePath;
process.StartInfo.Arguments = string.Format("{0} -", htmlPath);
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.UseShellExecute = false;
process.Start();
while (!process.HasExited)
{
process.StandardOutput.BaseStream.CopyTo(fs);
}
process.WaitForExit();
}
}
Check out this SO question regarding a similar issue. This guy seems to have gotten it to work. RotativaPDF is built on top of wkhtmltopdf hence the connection. I am in the process of trying it myself on our Azure site - I will post in the near future with my results.
Azure and Rotativa PDF print for ASP.NET MVC3

Resources