How to show different ARR friendly error pages on timeout depending on downstream server - iis-7.5

I'm trying to work out how we can go about serving different error pages from within ARR when a timeout occurs (http 502.3).
Scenario is we could host X number of different brands and want to supply different error pages if their servers are down.
The out of the box approach on a 502 error is that it will get the "Default Web Site" error pages, what I'm wondering is way we can provide that functionality without adding a new ARR server every time we bring on a new customer.
Any help you could give would be greatly appreciated.
Kind Regards,
J

Our final solution was deceptively simple. ARR doesn't have this functionality and e need code to allow logic, so we added a .aspx page with the appropriate host to error code mapping. Effectively a dictionary mapping within the .aspx page.

Related

The terms of use URL, /portal/terms_of_use is not working - I get a requested resource could not be found error

I have configured my Liferay 7 site with a terms of use web content document. It works perfectly when new users log in for the first time.
However, I need to be able to provide a link to review the terms of use. The only thing I see is in the Liferay Portal struts-config.xml file, which is "/portal/terms_of_use". But this link appended to my hostname is not working. I get a "Not Found" error page ("The requested resource could not be found.").
Am I using the wrong URL? I've tried searching for what the URL would be and have not found anything, which is surprising since I would think this is a common requirement.
With the routing of web-stuff to OSGi bundles, there's no longer any top-level URL that can be specified this way (there may never have been one). However, there is a way to get to it:
http://localhost:8080/c/portal/terms_of_use requires login, but results in the (in my case unconfigured) terms of use. The /c within the URL targets struts (and maybe something else - the /portal within targets a certain bundle (portal-web in this case)

Kentico: PortalTemplate.aspx explicitly throwing a 404 error when directly invoked

We work on a product that is a series of components that could be installed on different CMSs and provide different services. We take a CMS agnostic approach and try to use the same code in all the CMSs as much as possible (we try to avoid using CMS API as much as we can).
Some part of the code needs to work with the current URL for some redirections so we use Request.Url.ToString() that is something that has worked fine in other environments but in Kentico instead of returning the current page we always get a reference to CMSPages/PortalTemplate.aspx with a querystring parameter aliasPath that holds the real URL. In addition to that, requesting the Template page using a browser gives you a 404 error.
Example:
Real URL (this works fine on a browser):
(1) https://www.customer.com/Membership/Questionnaire?Id=7207f9f9-7354-df11-88d9-005056837252
Request.Url.ToString() (this gives you a 404 error on a browser):
(2) https://www.customer.com/CMSPages/PortalTemplate.aspx?Id=7207f9f9-7354-df11-88d9-005056837252&aliaspath=/Membership/Questionnaire
I've noticed that the 404 error is thrown explicitly by the template code when invoked directly. Please see below code from Page_Init method of PortalTemplate.aspx.cs:
var resolvedTemplatePage = URLHelper.ResolveUrl(URLHelper.PortalTemplatePage);
if (RequestContext.RawURL.StartsWithCSafe(resolvedTemplatePage, true))
{
// Deny direct access to this page
RequestHelper.Respond404();
}
base.OnInit(e);
So, if I comment the above code out my redirection works fine ((2) resolves to (1)). I know it is not an elegant solution but since I cannot / don't want to use Kentico API is the only workaround I could find.
Note that I know that using Kentico API will solve the issue since I'm sure I will find an API method that will return the actual page. I'm trying to avoid that as much as possible.
Questions: Am I breaking something? Is there a better way of achieving what I trying to accomplish? Can you think on any good reason I shouldn't do what I'm doing (security, usability, etc)?
This is kind of a very broad question so I was not able to find any useful information on Kentico docs.
I'm testing all this on Kentico v8.2.50 which is the version one of my customers currently have.
Thanks in advance.
It's not really recommended to edit the source files of Kentico, as you may start to run into issues with future upgrades and also start to see some unexpected behaviour.
If you want to get the original URL sent to the server before Kentico's routing has done its work, you can use Page.Request.RawUrl. Using your above example, RawUrl would return a value of /Membership/Questionnaire?Id=7207f9f9-7354-df11-88d9-005056837252, whereas Url will return a Uri with a value of https://www.customer.com/CMSPages/PortalTemplate.aspx?Id=7207f9f9-7354-df11-88d9-005056837252&aliaspath=/Membership/Questionnaire (as you stated).
This should avoid needing to use the Kentico API and also avoid having to change a file that pretty much every request goes through when using the portal engine.
If you need to get the full URL to redirect to, you can use something like this:
var redirectUrl = Request.Url.GetLeftPart(UriPartial.Authority) + Request.RawUrl;

azure 502 bad gateway

has anyone seen this before so I am getting a 502 bad gateway error on my app, the issue I have is that the detailed error information I am getting says my requested url is https://SOX:80/api however my site is configured to use https://sox.domain.com and the site largely works pulling the various JS files required
my app service name is SOX in the azure dashboard so I assume that is where it is picking up SOX from but I have no idea why it is using this.
So overall the issue had me perplexed... however with more testing I soon figured out what was going on.
my backend is Dotnet core Azure throwing the 502 bad gateway was its way of handling exceptions ultimately the problem was code based.
I am mentioning this purely so that it will help others
my first issue was based on cert handling it seems dotnet runs in a container that is specified by your app name as i mentioned above https://SOX:80
the below was causing my issues
sslPolicyErrors = X509StoreStoreHelper.ValidateSSLPolicy(cert.Thumbprint, cert);
after commenting this out for testing my problem went away(we are putting in a proper fix )
my second issue came from using an unsupported view in Azure SQL master.sys.master_files which again just threw a 502 bad gateway error referencing https://SOX:80
please note I have used https://SOX:80 as a reference to mask the real site.
hope this helps the next person.
Based on your description, I have checked your site (https://sox.azurewebsites.net/) and found that it contains three static files (index.html,generic.html,elements.html). I viewed your website in Chrome incognito window as follows:
I did not find any requests against https://SOX:80/api in your html page or JavaScript files. Please try to access your website in a new incognito window to isolate the cache issue or just press CTRL + F5 to refresh your current page to narrow this issue. Moreover, you need to check whether you have configured URL Rewrite. If you still could not solve this issue, you need to update your question with the details for us to reproduce this issue.

Deploying my front end and detecting client location by IP address - which AWS service should handle this? Confused by my options

I'm still new to AWS and just following the documentation and asking questions here when I get stuck. Please excuse me if this question sounds really noobish.
So far, I've deployed the following:
EB to deploy my REST API
RDS to deploy my psql database
Lambda functions to handle things like authentication & sending JWTs, uploading images to S3, etc.
I have got my basic back end (no caching (just started learning about redis), etc. set up yet, just the bare bones so far) deployed.
I'm still developing my front end, and have not even thought about how I will be deploying it yet (probably another deployment on EB, since I am using universal react). I am just developing it locally but using my production env variables now so I am hitting my deployed API, etc.
One of the MAJOR things I have no idea on how to do is detecting incoming requests from client side to get the client's location by IP address. This is so that I can return the INITIAL results in your general location just like yelp, foursquare, etc. do when you go to to their sites.
For now, I am just building a web app on desktop so I just want to worry about getting the IP address to get the general area of the user. My use case is something similar to other sites you might have used which provides an INITIAL result set for things in your area (think foursquare or yelp).
Here are my questions:
What would be a good way to do this? I'm thinking of handling this in my front end react universal deployment since it will be a node server with rendered page caching. Is this a terrible idea? It would work something like
(1) request from client comes in
(2) get IP from request and lookup the IP location using some service (still not sure what I'm going to use, have found a few plus a nodejs library called node-geoip). Preferably, I can get the zip code since I am trying to save having to do so many queries by unique locations in my database, and instead return results in the zip code and the front end will show an initial map with the initial results in that zip code.
(3) return to client the rendered page with those location params if it exists, otherwise create it, send it, and cache it.
Is the above a really dumb idea? Maybe you have already done something like this, and could share your wisdom :)
Is there an AWS service which can already handle something like this for me? Perhaps there's some functionality which can already do this.
Thanks.
AGAIN - I apologize if this is long winded. I don't know anyone in real life who can help me and I feel alone :(. I appreciate the help you guys can provide.
There are two parts to this:
Getting the user's IP address. You mentioned you're using 'EB' - I presume you mean AWS ELB (Elastic Load Balancer)? If so, then you need to read the X-Forwarded-For HTTP header in your app code, since otherwise what you'll really detect is the ELB's IP address. X-Forwarded-For contains the user's real IP - or rather, the IP of the end-connection being made (there's no telling if this is really a VPN, Proxy or something else-- but it's as far as you can get with an IP.)
Querying an IP DB that can turn the addr into a location object. There are tons of libraries for you. Assuming you're using Node, you can use node-geoip as you mentioned. Or you can just search 'geoip service' on Google and find managed services, like Telize on Mashape. If you don't want to manage the DB lookup yourself or keep the thing up to date, then a managed service would help.
In either case, it's likely that you'll be doing asynchronous look-ups. In that case, you might want to use async/await to get the user's full object before injecting that into your React props and ultimately rendering it as a HTML string that's sent down to the client.
You could also use a library like redial to decorate your components with data requirements, and return a Promise you can await on to know when you're okay to render.
Since you probably want to enable client routing too (i.e. where the user can click on a route in their browser, and the server isn't touched at all), then you will probably need some way to retrieve the IP address/results based on that IP even when the server isn't involved in the initial render.
For that, you could write a REST service that retrieves the results. Or write a GraphQL back-end that gets the data. It doesn't matter how you write it, since the server will have access to the X-Forwarded-For header and can use that to retrieve the results and send back location-aware data.
FYI, I'm writing a React starter kit (called ReactNow) that uses rxjs for handling async streams. It's not ready yet, but it might help you figure out the code layout that would offer a balanced mix between rendering on the server, and writing universal code that requires some heavy lifting from the server.

how to check if my website is being accessed using a crawler?

how to check if a certain page is being accessed from a crawler or a script that fires contineous requests?
I need to make sure that the site is only being accessed from a web browser.
Thanks.
This question is a great place to start:
Detecting 'stealth' web-crawlers
Original post:
This would take a bit to engineer a solution.
I can think of three things to look for right off the bat:
One, the user agent. If the spider is google or bing or anything else it will identify it's self.
Two, if the spider is malicious, it will most likely emulate the headers of a normal browser. Finger print it, if it's IE. Use JavaScript to check for an active X object.
Three, take note of what it's accessing and how regularly. If the content takes the average human X amount of seconds to view, then you can use that as a place to start when trying to determine if it's humanly possible to consume the data that fast. This is tricky, you'll most likely have to rely on cookies. An IP can be shared by multiple users.
You can use the robots.txt file to block access to crawlers, or you can use javascript to detect the browser agent, and switch based on that. If I understood the first option is more appropriate, so:
User-agent: *
Disallow: /
Save that as robots.txt at the site root, and no automated system should check your site.
I had a similar issue in my web application because I created some bulky data in the database for each user that browsed into the site and the crawlers were provoking loads of useless data being created. However I didn't want to deny access to crawlers because I wanted my site indexed and found; I just wanted to avoid creating useless data and reduce the time taken to crawl.
I solved the problem the following ways:
First, I used the HttpBrowserCapabilities.Crawler property from the .NET Framework (since 2.0) which indicates whether the browser is a search engine Web crawler. You can access to it from anywhere in the code:
ASP.NET C# code behind:
bool isCrawler = HttpContext.Current.Request.Browser.Crawler;
ASP.NET HTML:
Is crawler? = <%=HttpContext.Current.Request.Browser.Crawler %>
ASP.NET Javascript:
<script type="text/javascript">
var isCrawler = <%=HttpContext.Current.Request.Browser.Crawler.ToString().ToLower() %>
</script>
The problem of this approach is that it is not 100% reliable against unidentified or masked crawlers but maybe it is useful in your case.
After that, I had to find a way to distinguish between automated robots (crawlers, screen scrapers, etc.) and humans and I realised that the solution required some kind of interactivity such as clicking on a button. Well, some of the crawlers do process javascript and it is very obvious they would use the onclick event of a button element but not if it is a non interactive element such as a div. The following is the HTML / Javascript code I used in my web application www.so-much-to-do.com to implement this feature:
<div
class="all rndCorner"
style="cursor:pointer;border:3;border-style:groove;text-align:center;font-size:medium;font-weight:bold"
onclick="$TodoApp.$AddSampleTree()">
Please click here to create your own set of sample tasks to do
</div>
This approach has been working impeccably until now, although crawlers could be changed to be even more clever, maybe after reading this article :D

Resources