What is the best practice when it comes to setting security policies such as CSP and http security headers such as HSTS? Should they be configured within my express.js application? Or is it best practice to configure them in nginx? I found documents on how to implement them but I am not sure where they should be implemented.
Either could be used. You should put them where it's most appropriate for you and really depends on your set up.
I'm assuming you have an Nginx web server in front of one or more NodeJs application servers?
If so, then are some pages returned from Nginx (e.g. static pages) and some from Node (e.g. dynamic)? Do you have more than one Node server?
It also depends what you are doing with Node. It's quite common to have Nginx return HTML, CSS and Javascript and then use that Javascript to make AJAX calls to a node server to return JSON data. As CSP is needed on the HTML document and not tele JSON it makes no sense to return CSP headers from Node in this scenario.
Some headers like HSTS are set for whole domain so, to me, it makes sense to set them at the Nginx layer so they affect all requests - static pages served by Nginx and dynamic pages served by one or more Node servers. This also means you don't have to remember to set them if you ever set up another node server as well.
However if different data is returned for each service and/or request it may make sense to do that in Node. For example if your Node application needs to be able to set different CORS headers differently based on the request coming in, then it makes no sense to do this in Nginx and try to repeat the logic based on request URL and parameter.
Ultimately you should decide to do it where it makes most sense based on application set up, where its most likely to be set correctly (so it's not set when it shouldn't it set to wrong value and also so it's not too easy to forget to set it in future) and where it makes most sense to manage it (e.g. sometimes it's easier to change application code than server config or vice versa).
Related
Is it possible to use the same nodeJS server for two/three different domains (aliases)? (I don't want to redirect my users. I want them to see the exact URL they typed in the address bar. However, all three domains are exactly the same!)
I want my users to be logged in on all three domains at the same time, in order to avoid any confusion.
What is the simplest way to do this and avoid cross-domain issues?
Thanks!
If you mean that all domains will serve the same nodejs app then Yes you can do that.
but if each domain should open a different application then you must have a reverse proxy running on the server to handle and manage the sites/vhosts.
You may install nginx and use it as reverse proxy server or look for http-proxy a library for nodejs.
If you would like to manage the vhosts in your app you can look for vhost middleware for nodejs and use it
Choose one of:
Use some other server (like nginx) as a reverse proxy.
Use node-http-proxy as a reverse proxy.
Use the vhost middleware if each domain can be served from the same Connect/Express codebase and node.js instance.
This is a very broad question. Moreover, it is generally a pretty bad idea, SEO-wise, to have multiple independent domains that each serve the same content.
Logging in is generally either done through Cookies, or through extra parameters in the URL. Cookies are always domain-specific, for obvious security reasons. If you want to ensure folks will be logged in to all the domains at once, you can create an internal purpose-driven domain to handle authentication (without such domain showing in URL bar, and only being used for HTTP redirects, effectively); such domain will store the login state for all the rest, and the rest would pick up the login state through such purpose-driven domain (through HTTP redirects).
In general, however, this sounds like too much trouble. Consider that, perhaps, some users specifically want to use different domains for different accounts, so, you'll effectively break their usage if you mandate that a single login be used for all of them. And, back to the original point, doing this is pretty bad for SEO, so, just don't do it.
Can somebody explain to me the architecture of this website (link to a picture) ? I am struggling to understand the different elements in the front-end section as well as the fields on top, which seem to be related to AWS S3 and CDNs. The backend-section seems clear enough, although I don't understand the memcache. I also don't get why in the front end section an nginx proxy is needed or why it is there.
I am an absolute beginner, so it would be really helpful if somebody could just once talk me through how these things are connected.
Source
Memcache is probably used to cache the results of frequent database queries. It can also be used as a session database so that authenticated users' session work consistently across multiple servers, eliminating a need for server affinity (memcache is one of several ways of doing this).
The CDN on the left caches images in its edge locations as they are fetched from S3, which is where they are pushed by the WordPress part of the application. The CDN isn't strictly necessary but improves down performance by caching frequently-requested objects closer to where the viewers are, and lowers transport costs somewhat.
The nginx proxy is an HTTP router that selectively routes certain path patterns to one group of servers and other paths to other groups of servers -- it appears that part of the site is powered by WordPress, and part of it node.js, and part of it is static react code that the browsers need to fetch, and this is one way of separating the paths behind a single hostname and routing them to different server clusters. Other ways to do this (in AWS) are Application Load Balancer and CloudFront, either of which can route to a specific server based on the request path, e.g. /assets/* or /css/*.
I am looking for a way how to dynamically route requests through proxy webserver. I will explain what I need exactly and what I have found so far.
I would like to have some lightweight webserver (thinking about node.js or nginx) set up as proxy webserver with public IP. It would route requests to different local webservers based on URLs. But not only based on hostname but based on full URL.
My idea is, that this proxying webserver would use either local memory cache, memcached or redis to look up key-value based information of URL and local webserver.
I have found these projects:
https://github.com/nodejitsu/node-http-proxy
https://www.steve.org.uk/Software/node-reverse-proxy/
https://github.com/hipache/hipache
They all seem to do similar things, but not exactly what I am looking for, that is:
URL based proxying (absolute URLs routing to different local webservers)
use of memory based configuration storage / cache
dynamically change configuration using API without reloading proxy webserver
Is there any better-suited project or is there a way how to configure one of three projects above to fit my requirements ?
Thank you for your time and effort in advance.
I think this does exactly what you want: https://openresty.org/en/dynamic-routing-based-on-redis.html
It's basically nginx with precompiled modules. You can setup the same by yourself with nginx + lua module + redis ( + of course the necessary lua rocks). OpenResty just makes it easier.
Is there any way for varnish to read a list of backend urls from a text file, and then proxy cache misses to a random url taken from the text file?
What I imagine is something like this pseudocode...
/var/services/backend-urls.conf
http://backend-host-1/path/to/application
http://backend-host-2/path/to/application
http://backend-host-3/path/to/application
# etc
varnish config
sub vcl_miss {
// read a list of urls from a text file
backendHosts = readFile("/var/services/backend-urls.conf");
//choose a random url from the file
randomHost = chooseLineAtRandom(backendHosts);
//proxy the request to the random host
set req.backend = randomHost;
}
To provide some background, I work on a server system that comprises a number of backend applications that currently sit behind a front-end running apache. We are evaluating replacing the apache layer with varnish so we can benefit from the caching capabilities of varnish. We also have a service discovery framework that knows the endpoint locations for each backend application (the endpoint urls change periodically as new hosts emerge or are taken out of service).
Currently we use the RewriteMap functionality in mod_rewrite to route requests to the backend services. Then we have a process to maintain the lists of backend services based upon the contents of the service discovery framework.
All this works well for us in apache, except that apache is like using a sledgehammer to crack a nut. All we really want is the reverse proxy loigc, and the caching in varnish would be helpful too.
Is there any way to have varnish read the list of backend urls from an external resource?
Without resorting to custom vmod/c modules, the quick answer is no.
The VCL instruction are being compiled within varnish, and that rules out run-time inclusions.
But why not include within the VCL a separate backend vcl which includes the current backends.
that vcl file could be written out on demand. Then using varnishadm CLI command you could request a new compile of the VCL, therefore bringing the config live.
I can see two potential solutions.
The first is to have something generate your VCL and backends such as Chef or some custom scripting. You can then process the text file into backend definitions and the necessary VCL to invoke them. To handle the requirement for the random backend you could use a director. I've not dealt with directors myself but it looks like they are meant to solve that requirement. When changes to the backends occur you could rerun the generation script/Chef and tell Varnish to reload its configuration either using varnishadm or service varnish reload to avoid a full restart.
The second would be to implement it in C, either via a VMOD as Marcel Dumont suggests or possibly using inline C in your VCL.
With vmod_dynamic you can just use any DNS name as a backend or even service records.
For your use case, one option would be to set up an SRV record in DNS pointing to all your servers and then just use that as for example in the basic-stub.vtc test case.
I'm currently learning node.js and loving it. I noticing, however, that it seems that's it's really only fit for one site. So it's great for hosting mydomain.com, but what if I want to build an actual full web server with it. In other words, I would like to host mydomain.com, example.com, yourdomain.com and so on. What solutions (modules) are available for this? I was thinking of simply parsing the url from the request object and simply reading from the appropriate directory. For example if I get a request for example.com then read from the example_com directory or if I get a request from mydomain.com read from the mydomain_com directory. The issue here is I don't know how this will affect performance and scalability.
I've looked into Multi-node but I don't fully follow the idea of processes yet (I'm a node beginner).
Any suggestions are welcome.
You can do this a few different ways. One way is to write it directly into your web application by checking what domain the request was made to and then route within your application but unless your application is very basic this can make it fairly bloated and can get messy. A good time to do something like this might be if you're writing a blogging platform where everything is pretty much the same across all your domains. The key difference might be how you query your data to display the right data.
In this case you'd probably use the request to see which blog is being accessed.
If you want to just host a few different domains on the same server all using port 80 (like most websites do) you will want to proxy each request off to a different process. You can do this with nginx or even with node itself. It all comes down to what best fits your needs. bouncy is a quick way to get setup doing this as its a nodejs module and has some pretty impressive benchmarks. nginx (proxy with nginx) is probably the most wildly used method though, as a lot of nodejs servers use nginx to serve static content anyways.
http://blog.noort.be/2011/03/07/node-js-on-nginx.html
https://github.com/substack/bouncy/
You can use connect's vhost middleware (which is also available in express) to dispatch requests to separate request handlers based on the Host: header. This assumes that everything is being handled by the same node process on the same port; if you really need separate processes, then the suggestion about using nginx as a reverse proxy is probably the way to go.