Are databases attached to dynos in heroku? - node.js

I want to try out heroku, but am not quite sure if I understand all terms correctly.
I have an app with node.js and redis & my main focus is scaling and speed.
In a traditional environment I would have two servers in front of a load balancer; both servers are totally independent, share the same code and have an own redis instance. Both servers don't know of each other (the data is synched by a third party server, but that is not of interest for that case).
I would then push a load balancer in front of them. Know I could easily scale, as both instances are not aware of each other and I could just add more instances if I wish.
Can I mirror that environment in a dyno or can't I attach a redis instance to a dyno?
If something is unclear, please ask, as I'm new to paas!
As I understand it: I would have a dyno for my node-app and would just add another instance of it. That's cool, but would they share the same redis or can I make them independent?

You better forget traditional architectures and try to think it this way:
A dyno is a process processing HTTP requests, the absolute minimum of an app instance on heroku.
For one application instance you can have as many dynos you want and
it is totally transparent . No need to think about servers, load
balancing, etc... everything is taken care.
A redis instance is a basically a service used by the application
instance and therefore by one or more dynos. Again, servers, load
balancing, etc all is taken care.
Maybe you want to review the How it works on heroku.com now again.

You can have as many dynos for one URL as you want - you just change the value in the controller. This is actually one of the best features of Heroku - you don't care about servers, you increase the number of dynos and by this increase the number of requests which can be processed simultaneously.
Same thing with redis - it basically doesn't work that you add instances, you just switch to a more performant plan, see https://addons.heroku.com/redistogo. Again, forget about servers.

Related

How to implement cache across multiple dynos

Let's say I have Node/express app hosted on Heroku. I have implemented scalability using horizontal scaling by spanning a server across multiple dynos.
I have CMS panel to control content of the app which alters the DB to add content then content is presented to end-users throughout the server API.
What I want is to add cache mechanism to back-end API to make less trips to DB because I have huge traffic during the day by app users.
The solution initially can be done be setting up a simple cache using node-cache package which set in each server instance (dyno). But how to flush the cache through the CMS.
If I send a request to flush the cache, it will only trigger a single dyno each time. So the data isn't consistent across all dynos.
How to trigger flush cache on all dynos or is there a better way to handle caching?
Instead of a per-dyno cache, you can use something outside of your dynos entirely. Heroku offers several add-ons for common products. I've never used node-cache, but it is described like this:
A simple caching module that has set, get and delete methods and works a little bit like memcached.
That suggests that Memcached might be a good choice. The Memcached Cloud addon has a free 30MB tier and the MemCachier addon has a free 25MB tier.
In either case, or if you choose to host your cache elsewhere, or even if you choose another tool entirely, you would then connect each of your dynos to the same cache. This has several benefits:
Expiring items would then impact all dynos
Once an item is cached via one dyno it is already in the cache for other dynos
Cache content survives dyno restarts, which happen at least daily, so you'll have fewer misses
Etc.

Node.js project to production with a lot of traffic

I started a project and I started to have a lot of web traffic, and at some point I felt insecure of how to scale the project to production in two issues:
How to perform updates without leaving my users without service?
How to correctly configure Node.js so that it consumes less memory?
I have microservices working with Hydra-express, and I have not been able to implement Hydra-router and I do it with Express.js; I also have NGINX as a proxy gateway.
I am programming in ES6, transpiling with BABEL and maintaining active microservices with PM2, some with fork and the most important in cluster mode.
I was thinking about using docker, but I have not found any tutorial on how to use it with CDN, upload files and serve them to the user.
It's impossible to give a definite answer to 2 since that is completely up to what the application does, there's no silver bullet configuration that you can apply.
This leaves the first point, which is around something called zero downtime.
So, in the context of having multiple servers returning content to users, e.g. http servers I guess it's fair to say that most production environments have something at the front that's not related to the business logic. This could be a load balancer (comes in many shapes and forms) or a reverse proxy. This is usually the spot where you point your DNS A record. This server should basically never be down.
Now, lets assume you have changed some business logic and want to deploy a new backend. What you normally do is to swap out the already running processes behind the load balancer (or reverse proxy), one by one. So if you have five node processes, you stop one, start up a new with the updated code, and repeat until all running have been swapped out.
You can also utilize this to swap out just one, run tests on that one, then proceed to swap out the rest.
To really make sure you don't disrupt any users, you should stop accepting new http requests on the old processes, so new http requests are routed to the updated processes. This will allow for http requests that are taking place to finish up. Then you stop the old processes.
Hopes this helps.
Adding to #ralphtheninja answer, I suggest you reading more about blue green deployments, as proposed by Martin Fowler; https://martinfowler.com/bliki/BlueGreenDeployment.html
One of the challenges with automating deployment is the cut-over itself, taking software from the final stage of testing to live production. You usually need to do this quickly in order to minimize downtime. The blue-green deployment approach does this by ensuring you have two production environments, as identical as possible. At any time one of them, let's say blue for the example, is live. As you prepare a new release of your software you do your final stage of testing in the green environment. Once the software is working in the green environment, you switch the router so that all incoming requests go to the green environment - the blue one is now idle.
Blue-green deployment also gives you a rapid way to rollback - if anything goes wrong you switch the router back to your blue environment.
I have no idea where your backend is running but there are some services that will do it for you, AWS ElasticBeanstalk for example will put your instances behind a load balancer and will manage deployment according to a policy. have a look; https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.rolling-version-deploy.html.

Server (failover) redundancy solution for Heroku app

I have currently a node.js app deployed on a free web Dyno running on Heroku. As planning to make it production, I need to think about a redundancy and failover solution at a reasonable cost.
As I ran "Prodiction Check" on the Heroku Dashboard, it gave me a list of things to do to make it production. One of the things is "Dyno redundancy" that I should have at least 2 web dynos running for failover. Does it mean I should upgrade my Free Dyno to Hobby or Standart 1X, and should I also need to have two dyno of the same type, e.g. two Hobby dynos or two Standard 1X dynos?
How does Heroku handle failover from one Dyno to another one?
Thanks!
Heroku shares traffic between all available dynos, distributing requests using a random assignment algorithm. So all your dynos will always be serving incoming traffic.
This provides redundancy, not failover. If one dyno is choking on a very slow request, the app will still be available via the other dynos.
Failover is different. In the case of an application failure (say, the database is inaccessible) Heroku's router offers little help. To deal with more industrial workloads, you could use Amazon Route 53's DNS-level failover, which runs a health check against the backend and will reroute the domain name in the case of a Heroku crash.
However for many use-cases it is probably enough to simply offer a friendly, customised HTTP 503 error page, which you can configure in Heroku, to keep users happy during an outage.

I'm not sure how to correctly configure my server setup

This is kind of a multi-tiered question in which my end goal is to establish the best way to setup my server which will be hosting a website as well as a service (using Socket.io) for an iOS (and eventually an Android) app. Both the app service and the website are going to be written in node.js as I need high concurrency and scaling for the app server and I figured whilst I'm at it may as well do the website in node because it wouldn't be that much different in terms of performance than something different like Apache (from my understanding).
Also the website has a lower priority than the app service, the app service should receive significantly higher traffic than the website (but in the long run this may change). Money isn't my greatest priority here, but it is a limiting factor, I feel that having a service that has 99.9% uptime (as 100% uptime appears to be virtually impossible in the long run) is more important than saving money at the compromise of having more down time.
Firstly I understand that having one node process per cpu core is the best way to fully utilise a multi-core cpu. I now understand after researching that running more than one per core is inefficient due to the fact that the cpu has to do context switching between the multiple processes. How come then whenever I see code posted on how to use the in-built cluster module in node.js, the master worker creates a number of workers equal to the number of cores because that would mean you would have 9 processes on an 8 core machine (1 master process and 8 worker processes)? Is this because the master process usually is there just to restart worker processes if they crash or end and therefore does so little it doesnt matter that it shares a cpu core with another node process?
If this is the case then, I am planning to have the workers handle providing the app service and have the master worker handle the workers but also host a webpage which would provide statistical information on the server's state and all other relevant information (like number of clients connected, worker restart count, error logs etc). Is this a bad idea? Would it be better to have this webpage running on a separate worker and just leave the master worker to handle the workers?
So overall I wanted to have the following elements; a service to handle the request from the app (my main point of traffic), a website (fairly simple, a couple of pages and a registration form), an SQL database to store user information, a webpage (probably locally hosted on the server machine) which only I can access that hosts information about the server (users connected, worker restarts, server logs, other useful information etc) and apparently nginx would be a good idea where I'm handling multiple node processes accepting connection from the app. After doing research I've also found that it would probably be best to host on a VPS initially. I was thinking at first when the amount of traffic the app service would be receiving will most likely be fairly low, I could run all of those elements on one VPS. Or would it be best to have them running on seperate VPS's except for the website and the server status webpage which I could run on the same one? I guess this way if there is a hardware failure and something goes down, not everything does and I could run 2 instances of the app service on 2 different VPS's so if one goes down the other one is still functioning. Would this just be overkill? I doubt for a while I would need multiple app service instances to support the traffic load but it would help reduce the apparent down time for users.
Maybe this all depends on what I value more and have the time to do? A more complex server setup that costs more and maybe a little unnecessary but guarantees a consistent and reliable service, or a cheaper and simpler setup that may succumb to downtime due to coding errors and server hardware issues.
Also it's worth noting I've never had any real experience with production level servers so in some ways I've jumped in the deep end a little with this. I feel like I've come a long way in the past half a year and feel like I'm getting a fairly good grasp on what I need to do, I could just do with some advice from someone with experience that has an idea with what roadblocks I may come across along the way and whether I'm causing myself unnecessary problems with this kind of setup.
Any advice is greatly appreciated, thanks for taking the time to read my question.

How to make a distributed node.js application?

Creating a node.js application is simple enough.
var app = require('express')();
app.get('/',function(req,res){
res.send("Hello world!");
});
But suppose people became obsessed with your Hello World! application and exhausted your resources. How could this example be scaled up on practice? I don't understand it, because yes, you could open several node.js instance in different computers - but when someone access http://your_site.com/ it aims directly that specific machine, that specific port, that specific node process. So how?
There are many many ways to deal with this, but it boils down to 2 things:
being able to use more cores per server
being able to scale beyond more than one server.
node-cluster
For the first option, you can user node-cluster or the same solution as for the seconde option. node-cluster (http://nodejs.org/api/cluster.html) essentially is a built in way to fork the node process into one master and multiple workers. Typically, you'd want 1 master and n-1 to n workers (n being your number of available cores).
load balancers
The second option is to use a load balancer that distributes the requests amongst multiple workers (on the same server, or across servers).
Here you have multiple options as well. Here are a few:
a node based option: Load balancing with node.js using http-proxy
nginx: Node.js + Nginx - What now? (using more than one upstream server)
apache: (no clearly helpful link I could use, but a valid option)
One more thing, once you start having multiple processes serving requests, you can no longer use memory to store state, you need an additional service to store shared states, Redis (http://redis.io) is a popular choice, but by no means the only one.
If you use services such as cloudfoundry, heroku, and others, they set it up for you so you only have to worry about your app's logic (and using a service to deal with shared state)
I've been working with node for quite some time but recently got the opportunity to try scaling my node apps and have been researching on the same topic for some time now and have come across following pre-requisites for scaling:
My app needs to be available on a distributed system each running multiple instances of node
Each system should have a load balancer that helps distribute traffic across the node instances.
There should be a master load balancer that should distribute traffic across the node instances on distributed systems.
The master balancer should always be running OR should have a dependable restart mechanism to keep the app stable.
For the above requisites I've come across the following:
Use modules like cluster to start multiple instances of node in a system.
Use nginx always. It's one of the most simplest mechanism for creating a load balancer i've came across so far
Use HAProxy to act as a master load balancer. A few pointers on how to use it and keep it forever running.
Useful resources:
Horizontal scaling node.js and websockets.
Using cluster to take advantages of multiple cores.
I'll keep updating this answer as I progress.
The basic way to use multiple machines is to put them behind a load balancer, and point all your traffic to the load balancer. That way, someone going to http://my_domain.com, and it will point at the load balancer machine. The sole purpose (for this example anyways; in theory more could be done) of the load balancer is to delegate the traffic to a given machine running your application. This means that you can have x number of machines running your application, however an external machine (in this case a browser) can go to the load balancer address and get to one of them. The client doesn't (and doesn't have to) know what machine is actually handling its request. If you are using AWS, it's pretty easy to set up and manage this. Note that Pascal's answer has more detail about your options here.
With Node specifically, you may want to look at the Node Cluster module. I don't really have alot of experience with this module, however it should allow you to spawn multiple process of your application on one machine all sharing the same port. Also node that it's still experimental and I'm not sure how reliably it will be.
I'd recommend to take a look to http://senecajs.org, a microservices toolkit for Node.js. That is a good start point for beginners and to start thinking in "services" instead of monolitic applications.
Having said that, building distributed applcations is hard, take time to learn, take LOT of time to master it, and usually you will face a lot trade-off between performance, reliability, manteinance, etc.

Resources