Website with node.js, hosting architecture decision

Website with node.js, hosting architecture decision - node.js

I am planning to start my first website. The website is a little HTML5+CSS+JS website with a backend running node.js that serves the data stored on mongodb. I would like to know which one is the best solution regarding mostly the security:
Web hosting (SSL and cloudflare) + VPS serving on port 3000 (with SSL, cloudflare and node.js with sensible data;users and pass and a local mongodb)
Everything in the same VPS.
Any other approach you can give.
The thing is that in the first approach there are two elements in the architecture so if someone wants to hack it i suppose it's more difficult. On the other hand in the second approach if the VPS is hacked everything is hacked and they could access to passwords, mongodb database. I am quite obsessed with security as it is my first website and i don't know what meassures to make to protect my VPS (node.js and mongodb).
Furthermore, i would like to know in terms of efficiency which would be best solution imagine for a 10MB website with 1.000 visits a day.

Regardless of how many actual servers you decide to deploy on, I'd strongly suggest not serving your site directly from node.js. Instead, proxy it through a more robust http server such as Apache or Nginx or even lighttpd. For the very simple reason that the http module in node.js was never meant to protect against worms and hacking attempts and various other malware.
I've written web servers from scratch myself and have noticed that in general, you'll get your first hacking attempt within the first hour of putting your server online. You'll get around a dozen or so hacking attempt per day on the slowest days and it goes up from there. These attempts are so common that most server software no longer log them in access logs and simply block them.
From my own personal experience I estimate that around 5% to 10% of my bandwidth is consumed by failed hacking/infection attempts. That is when I'm not being actively attacked.
Security through obscurity is not good security. Especially since node's http module is not very obscure in the first place and someone is bound to find a hackable weakness one of these days.
Apart from security, you also waste fewer CPU cycles ignoring these hacking attempts in Apache or Nginx compared to node.js since you don't need to run any javascript code to handle them.

You can make the choice between the two architectures moot. Both architectures are hackable, and your data will be exposed.
If security is paramount, check out Mylar - it's a platform that protects data confidentiality even when an attacker gets full access to servers. Mylar stores only encrypted data on the server, and decrypts data only in users' browsers.
It runs on top of Meteor, which in turn runs on top of Node.js and uses MongoDB, so if your web app is small, it should be easy to port the code. Meteor also stores passwords using bcrpyt, the best
password hashing algorithm nowadays.

Related

Architecting back-end and Front-end solution (nodeJS)

I have a single back-end running node/express providing API endpoints and 2 static (react) front-ends. The front-ends interact with the users and communicate with the back-end.
I need to use https through-out once enter production stage.
The front-ends will require their own domain names.
I’ve been thinking on the simplest way to have these configured and have come up with Option 1 (see diagram). Node.js API server running on one VPS and as the front-ends are static sites these can be loaded on separate servers (UPDATE- Mean't to say hosting providers) hence get their own domains. As an option, and I’m unsure if its needed, add cloudflare to the front-end to provide a layer of security.
This will allow front-ends to have separate domain names.
As this is a start-up project and doubt a large number of visitors, I’m wondering if the above is over-engineered and un-necessarily complicated.
So I’m considering Option 2 of hosting back-end api app and the two front-ends on the same linux vps. As the front-ends are static, I added the front-ends into the public folder of node.js. This allowed me to access the front-ends as http://serverIP:8080/siteA
As I want to access front-end as http://siteA.com I’m assuming I require a reverse proxy (nginX)
The questions to help me decide between the two options are:
For a start-up operation and given he above scenario which option is best ?
I understand that node.js requires a port number regardless to work, for the API I don’t mind having a port number (as its not applicable for end users i.e. http://10.20.30.40:3000), however the two front-ends require their own domain names (www.siteA.xom, www.siteB.com), therefore will I need to employ a reverse proxy (nginX) regardless if they are static sites or not ?
I’m concerned that someone could attack API end-points (http://10.20.30.40:3000). In this case, is it true with Option 2 is safer than 1 - that I could potentially block malicious direct API calls as all sites are hosted on the same VPS and the API can be easily be secured, this is not exposed to the outside world?
My developer once upon a time told me that option 1 is best as nginX adds un-needed complication, but not sure what he meant – hence my confusion, to be honest I don’t think he wanted to add nginX to the server.
I’m looking at a high-level guidance to get me on the right track. Thank

This is - as you have also doubted - unnecessarily complex, and incorrect in some cases. Here's a better (and widely used across the industry) design. I'm strongly recommending to drop the whole VM approach and go for a shared computing unit, unless you are using that machine for some other computation and utilizing it that way is saving your company a lot of money. I strongly doubt this is the case. Otherwise, you're just creating problems for yourself.
One of the most common mistakes that you can make when using Node.js is to host the static content through the public folder (for serious projects) Don't. Use a CDN instead. You'll get better telemetry depending on the CDN, redundancy, faster delivery, etc. If you aren't expecting high volumes of traffic and performance of delivering that static content isn't outrageously important at the moment, you can even go for a regular hosting server. I've done this with namecheap and GoDaddy before.
Use a direct node-js shared - or dedicated depending on size - hosting for your app and use CI/CD to deploy it. You can use CNAMEs to map whatever domain name you want to have your app on (ex: https://something.com) to map to the domain name of the cloud-hosting provider url for your app. I've used multiple things, Azure, Heroku, Namecheap for the apps and primarily Azure DevOps to manage the CI/CD pipeline, although Jenkins is super popular as well. I'd recommend Heroku - since it provides a super easy setup.
When designing any API on HTTP, you should assume people will call your API directly. See this answer for more details: How to prevent non-browser clients from sending requests to my server I'm not suggesting to put something like CloudFlare, but you may be overthinking it, look into your traffic first. Get it when you need it. As long as you have the right authentication / authorization mechanism in place, security of the API shouldn't be a big problem on these platforms. If you deploy it on one of these platforms, you won't have to deal with ports either. Unless you reach absolutely massive scale, it will definitely be cheaper for you operate with high-reliability this way.
You won't need to deal with nginx anymore.

Why all those new languages have their own web server?

I am kinda old school and the first programming language for web I saw was PHP, and everybody uses it with Apache. At that time, I also knew ASP, which were used along with Microsoft IIS and, later, ASP.NET, that runs over IIS, as well.
The time passed, I went to the ERP world and, when I came back (few months ago), I knew Golang and Node.js and for my surprise they have their own web servers.
I can see many advantages in the builtin web servers, but, every application needs to rewrite their web server rules (I faced that recently when I needed to setup a HTTPS server using Express.js).
After some hard work to understand all the nuances of the HTTP protocol, I asked myself: and if I am doing it in the wrong way? If all the permissive rules that I created in my dev server go to production? Maybe this is an useless concern. But maybe I am creating a fragile server that could be exploited by a naive hacker.
Using a server like Apache it is harder to misuse security rules, because there are settings for development and production environments that are explicit. If the rules are hardcoded (as they are in Node or Go), an unaware developer can use development rules in production and nobody is going to see it before the stuff happens.
Any thoughts?

web server focuses on the speed capacity and the caculating capacity. No matter how good java or php web is or how many old companies put them in use, as long as a new language can provides a faster speed and better capacity such as go, more programmer would go for it.
by the way, to run a web server in go is really such an easy thing.It's faster building and slightly running.And the routine in go helps the web server beter serves milions of client requests,Which old web language can hardly do it.

You can still use nginx or apache in front of your golang gateway for many reasons including tls termination.
But service to service communication might be nice to communicate directly to services and the golang http webserver is fast. It also supports http2 out of the box. Go leverages its "goroutines" to reduce overhead from the os to handle many requests at once.

Node.js and Golang do not have their web server, these are just some lib packages implement http-protocols and open some ports to provide services.
Like Spring web.
Nginx/IIS/Apache are true server, web server just a component of them.
I think Spring should meet the full application scenarios, include /gateway/security/route/package/runtime manage/ and so on.
But when we has some different language platform, then we need nginx/apache/spring gateway/zuul/or others to route them.

Client Server Security Architecture

I would like go get my head around how is best to set up a client server architecture where security is of up most importance.
So far I have the following which I hope someone can tell me if its good enough, or it there are other things I need to think about. Or if I have the wrong end of the stick and need to rethink things.
Use SSL certificate on the server to ensure the traffic is secure.
Have a firewall set up between the server and client.
Have a separate sql db server.
Have a separate db for my security model data.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Passwords generated using a salt which is stored in a different db to the passwords.
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
I have tried searching for some diagrams which could help me understand but I cannot find any which seem to be appropriate.
Thanks in advance

Hardening your architecture can be a challenging task and sharding your services across multiple servers and over-engineering your architecture for semblance security could prove to be your largest security weakness.
However, a number of questions arise when you come to design your IT infrastructure which can't be answered in a single SO answer (will try to find some good white papers and append them).
There are a few things I would advise which is somewhat opinionated backed up with my own thought around it.
Your Questions
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
Settle for the cloud. You do not need to store things on physical servers anymore unless you have current business processes running core business functions that are already working on local physical machines.
Running physical servers increases your system administration requirements for things such as HDD encryption and physical security requirements which can be misconfigured or completely ignored.
Use SSL certificate on the server to ensure the traffic is secure.
This is normally a no-brainer and I would go with a straight, "Yes"; however you must take into consideration the context. If you are running something such as a blog site or documentation-related website that does not transfer any sensitive information at any point in time through HTTP then why use HTTPS? HTTPS has it's own overhead, it's minimal, but it's still there. That said, if in doubt, enable HTTPS.
Have a firewall set up between the server and client.
That is suggested, you may also want to opt for a service such as CloudFlare WAF, I haven't personally used it though.
Have a separate sql db server.
Yes, however not necessarily for security purposes. Database servers and Web Application servers have different hardware requirements and optimizing both simultaneously is not very feasible. Additionally, having them on separate boxes increases your scalability quite a bit which will be beneficial in the long run.
From a security perspective; it's mostly another illusion of, "If I have two boxes and the attacker compromises one [Web Application Server], he won't have access to the Database server".
At foresight, this might seem to be the case but is rarely so. Compromising the Web Application server is still almost a guaranteed Game Over. I will not go into much detail into this (unless you specifically ask me to) however it's still a good idea to keep both services separate from eachother in their own boxes.
Have a separate db for my security model data.
I'm not sure I understood this, what security model are you referring to exactly? Care to share a diagram or two (maybe an ERD) so we can get a better understanding.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Obvious yes; what I am about to say however is controversial and may be flagged by some people (it's a bit of a hot debate)—I recommend using BCrypt instead of PKBDF2 due to BCrypt being slower to compute (resulting in slower to crack).
See - https://security.stackexchange.com/questions/4781/do-any-security-experts-recommend-bcrypt-for-password-storage
Passwords generated using a salt which is stored in a different db to the passwords.
If you use BCrypt I would not see why this is required (I may be wrong). I go into more detail regarding the whole username and password hashing into more detail in the following StackOverflow answer which I would recommend you to read - Back end password encryption vs hashing
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
This purely depends on your goals, budget and requirements. I would personally go for AWS, however you should read some more on alternative platforms such as Google Cloud Platform before making your decision.
Last Remarks
All of the things you mentioned are important and it's good that you are even considering them (most people just ignore such questions or go with the most popular answer) however there are a few additional things I want to point:
Internal Services - Make sure that no unrequired services and processes are running on server especially in productions. These services will normally be running old versions of their software (since you won't be administering them) that could be used as an entrypoint for your server to be compromised.
Code Securely - This may seem like another no-brainer yet it is still overlooked or not done properly. Investigate what frameworks you are using, how they handle security and whether they are actually secure. As a developer (and not a pen-tester) you should at least use an automated web application scanner (such as Acunetix) to run security tests after each build that is pushed to make sure you haven't introduced any obvious, critical vulnerabilities.
Limit Exposure - Goes somewhat hand-in-hand with my first point. Make sure that services are only exposed to other services that depend on them and nothing else. As a rule of thumb, keep everything entirely closed and open up gradually when strictly required.
My last few points may come off as broad. The intention is to keep a certain philosophy when developing your software and infrastructure rather than a permanent rule to tick on a check-box.
There are probably a few things I have missed out. I will update the answer accordingly over time if need be. :-)

How safe is cross domain access?

I am working on a personal project and I have being considering the security of sensitive data. I want to use API for accessing the Backend and I want to keep the Backend in a different server from the one the user will logon to. This then require a cross domain accessing of data.
Considering that a lot of accessing and transaction will be done, I have the following questions to help guide me in the right path by those who have tried and tested cross domain access. I don't want to assume and implement and run into troubles and redesign when I have launched the service thereby losing sleep. I know there is no right way to do many things in programming but there are so many wrong ways.
How safe is it in handling sensitive data (even with https).
Does it have issues handling a lot of users transactions.
Does it have any downside I not mentioned.
These questions are asked because some post I have read this evening discouraged the use of cross-domain access while some encouraged it. I decided to hear from professionals who have actually used it in a bigger scale.
I am actually building a Mobile App, using Laravel as the backend.
Thanks..

How safe is it in handling sensitive data (even with https).
SSL is generally considered safe (it's used everywhere and is considered the standard). However, it's not any less safe by hitting a different server. The data still has to traverse the pipes and reach its destination which has the same risks regardless of the server.
Does it have issues handling a lot of users transactions.
I don't see why it would. A server is a server. Ultimately, your server's ability to handle volume transactions is going to be based on its power, the efficiency of your code, and your application's ability to scale.
Does it have any downside I not mentioned.
Authentication is the only thing that comes to mind. I'm confused by your question as to how they would log into one but access data from another. It seems that would all just be one application. If you want to revise your question, I'll update my answer.

What security risks are posed by using a local server to provide a browser-based gui for a program?

I am building a relatively simple program to gather and sort data input by the user. I would like to use a local server running through a web browser for two reasons:
HTML forms are a simple and effective means for gathering the input I'll need.
I want to be able to run the program off-line and without having to manage the security risks involved with accessing a remote server.
Edit: To clarify, I mean that the application should be accessible only from the local network and not from the Internet.
As I've been seeking out information on the issue, I've encountered one or two remarks suggesting that local servers have their own security risks, but I'm not clear on the nature or severity of those risks.
(In case it is relevant, I will be using SWI-Prolog for handling the data manipulation. I also plan on using the SWI-Prolog HTTP package for the server, but I am willing to reconsider this choice if it turns out to be a bad idea.)
I have two questions:
What security risks does one need to be aware of when using a local server for this purpose? (Note: In my case, the program will likely deal with some very sensitive information, so I don't have room for any laxity on this issue).
How does one go about mitigating these risks? (Or, where I should look to learn how to address this issue?)
I'm very grateful for any and all help!

There are security risks with any solution. You can use tools proven by years and one day be hacked (from my own experience). And you can pay a lot for security solution and never be hacked. So, you need always compare efforts with impact.
Basically, you need protect 4 "doors" in your case:
1. Authorization (password interception or, for example improper, usage of cookies)
2. http protocol
3. Application input
4. Other ways to access your database (not using http, for example, by ssh port with weak password, taking your computer or hard disk etc. In some cases you need properly encrypt the volume)
1 and 4 are not specific for Prolog but 4 is only one which has some specific in a case of local servers.
Protect http protocol level means do not allow requests which can take control over your swi-prolog server. For this purpose I recommend install some reverse-proxy like nginx which can prevent attacks on this level including some type of DoS. So, browser will contact nginx and nginx will redirect request to your server if it is a correct http request. You can use any other server instead of nginx if it has similar features.
You need install proper ssl key and allow ssl (https) in your reverse proxy server. It should be not in your swi-prolog server. Https will encrypt all information and will communicate with swi-prolog by http.
Think about authorization. There are methods which can be broken very easily. You need study this topic, there are lot of information. I think it is most important part.
Application input problem - the famose example is "sql injection". Study examples. All good web frameworks have "entry" procedures to clean all possible injections. Take an existing code and rewrite it with prolog.
Also, test all input fields with very long string, different charsets etc.
You can see, the security is not so easy, but you can select appropriate efforts considering with the impact of hacking.
Also, think about possible attacker. If somebody is very interested particulary to get your information all mentioned methods are good. But it can be a rare case. Most often hackers just scan internet and try apply known hacks to all found servers. In this case your best friend should be Honey-Pots and prolog itself, because the probability of hacker interest to swi-prolog internals is extremely low. (Hacker need to study well the server code to find a door).
So I think you will found adequate methods to protect all sensitive data.
But please, never use passwords with combinations of dictionary words and the same password more then for one purpose, it is the most important rule of security. For the same reason you shouldn't give access for your users to all information, but protection should be on the app level design.
The cases specific to a local server are a good firewall, proper network setup and encription of hard drive partition if your local server can be stolen by "hacker".
But if you mean the application should be accessible only from your local network and not from Internet you need much less efforts, mainly you need check your router/firewall setup and the 4th door in my list.
In a case you have a very limited number of known users you can just propose them to use VPN and not protect your server as in the case of "global" access.

I'd point out that my post was about a security issue with using port forwarding in apache
to access a prolog server.
And I do know of a successful prolog injection DOS attack on a SWI-Prolog http framework based website. I don't believe the website's author wants the details made public, but the possibility is certainly real.
Obviously this attack vector is only possible if the site evaluates Turing complete code (or code which it can't prove will terminate).
A simple security precaution is to check the Request object and reject requests from anything but localhost.
I'd point out that the pldoc server only responds by default on localhost.
- Anne Ogborn

I think SWI_Prolog http package is an excellent choice. Jan Wielemaker put much effort in making it secure and scalable.
I don't think you need to worry about SQL injection, indeed would be strange to rely on SQL when you have Prolog power at your fingers...
Of course, you need to properly manage the http access in your server...
Just this morning there has been an interesting post in SWI-Prolog mailing list, about this topic: Anne Ogborn shares her experience...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string