How to prove that a server is running a certain code

How to prove that a server is running a certain code - security

I would greatly appreciate it if someone can clarify.
How can I prove that a certain piece of code is running in a server?
One could create and publish the hash of the code that was deployed on a server, but, on the other hand, I do not seem to find any mechanism of proving it.
I believe that PaaS like Heroku or maybe Amazon tackles that problem, at least partially. When one is deploying an application, Heroku returns the git commit hash that references the deployed code. Of course, the code running might not effectively be the one disclosed.
Do we have a mechanism for returning a hash of deployment, some sort of summary with code + metadata, and not necessarily a commit hash?
Thanks!

It is not possible. Here's a scenario:
Server is running X.
Client asks server for proof of code.
Server provides proof that the code running is X. Client accepts the proof.
A malicious agent redeploys the server with code Y.
Session continues, attacker steals data.
Session ends.
Attacker redeploys the server with code X.
Note that the TCP session on the server side is usually nginx or some such, which also handles TLS termination, so redeploying doesn't even break the client's TCP or TLS session.
Moreover, the server has a runtime, be it the C standard library, node.js, or JVM - how do you know that that isn't compromised?
Moreover, even if the server code is correct, the server uses OS resources like networking, disk etc. - how do you know that those aren't compromised?
It's hard enough for the server to know it's compromised, it's downright impossible for the client.

In some scenarios, it may be practical to write your code as a R1CS circuit and give a succinct zero-knowlege proof that it was executed correctly. This can be done with libraries such as Circom, Zokrates, and Noir, although it will require learning a new domain-specific language unless you use Keelung in Haskell.
The tradeoff will be:
a slower runtime, which may render some use cases impractical
requirement that all inputs to your program are either a fixed length or are padded to a fixed length.
But it will enable you to prove that you executed certain code correctly without any trust gaurantees.

Related

How to "hide" top-secret data that need to be fed to the app

Let say I have an application that should run on a VPS. The app utilizes a configuration file that contains very important private keys, in a sense that no one should ever have access to! I know VPS providers can easily access my files. So, how may I "hide" the sensitive data from malicious acts while still have them usable for the app?
I believe encryption will be of no help, since the decryption should be done on the same machine! Also, I know running my own private server is a no-brainier; but, that's not an option, unfortunately.

You cannot solve this problem. Whatever workaround you can find, there will be a way for someone with access to repeat the same steps. You can only solve this if you have full control over the server (both hardware and software), otherwise, it's a lost battle.
Some links:
https://cheatsheetseries.owasp.org/cheatsheets/Key_Management_Cheat_Sheet.html
https://owaspsamm.org/model/implementation/secure-deployment/stream-b/
https://security.stackexchange.com/questions/223457/how-to-store-api-keys-when-algo-trading
You can browse security SE for some direction, and ask a more target question.
This problem is mitigated with using your own servers, using specialized hardware for key storage, trusting to your host provider or cloud, and using well-designed security protocols.

But the VPS provider doesn't know how your app will decrypt the keys in the file? Perhaps your app has a decrypt key embedded in it, or maybe it is something even simpler. Without decompiling your app they are no closer to learning the secrets. Of course if your "app" is just a few scripts then they can work it out.
For example if the first key in the file is customerID, they don't know that all the other keys are simply xor'ed against a hash of your customerID - they don't even know the hashing algorithm you used.
Ok, that might be too simplistic of you used one of the few well known hashes, but if there are only a few clients, it can be enough.
Obviously, they could be listening to the network traffic your app is sending, but then that should be end-to-end encrypted already, if you are that paranoid.

Can a running nodejs application cryptographically prove it is the same as published source code version?

Can a running nodejs program cryptographically prove that it is the same as a published source code version in a way that could not be tampered with?
Said another way, is there a way to ensure that the commands/code executed by a nodejs program are all and only the commands and code specified in a publicly disclosed repository?
The motivation for this question is the following: In an age of highly sophisticated hackers as well as pressures from government agencies for "backdoors" that allow them to snoop on private transactions and exchanges, can we ensure that an application has been neither been hacked nor had a backdoor added?
As an example, consider an open source-based nodejs application like lesspass (lesspass/lesspass on github) which is used to manage passwords and available for use here (https://lesspass.com/#/).
Or an alternative program for a similar purpose encryptr (SpiderOak/Encryptr on github) with its downloadable version (https://spideroak.com/solutions/encryptr).
Is there a way to ensure that the versions available on their sites to download/use/install are running exactly the same code as is presented in the open source code?
Even if we have 100% faith in the integrity of the the teams behind applications like these, how can we be sure they have not been coerced by anyone to alter the running/downloadable version of their program to create a backdoor for example?
Thank you for your help with this important issue.

sadly no.
simple as that.
the long version:
you are dealing with the outputs of a program, and want to ensure that the output is generated by a specific version of one specific program
lets check a few things:
can an attacker predict the outputs of said program?
if we are talking about open source programs, yes, an attacker can predict what you are expecting to see and even can reproduce all underlying crypto checks against the original source code, or against all internal states of said program
imagine running the program inside a virtual machine with full debugging support like firing up events at certain points in code, directly reading memory to extract cryptographic keys and so on. the attacker does not even have to modify the program, to be able to keep copys of everything you do in plaintext
so ... even if you could cryptographically make sure that the code itself was not tampered with, it would be worth nothing: the environment itself could be designed to do something harmful, and as Maarten Bodewes wrote: in the end you need to trust something.
one could argue that TPM could solve this but i'm afraid of the world that leads to: in the end ... you still have to trust something like a manufacturer or worse a public office signing keys for TPMs ... and as we know those would never... you hear? ... never have other intentions than what's good for you ... so basically you wouldn't win anything with a centralized TPM based infrastructure

You can do this cryptographically by having a runtime that checks signatures before running any code. Of course, you'd have to trust that runtime environment as well. Unless you have such an environment you're out of luck - that is, unless you do a full code review.
Furthermore you can sign the build by placing a signature within the build system. The build system and developer access in turn can be audited. This is usually how secure development environments are build. But in the end you need to trust something.
If you're just afraid that a particular download is corrupted you can test against an official hash published at one or more trusted locations.

Client Server Security Architecture

I would like go get my head around how is best to set up a client server architecture where security is of up most importance.
So far I have the following which I hope someone can tell me if its good enough, or it there are other things I need to think about. Or if I have the wrong end of the stick and need to rethink things.
Use SSL certificate on the server to ensure the traffic is secure.
Have a firewall set up between the server and client.
Have a separate sql db server.
Have a separate db for my security model data.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Passwords generated using a salt which is stored in a different db to the passwords.
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
I have tried searching for some diagrams which could help me understand but I cannot find any which seem to be appropriate.
Thanks in advance

Hardening your architecture can be a challenging task and sharding your services across multiple servers and over-engineering your architecture for semblance security could prove to be your largest security weakness.
However, a number of questions arise when you come to design your IT infrastructure which can't be answered in a single SO answer (will try to find some good white papers and append them).
There are a few things I would advise which is somewhat opinionated backed up with my own thought around it.
Your Questions
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
Settle for the cloud. You do not need to store things on physical servers anymore unless you have current business processes running core business functions that are already working on local physical machines.
Running physical servers increases your system administration requirements for things such as HDD encryption and physical security requirements which can be misconfigured or completely ignored.
Use SSL certificate on the server to ensure the traffic is secure.
This is normally a no-brainer and I would go with a straight, "Yes"; however you must take into consideration the context. If you are running something such as a blog site or documentation-related website that does not transfer any sensitive information at any point in time through HTTP then why use HTTPS? HTTPS has it's own overhead, it's minimal, but it's still there. That said, if in doubt, enable HTTPS.
Have a firewall set up between the server and client.
That is suggested, you may also want to opt for a service such as CloudFlare WAF, I haven't personally used it though.
Have a separate sql db server.
Yes, however not necessarily for security purposes. Database servers and Web Application servers have different hardware requirements and optimizing both simultaneously is not very feasible. Additionally, having them on separate boxes increases your scalability quite a bit which will be beneficial in the long run.
From a security perspective; it's mostly another illusion of, "If I have two boxes and the attacker compromises one [Web Application Server], he won't have access to the Database server".
At foresight, this might seem to be the case but is rarely so. Compromising the Web Application server is still almost a guaranteed Game Over. I will not go into much detail into this (unless you specifically ask me to) however it's still a good idea to keep both services separate from eachother in their own boxes.
Have a separate db for my security model data.
I'm not sure I understood this, what security model are you referring to exactly? Care to share a diagram or two (maybe an ERD) so we can get a better understanding.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Obvious yes; what I am about to say however is controversial and may be flagged by some people (it's a bit of a hot debate)—I recommend using BCrypt instead of PKBDF2 due to BCrypt being slower to compute (resulting in slower to crack).
See - https://security.stackexchange.com/questions/4781/do-any-security-experts-recommend-bcrypt-for-password-storage
Passwords generated using a salt which is stored in a different db to the passwords.
If you use BCrypt I would not see why this is required (I may be wrong). I go into more detail regarding the whole username and password hashing into more detail in the following StackOverflow answer which I would recommend you to read - Back end password encryption vs hashing
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
This purely depends on your goals, budget and requirements. I would personally go for AWS, however you should read some more on alternative platforms such as Google Cloud Platform before making your decision.
Last Remarks
All of the things you mentioned are important and it's good that you are even considering them (most people just ignore such questions or go with the most popular answer) however there are a few additional things I want to point:
Internal Services - Make sure that no unrequired services and processes are running on server especially in productions. These services will normally be running old versions of their software (since you won't be administering them) that could be used as an entrypoint for your server to be compromised.
Code Securely - This may seem like another no-brainer yet it is still overlooked or not done properly. Investigate what frameworks you are using, how they handle security and whether they are actually secure. As a developer (and not a pen-tester) you should at least use an automated web application scanner (such as Acunetix) to run security tests after each build that is pushed to make sure you haven't introduced any obvious, critical vulnerabilities.
Limit Exposure - Goes somewhat hand-in-hand with my first point. Make sure that services are only exposed to other services that depend on them and nothing else. As a rule of thumb, keep everything entirely closed and open up gradually when strictly required.
My last few points may come off as broad. The intention is to keep a certain philosophy when developing your software and infrastructure rather than a permanent rule to tick on a check-box.
There are probably a few things I have missed out. I will update the answer accordingly over time if need be. :-)

Website with node.js, hosting architecture decision

I am planning to start my first website. The website is a little HTML5+CSS+JS website with a backend running node.js that serves the data stored on mongodb. I would like to know which one is the best solution regarding mostly the security:
Web hosting (SSL and cloudflare) + VPS serving on port 3000 (with SSL, cloudflare and node.js with sensible data;users and pass and a local mongodb)
Everything in the same VPS.
Any other approach you can give.
The thing is that in the first approach there are two elements in the architecture so if someone wants to hack it i suppose it's more difficult. On the other hand in the second approach if the VPS is hacked everything is hacked and they could access to passwords, mongodb database. I am quite obsessed with security as it is my first website and i don't know what meassures to make to protect my VPS (node.js and mongodb).
Furthermore, i would like to know in terms of efficiency which would be best solution imagine for a 10MB website with 1.000 visits a day.

Regardless of how many actual servers you decide to deploy on, I'd strongly suggest not serving your site directly from node.js. Instead, proxy it through a more robust http server such as Apache or Nginx or even lighttpd. For the very simple reason that the http module in node.js was never meant to protect against worms and hacking attempts and various other malware.
I've written web servers from scratch myself and have noticed that in general, you'll get your first hacking attempt within the first hour of putting your server online. You'll get around a dozen or so hacking attempt per day on the slowest days and it goes up from there. These attempts are so common that most server software no longer log them in access logs and simply block them.
From my own personal experience I estimate that around 5% to 10% of my bandwidth is consumed by failed hacking/infection attempts. That is when I'm not being actively attacked.
Security through obscurity is not good security. Especially since node's http module is not very obscure in the first place and someone is bound to find a hackable weakness one of these days.
Apart from security, you also waste fewer CPU cycles ignoring these hacking attempts in Apache or Nginx compared to node.js since you don't need to run any javascript code to handle them.

You can make the choice between the two architectures moot. Both architectures are hackable, and your data will be exposed.
If security is paramount, check out Mylar - it's a platform that protects data confidentiality even when an attacker gets full access to servers. Mylar stores only encrypted data on the server, and decrypts data only in users' browsers.
It runs on top of Meteor, which in turn runs on top of Node.js and uses MongoDB, so if your web app is small, it should be easy to port the code. Meteor also stores passwords using bcrpyt, the best
password hashing algorithm nowadays.

What security risks are posed by using a local server to provide a browser-based gui for a program?

I am building a relatively simple program to gather and sort data input by the user. I would like to use a local server running through a web browser for two reasons:
HTML forms are a simple and effective means for gathering the input I'll need.
I want to be able to run the program off-line and without having to manage the security risks involved with accessing a remote server.
Edit: To clarify, I mean that the application should be accessible only from the local network and not from the Internet.
As I've been seeking out information on the issue, I've encountered one or two remarks suggesting that local servers have their own security risks, but I'm not clear on the nature or severity of those risks.
(In case it is relevant, I will be using SWI-Prolog for handling the data manipulation. I also plan on using the SWI-Prolog HTTP package for the server, but I am willing to reconsider this choice if it turns out to be a bad idea.)
I have two questions:
What security risks does one need to be aware of when using a local server for this purpose? (Note: In my case, the program will likely deal with some very sensitive information, so I don't have room for any laxity on this issue).
How does one go about mitigating these risks? (Or, where I should look to learn how to address this issue?)
I'm very grateful for any and all help!

There are security risks with any solution. You can use tools proven by years and one day be hacked (from my own experience). And you can pay a lot for security solution and never be hacked. So, you need always compare efforts with impact.
Basically, you need protect 4 "doors" in your case:
1. Authorization (password interception or, for example improper, usage of cookies)
2. http protocol
3. Application input
4. Other ways to access your database (not using http, for example, by ssh port with weak password, taking your computer or hard disk etc. In some cases you need properly encrypt the volume)
1 and 4 are not specific for Prolog but 4 is only one which has some specific in a case of local servers.
Protect http protocol level means do not allow requests which can take control over your swi-prolog server. For this purpose I recommend install some reverse-proxy like nginx which can prevent attacks on this level including some type of DoS. So, browser will contact nginx and nginx will redirect request to your server if it is a correct http request. You can use any other server instead of nginx if it has similar features.
You need install proper ssl key and allow ssl (https) in your reverse proxy server. It should be not in your swi-prolog server. Https will encrypt all information and will communicate with swi-prolog by http.
Think about authorization. There are methods which can be broken very easily. You need study this topic, there are lot of information. I think it is most important part.
Application input problem - the famose example is "sql injection". Study examples. All good web frameworks have "entry" procedures to clean all possible injections. Take an existing code and rewrite it with prolog.
Also, test all input fields with very long string, different charsets etc.
You can see, the security is not so easy, but you can select appropriate efforts considering with the impact of hacking.
Also, think about possible attacker. If somebody is very interested particulary to get your information all mentioned methods are good. But it can be a rare case. Most often hackers just scan internet and try apply known hacks to all found servers. In this case your best friend should be Honey-Pots and prolog itself, because the probability of hacker interest to swi-prolog internals is extremely low. (Hacker need to study well the server code to find a door).
So I think you will found adequate methods to protect all sensitive data.
But please, never use passwords with combinations of dictionary words and the same password more then for one purpose, it is the most important rule of security. For the same reason you shouldn't give access for your users to all information, but protection should be on the app level design.
The cases specific to a local server are a good firewall, proper network setup and encription of hard drive partition if your local server can be stolen by "hacker".
But if you mean the application should be accessible only from your local network and not from Internet you need much less efforts, mainly you need check your router/firewall setup and the 4th door in my list.
In a case you have a very limited number of known users you can just propose them to use VPN and not protect your server as in the case of "global" access.

I'd point out that my post was about a security issue with using port forwarding in apache
to access a prolog server.
And I do know of a successful prolog injection DOS attack on a SWI-Prolog http framework based website. I don't believe the website's author wants the details made public, but the possibility is certainly real.
Obviously this attack vector is only possible if the site evaluates Turing complete code (or code which it can't prove will terminate).
A simple security precaution is to check the Request object and reject requests from anything but localhost.
I'd point out that the pldoc server only responds by default on localhost.
- Anne Ogborn

I think SWI_Prolog http package is an excellent choice. Jan Wielemaker put much effort in making it secure and scalable.
I don't think you need to worry about SQL injection, indeed would be strange to rely on SQL when you have Prolog power at your fingers...
Of course, you need to properly manage the http access in your server...
Just this morning there has been an interesting post in SWI-Prolog mailing list, about this topic: Anne Ogborn shares her experience...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string