More and more server-side file deployments are handled using git. It's nice and there are plenty of guides available how to setup your deployment workflow with git, rsync and others.
However, I'd like to ask what's the cleanest way to set deployment rollbacks, so that
Every time you deploy, you record the latest state before the deployment (no need to manually read through logs to find the commit)
What git commands use to rollback to the prior (recorded) state in the case deployment has unforeseen consequences
The scope of the question is Linux servers, shell scripting and command line git.
Note that there is no general solution to this problem. I would propose two solutions.
First one requires usage of Fabric and some deep thinking how to handle whole deployment process. For a Django site I maintain, I wrote a fabric script that deploys staging on every git commmit. Deploying from staging to production is then a simple fabric command that copies all the files to a new folder (increments a version by 1), for example from production/v55/ to production/v56/ (ok, it also does backups and runs migrations). If anything goes wrong, rollback command restores backups, and starts production environment from folder production/v55. Less talk, more code: https://github.com/kiberpipa/Intranet/blob/master/fabfile.py
The second option requires more reading and has a bigger learning curve, but also provides cleaner solution. As Lenin suggested to use framework with declarative configuration, I would propose to go a step further and learn a Linux distribution with declarative configuration - http://nixos.org/. NixOS has built in capabilities for distributed software deployment (including rollbacks) and also tools to deploy stuff from your machine https://github.com/NixOS/nixops. See also thesis on Distributed Software Deployment, which covers also your questions (being part of a much bigger problem): http://www.st.ewi.tudelft.nl/~sander/index.php/phdthesis
Please have a look at Capistrano and Chef which require ruby/ror support. But are great deployment tools. Python's fabric is also an awesome tool.
Related
For about 18 months now I've been working in Node; and for the last 6 months I've been slowly migrating my existing WordPress websites to NextJS.
To date, I've been deploying to production manually. I log into my production server, checkout the latest release from GitHub, build, and do a pm2 restart.
Even though the above workflow seems to be the most commonly documented around the internet, it's always felt a little wrong to me.
Recently, I found myself in a situation where I needed to customise some 3rd party code. So, my main code now has a line in package.json that says
{
...
"dependencies": {
...
"react-share": "file:../react-share/react-share-4.4.1.tgz",
...
},
...
}
which implies that I'm going to checkout my custom react-share, build it somewhere on the production server, change this line to point to wherever I put it, and then rebuild.
Also, I'm using Prisma, which means that every time I deploy, before I do a build, I need to do an npx prisma generate to create the client.
This now all seems really, really wrong.
I don't know how a "simple" CI/CD environment might look, but whatever it looks like, it feels like overkill. It's just me doing development, and my production environment is a single EC2 server sitting behind AWS CloudFront.
It seems to me that I should be doing something more/different than what I'm currently doing, in service to someday moving to a CI/CD model, if/when I have a whole team working on this, or sufficient users that I have multiple load-balanced servers and need production to be continually up.
In particular, it feels like I shouldn't be building on the production server.
Are there any intermediary step(s) I can/should be taking for faster/less-error-prone/less-down-time deployment to a single EC2 instance for Next/Node apps, between manually deploying as I am currently, and some sort of CI/CD setup? Or are my only choices to do what I'm doing now, or go research how to do CI/CD?
You're approaching towards your initial stages of what technically is called DevOps, if not already as it appears from your context. What you're asking is a broad topic, which is an understatement, and explaining each and everything here will almost be like writing an article about it, at the very least.
However, I'll brief you overall on how to approach with this.
I don't know how a "simple" CI/CD environment might look, but whatever it looks like, it feels like overkill.
Simplicity & complexity are relative terms. A system which is complicated for one might be simple for another. CI/CD doesn't define any laws that you need to follow in order to create a perfect deployment procedure, as everyone's deployment requirement is unique (at some point).
If I mention it in bullet points, what you need to figure out before you start with setting up CI&CD, is -
The sequence of steps your deployment procedure needs in order to deploy your latest version. As you have stated already that you've been doing deployment manually, that means you already know your steps. All you need to do is to fine-tune each step so that it shouldn't require manual intervention while being executed automatically by the CI program.
Choose a CI program, like Travis CI, Circle CI, or if you're using GitHub, it has it's own GitHub Actions for the purpose, you can read their documentation for more details. Your CI program will be responsible for executing your deployment steps which you'll mention to it in whichever format it understands (mostly .yml).
The CI program will execute your steps on behalf of you based on the condition which you'll provide, (like when code is pushed on prod branch). It will execute the commands on a machine (like your EC2), specifically, GitHub actions runner will be responsible for running your commands on your machine, the runner should be setup beforehand in the instance you intend to deploy your code on. More details on runners can be found in relevant documentations.
Since the runner will actually execute the commands on your machine, make sure that all required commands and parameters, including the concerned files & directories are accessible to the runner program, from permissions point of view at least. For example, running your npx prisma generate command should require that npx command is available and executable in the system, and the concerned folders in which the command will CRUD files is accessible by the runner program. Similarly for all other commands.
Get your hands on bash scripting as well.
If your steps contain dynamic info, like the one you mentioned that in your package.json an npm script needs to be updated, then a custom bash script created to update the same automatically will help, for instance. There will be however, several other ways depending on the specific nature of the dynamic changes.
The above points are huge (by huge, I mean astronomically huge) oversimplification of the ways through which CI&CD pipelines are setup. But I hope you get the idea of it at least.
In particular, it feels like I shouldn't be building on the production server.
Your feeling is legitimate. You should replicate your production environment (including deployment procedures) into a separate development environment as close as possible, in order to have all your experiments, development and testing done separately from production environment, and after successful evaluation on the development environment, deploy on production one. Steps like building will most likely be done on both environments, as it is something your program needs to run, irrespective of the environment it is running in. Your future team will appreciate this separation of environments.
if/when I have a whole team working on this, or sufficient users that I have multiple load-balanced servers and need production to be continually up.
Again, this small statement in itself is a proper domain of IT department, known as System Design, in which, to put it simply, you or your team will create an architecture for your whole system which will support your business requirements and scaling as your audience increases, which is something a simple Stackoverflow QnA won't suffice to explain.
Therefore,
or go research how to do CI/CD?
is what I'd recommend and you should also feel is the right way ahead, after reading everything above.
Useful references to begin with (not endorsing any resources, you can search for relevant/better resources too)
GitHub Actions self-hosted runners
System Design - Getting started
Bash scripting
Development, Staging, Production
Can you suggest me a way of separating learning/trying out vs production in the same computer? I am in such a place that I know a lot of JS and production ready skills whilst sometimes require probing or trying out simpler stuff or basics. I presume that a lot of engineers are also in a similar place.
This is the situation I am facing with right now.
I wanted to install redis and configure it while trying out something interested.
In a separate project I needed another clean redis configuration and installation.
In front-end side I tried and installed a few npm packages globally.
At some point I installed python 3.4 now require 3.6
At some point I installed nginx and configured it, now need another configuration and wipe the previous one out,
If I start a big project right now I feel like my computer will eventually let me down due to several attempts I previously done
et cetera, these all create friction on both my learning and exploration
Now, it crosses mind to use separate virtual box installations for trying out things, but this answer is trivial, please suggest something else.
P.S.: I am using Linux Mint.
You can install and use Docker, which is also trivial,
however, if your environment is Linux you can use LXC
There isn't really a single good answer to this sort of question of course; but some things that are generally a good idea are:
use git repos to keep the source "backed up" (obviously your local pc should not be the git server); commit your changes all the time, if you can't hold your breath for as long as the timespan between 2 commits, then you're doing it wrong (or you may have asthma, see a doctor).
Always build your project with there being not just multiple, but a variable amount of "deployments" in mind. That means not hardcoding absolute paths and database names/ports/hostnames and things like that. If your project needs database/api credentials then that should be in a configfile of sorts (or in the env); that configfile should be stored outside the codebase and shouldn't be checked into your git repos (though there can ofcourse be a config template in there).
Always have at least 2 deployments of any project actually deployed. Next to the (obvious) "live"/"production" deployment, which your clients/users use, you want a "dev"-version for yourself where you can freely shit the bed, and for bigger projects you may well want multiple. Each deployment would have its own database, and it's own copy of the code/assets.
It can be useful to deploy everything inside podman or docker containers, that makes it easier to have a near-identical system in both development and production (incase those are different servers), but that may be too much overhead for you.
Have a method (maybe a script) that makes it very easy to deploy updates from your gitrepo or dev-deployment, to the production deployment. Based on your description, i'm guessing if a client tells you she wants some minor cosmetic changes done, you do them straight on the live version; very convenient and fast, but a horrible thing in practice. once you switch from that workflow to having a seperate dev-deploy, you'll feel slowed down by that (which you are), but if you optimize that workflow over time you'll get to the point where you could still deploy cosmetic changes in a minute orso, while having fully separated deployments, it is worth the time investment.
Have a personal devtools git repo or something similar. You're likely using an IDE such as VS code ? Back up your vs code user config in that repo, update it reasonably frequently. Use a texteditor, photoshop/editor, etc etc, same deal. You hear that ticking sound ? that's the bomb that's been placed on your motherboard. It might go off tonight, it might not go off for years, but you never know, always expect it could be today or tomorrow, so have stuff backed up externally and/or on offline media.
There's a lot more but those are some of the basics that spring to mind.
I though Docker was only for containerizing your app with all the installation files and configurations before pushing to the production
Docker is useful whenever you need to configure the runtime environment in an isolated manner. Production, local development, other environments - all need the same runtime. All benefit from the runtime definition and isolation that docker provides. Arguably docker is even more useful in workstation-centric development, than it is in production.
I wanted to install redis and configure it while trying out something interested.
Instead of installing redis on your os directly, run the preexisting docker image for redis.
In a separate project I needed another clean redis configuration and installation.
Instantiate the docker image again and now you have 2 isolated redis servers running locally.
In front-end side I tried and installed a few npm packages globally.
Run your npm code within a nodejs docker container
At some point I installed python 3.4 now require 3.6
Different versions of python is a great use case for docker containers, which will tagged with specific python versions.
At some point I installed nginx and configured it, now need another configuration and wipe the previous one out,
Nginx also has a very useful official container.
If I start a big project right now I feel like my computer will eventually let me down due to several attempts I previously done
Yeah, it gets messy quick. That's why docker is such a great solution. Give every project dedicated services and use docker-compose to simplify the networking and building components. Fight the temptation to use a docker container for more than one service - instead stitch them together with docker networks.
Read https://docs.docker.com/get-started/overview/ to get started with docker.
We are using chef to deploy all of our stacks.
I need to build a runbook for each environment we deploy.
I have been parsing the environment, node and recipe files but the more information I need to extract, the more complex it becomes because I am converging the attributes in my application.
I would like to use the converged-attributes.json file produced by our chef deployment without deploying any code because we can't deploy production to the build runbooks.
We also plan to build the runbook before the environment exists to provide configuration information to the DevOps team (e.g. memory requirements, ports, etc.).
Is there a way to use any of the chef/knife components or libraries to do the following?
Converge the attributes for each node
Write the converged attributes to a location my application can access on Mac OSX.
Quit before attempting to access any servers
This is not possible in the generic case. Chef is executable code at heart and the only way to fully compute the side effects is to actually execute it. This is what chef-client does, you can't "converge" the node externally so step 3 doesn't really make any sense. You could try to use Why Run mode but we really don't recommend it and are probably going to remove the feature as it does more harm than good most of the time. Roles and environments are static data so you can parse and manipulate those, but cookbooks are code and have to be run in-place to know exactly what they will do.
I want to know how developer continously develop a web application on cloud.
which software or which development environment they ae using?
Is Docker correct answer?
Thank you
This is an extremely open ended question, so I will give you a relatively open ended answer. CI/CD isn't really a defined process, but typically people follow the same strategy.
CI:
Develop and store code in Git or some repository
Execute unit test cases
Build Source Code
At this point, you have code that is being continuously tested and built. Now continuous delivery (CD) kicks in. This differs from company to company, but it may follow the below
CD:
Deploy Source Code to development integration testing server (DIT)
Execute Automated test portfolio
Deploy Source Code to Stage or Pre Prod environment
Execute Automated test portfolio
Now at this point in time you have your code fully tested and deployed to internal testing/stage servers. As a company, you can decide whether your confidence level is high enough to implement continuous deployment or if you implement a change mgmt process. continuous deployment is similar to continuous delivery EXCEPT you deploy the built application/service to production automatically with no gates in place. Then, you will run your test portfolio again against prod. Do not do performance testing in prod (do this testing in stage typically)
Product typically used for CI = Jenkins (open source, great community support)
Product(s) typically used for CD = Puppet, Chef, Ansible, uDeploy
Disclaimer - please do not get into a conversation about which products are best used for which stage...I only know what I know; and I know there are other tools to do CI/CD that I havent mentioned.
Our company uses ANT to automate build scripts.
Now somebody raised the question how to secure such build scripts agains (accidental or intended) threats?
Example 1: someone checks in a build script that deletes everything under Windows drive T:\ because that is where the Apache deployment directory is mounted for a particular development machine. Months later, someone else might run the build script and erase everything on T:\ which is a shared drive on this machine.
Example 2: an intruder modifies the default build target in a single project to scan the entire local hard disk. The Continuous Integration machine (e.g. Jenkins) is configured to execute the default build target and will therefore send its entire local directory structure to the intruder, even for projects that the intruder should not have access to.
Any suggestions how to prevent such scenarios (besides "development policies" or "do not mount shared drives")?
My only idea is to use chroot enviroments for builds?!
The issues you describe are the same for any code that you execute on the build machine - you could do the same thing using a unit test.
In this case the best solution may be to place your build scripts under source control and have a code review prior to check in.
At my company, the build scripts (usually a build folder) are an svn:external to another subversion repository that is only controlled by build/release engineers. Developers can control variables such as servers it can deploy to, but not what those functions do. This same code is reused amongst multiple projects in flight, and only a few devops folks can alter it, not the entire development staff.
Addition: When accessing shared resources, we use a system account that has only read access to those resources. Further: jenkins,development projects and build/deploy code are written to handle complete loss of jenkins project workspace and deploy environments. This is basic build automation/deploy automation that leads to infrastructure automation.
Basic rule: Murphy's law is going to happen. You should write scripts that are robust and handle cold start scenarios and not worry about wild intruder theories.