/usr/local/ reset for custom centos7 image on azure scale set

/usr/local/ reset for custom centos7 image on azure scale set - linux

We're using Packer to construct a custom centos7 image for an Azure scale set. Part of this includes a custom rpm that we have created that builds git from source (can't use community repos so we make our own) and installs it to the /usr/local/bin directory. In normal practice, the package works perfectly. Everything gets installed appropriately to the right places and we can use our new version of git.
When we run things through Packer, we install it via ansible, and then finally Packer does the deprovisioning step, captures the image and puts it in an azure shared image gallery, which we then pick up for use in our azure scale set.
Scale set uses the image to make a few instances, and we're up and running. Problem is, suddenly, the /usr/local/ directory seems to be as if it has been reset to default. There's nothing in /usr/local/bin anymore, and furthermore, some (not all) of the packages that we install as dependencies to build git (like gcc for example), also just disappear. Our git rpm is still listed as installed, but gcc is not.
/usr/bin/ seems fine (aside from the missing gcc, and though we don't need it at this point anyway, it still seems concerning), so we can probably just install it there, but I'd still like to know if something crazy is happening, and should I look out for it in the future seeing as /usr/local/ seemed a logical spot to install it.
TL;DR:
packer gets base centos7 image
add our custom git package
git installs to /usr/local/bin (it works! git is available)
deprovision with waagent and generalized
packer captures image and uploads it
azure scale set uses image to make new instances
/usr/local/ is back to original state? (thus git is missing?)
???
packer azure arm docs
waagent deprovisioning tool docs

Figured this out.
Turns out (at least with version 1.7.2) Packer does not necessarily do idempotent operations with the azure arm in relation to Shared Image Gallery versions, even with the --force flag.
We had created the SIG image version before we had gotten our git package fully working and installed properly, so it was created on a base image that did not have /usr/local/bin/ modified.
When we ran the Packer build with the force flag, it deletes and recreates the base image, but it runs a PUT call with the configuration information for the SIG image version, which is to say it will "Create or Update" if it's following convention (you can't see this unless you set some packer logging vars and output the verbose logs to a file or something).
So while the base image was updated to one that had git properly set up, the SIG version thought it was using the same base image as before (the name was the same, no unique identifier), so as far as it was concerned the configuration hadn't changed and nothing needed to happen. After we deleted the old version or made a new version, it properly spun up a VM based on the base image we had made and everything was where it was supposed to be.
I am definitely of the opinion that a --force should be an idempotent operation from start to finish, I'm not sure if this is fixed in future versions (at the time of writing this they're on 1.7.6) but maybe I'll update once I've checked it out.

Related

What is the purpose of Docker?

So in my head, Docker is a container management system that allows you to build an application in a unified way so you don't need to worry about version control, client environment configuration and so on.
However, there is some concept that I am clearly missing:
In my head, Docker basically wraps your whole program in a container to be shipped easily to clients and anybody who wants to use your product. And from there I can just tell clients to install so-and-so to set up the whole system in their own system. However, digging into Docker, I don't understand how pulling and pushing images into DockerHub helps that use case as well as not providing an executable to execute DockerImage in a click.
DockerHub images take so many steps to unpack and edit. I was assuming that those templates on DockerHub exists for us to pull and edit the template for our own use cases, but that does not seem to be the case because the steps to unpack an image is much more than I imagined, and the use case seems to be more of "Download and use image, not for editing".
Surely I am missing something about Docker. What is the purpose of pushing and pulling images on DockerHub? How does that fit into the use case of containerizing my software to be executed by clients? Is the function of DockerHub images just to be pulled to be ran and not edited?
It's so hard for me to wrap my head around this because I'm assuming Docker is for containerizing my application to be easily executable by clients who wants to install my system.

To further explain this answer I would even say that docker allows you to have a development environment tied to your application that is the same for all your developers.
You would have your git repo with your app code, and a docker container with all that is needed to run the application.
This way, all your developers are using the same version of software and that docker container(s) should replicate the production environment (you can even deploy with it, that's another use for it) but with this there's no more the "it works on my machine" problem. Because everyone is working on the same environment.
In my case all our projects have a docker-compose structure associated with them so that each project always have their server requirements. And if one developer needs to add a new extension, he can just add it to the docker config files and all developer will receive the same extension once they update to the latest release.

I would say there are two uses to having images on DockerHub.
The first is that some images are extremely useful as-is. Pulling a redis/mariadb image saves you the trouble of setting it and configuring it yourself.
The second is that you can think of a docker image as a layered item: assume your application is a PHP server. You can (and will have to) create an image for your app source code. BUT the container will need PHP to run your source code!
This is why you have a FROM keyword in a Dockerfile, so that you can define a "starting layer". In the case of a PHP server you'd write FROM php:latest, and docker would pull a PHP image for your server to use from DockerHub.
Without using Dockerhub, you'd have make your image from scratch, and therefore to bundle everything in your image, some operating system information, PHP, your code, etc. Having ready-to-use images to start from makes the image you're building much lighter.

GitLab CE: How to restore or repair repos with issues / merge requests that are suddenly missing?

I started running GitLab CE inside of an x86 Debian VM locally about two years ago, and last year I decided to migrate the GitLab CE instance to a dedicated Intel NUC server. Everything appeared to go well with no issues, and my GitLab CE instance is up-to-date as of today (running 13.4.2).
I discovered recently though, that some repos that were moved give a "NO REPOSITORY!" error when visiting their project pages, and if they had any issue boards, merge requests, etc, that these were also gone. But you wouldn't suspect it since the broken repos appear in the repo lists along with working repos that I use all the time.
If I had to reason about these broken repos, it would be that they had their last activity over a year ago, with either no pushes ever made to them other than an initial push, or if changes were made, issues created, or merge requests created, it was literally over a year ago.
Some of these broken repos are rather large with a lot of history, whereas others are super tiny (literally just tracking changes to a shell script), so I don't think repo size itself has anything to do with it.
If I run the GitLab diagnostic check sudo gitlab-rake gitlab:check, everything looks good except for "hashed storage":
All projects are in hashed storage? ... no
Try fixing it:
Please migrate all projects to hashed storage
But then running sudo gitlab-rake gitlab:storage:migrate_to_hashed doesn't appear to complete (with something like six failed jobs in the dashboard), and running the "gitlab:check" again still indicates this "hashed storage" problem. I've also tried running sudo gitlab-rake gitlab:git:fsck and sudo gitlab-rake cache:clear but these commands don't seem to make a difference.
Luckily I have the latest versions of all the missing repos on my machine, and in fact, I still have the original VM running GitLab CE 12.8.5 (with slightly out of date copies of the repos.)
So my questions are:
Is it possible to "repair" the broken repos on my current instance? I suspect I could just "re-push" my local copies of these repos back up to my server, but I really don't want to lose any metadata like issues / merge requests and such.
Is there any way to resolve the "not all projects are in hashed storage" issue? (Again the migrate_to_hashed task fails to complete.)
Would I be able to do something like "backup", "inspect / tweak backup", "restore backup" kind of thing to fix the broken repos, or at least the metadata?
Thanks in advance.

Okay, so I think I figured out what happened.
I found this thread on the GitLab User Forums.
Apparently the scenario here is:
Have a GitLab instance that has repos not in "hashed storage"
Backup your repo
Restore your repo (either to the same server or migrating to another server)
Either automatically or manually, attempt to update your repos to "hashed storage"
You'll now find that any repo with a "ci runner" (continuous integration runner) will now be listed as "NO REPOSITORY!" and be completely unavailable, since the "hashed storage" migration process will fail
The fix is to:
Reset runner registration tokens as listed in this article in the GitLab documentation
Re-run the sudo gitlab-rake gitlab:storage:migrate_to_hashed process
Once the background jobs are completed, run sudo gitlab-rake gitlab:check to ensure the output contains the message:
All projects are in hashed storage? ... yes
If successful, the projects that stated "NO REPOSITORY!" should now be fully restored.
A key to know if you need to run this process is if you:
Log in to your GitLab CE instance as an admin
Go to the Admin Area
Look under Monitoring->Background Jobs->Dead
and see a job with the name
hashed_storage:hashed_storage_project_migrate
with the error
OpenSSL::Cipher::CipherError:

Docker container/image is not rebuilding automatically on code amendment

In short i want my docker container/image rebuild automatically whenever i write a new chunk of functions.
I have created a node app, running the server in Docker container via compose.
The container works fine, however whenever i make changing in the files or directory it doesn't render the changes automatically. I need to rebuild the directory again via
$ docker-compose up --build
so that the changes may take effect.
is there any solution that i might not need to rebuild the container manually?
Regards.

You either want to look at some kind of delivery pipeline tool as Boynux suggests, btw Dockerhub can watch github for checkins and trigger automatic image builds.
Or you can mount the code into the container using a volume so that changes are picked up.
The option you pick depends on your philosophy / delivery pipeline.

Cleaning the packages folder after deploy

I have an Azure Website configured to deploy from a Bitbucket repository. This works fine.
Since the application is still in active development, I update the nuget packages it uses quite frequently. This causes the packages folder to keep growing indefinitely, unless I go and manually delete the packages.
Now, in my local machine this is not a big issue. Space is cheap. But in Azure, this makes us go over the quota really fast, as old packages accumulate.
How can I customize the Azure deploy process so that it deletes all the packages after a successful deployment?
(I am open to other solutions as well)

You can utilize the custom deployment script feature where you add a step that cleans up the packages directory.
You can read about it here:
http://blog.amitapple.com/post/38418009331/azurewebsitecustomdeploymentpart2/
Another option is to add a post deployment action, by adding a script file (.cmd/.bat) that has the clenup logic to the following directory in your site: d:\home\site\deployments\tools\PostDeploymentActions\, this script will run after the deployment completes successfully.
Read more about it here:
https://github.com/projectkudu/kudu/wiki/Post-Deployment-Action-Hooks

deploying custom software on linux?

I write company internal software in PHP and C++.
What are the best methods of deploying this type of software to linux machine? Currently, we use svn export, are there any other methods?

We use checkinstall. Just write a simple Makefile that copies the files to target directories on the target machine and then run checkinstall to create RPM, DEB or TGZ package, which you can later easily install with distribution package management tools.
You can even add shell scripts that are executed before and after files are copied, so you can do some pre and post processing like adding user accounts, crontab entries, etc.
Once you get more advanced, you can add dependencies to these packages so it could also pull and install PHP, MySQL, Apache, GCC libraries and even required PHP or Apache modules or some extenal C++ libs you might need, all with a single command.

I think it depends on what you mean by deploy. Typically a deploy process for web projects involves a configuration scripting step in which you can take the same deploy package and cater it to specific servers (staging, development, production) by altering simple configuration directives.
In my experience with Linux serviers, these systems are often custom built, and in my experience often use rsync rather than svn export and/or scp alone.
A script might be executed from the command line like so:
$ deploy-site --package=app \
--platform=dev \
--title="Revsion 1.2"
Internally, the system would take whatever was in trunk for the given package from SVN (I'm sure you could adapt this really easily for git too), generate a new unique tag with the log entry "deploying Revision 1.2".
Then it would patch any configuration scripts with the appropriate changes (urls, hosts, database passwords, etc.) before rsyncing it the appropriate destination.
If there are issues with the deployment, it's as easy as running the same command again only this time using one of your auto-generated tags from an earlier deploy:
$ deploy-site --package=app \
--platform=dev \
--title="Reverting to Revision 1.1" \
--tag=20090714200154
If you have to also do a compile on the other end, you could include as part of your configuration patching a Makefile and then execute a command via ssh that would compile the recently deployed code once the rsync process completes.

There is, in my experience, a tradeoff between security and ease of deployment.
For my deployment, I've never had a problem using scp to move the files from one machine to another. You can write a simple BASH script to take a list of machines (from a text file or STDIN) and push a given directory/application to a given directory on all of the machines. Say you hypothetically did it to a bin directory, the end user would never know the difference.
The only problem with that would be when you have multiple architectures and OSes, where it has to be compiled on each one individually. In that case, you could just write a script (the first example that pops into my mind is Net::SSH from Ruby) to take that list of servers, cd to the given directory, and run the compilation script. However, if all machines use the same architecture and configuration, you can hypothetically just compile it once on the machine that you are using to distribute.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string