Regarding user-data in Azure VMs - azure

I recently came across a usecase which involves using the "userdata" capability in Azure. However reading the docs, I understood the differences between custom data and user data, such as userdata persists and is available through out the lifetime of the VM. However I have some questions that are unanswered and coulndn't figure out through Google. The questions are as follows:
Is userdata only applicable to Linux VMs or can we provide userdata for Windows VMs provisioned on Azure? If so, which process takes care of executing it? (like in Linux VMs, there's cloud-init process that processes the userdata)
If I specify customdata and userdata for an Azure VM, which gets executed first?
Can I add the cloud-config scripts usually added in the customdata, inside the userdata data field as well? (I know userdata is not encrypted, but since I read that userdata is basically customdata with added benefits, I wanted to know if it was possible.)
When to use userdata and when to use customdata?
I know that its the cloud-init process that executes the customdata when the VM boots up. Similarly, which process executes the userdata?
Since userdata can be added after the VM has been provisioned and existing userdata can be modified, which process takes care of executing it after adding/modifying the existing userdata. Because if it's not cloud-init(because as far as I know, cloud-init only executes once when the VM provisions), then how does userdata gets executed on reboot?
If someone who knows about Azure or the above concepts could enlighten me, it would be a huge help. Thanks.

Related

How to change a terraform variable depending on runtime information?

I have a Terraform script that, after creating several nodes, installs a piece of software on them by telling one node about the other ones using cloud-init. I want this cloud-init piece to only run when this one node is initially created, and not if it's altered or re-created. Now thanks to Terraform plan, Terraform has the information needed, it's telling the user clearly if the node has to be re-created or if it's created the first time at all. But how do I get this kinda information in my script?
The (boring) and manual way is of course to make it a variable that a human enters after reviewing the plan. But this is hardly scalable, secure or sophisticated.
Maybe Terraform is entirely the wrong tool for this kinda job?
Terraform does not expose the information about what action is planned for an object for use in the configuration itself, because Terraform is a desired state system and so the actions are derived from the configuration, rather than the configuration being derived from the actions.
To achieve what you described I think you'll need to arrange for the initialization process to itself remember that it already run in some persistent location. Cloud-init itself remembers when it has run to completion so that rebooting the system won't re-run initialization tasks, but of course that information cannot survive replacing the VM entirely and so you'd need to create a similar marker yourself in a data store that will outlive that particular VM.
One unanswered question down that path is how that external state would get cleaned up if you were to destroy the VM entirely, since the software running in the VM can't tell whether the system is being shut down in preparation for replacement or being shut down in response to just destroying.

How to fail aws_instance creation if user_data fails to run to completion

Is it possible to fail an aws_instance creation if script passed into user_data fails to run? e.g., with exit 1?
I have a null_resource that uses depends_on [aws_instance.myVM] to post a JSON to the instance, and I need that depends_on to fail when the user_data script fails.
Thanks!
user_data is handled by software running inside the instance itself, such as cloud-init, and so its processing is asynchronous from the ec2:RunInstances call that Terraform's AWS provider makes to start the instance running. There is therefore no way to feed back status information from the user_data handling because it could potentially be running some time after the EC2 API starts reporting that the instance is "running", depending on where in the boot process it's dealt with.
Also, depends_on is for ordering rather than for handling errors, so a depends_on clause will never change anything about Terraform's error handling.
If you want to run software on your instance synchronously as part of Terraform's create operation then unfortunately the only practical option is to use the remote-exec provisioners. This comes at the expense of some considerable extra complexity, because Terraform must now be able to open an SSH session with the instance and authenticate with it to create a two-way communications channel.
In return for that complexity, Terraform can be the one to run the code in question and so Terraform can detect whether it succeeded or not (using its exit status). If the remote command fails, Terraform will halt further processing, return an error, and mark the instance as tainted so that the next plan will attempt to create it again.
With that said, it may be better to find a different way to achieve your goal that doesn't require coupling the Terraform run directly to the state of the software running in the virtual machine. It's not usually expected that a Terraform configuration should need to both start up a virtual machine and interact with software running in that virtual machine in the same operation. This could be a good point for some system decomposition, where you'd have one Terraform configuration that declares that the virtual machine should exist and then a second configuration that assumes that the virtual machine exists and takes actions against it. You can then run whatever other software you need to run in between those two, in order to check whether the instance started up successfully.
One not so obvious way I am doing this is I register the instance as a consul service at the end of user data. If the service never arrives, downstream polling for arrival of the consul service can also timeout and fail. This could be located in another local exec function which could be used as a dependency, or it could be in other instance user data as well.
Perhaps a better way would be to use the consul KV store though, and not to just rely on service arrival. The KV store could be used to provide more information than just a binary state.

should I configure my EC2 using user_data or Ansible

When launching EC2 using Terraform (or cloud formation), we can configure EC2 by putting some scripts in user_data/remote-exec. Alternatively, we can configure EC2 using Ansible/Chef, etc. What are the difference of configuring EC2 in user_data/remote-exec and do that with Ansible/Chef? when to use the former, when to use the latter (I know Ansible/Chef is idempotent)?
In my case, the EC2 is originally manually launched, then manually configured using a lot of linux commands. and the commands are not configured by me. Now I am the person to automate the whole structure using terraform, and configure EC2s. Using user_data/remote-exec to configure EC2 is straightforward. I just need to put all the existing linux commands they have in some scripts with a little change. And if the configuration result using my script is not successful, at least I can quickly figure out whether I miss some commands by comparing my script and the original linux commands. But if I use ansible/chef, I have to rewrite all the steps using different language. And if the configuration is not what expected, it is hard for me to figure out which steps are not correct, because the syntax of ansible/chef and linux commands are totally different.
My question is, in my case, should I use ansible/chef or user_data/remote-exec for configuration?
User Data is good for initial configuration of the system. If you need longer term maintenance a configuration management software like Ansible/Chef/Salt/Puppet is a great option.
Packer can be used for immutable infrastructure, i.e. doesn't change after creation. You can run all the scripts and installs on the system for it to be ready to just boot, this is also faster because you don't have to wait for user data to run.
A few questions you have to ask as well, how often are you going to patch these? Are you going to just update existing or replace with new. Ansible is great for configuration since it's just yaml files an
Blue/Green deployments generally replace servers with all new ones and gradually move traffic over to the new servers.
Some more things to consider with your Infrastructure as code

Setting up ELB on AWS for Node.js

I have a very basic question but I have been searching the internet for days without finding what I am looking for.
I currently run one instance on AWS.
That instance has my node server and my database on it.
I would like to make use of ELB by separating the one machine that hosts both the server and the database:
One machine that is never terminated, which hosts the database
One machine that runs the basic node server, which as well is never terminated
A policy to deploy (and subsequently terminate) additional EC2 instances that run the server when traffic demands it.
First of all I would like to know if this setup makes sense.
Secondly,
I am very confused about the way this should work in practice:
Do all deployed instances run using the same volume or is a snapshot of the volume is used?
In general, how do I set such a system? Again, I searched the web and all of the tutorials and documentations are so generalized for every case that I cannot seem to figure out exactly what to do in my case.
Any tips? Links? Articles? Videos?
Thank you!
You would have an AutoScaling Group with a minimum size of 1, that is configured to use an AMI based on your NodeJS server. The AutoScaling Group would add/remove instances to the ELB as instances are created and deleted.
EBS volumes can not be attached to more than one instance at a time. If you need a shared disk volume you would need to look into the EFS service.
Yes you need to move your database onto a separate server that is not a member of the AutoScaling Group.

Is it possible to Capture running machine on Azure or it is recommended to shutdown before?

I have an Ubuntu 14 VM on Azure to host my developed web sites. (I do not think the OS matters in the point of view the question, but never know)
I've discovered the relatively new Capture button, so for the storage price of a disk size I regularly save a "snapshot" via the Capture function (I am not preparing the image for provisioning, I mean not checking the "I have run 'waagent -deprovision' on the virtual machine" checkbox). Be aware quickly becomes pretty addictive.
The result is an image what I can use when creating new machines, its there in My Images in the wizard. This can function as a backup/rollback worflow. If I delete the VM and create a new from one of resulting image of the previously captured "snapshots". (again, no provisioning will take place)
It is possible to initiate the Capture operation on a running VM. It is not clear for me, if the result will be an image what is a template for a new VM, and that VM will start up and boot, in what state the filesystem etc will be?
Is not it a similar state than sudden power lost? If yes, then it is strongly recommended to always shutdown the VM before capturing, however this such a pain and productivity killer, so no one (including me) wants to do unless it is mandatory.
Accidentally I've switched to the new Azure portal and there the Capture UI says:
Capturing a virtual machine while it's running isn't recommended.

Resources