How to detach/remove EBS volumes from AWS EMR using Terraform?

How to detach/remove EBS volumes from AWS EMR using Terraform? - terraform

Currently in Terraform, ebs_config option is used to specify the size and number of EBS volumes to be attached to a instance group in EMR. When no ebs_config is specified a default of 32GB EBS volume is attached to the core node in addition to the root volume. My case is not to have any EBS volumes attached to the core node. How do I specify that in terraform ?
Currently I use the following code
name = "CoreInstanceGroup"
instance_role = "CORE"
instance_type = "m4.xlarge"
instance_count = "1"
ebs_config {
size = 1
type = "gp2"
volumes_per_instance = 1
}
Terraform doesn't allow size and volumes_per_instance to be 0.

I managed to figure out this is not a terraform issue but that's how AWS EMR works. When you specify 'EBS only' instance as instance type (say m2.4xLarge), EMR automatically attaches an EBS storage volume in addition to the Root volume. If you specify SSD type instead of 'EBS only' as instance type(say r3.Xlarge), EMR doesn't attach an extra EBS volume.

Related

Need to create a snapshot of a volume using python3 and boto3 for a specific region

I am having a need to create snapshot of all volumes present in a region in aws. This script must be able to create snapshot of all the volumes in us-east-2 region
I have used the below script but its only taking snapshot of my default region. How to fix this issue?
import boto3
ec2 = boto3.resource('ec2')
snapshot = ec2.Snapshot('id')
Region='us-east-2'
for vol in ec2.volumes.all():
if Region=='us-east-2':
string=vol.id
ec2.create_snapshot(VolumeId=vol.id,Description=string)
print(vol.id),
print('A snapshot has been created for the following EBS volumes',vol.id)
else:
print('No snapshot has been created for the following EBS volumes',vol.id)
The script works fine only for default region but when I create volumes in any other region it does not bother to take snapshot of those volumes. Can someone please help?

You may specify the region in creating ec2 client with Config.
Reference: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html

After doing more research I could see that the below script worked fine for me.
import boto3
ec2 = boto3.client('ec2')
region_name='us-east-2'
ec2 = boto3.resource('ec2',region_name)
count=0
for vol in ec2.volumes.all():
count+=1
string=vol.id
ec2.create_snapshot(VolumeId=vol.id,Description=string)
print('A snapshot has been created for the following EBS volumes',vol.id)
if count==0:
print('No snapshot has been created for the following region, cause volume does not exit!')

Terraform cyclic dependency issue when referencing IP addresses to generate config file

I am trying to setup an AWS environment with 2 ec2 instances in a VPC that are configured to run a piece of software that requires a config file containing the IP address of the other ec2. To do this, I am creating the config file in a template that I am running to start the ec2 like this:
data "template_file" "init_relay" {
template = file("${path.module}/initRelay.tpl")
vars = {
port = var.node_communication_port
ip = module.block-producing-node.private_ip[0]
self_ip = module.relay-node.public_ip
}
}
module "relay-node" {
source = "terraform-aws-modules/ec2-instance/aws"
name = "relay-node"
ami = var.node_ami
key_name = "aws-keys"
user_data = data.template_file.init_relay.rendered
instance_type = var.instance_type
subnet_id = module.vpc.public_subnets[0]
vpc_security_group_ids = [module.relay_node_sg.this_security_group_id]
associate_public_ip_address = true
monitoring = true
root_block_device = [
{
volume_type = "gp2"
volume_size = 35
},
]
tags = {
Name = "Relay Node"
Environment = var.environment_tag
Version = var.pool_version
}
}
data "template_file" "init_block_producer" {
template = "${file("${path.module}/initBlockProducer.tpl")}"
vars = {
port = var.node_communication_port
ip = module.relay-node.private_ip
self_ip = module.block-producing-node.private_ip
}
}
module "block-producing-node" {
source = "terraform-aws-modules/ec2-instance/aws"
name = "block-producing-node"
ami = var.node_ami
key_name = "aws-keys"
user_data = data.template_file.init_block_producer.rendered
instance_type = var.instance_type
subnet_id = module.vpc.public_subnets[0]
vpc_security_group_ids = [module.block_producing_node_sg.this_security_group_id]
associate_public_ip_address = true
monitoring = true
root_block_device = [
{
volume_type = "gp2"
volume_size = 35
},
]
tags = {
Name = "Block Producing Node"
Environment = var.environment_tag
Version = var.pool_version
}
}
but that gives me a cyclic dependency error:
» terraform apply
Error: Cycle: module.relay-node.output.public_ip, module.block-producing-node.output.private_ip, data.template_file.init_relay, module.relay-node.var.user_data, module.relay-node.aws_instance.this, module.relay-node.output.private_ip, data.template_file.init_block_producer, module.block-producing-node.var.user_data, module.block-producing-node.aws_instance.this
To me that makes sense why I am getting this error because in order to generate the config file for one ec2, the other ec2 already needs to exist and have a ip address assigned to it. But I don't know how to do this in a way.
How do I reference the IP address of the other EC2 in the template file in a way that doesn't cause a cyclic dependency issue?

Generally-speaking, the user data of an EC2 instance cannot contain any of the IP addresses of the instance because the user data is submitted as part of launching the instance and cannot be changed after the instance is launched, and the IP address (unless you specify an explicit one when launching) is also assigned during instance launch, as part of creating the implied main network interface.
If you have only a single instance and it needs to know its own IP address then the easiest answer is for some software installed in your instance to ask the operating system which IP address has been assigned to the main network interface. The operating system already knows the IP address as part of configuring the interface using DHCP, and so there's no need to also pass it in via user data.
A more common problem, though, is when you have a set of instances that all need to talk to each other, such as to form some sort of cluster, and so they need the IP addresses of their fellows in addition to their own IP addresses. In that situation, there are broadly-speaking two approaches:
Arrange for Terraform to publish the IP addresses somewhere that will allow the software running in the instances to retrieve them after the instance has booted.
For example, you could publish the list in AWS SSM Parameter Store using aws_ssm_parameter and then have the software in your instance retrieve it from there, or you could assign all of your instances into a VPC security group and then have the software in your instance query the VPC API to enumerate the IP addresses of all of the network interfaces that belong to that security group.
All variants of this strategy have the problem that the software in your instances may start up before the IP address data is available or before it's complete. Therefore it's usually necessary to periodically poll whatever data source is providing the IP addresses in case new addresses appear. On the other hand, that capability also lends itself well to autoscaling systems where Terraform is not directly managing the instances.
This is the technique used by ElasticSearch EC2 Discovery, for example, looking for network interfaces belonging to a particular security group, or carrying specific tags, etc.
Reserve IP addresses for your instances ahead of creating them so that the addresses will be known before the instance is created.
When we create an aws_instance without saying anything about network interfaces, the EC2 system implicitly creates a primary network interface and chooses a free IP address from whatever subnet the instance is bound to. However, you have the option to create your own network interfaces that are managed separately from the instances they are attached to, which both allows you to reserve a private IP address without creating an instance and allows a particular network interface to be detached from one instance and then connected to another, preserving the reserved IP address.
aws_network_interface is the AWS provider resource type for creating an independently-managed network interface. For example:
resource "aws_network_interface" "example" {
subnet_id = aws_subnet.example.id
}
The aws_network_interface resource type has a private_ips attribute whose first element is equivalent to the private_ip attribute on an aws_instance, so you can refer to aws_network_interface.example.private_ips[0] to get the IP address that was assigned to the network interface when it was created, even though it's not yet attached to any EC2 instance.
When you declare the aws_instance you can include a network_interface block to ask EC2 to attach the pre-existing network interface instead of creating a new one:
resource "aws_instance" "example" {
# ...
user_data = templatefile("${path.module}/user_data.tmpl", {
private_ip = aws_network_interface.example.private_ips[0]
})
network_interface {
device_index = 0 # primary interface
network_interface_id = aws_network_interface.example.id
}
}
Because the network interface is now a separate resource, you can use its attributes as part of the instance configuration. I showed only a single network interface and a single instance above in order to focus on the question as stated, but you could also use resource for_each or count on both resources to create a set of instances and then use aws_network_interface.example[*].private_ips[0] to pass all of the IP addresses into your user_data template.
A caveat with this approach is that because the network interfaces and instances are separate it is likely that a future change will cause an instance to be replaced without also replacing its associated network interface. That will mean that a new instance will be assigned the same IP address as an old one that was already a member of the cluster, which may be confusing to a system that uses IP addresses to uniquely identify cluster members. Whether that is important and what you'd need to do to accommodate it will depend on what software you are using to form the cluster.
This approach is also not really suitable for use with an autoscaling system, because it requires the number of assigned IP addresses to grow and shrink in accordance with the current number of instances, and for the existing instances to somehow become aware when another instance joins or leaves the cluster.

Your template is dependent on your module and your module on your template - that is causing the cycle.
ip = module.block-producing-node.private_ip[0]
and
user_data = data.template_file.init_block_producer.rendered

Terraform aws eks worker node spot instance

I am following this blog to run terraform to spin up an eks cluster .
https://github.com/berndonline/aws-eks-terraform/blob/master/
I just want to change my ec2 worker node type to spot instance
https://github.com/berndonline/aws-eks-terraform/blob/master/eks-worker-nodes.tf
I googled and narrowed it down to launch configuration section,
any ideas how to change the ec2 type to spot instance ?

Please go through the official document about resource aws_launch_configuration
it gives you the sample on how to set spot instance already:
resource "aws_launch_configuration" "as_conf" {
image_id = "${data.aws_ami.ubuntu.id}"
instance_type = "m4.large"
spot_price = "0.001"
lifecycle {
create_before_destroy = true
}
}
Notes:
spot instances price are keep changing depend on the usage. If you are not familiar with it, use the same price of its on-demond price.
Even you set as on-demond price, AWS will only charge you less (normally 5 times less), unless they are used out. But AWS will never charge more.
Please also go through aws document for details: https://aws.amazon.com/ec2/spot/pricing/

Terraform Azure: create a Linux VM from packer image and external data disk

I have been trying, to no avail, to Terraform the following setup in Azure:
A Linux VM from a Packer-created custom VM image with an additional persistent, managed and encrypted data disk attached to said VM, but lives externally in case I want to recreate the VM with a newer (more updated, secure) version of the custom image, without losing any of the data saved to the external disk (imagine a node in a database cluster). And went on to do the following:
Initially, I tried using the azurerm_managed_disk and a azurerm_virtual_machine_data_disk_attachment with the VM resource, but the issue is that if you just create a disk like this (with create_option set to Empty) the disk will be unformatted, unpartitioned, and unmounted. Basically unusable unless something script is ran on the VM.
My thinking went: ok, I'll just run a cloud-init or provisioner block thingie to partition/mount the disk and that's it. But: if I do this, when I rotate the VM, the script will run again and re-format/partition the disk, hence deleting any data I might have saved.
I also tried creating a custom image with an additional data disk with Packer, and using FromImage in the azurerm_managed_disk's create_option, but it turns out it only works when referencing marketplace images and custom images are not supported
The only viable thing I can now think of is going back to approach 2 and make a smarter script that runs only if the attached disk is not partitioned.
Is there an alternative approach I'm not seeing? Can someone help me validate this thinking?
My additional concern is regarding encryption in said disks, as I don't know if this will be an issue when adopting either approach.

First of all, you can create the Azure VM from a custom image through Terraform, no matter how do you create the image, Packer or other ways, more details see To provision a Custom Image in Terraform.
But when you use the custom image and want to encrypted data disk, the problem is coming.
Disk encryption is not currently supported in the use of custom Linux
images.
More details see Requirements and limitations of Encryption.
In addition, to mount the data disk to the VM, I think you can use the VM extension to achieve that. And attach the managed data disk to VM, you can just add the storage_data_disk block in the VM configuration of Terraform code like this:
resource "azurerm_virtual_machine" "main" {
name = "${var.prefix}-vm"
location = "${azurerm_resource_group.main.location}"
resource_group_name = "${azurerm_resource_group.main.name}"
network_interface_ids = ["${azurerm_network_interface.main.id}"]
vm_size = "Standard_DS1_v2"
# Uncomment this line to delete the OS disk automatically when deleting the VM
# delete_os_disk_on_termination = true
# Uncomment this line to delete the data disks automatically when deleting the VM
# delete_data_disks_on_termination = true
...
storage_data_disk {
name = "datadisk0"
vhd_uri = "${azurestack_storage_account.test.primary_blob_endpoint}${azurestack_storage_container.test.name}/datadisk0.vhd"
disk_size_gb = "1023"
create_option = "Empty"
lun = 0
}
...
tags {
environment = "staging"
}
}
EDIT
I am afraid you need to use the custom image id in the vm storage_image_reference. You could use the data azurerm_image to refer your custom image in your group. The code like this:
data "azurerm_image" "custom" {
name = "your_custom_image_name"
resource_group_name = "your_group"
}
resource "azurerm_virtual_machine" "main" {
...
storage_image_reference {
id = "${data.azurerm_image.custom.id}"
}
...
}

Expand Root Volume when Creating Ec2 from API

if you are creating an EC2 from Nodejs api call Runinstances
I found it tricky to expand the root volume if you are creating an EC2 from AMI comes with low disk Space 8GB usually
I you use
using this block will add another EBS Volume to the EC2 without expanding the root volume
BlockDeviceMappings: [
{
DeviceName: "/dev/sdh",
Ebs: {
VolumeSize: 100
}
}
],
what can we do to Expand the root volume ?

Have you tried modifyVolume(params = {}, callback) ⇒ AWS.Requestand you can also modify volume attribute by modifyVolumeAttribute(params = {}, callback) ⇒ AWS.Request.
It is mentioned in same documentation link you have shared.
Thanks

Even I have some issue while calling run-instances, I wanted to exclude some volume but it's not possible in the run-instance
but if you do the operation through AWS Console, AWS UI allows doing the same like expand the volume and exclude volume
You can launch the instance using the run-instance and do the volume modify and increase the size of the volume in another command.
if you find any other solution, please let me also know by answering the below question
Exclude EBS volume while create instance from the AMI

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to detach/remove EBS volumes from AWS EMR using Terraform? - terraform

Related

Need to create a snapshot of a volume using python3 and boto3 for a specific region

Terraform cyclic dependency issue when referencing IP addresses to generate config file

Terraform aws eks worker node spot instance

Terraform Azure: create a Linux VM from packer image and external data disk

Expand Root Volume when Creating Ec2 from API

Categories

Resources