Filesystem for a partition goes missing EC2 reboot

Filesystem for a partition goes missing EC2 reboot - linux

I created a d2.xlarge EC2 instance on AWS which returns the following output:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 8G 0 disk
`-xvda1 202:1 0 8G 0 part /
xvdb 202:16 0 1.8T 0 disk
xvdc 202:32 0 1.8T 0 disk
xvdd 202:48 0 1.8T 0 disk
The default /etc/fstab looks like this
LABEL=cloudimg-rootfs / ext4 defaults,discard 0 0
/dev/xvdb /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2
Now, I make an EXT4 filesystem for xvdc
$ sudo mkfs -t ext4 /dev/xvdc
mke2fs 1.42.13 (17-May-2015)
Creating filesystem with 488375808 4k blocks and 122101760 inodes
Filesystem UUID: 2391499d-c66a-442f-b9ff-a994be3111f8
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information:
done
blkid returns a UID for the filesystem
$ sudo blkid /dev/xvdc
/dev/xvdc: UUID="2391499d-c66a-442f-b9ff-a994be3111f8" TYPE="ext4"
Then, I mount it on /mnt5
$ sudo mkdir -p /mnt5
$ sudo mount /dev/xvdc /mnt5
It gets succesfully mounted. Till there, the things work fine.
Now, I reboot the machine(first stop it and then start it) and then SSH into the machine.
I do
$ sudo blkid /dev/xvdc
It returns me nothing. Where did the filesystem go which I created before the reboot? I guess the filesystem for mounts remain created even after the reboot cycle.
Am I missing something to mount a partition on an AWS EC2 instance?
I followed this http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html and it does not seem to work as described above

You need to read up on EC2 Ephemeral Instance Store volumes. When you stop an instance with this type of volume the data on the volume is lost. You can reboot by performing a reboot/restart operation, but if you do a stop followed later by a start the data is lost. A stop followed by a start is not considered a "reboot" on EC2. When you stop an instance it is completely shut down and when you start it back later it is basically recreated on different backing hardware.
In other words what you describe isn't an issue, it is expected behavior. You need to be very aware of how these volumes work before depending on them.

Related

Changing EC2 Instance Type modified EBS root device UUID and made disk read only. How to resolve?

I had a fully working Amazon Linux 2 instance, running on t2.small instance type. I wanted to try changing the instance to a t2.medium type to test. As I have done in the past, I simply shut down the instance, changed the type, and then restarted the instance.
After the restart, apache was down and my sites were un-reachable. I was able to login to the instance and when trying to start apache I discovered that the root drive was now read only which prevented start/etc. Through some troubleshooting I was able to get the drive remounted and thing running as normal, but everytime I restart the instance, it goes back to read-only and I have to perform the same fix each time to get it back to normal. I believe it's an issue with my /etc/fstab root device UUID not matching the current root device UUID. I never changed any of the attached EBS volumes, so I'm not sure how the change occured.
Some relevant info:
$ cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
To discover the UUID mismatch/fix, I performed the following:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 50G 0 disk
└─xvda1 202:1 0 50G 0 part /
xvdb 202:16 0 50G 0 disk
xvdf 202:80 0 50G 0 disk
└─xvdf1 202:81 0 50G 0 part
$ sudo blkid
/dev/xvda1: LABEL="/" UUID="2a7884f1-a23b-49a0-8693-ae82c155e5af" TYPE="xfs" PARTLABEL="Linux" PARTUUID="4d1e3134-c9e4-456d-a253-374c91394e99"
/dev/xvdf1: LABEL="/" UUID="a8346192-0f62-444c-9cd0-655ed0d49a8b" TYPE="ext4" PARTLABEL="Linux" PARTUUID="2688b30d-29ef-424f-9196-05ec7e4a0d80"
I had read that a possible fix would be to perform the following:
$ sudo mount -o remount,rw /
mount: /: can't find UUID=-1a7884f1-a23b-49a0-8693-ae82c155e5af.
Obviously, that didn't work. So I looked at my /etc/fstab:
#
UUID=-1a7884f1-a23b-49a0-8693-ae82c155e5af / xfs defaults,noatime 1 1
/swapfile swap swap defaults 0 0
Seeing this mismatch, I tried:
sudo mount -o remount nouuid /
Which worked, made the root writeable and I was able to get services back up and running.
So, this is how I've come to the belief that it has to do with the mismatch of the UUID in fstab.
My Questions:
Should I change the entry in /etc/fstab to match the current UUID: 2a7884f1-a23b-49a0-8693-ae82c155e5af
Any idea why this happened and how I can prevent it from happening in the future?

Options for storing many small images for fast batch access on Google Cloud?

We have a few datasets of small images, where each image is about 100KB, and there about 50K images per dataset (around 5GB each dataset). We typically use these datasets to batch-load each image incrementally into a memory of a Google VM instance in order to perform machine learning studies. This is done several times a day.
Currently, a few of us each have our own Google Persistent Disk attached to the VM with the datasets replicated on each. This is not ideal since they are pricey, however, data access is very fast which allows us to iterate on our studies fairly rapidly. We don't share one disk because of the inconvenience of having to manage read/write settings with Google disks when sharing.
Is there an alternative Google Cloud option to handle this use case? Google Buckets are too slow since it is reading many small files.

If your main interest is having rapid I/O your best bet is using an SSD for obvious reasons. Why I don't understand is why you don't want to share one disk. You can have one SSD attached to one of your instances as R/W for loading and modifying your datasets and mounting it read-only to the instances that need to fetch the data.
I'm not sure how faster will be this solution compared to using a bucket, though. I guess you are aware that gsutil has an option for multithreading transfers, which exponentially increases the data transfer speed, specially when transfering a lot of small files? The flag is -m
-m Causes supported operations (acl ch, acl set, cp, mv, rm, rsync,
and setmeta) to run in parallel. This can significantly improve
performance if you are performing operations on a large number of
files over a reasonably fast network connection.
gsutil performs the specified operation using a combination of
multi-threading and multi-processing, using a number of threads
and processors determined by the parallel_thread_count and
parallel_process_count values set in the boto configuration
file. You might want to experiment with these values, as the
best values can vary based on a number of factors, including
network speed, number of CPUs, and available memory.
Using the -m option may make your performance worse if you
are using a slower network, such as the typical network speeds
offered by non-business home network plans. It can also make
your performance worse for cases that perform all operations
locally (e.g., gsutil rsync, where both source and destination
URLs are on the local disk), because it can "thrash" your local
disk.
If a download or upload operation using parallel transfer fails
before the entire transfer is complete (e.g. failing after 300 of
1000 files have been transferred), you will need to restart the
entire transfer.
Also, although most commands will normally fail upon encountering
an error when the -m flag is disabled, all commands will
continue to try all operations when -m is enabled with multiple
threads or processes, and the number of failed operations (if any)
will be reported at the end of the command's execution.
If you want to go with the instance with R/W SSD and multiple read only clients see below:
One option is to set up an NFS on your SSD, one instance will act as the NFS server with R/W rights and the rest will have only read permissions. I will be using Ubuntu 16.04 but the process is similar in all distros:
1 - Install the required packages on both server and clients:
Server: sudo apt install nfs-kernel-server
Client: sudo apt install nfs-common
2 - Mount the disk SSD disk on the server (after formatting it to the filesystem you want to use):
Server:
jordim#instance-5:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 50G 0 disk <--- My extra SSD disk
sda 8:0 0 10G 0 disk
└─sda1 8:1 0 10G 0 part /
jordim#instance-5:~$ sudo fdisk /dev/sdb
(I will create a single primary ext4 partition)
jordim#instance-5:~$ sudo fdisk /dev/sdb
(create partition)
jordim#instance-5:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 50G 0 disk
└─sdb1 8:17 0 50G 0 part <- Newly created partition
sda 8:0 0 10G 0 disk
└─sda1 8:1 0 10G 0 part /
jordim#instance-5:~$ sudo mkfs.ext4 /dev/sdb1
(...)
jordim#instance-5:~$ sudo mkdir /mount
jordim#instance-5:~$ sudo mount /dev/sdb1 /mount/
Make a dir for your NFS share folder:
jordim#instance-5:/mount$ sudo mkdir shared
Now configure the exports on your server. Add the folder to share and the private IPs of the clients. Also you can tweak permissions here, use "ro" for "read only" or "rw" for read-write permissions.
jordim#instance-5:/mount$ sudo vim /etc/exports
(inside the exports file, note the IP is the private IP of the client instance):
/mount/share 10.142.0.5(ro,sync,no_subtree_check)
Now start the nfs service on the server:
root#instance-5:/mount# systemctl start nfs-server
Now to create the mountpoint on the client:
jordim#instance-4:~$ sudo mkdir -p /nfs/share
And mount the folder:
jordim#instance-4:~$ sudo mount 10.142.0.6:/mount/share /nfs/share
Now let's test it:
Server:
jordim#instance-5:/mount/share$ touch test
Client:
jordim#instance-4:/nfs/share$ ls
test
Also, see the mounts:
jordim#instance-4:/nfs/share$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 1.8G 0 1.8G 0% /dev
tmpfs 370M 9.9M 360M 3% /run
/dev/sda1 9.7G 1.5G 8.2G 16% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
tmpfs 370M 0 370M 0% /run/user/1001
10.142.0.6:/mount/share 50G 52M 47G 1% /nfs/share
There you go, now you have only one instance with a r/w disk and as many clients as you want with read only permissions.

mount already mounted or busy

I have an Amazon EC2 instance (Ubuntu 12.04) to which I have attached two 250 GB volumes. Inadvertently, the volumes got unmounted. When I tried mounting them again, with the following command,
sudo mount /dev/xvdg /data
this is the error I get :
mount: /dev/xvdg already mounted or /data busy
Then, I tried un-mounting it as follows :
umount /dev/xvdg but it tells me that the volume is not mounted.
umount: /dev/xvdg is not mounted (according to mtab)
I tried lsof to check for any locks but there weren't any.
The lsblk output is as below :
Any help will be appreciated. What do I need to do to mount the volumes back without losing the data on them?

Ok, figured it out. Thanks #Petesh and #mootmoot for pushing me in the right direction. I was trying to mount single volumes instead of a RAID 0 array. The /dev/md127 device was running so I stopped it first with the following command :
sudo mdadm --stop /dev/md127
Then I assembled the RAID 0 array :
sudo mdadm --assemble --uuid <RAID array UUID here> /dev/md0
Once the /dev/md0 array became active, I mounted it on /data.

Try umount /dev/xvdg* and umount /data and then
mount /dev/xvdg1 /data

About formatting new EBS volume on Amazon AWS

I don't have much experience with Linux and mounting/unmounting things. I'm using Amazon AWS, have booting up EC2 with Ubuntu image, and have attached a new EBS volume to the EC2. From the dashboard, I can see that the volume is attached to :/dev/sda1.
Now, I see from this guide from Amazon that the path will likely be changed by the kernel. So it's most likely that my /dev/sda1 device will be mounted on, maybe, /dev/xvda1.
So I logged in using terminal. I do ls /dev/ and I indeed see xvda1 on there. But I also see xvda. Now I want to format the device. But I don't know if the unformatted device is attached to xvda1 or xvda. I cannot list the content of /dev/xvda1 and /dev/xvda (it says ls: cannot access /dev/xvda1/: Not a directory). I guess I have to format it first.
I tried to format using sudo mkfs.ext4 /dev/xvda1. It says: /dev/xvda1 is mounted; will not make a filesystem here!.
I tried to format using sudo mkfs.ext4 /dev/xvda. It says: /dev/xvda is apparently in use by the system; will not make a filesystem here!
How can I format the volume?
EDIT:
The result of lsblk command:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 8G 0 disk
`-xvda1 202:1 0 8G 0 part /
I then tried to use the command sudo mkfs -t ext4 /dev/xvda, but the same error message appears: /dev/xvda is apparently in use by the system; will not make a filesystem here!
When I tried to use the command mount /dev/xvda /webserver, error message appears: mount: /dev/xvda already mounted or /webserver busy. Some website indicate that this also probably because a corrupted or unformatted file system. So I guess I have to be able to format it first before able to mount it.

First of all you are trying to format /dev/xvda1, which is root device. Why ??
Second if you have added a new EBS, then follow below steps.
List Block Device's
This will give you list of block device attached to your EC2 which will look like
[ec2-user ~]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvdf 202:80 0 100G 0 disk
xvda1 202:1 0 8G 0 disk /
Out of this xvda1 is the / (root) and xvdf is the one that you need to format and mount ( for the new EBS)
Format Device
sudo mkfs -t ext4 device_name # device_name is xvdf here
Create a Mount Point
sudo mkdir /mount_point
Mount the Volume
sudo mount device_name mount_point # here device_name is /dev/xvdf
Make an entry in /etc/fstab
device_name mount_point file_system_type fs_mntops fs_freq fs_passno
Execute
sudo mount -a
This will read your /etc/fstab file and if it's OK. it will mount the EBS to mount_point

increase root partition size

I created an ec2 instance and set the root device to 120GB but only 20GB shows up in the parition:
[root#xxx.xxx.xxx.xxx ec2-user]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 1G 0 disk
ââxvda1 202:1 0 949M 0 part
xvdb 202:16 0 120G 0 disk
ââxvdb1 202:17 0 20G 0 part /
How do I add the remaining 100GB to the / partition?

There are a number of tutorials covering this, but general gist is:
you need to detach the volume from the instance (requires bringing it down)
Mount the volume to another running Ec2 instance (not as its root partion)
Resize it here
Detach and reattach back to original instance as the root partition
restart the original instance
Below is another way to do it with snapshots instead of using a second instance:
https://alestic.com/2010/02/ec2-resize-running-ebs-root/

To resize partition online without detaching the volume (/root for example) use xfs_growfs instead resize2fs
xfs_growfs /dev/xvdb

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string