What are the supported backend filesystems (storage pools) for OpenEBS Jiva Volumes? - openebs

What are the filesystems on the node mount-points on which Jiva replicas can be created.

Jiva uses sparse files as disks to store data, which in turn requires filesystem support for extent mapping (i.e., the fiemap ioctl). A list of commonly used filesystem with extent support includes: ext4, xfs, btrfs, lustre. Popular OS/Storage filesystems that don't have this support include: ZFS, NFS, Amazon EFS.
ext4 is being extensively used as a supported jiva backend in production."

Related

Docker Storage Driver overlay2 not showing all the information

docker info command shows Storage Driver as overlay2 and is not showing information like when Storage Driver is devicemapper as shown below
Is there a way to see similar sort of information when Storage Driver is overlay2?
Reason:
docker info shows the following warnings when Storage Driver is devicemapper
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
Device mapper used a dedicated block device (or loop mounted file as a device) as storage for the images and containers. The various utilization statistics applied to that block devices.
AUFS, Overlay, Overlay2, and other storage backends don't have this block device. They also don't separate between metadata and data storage, they're just different files in folders. To see the remaining capacity of the storage backend, you look at the remaining capacity of the backing filesystem. E.g.
df /var/lib/docker

Local disk configuration in Spark

Hi the official Spark documentation state:
While Spark can perform a lot of its computation in memory, it still
uses local disks to store data that doesn’t fit in RAM, as well as to
preserve intermediate output between stages. We recommend having 4-8
disks per node, configured without RAID (just as separate mount
points). In Linux, mount the disks with the noatime option to reduce
unnecessary writes. In Spark, configure the spark.local.dir variable
to be a comma-separated list of the local disks. If you are running
HDFS, it’s fine to use the same disks as HDFS.
I wonder what is the purpose of 4-8 per node
Is it for parallel write ? I am not sure to understand the reason why as it is not explained.
I have no clue for this: "If you are running HDFS, it’s fine to use
the same disks as HDFS".
Any idea what is meant here...
Purpose of usage 4-8 RAID disks to mirror partitions adding redundancy to prevent data lost in case of fault on hardware level. In case of HDFS the redundancy that RAID provides is not needed, since HDFS handles it by replication between nodes.
Reference

Azure D4 Disk Size below advertised

Azure VM D4 is advertised as having 400GB of SSD backed storage. Provisioning the VM I end up with 127GB for the OS and 400GB for temp storage.
Is this normal? I need the full 400GB on the OS drive and dont see an obvious way to reconfigure storage.
That is correct. However, because the local SSD is not guaranteed to be persistent, you will not want this to be responsible for your OS drive.
In the D-series announcements, http://azure.microsoft.com/blog/2014/09/22/new-d-series-virtual-machine-sizes/
"Local Storage SSD Drive
On these new sizes, the temporary drive (D:\ on Windows, /mnt or /mnt/resource on Linux) are local SSDs. This high-speed local disk is best used for workloads that replicate across multiple instances, like MongoDB, or can leverage this high I/O disk for a local and temporary cache, like SQL Server 2014’s Buffer Pool Extensions. Note, these drives are not guaranteed to be persistent. Thus, while physical hardware failure is rare, when it occurs, the data on this disk may be lost, unlike your OS disk and any attached durable disks that are persisted in Azure Storage." (emphasis mine)
Found this post that explains how this is normal for the OS drive.
https://azure.microsoft.com/blog/2015/03/25/azure-vm-os-drive-limit-octupled/
So for marketplace images the guidance is to provision new data drives.

Is it possible to use GridGain file system with storage other than Hadoop

Is there an way to use the GridGain in-memory file system on top of storage other than the Hadoop file system?
Actually my idea it would to use just like a cache on top of a plain file system or shared NFS.
GridGain chose Hadoop FileSystem API as the secondary file system interface. So, the answer is Yes, as long as you wrap another file system into Hadoop FileSystem interface.
Also, you may wish to look at HDFS NFS Gateway project.

AWS Amazon offers 160GB space for small instance. On booting Suse linux the total root partition space is

AWS Amazon offers 160GB space for small instance. On booting Suse linux the total root partition space I got is 10GB. On df -h I only see/dev/sda1 with 10GB space. Where is rest 150GB? How can I claim this space? I dont want to use EBS as it cost extra and 160GB space suffice my need. Please help.
The extra 150GB is given as an ephemeral storage, e.g. data on this storage won't survive reboots in contrast to the data on your root storage. During launching, you can select where your ephemeral disks should be made available as a device in your machine (this is the -boption when using the command line or in the "Instance Storage" tab when launching via the S3 console. You can then simply mount it in your running instance.

Resources