Can I use GlusterFS volume storage directly without mounting? - glusterfs

I have setup small cluster of GlusterFS with 3+1 nodes.
They're all on the same LAN.
There are 3 servers and 1 laptop (via Wifi) that is also GlusterFS node.
A laptop often disconnects from the network. ;)
Use case I want to achieve is this:
I want my laptop to automatically synchronize with GlusterFS filesystem when it reconnects. (That's easy and done.)
But, when laptop is disconnected from cluster I still want to access filesystem "offline". Modify, add, remove files..
Obviously the only way I can access GlusterFS filesystem when it's offline from cluster, is accessing volume storage directly. The one I configured creating a gluster volume. I guess it's the brick.
Is it safe to modify files inside storage?
Will they be replicated to the cluster when the node re-connects?

There are multiple questions in your list:
First: Can I access GlusterFS when my system is not connected to it:
If you setup a GlusterFS daemon & brick on your system, mount this local daemon through gluster how you would usually do that and add a replication target also, you can access your brick through gluster as if it was not on your local system. The data will then be synchronized with the replication target once you re-connect your system to the network.
Second: Can I edit files in my brick directly:
Technically you can: You can just navigate to your brick and edit a file, however since gluster will not know what you changed, the changes will not be replicated and you will create a split brain situation. So it is certainly not advisable (so don't do that unless you want to change it manually in your replication brick also).

Tomasz, it is definitely not a good idea to directly manipulate the backend volume. Say you add a new file to the backend volume, glusterfs is not aware of this change and the file appears as spurious file when the parent directory is accessed via the glusterfs volume. I am not sure if glusterfs is ideal for your usecase

Related

Can kubernetes provide a pod with an emptyDir volume from the host backed by a specific filesystem different than the host's?

I know this is a bit weird, but I'm building an application that makes small local changes to ephemeral file/folder systems and needs to sync them with a store of record. I am using NFS right now, but it is slow, not super scalable, and expensive. Instead, I'd love to take advantage of btrfs or zfs snapshotting for efficient syncing of snapshots of a small local filesystem, and push the snapshots into cloud storage.
I am running this application in Kubernetes (in GKE), which uses GCP VMs with ext4 formatted root partitions. This means that when I mount an emptyDir volume into my pods, the folder is on an ext4 filesystem I believe.
Is there an easy way to get an ephemeral volume mounted with a different filesystem that supports these fancy snapshotting operations?
No. Nor does GKE offer that kind of low level control anyway but the rest of this answer presumes you've managed to create a local mount of some kind. The easiest answer is a hostPath mount, however that requires you manually account for multiple similar pods on the same host so they don't collide. A new option is an ephemeral CSI volume combined with a CSI plugin that basically reimplements emptyDir. https://github.com/kubernetes-csi/csi-driver-host-path gets most of the way there but would 1) require more work for this use case and 2) is explicitly not supported for production use. Failing either of those, you can move the whole kubelet data directory onto another mount, though that might not accomplish what you are looking for.

Read from a XFS brick, write on a volume?

Filesystem notifications are not available on volumes, the reason why we started reading directly from brick.
Is it okay to read directly from a brick, but write to a volume so that replication happens?
The volume is created using 3 bricks using a replication strategy. Could anyone please suggest the demerits of directly reading from brick.
If the file on the brick from which you read is not in sync with the other copy/copies of the replica (i.e. there is a self-heal that is pending), you can get stale data. Reading from the mount ensures that you always get the up to date data.
Though not comparable with inotify, you can use glusterfind to provide some level of filesystem notifications.

Are docker volumes better option for write heavy operations than binding directories directly?

Reading through docker documentation I found this passage (located here):
Block-level storage drivers such as devicemapper, btrfs, and zfs perform better for write-heavy workloads (though not as well as Docker
volumes).
So does this mean that one should always use docker volumes when expecting lot's of persistent writing?
The container-local filesystem never stores persistent data, so you don't have a choice but to mount something into the container if you want data to live on after the container exits. The "block-level storage drivers" you quote discuss particular install-time options for how images and containers are stored, and aren't related to any particular volume or bind-mount implementation.
As far as performance goes, my general expectation is that the latency of disk I/O will far outweigh any overhead of any particular implementation. Without benchmarking any particular implementation, on a native Linux host, I would expect a named volume, a bind-mount, and writes to the container filesystem to be more or less similar.
From a programming point of view, you will probably get better long-term performance improvement from figuring out how to have fewer disk accesses (for example, by grouping together related database requests into a single transaction) than by trying to optimize the Docker-level storage.
The one prominent exception to this is that bind mounts on MacOS are known to be very slow and you should avoid them if your workload involves substantial disk access. (This includes both reading and writing, and includes some interpreted languages that want to read in every possible source file at startup time.) If you're managing something like database storage where you can't usefully directly access the files anyways, use a named volume. For your application code, COPY it into an image in a Dockerfile and do not overwrite it at run time.
should always use docker volumes when expecting lot's of persistent writing?
It depends.
Yes you want some kind of external to the container storage for any persistent data since data written inside the container is lost when that container is removed.
Whether that should be a host bind or named volume depends on how you need to manage that data. A host volume is a bind mount to the host filesystem. It gives you direct access to that data, but that direct access also comes with uid/gid permission issues and losses the initialization feature of named volumes.
Named volumes with all the defaults is just a bind mount to a folder under /var/lib/docker, so performance would be the same as a host volume of the underlying filesystem is the same. That said the named volume can be configured to mount just about anything you can do with the mount command.
Since each of these options can have varying underlying filesystem, and the performance difference comes from that underlying filesystem choice, there's no way to answer this in any generic sense. Hence, it depends.

Creating shared space for PBS Pro

How do you create shared space across nodes?
I have a designated drive that I would like to use but maintain the ability to add additional drives later
Let's assume you are just starting out and do not have any specific performance requirements. Then probably the easiest way to go would be to start an NFS server on the head node and export your dedicate drive as an NFS file share to the nodes. Your nodes would be able to mount this share over the network under the same mountpoint.
If your dedicated drives are spread across the cluster, the problem obviously gets trickier. After you have become comfortable with NFS, have a look at parallel file systems such as Gluster FS.

How to implement Shared Storage for Concurrent File Access between 2 nodes (Linux)

I need to design a Clustered application which runs separate instances on 2 nodes. These nodes are both Linux VM's running on VMware. Both application instances need to access a database & a set of files.
My intention is that a shared storage disk (external to both nodes) should contain the database & files. The applications would co-ordinate (via RPC-like mechanism) to determine which instance is master & which one is slave. The master would have write-access to the shared storage disk & the slave will have read-only access.
I'm having problems determining the file system for the shared storage device, since it would need to support concurrent access across 2 nodes. Going for a proprietary clustered file system (like GFS) is not a viable alternative owing to costs. Is there any way this can be accomplished in Linux (EXT3) via other means?
Desired behavior is as follows:
Instance A writes to file foo on shared disk
Instance B can read whatever A wrote into file foo immediately.
I also tried using SCSI PGR3 but it did not work.
Q: Are both VM's co-located on the same physical host?
If so, why not use VMWare shared folders?
Otherwise, if both are co-located on the same LAN, what about good old NFS?
try using heartbeat+pacemaker, it has couple of inbuilt options to monitor cluster. Should have something to look for data too
you might look at an active/passive setup with "drbd+(heartbeat|pacemaker)" ..
drbd gives you a distributed block device over 2 nodes, where you can deploy an ext3-fs ontop ..
heartbeat|pacemaker gives you a solution to handle which node is active and passive and some monitoring/repair functions ..
if you need read access on the "passive" node too, try to configure a NAS too on the nodes, where the passive node may mount it e.g. nfs|cifs ..
handling a database like pgsq|mysql on a network attached storage might not work ..
Are you going to be writing the applications from scratch? If so, you can consider using Zookeeper for the coordination between master and slave. This will put the coordination logic purely into the application code.
GPFS Is inherently a clustered filesystem.
You setup your servers to see the same lun(s), build the GPFS filesystem on the lun(s) and mount the GPFS filesystem on the machines.
If you are familiar with NFS, it looks like NFS, but it's GPFS, A Clustered filesystem by nature.
And if one of your GPFS servers goes down, if you defined your environment correctly, no one is the wiser and things continue to run.

Resources