What file system does ceph use for its OSDs - object

everyone.
According to Sage A. Weil's paper, Ceph was using EBOFS as the file system for its OSDs. However, I looked into the source code of Ceph and could hardly find any code of EBOFS. Is Ceph still using EBOFS or has opted to use other types of file system for a single OSB?
Thank you:-)

The EBOFS was deprecated many years ago.
then the default fs for osd filestore backend is btrfs for years,
bc the btrfs is not stable that time,
ceph community change it to xfs as deafult fs until now.
The filestore layer was aimed to HDD
Now community developed a new objectstore layer named bluestore,
which is designed for high speed storage disk like SSD

Related

Read from a XFS brick, write on a volume?

Filesystem notifications are not available on volumes, the reason why we started reading directly from brick.
Is it okay to read directly from a brick, but write to a volume so that replication happens?
The volume is created using 3 bricks using a replication strategy. Could anyone please suggest the demerits of directly reading from brick.
If the file on the brick from which you read is not in sync with the other copy/copies of the replica (i.e. there is a self-heal that is pending), you can get stale data. Reading from the mount ensures that you always get the up to date data.
Though not comparable with inotify, you can use glusterfind to provide some level of filesystem notifications.

User Level Library for Loopback Storage (no loopback device for Spark applications in HPC)

Cray recommends using loopback devices for running Spark on HPC cluster with Lustre file systems [1]. The problem is most HPC clusters do not provide access to loopback devices for their users. So I wonder if there is a library that opens only one huge file on Lustre ad let use treat that huge file as a file system, and then we can utilize the parallel file access to that one file.
This way we can have parallel IO while having proper partitions and one file per partition. Searching didn't show me anything.
[1] http://wiki.lustre.org/images/f/fb/LUG2016D2_Scaling-Apache-Spark-On-Lustre_Chaimov.pdf
Whether this is possible depends heavily on your application. It would be possible to create eg. an ext4 filesystem image in a regular file using mke2fs as a regular user, and it would be possible to access this with libext2fs linked into your application (probably single-threaded) or via fuse2fs in userspace. It may be that fuse2fs still needs root permission to set up, but I'm not positive, but after that it would behave like a normal filesystem, and does not need a block device.

how CEPH Object store files

I am not a developer so this is not a technical question. We're looking at using CEPH storage by adding it to our current application but I can't seem to get an answer for how CEPH store files if we plan to use CEPH Object Storage. If I send a 1GB file to CEPH Object store, does CEPH split the file in "Chunks" and store it across multiple OSD? OR does CEPT storage that single 1GB file on multiple OSD?
Thank you for answering my question.
Yes, Ceph stripes data (similar to RAID 0). You can refer to HOW CEPH CLIENTS STRIPE DATA for the detail.

GlusterFS noatime alternative

It seems noatime has been disabled for glusterfs on the version I'm using. Installing direct via apt-get, I'm on v3.4.2. I've tried manually switching over to 3.7 and 3.8 via the PPAs, which did work fine, but the noatime flag still isn't accepted when mounting.
I'm running GlusterFS across several PHP servers. The vast majority of my load comes from GlusterFS. I'm rarely writing files, so I'm assuming the load is coming from gluster syncing access times across the servers. PHP mode, caching, etc. aside, what can I do to reduce the insane load caused by gluster?
Running gluster volume top www_storage open shows massive open counts for config.ini files. The highest value in read is about a quarter of the highest opens, and writes are tiny.

What is the best approach in nodejs to switch between the local filesystem and Amazon S3 filesystem?

I'm looking for the best way to switch between using the local filesystem and the Amazon S3 filesystem.
I think ideally I would like a wrapper to both filesystems that I can code against. A configuration change would tell the wrapper which filesystem to use. This is useful to me because a developer can use their local filesystem, but our hosted environments can use Amazon S3 by just changing a configuration option.
Are there any existing wrappers that do this? Should I write my own wrapper?
Is there another approach I am not aware of that would be better?
There's a project named s3fs that offers a subset of POSIX file system function on top of S3. There's no native Amazon-provided way to do this.
However, you should think long and hard about whether or not this is a sensible option. S3 is an object store, not a regular file system, and it has quite different performance and latency characteristics.
If you're looking for high iops, NAS-style storage then Amazon EFS (in preview) would be more appropriate. Or roll your own NFS/CIFS solution using EBS volumes or SoftNAS or Gluster.
I like your idea to build a wrapper that can use either the local file system or S3. I'm not aware of anything existing that would provide that for you, but would certainly be interested to hear if you find anything.
An alternative would be to use some sort of S3 file system mount, so that your application can always use standard file system I/O but the data might be written to S3 if your system has that location configured as an S3 mount. I don't recommend this approach because I've never heard of an S3 mounting solution that didn't have issues.
Another alternative is to only design your application to use S3, and then use some sort of S3 compatible local object storage in your development environment. There are several answers to this question that could provide an S3 compatible service during development.
There's a service called JuiceFS that can do what you want.
According to their documentation:
JuiceFS is a POSIX-compatible shared filesystem specifically designed
to work in the cloud.
It is designed to run in the cloud so you can utilize the cheap price
of object storage service to store your data economically.
It is a
POSIX-compatible filesystem so you can access your data seamlessly as
accessing local files.
It is a shared filesystem so you can share your
files across multiple machines.
s3 is one of the backends supported, you can even configure it to replicate files to a different object storage system on another cloud.

Resources