Bidirectional synchronisation between Amazon s3 bucket and physical server - linux

We have a folder in physical server and need to synchronise with one of our Aws s3 bucket. But here the requirement is , we have to synchronise the contents in both the ways (Changes done in the physical server should reflect in Aws S3 bucket and vice versa).Is it possible.?

Use AWS CLI S3 sync. Note that sync is one-way, so you have to issue two separate commands switching source and target to achieve bidirectional sync.
From local directory to S3
aws s3 sync . s3://mybucket
From S3 to local directory
aws s3 sync s3://mybucket .
Running both will get you both directions of the sync.
As pointed out in the comments below each time you modify S3 or your local folder you need to sync in the opposite direction or risk overwriting updated files later.

There are products that do this - Open Source Owncloud and NextCloud run on S3 and a local computer and can sync two folders as a two-way near-live mirror a-la Dropbox. Also Resilio Sync uses Bittorrent to do fast two-way mirrors and can run on S3.

Related

Is there a way to synchronise my local hard-disks folders to s3

I am trying to sync my local hard-disks folders to the s3 bucket, the thing is that the local folders are separated to few drivers like C:\, D:\ and so on...
For example the files in S3 Bucket includes 'RD1' to 'RD80' directories and in the local files C:\ holds 'RD1' to 'RD12', 'D:\' contain 'RD12' to 'RD20' and so on...
There is anyway to use aws cli sync command to accomplish my needs?
I wrote python script that would compere the two backups but i do prefer to use sync command and permanently control the synchronization.
Thanks alot,
Best regards.
The AWS CLI aws s3 sync command will synchronize files from one location to another location, including subdirectories.
If you wish to synchronize multiple directories to different locations, then you will need to run the sync command multiple times.
Also, please note that the sync command is one-way, either from your local computer to S3 or from S3 to the local computer (or S3 to S3). If you want to sync in 'both directions', then you would need to execute the sync command both ways. This is especially important for handling deleted files, which are only deleted in the destination location if you use the --delete option.

Move files from S3 to AWS EFS on the fly

I am building an application where the user can upload images, for this I am using S3 as files storage.
In other area of the application there is some process deployed on EC2 that need to use the uploaded images.
This process need the images multiple times (it generate some report with it) and it's in part of multiple EC2 - using elastic beanstalk.
The process doesn't need all the images at once, but need some subset of it every job it gets (depend the parameters it gets).
Every ec2 instance is doing an independent job - they are not sharing file between them but they might need the same uploaded images.
What I am doing now is to download all the images from s3 to the EC2 machine because it's need the files locally.
I have read that EFS can be mounted to an EC2 and then I can access it like it was a local storage.
I did not found any example of uploading directly to EFS with nodejs (or other lang) but I found a way to transfer file from S3 to EFS - "DataSync".
https://docs.aws.amazon.com/efs/latest/ug/transfer-data-to-efs.html
So I have 3 questions about it:
It is true that I can't upload directly to EFS from my application? (nodesjs + express)
After I move files to EFS, will I able to use it exactly like it in the local storage of the ec2?
Is it a good idea to move file from s3 to efs all the time or there is other solution to the problem I described?
For this exact situation, we use https://github.com/kahing/goofys
It's very reliable, and additionally, offers the ability to mount S3 buckets as folders on any device - windows, mac as well as of course Linux.
Works outside of the AWS cloud 'boundary' too - great for developer laptops.
Downside is that it does /not/ work in a Lambda context, but you can't have everything!
Trigger a Lambda to call an ECS task when the file is uploaded to s3. The ECS task starts and mounts the EFS volume and copies the file from s3 to the EFS.
This wont run into problems with really large files with Lambda getting timed out.
I dont have the code but would be interested if someone already has this solution coded.

Download sqlite file stored in s3 bucket through node

I have a sqlite file stored in S3 bucket. I want to download it in the my local machine or directly access it. How to do that?
You can mount your S3 bucket in your local machine. https://code.google.com/archive/p/s3fs/wikis/InstallationNotes.wiki
But why would you want to do that, if you are just going to read data from your setup this setup will work fine, but if you are going to write data back to it via multiple channels, each channel will use a copy of the file and you will lose concurrency.
The best way will be to clone it to an EBS and write a script to back it up to S3 as a sqlite file.

Update the ID3 tags of S3 bucket files

In my AWS s3 bucket I have thousand of mp3 files and I want to modify the ID3 tags for those files. please suggest the best way.
Sorry to give you the bad news, but only way to do is downloading files one by one update id3 tags and upload them back to s3 bucket. You cannot edit files in place, because AWS S3 is object storage, meaning it keeps data in key: value pairs, key is the folder/filename, value is the file content. It's not suitable for file system, databases, etc.
If you do it this way, one warning, check if you have versioning is on or off for your bucket. Sometimes it's nice to have versioning handled automatically by S3 but, you should remember that each version adds to the storage space that you're paying for.
If you want to edit/modify your files every now and then, you can use AWS EBS or EFS. Both EBS and EFS are block storages, and you can attach them to any EC2 instance, then you can edit/modify your files. The difference between EBS and EFS mainly is, EFS can be attached to multiple EC2 instances at the same time, and share the files in between them.
One more thing about EBS and EFS though, to reach your files, you need to attach it to an EC2 instance. There is no other way to reach your files as easily as in S3.

When hosting on EC2, should I use FS to store files "locally" or s3fs to store files on my s3 service "indirectly"?

I'm hosting a node.js express application on EC2, and i'm using Amazon's S3 storage service.
Within my application, hosted on amazon, should I write the files locally (since the server is already running on aws) or should I still use the s3fs package to store the files on the S3 service as if I'm on a remote machine?
Thanks all!
Don't use s3fs. It's nice, but if you try to use it in production, it will be a nightmare. S3FS has to 'translate' any AWS errors into the very limited set that a filesystem can return. It also can't give you fine-grained control of retries, etc.
It's much better to write code to interact with S3. You will be able to get the full error from S3, you can decide what your retry policy is, etc.

Resources