AWS Beanstalk NodeJS app - what happens when I save to file system? - node.js

I have an app that currently saves some data as a file to file system. On self-hosted server it saves it to a disk. When I deploy it to AWS Beanstalk service where will this file end up? Does AWS use persistent or ephemereal file system?
My use case is very simple and I don't want to bother with setting up S3 storage, is it possible to just leave it be? Can I access the file system somehow?

Underneath the beanstalk wrapper, there are ec2 instances, which are running your code. So if you use file system storage, the file will be saved in on the disk/attached volume. Volume data are persistent and you can ssh into the ec2 instance to find your saved file.

Related

How to automatically upload ASP.NET Core crash dumps to an Amazon S3 bucket?

We have an ASP.NET Core 3.1 application running in Amazon EC2 instances with Amazon Linux 2 (RHEL based).
Periodically our application crashes with an 11/SEGV status (segmentation fault) so we enabled minidumps to be generated with an environment variable (COMPlus_DbgEnableMiniDump) as documented here
As multiple instances of the application run simultaneously within an auto scaling group, it's hard to keep track of the crashes, so I need to know if there is any tool or recommended way of logging each of these crashes and uploading the generated minidump file into an S3 bucket, so we can easily retrieve them and analyze them in our development environment.
Any recommendations?
Thank you!
Sorry that I am late to this conversation. Hopefully, you have found a solution to this by now.
Adding my thoughts here to help anyone else with a similar challenge.
I can think of a couple of solutions:
Since the application is running on a Linux instance, you could consider saving the crash dumps to an EFS instance. Register a lifecycle hook handler to the ASG and raise an SNS notification capturing the necessary details of the crash dump file.
Option 1: Deploy a process as a side-car that responds to the
notification and moves the dump file to S3 bucket. Please note that
the dump file will be moved by the process that runs on the new
instance (or other instances) spun up by the ASG
Option 2: Deploy the process that is responsible for moving the dump
files to S3 in a dedicated EC2 instance and attach the same EFS
the instance used by the actual service
Option 3: Create a lambda with required permissions to access the EFS
access points.
Refer to AWS EC2 Lifecycle Hooks

Move files from S3 to AWS EFS on the fly

I am building an application where the user can upload images, for this I am using S3 as files storage.
In other area of the application there is some process deployed on EC2 that need to use the uploaded images.
This process need the images multiple times (it generate some report with it) and it's in part of multiple EC2 - using elastic beanstalk.
The process doesn't need all the images at once, but need some subset of it every job it gets (depend the parameters it gets).
Every ec2 instance is doing an independent job - they are not sharing file between them but they might need the same uploaded images.
What I am doing now is to download all the images from s3 to the EC2 machine because it's need the files locally.
I have read that EFS can be mounted to an EC2 and then I can access it like it was a local storage.
I did not found any example of uploading directly to EFS with nodejs (or other lang) but I found a way to transfer file from S3 to EFS - "DataSync".
https://docs.aws.amazon.com/efs/latest/ug/transfer-data-to-efs.html
So I have 3 questions about it:
It is true that I can't upload directly to EFS from my application? (nodesjs + express)
After I move files to EFS, will I able to use it exactly like it in the local storage of the ec2?
Is it a good idea to move file from s3 to efs all the time or there is other solution to the problem I described?
For this exact situation, we use https://github.com/kahing/goofys
It's very reliable, and additionally, offers the ability to mount S3 buckets as folders on any device - windows, mac as well as of course Linux.
Works outside of the AWS cloud 'boundary' too - great for developer laptops.
Downside is that it does /not/ work in a Lambda context, but you can't have everything!
Trigger a Lambda to call an ECS task when the file is uploaded to s3. The ECS task starts and mounts the EFS volume and copies the file from s3 to the EFS.
This wont run into problems with really large files with Lambda getting timed out.
I dont have the code but would be interested if someone already has this solution coded.

File read/write on cloud(heroku) using node.js

First of all I am a beginner with node.js.
In node.js when I use functions such as fs.writeFile(); the file is created and is visible in my repository. But when this same process is done on a cloud such as heroku no file is visible in the repository(cloned via git). I know the file is being made because I am able to read it but I cannot view it. Why is this??? Plus how can I view the file?
I had the same issue, and found out that Heroku and other cloud services generally prefer that you don't write in their file system; everything you write/save will be store in "ephemeral filesystem", it's like a ghost file system really.
Usually you would want to use Amazon S3 or reddis for json files etc, and other bigger ones like mp3.
I think it will work if you rent a remote server, like ECS, with a linux system, and a mounted storage space, then this might work.

Transfer a file from EC2 linux instance to a Windows system

I have one scheduled task on an EC2 linux system and it is generating a file daily. Now I want to transfer this file from that to another Windows machine that is not on AWS.
I want to schedule this job on the EC2 instance only. I don't want to download it from target machine, I want the upload facility from EC2.
I have tried below command:
scp -i (Key File for my current EC2 instance) (IP of target):(Local file path of current EC2 instance) D:\TEMP(Target Path of window machine)
I am getting:
ssh: connect to host (IP of target) port 22: Connection refused
We already have a functionality to store the file in S3 but it depends on the task of EC2 instance. (Sometimes it takes 1 hour or sometimes it takes 4 hours, that's why I want it at end of this task.)
The error you're receiving is most likely caused by an incorrect firewall setting on the EC2 Security Group in front of your EC2 instance, or, on your Windows server's network.
I would like to suggest using an Amazon S3 bucket to upload the file from your EC2 instance into S3. The file can then wait to be collected from your Windows instance as a scheduled job. You could delete the file from S3 after Windows has downloaded it, or use a lifecycle policy to automatically delete the saved files after a certain time.
This will remove the need to open SSH to your EC2 instance, and also enable you to save the file in S3 so that you can re-download it if you need it again.
I don't know what technology stack you're using, but you could start using Amazon S3 on both servers with the AWS CLI, or an SDK for your preferred programming language.

When hosting on EC2, should I use FS to store files "locally" or s3fs to store files on my s3 service "indirectly"?

I'm hosting a node.js express application on EC2, and i'm using Amazon's S3 storage service.
Within my application, hosted on amazon, should I write the files locally (since the server is already running on aws) or should I still use the s3fs package to store the files on the S3 service as if I'm on a remote machine?
Thanks all!
Don't use s3fs. It's nice, but if you try to use it in production, it will be a nightmare. S3FS has to 'translate' any AWS errors into the very limited set that a filesystem can return. It also can't give you fine-grained control of retries, etc.
It's much better to write code to interact with S3. You will be able to get the full error from S3, you can decide what your retry policy is, etc.

Resources