Transfer a file from EC2 linux instance to a Windows system - linux

I have one scheduled task on an EC2 linux system and it is generating a file daily. Now I want to transfer this file from that to another Windows machine that is not on AWS.
I want to schedule this job on the EC2 instance only. I don't want to download it from target machine, I want the upload facility from EC2.
I have tried below command:
scp -i (Key File for my current EC2 instance) (IP of target):(Local file path of current EC2 instance) D:\TEMP(Target Path of window machine)
I am getting:
ssh: connect to host (IP of target) port 22: Connection refused
We already have a functionality to store the file in S3 but it depends on the task of EC2 instance. (Sometimes it takes 1 hour or sometimes it takes 4 hours, that's why I want it at end of this task.)

The error you're receiving is most likely caused by an incorrect firewall setting on the EC2 Security Group in front of your EC2 instance, or, on your Windows server's network.
I would like to suggest using an Amazon S3 bucket to upload the file from your EC2 instance into S3. The file can then wait to be collected from your Windows instance as a scheduled job. You could delete the file from S3 after Windows has downloaded it, or use a lifecycle policy to automatically delete the saved files after a certain time.
This will remove the need to open SSH to your EC2 instance, and also enable you to save the file in S3 so that you can re-download it if you need it again.
I don't know what technology stack you're using, but you could start using Amazon S3 on both servers with the AWS CLI, or an SDK for your preferred programming language.

Related

Move files from S3 to AWS EFS on the fly

I am building an application where the user can upload images, for this I am using S3 as files storage.
In other area of the application there is some process deployed on EC2 that need to use the uploaded images.
This process need the images multiple times (it generate some report with it) and it's in part of multiple EC2 - using elastic beanstalk.
The process doesn't need all the images at once, but need some subset of it every job it gets (depend the parameters it gets).
Every ec2 instance is doing an independent job - they are not sharing file between them but they might need the same uploaded images.
What I am doing now is to download all the images from s3 to the EC2 machine because it's need the files locally.
I have read that EFS can be mounted to an EC2 and then I can access it like it was a local storage.
I did not found any example of uploading directly to EFS with nodejs (or other lang) but I found a way to transfer file from S3 to EFS - "DataSync".
https://docs.aws.amazon.com/efs/latest/ug/transfer-data-to-efs.html
So I have 3 questions about it:
It is true that I can't upload directly to EFS from my application? (nodesjs + express)
After I move files to EFS, will I able to use it exactly like it in the local storage of the ec2?
Is it a good idea to move file from s3 to efs all the time or there is other solution to the problem I described?
For this exact situation, we use https://github.com/kahing/goofys
It's very reliable, and additionally, offers the ability to mount S3 buckets as folders on any device - windows, mac as well as of course Linux.
Works outside of the AWS cloud 'boundary' too - great for developer laptops.
Downside is that it does /not/ work in a Lambda context, but you can't have everything!
Trigger a Lambda to call an ECS task when the file is uploaded to s3. The ECS task starts and mounts the EFS volume and copies the file from s3 to the EFS.
This wont run into problems with really large files with Lambda getting timed out.
I dont have the code but would be interested if someone already has this solution coded.

AWS Beanstalk NodeJS app - what happens when I save to file system?

I have an app that currently saves some data as a file to file system. On self-hosted server it saves it to a disk. When I deploy it to AWS Beanstalk service where will this file end up? Does AWS use persistent or ephemereal file system?
My use case is very simple and I don't want to bother with setting up S3 storage, is it possible to just leave it be? Can I access the file system somehow?
Underneath the beanstalk wrapper, there are ec2 instances, which are running your code. So if you use file system storage, the file will be saved in on the disk/attached volume. Volume data are persistent and you can ssh into the ec2 instance to find your saved file.

How do you manage provisioning Packer EC2 Linux instances that require hostname entries reliably?

I have a packer ec2 instance where part of the provisioning entails updating the etc/hosts file for the instance. Among these entries is the one of the current running machine which is written in ip-00-00-00 format.
If you do this in Packer, the AMI is saved and launching it again results in a new hostname/ip being assigned and the old hostname is irrelevant. The hostname is used for an internal application which relies on its hostname entry as well as oracle client. For oracle client, there is an ORACLE_HOSTNAME environment variable entry that can be added.
So how do you manage such a process where you're building an AMI that requires dynamic changes to its host file?

Run sync script on AWS ec2 triggered by write

I have an ec2 instance running and have setup it up where it takes SFTP writes (I have to use SFTP unfortunately so I am aware of better solutions but I can't use them). I have an s3 bucket mounted but I ran into an issue with allowing SFTP writes directly into the bucket. My work around is to run
aws s3 sync <directory> s3://<s3-bucket-name>/
And this works. My problem I don't know how to run this script automatically, I would prefer to run it whenever there is a write to a specified directory but I will settle for it running on regular intervals.
So essentially my question is "How do I fire a script automatically in a ec2 aws instance running linux"
Thanks.
use inotifywait for file watcher or use cronjob to kick-off your S3 Sync script at regular interval.

Backup and Decommission Instance stores in AWS

I have inherited some Instance Store-backed Linux AMIs that need to be archived and terminated. We run a Windows & VMWare environment, so I have little experience with Linux & AWS.
I have tried using the Windows EC2 command line tools to export to a vhdk disk image, but receive an error stating that the instance must be EBS-backed to do so.
What's the easiest way to get a complete backup? Keep in mind that we have no plans to actually use the instances again, this is just for archival purposes.
Assuming you have running instance-store instances (and not AMIs, which would mean you already have a backup), you can still create an AMI. Its not simple, and may not be worth the effort if you never plan to actually re-launch the instances, but the following page gives you a couple options:
(1) create an instance-store backed AMI from a running instance
(2) subsequently create an EBS-backed AMI from the instance-store AMI
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-instance-store.html
You can also do a sync of the filesystem directly to S3 or attach an EBS volume and copy the files there.
In the end, I used the dd command in combination with ssh to copy images of each relevant drive to offline storage. Here is a summary of the process:
Ssh into the remote machine and run df -aTh to figure out which drives to backup
Log out of ssh
For each desired disk, run the following ssh command to create and download the disk image (changing the if path to the desired disk): ssh root#[ipaddress] "dd if=/dev/sda1 | gzip -1 -" | dd of=outputfile.gz
Wait for the image to fully download. You may want to examine your network usage and make sure that an appropriate amount of incoming traffic occurs.
Double check the disk images for completeness and mount-ability
Terminate the Instance

Resources