Resuming interrupted s3 download with s3api - aws-cli

I want to resume a s3 file download within linux docker container. I am using
aws s3api get-object \
--bucket mybucket \
--key myfile \
--range "bytes=$size-" \
/dev/fd/3 3>>myfile
It seems /dev/fd/3 3>>myfile works for MAC OS and it appends the next range of data to existing file. However when I try the same command in linux it replaces the original file with next range of contents.

Related

How do I run a python script and files located in an aws s3 bucket

I have python script pscript.py which takes input parameters -c input.txt -s 5 -o out.txt. The files are all located in an aws s3 bucket. How do I run it after creating an instance? Do I have to mount the bucket on EC2 instance and execute the code? or use lambda? I am not sure. Reading so many aws documentations kinda confusing.
Command line run is as follows:
python pscript.py -c input.txt -s 5 -o out.txt
You should copy the file from Amazon S3 to the EC2 instance:
aws s3 cp s3://my-bucket/pscript.py
You can then run your above command.
Please note that, to access the object in Amazon S3, you will need to assign an IAM Role to the EC2 instance. The role needs sufficient permission to access the bucket/object.

How to delete file after sync from EC2 to s3

I have a file system where files can be dropped into an EC2 instance and I have a shell script running to sync the newly dropped files to an s3 bucket. I'm looking to delete the files off the E2C instance once they are synced. Specifically the files are dropped into the "yyyyy" folder.
Below is my shell code:
#!/bin/bash
inotifywait -m -r -e create "yyyyy" | while read -r NEWFILE
do
if lsof | grep "$NEWFILE" ; then
echo "$NEWFILE";
else
sleep 15
aws s3 sync yyyyy s3://xxxxxx-xxxxxx/
fi
Instead of using aws s3 sync, you could use aws s3 mv (which is a 'move').
This will copy the file to the destination, then delete the original (effectively 'moving' the file).
Can also be used with --recursive to move a whole folder, or --include and --exclude to specify multiple files.

AWS CLI for searching a file in s3 bucket

I want to search for a file name abc.zip in s3 buckets and there are nearly 60 buckets and each buckets have 2 to 3 levels subdirectories or folders .I tried to perform search using AWS CLI commands and below are the commands which i tried but even though the file is existing in the bucket.The results are not being displayed for the file.
aws s3api list-objects --bucket bucketname --region ca-central-1 \
--recursive --query "Contents[?contains(Key, 'abc.zip')]"
aws s3 ls --summarize --human-readable --recursive bucketname \
--region ca-central-1 | egrep 'abc.zip'
For all the above commands execution i dont see the filename in command line and when i manually check the bucket the file exists.
Is there any way i can find the file.
Hmm.
I used your command from #1 without "--recursive" because this throws Unknown options: --recursive. The file I was searching for is on the second level of the bucket and it was found. --region is also not used.
My guess is you are using some old version of AWS client or pointing to an incorrect bucket.
My working command:
aws s3api list-objects --bucket XXXXX --query "Contents[?contains(Key, 'animate.css')]"
[
{
"LastModified": "2015-06-14T23:29:03.000Z",
"ETag": "\"e5612f9c5bc799b8b129e9200574dfd2\"",
"StorageClass": "STANDARD",
"Key": "css/animate.css",
"Owner": {
"DisplayName": "XXXX",
"ID": "XXXX"
},
"Size": 78032
}
]
If you decide to upgrade your CLI client: https://github.com/aws/aws-cli/tree/master
Current version is awscli-1.15.77 which you may check by aws --version.
I tried in the following way
aws s3 ls s3://Bucket1/folder1/2019/ --recursive |grep filename.csv
This outputs the actual path where the file exists
2019-04-05 01:18:35 111111 folder1/2019/03/20/filename.csv
Hope this helps!
I know this is ancient, but I found a way to do this without piping text to grep...
aws s3api list-objects-v2 --bucket myBucket --prefix 'myFolder' \
--query "Contents[*]|[?ends_with(Key,'jpg')].[Key]"
I think previous answers are correct but if you want make this, bucket agnostic, then you can use the below script all you have to do is change the variable value (search_value) on the first line to what you
are searching for and add your id and secret:
#!/usr/bin/sh
export AWS_ACCESS_KEY_ID=your_key; export AWS_SECRET_ACCESS_KEY=your_secret;
search_value="3ds"
my_array=( `aws s3api list-buckets --query "Buckets[].Name"|grep \" |sed 's/\"//g'|sed 's/\,//g'` )
my_array_length=${#my_array[#]}
for element in "${my_array[#]}"
do
echo "----- ${element}"
aws s3 ls s3://"${element}" --recursive |grep -i $search_value
done
Warning....it will search every single bucket in your account so be prepared for a long search....
It does pattern search so it will find any words that contains the value
Lastly this is case insensitive search ... (you can disable that by removing -i from grep line)
done

AzCopy upload file for Linux

I'm trying to upload a sample file to Azure from my Ubuntu machine using AzCopy for Linux but I keep getting the below error no matter what permission/ownership I change to.
$ azcopy --source ../my_pub --destination https://account-name.blob.core.windows.net/mycontainer --dest-key account-key
Incomplete operation with same command line detected at the journal directory "/home/jmis/Microsoft/Azure/AzCopy", do you want to resume the operation? Choose Yes to resume, choose No to overwrite the journal to start a new operation. (Yes/No) Yes
[2017/11/18 22:06:24][ERROR] Error parsing source location "../my_pub": Failed to enumerate directory /home/jmis/my_pub/ with file pattern *. Cannot find the path '/home/jmis/my_pub/'.
I have digged over the internet to find solutions, without having a luck I eventually ended up asking a question here.
Although AzCopy was having issues for Linux I'm able to do the above operation seamlessly with Azure CLI. The below code listed on Azure docs helped me do it:
#!/bin/bash
# A simple Azure Storage example script
export AZURE_STORAGE_ACCOUNT=<storage_account_name>
export AZURE_STORAGE_ACCESS_KEY=<storage_account_key>
export container_name=<container_name>
export blob_name=<blob_name>
export file_to_upload=<file_to_upload>
export destination_file=<destination_file>
echo "Creating the container..."
az storage container create --name $container_name
echo "Uploading the file..."
az storage blob upload --container-name $container_name --file $file_to_upload --name $blob_name
echo "Listing the blobs..."
az storage blob list --container-name $container_name --output table
echo "Downloading the file..."
az storage blob download --container-name $container_name --name $blob_name --file $destination_file --output table
echo "Done"
Going forward I will be using the Cool Azure CLI which is Linux compliant and Simple too.
We can use this script to upload single file with Azcopy(Linux):
azcopy \
--source /mnt/myfiles \
--destination https://myaccount.file.core.windows.net/myfileshare/ \
--dest-key <key> \
--include abc.txt
Use --include to specify which file you want to upload, here a example, please check it:
root#jasonubuntu:/jason# pwd
/jason
root#jasonubuntu:/jason# ls
test1
root#jasonubuntu:/jason# azcopy --source /jason/ --destination https://jasondisk3.blob.core.windows.net/jasonvm/ --dest-key m+kQwLuQZiI3LMoMTyAI8K40gkOD+ZaT9HUL3AgVr2KpOUdqTD/AG2j+TPHBpttq5hXRmTaQ== --recursive --include test1
Finished 1 of total 1 file(s).
[2017/11/20 07:45:57] Transfer summary:
-----------------
Total files transferred: 1
Transfer successfully: 1
Transfer skipped: 0
Transfer failed: 0
Elapsed time: 00.00:00:02
root#jasonubuntu:/jason#
More information about Azcopy on Linux, please refer to this link.

How to create instance from already uploaded VMDK image at S3 bucket

I have already uploaded my VMDK file to the S3 bucket using following command:
s3cmd put /root/Desktop/centos-ldaprad.vmdk --multipart-chunk-size-mb=10 s3://xxxxx
Now When I would like to create AWS Instance from the same VMDK available at S3 bucket:
ec2-import-instance centos-ldaprad.vmdk -f VMDK -t t2.micro -a x86_64 -b xxxxx -o <XXXX_ACCESS_KEY_XXXX> -w <XXXX_SECRET_KEY_XXX> -p Linux --dont-verify-format -s 5 --ignore-region-affinity
But It looks on present working directory for the source VMDK file. I will be really greatful if you can guide to how to point source VMDK at bucket instead of local source?
Does this --manifest-url url points to the S3 bucket? But when I have uploaded do not have any idea whether it has created any such file? If it creates where it would be created?
Another thing is using above ec2-import-instance when I am creating it searches for VMDK on present working directory and if found it will start uploading. But is there any provision to make upload in parts and also to resume in case of interruption?
It's not really the answer you were after, but I've attached the script I use to upload VMDKs and convert them to AMI images.
This uses the ec2-resume-import, so you can restart it if a upload partially fails.
http://pastebin.com/bD8c3gQu
It's worth pointing out that when I register the device I specify a block device mapping. This is because my images always include a separate boot partition, and a LVM based root partition.
--root-device-name /dev/sda1 -b /dev/sda=$SNAPSHOT_ID:10:true --region $REGION -a x86_64 --kernel aki-52a34525

Resources