Amazonka: How to generate S3:// uri from Network.AWS.S3.Types.Object? - haskell

I've been using turtle to call "aws s3 ls" and I'm trying to figure out how to replace that with amazonka.
Absolute s3 urls were central to how my program worked. I now know how to get objects and filter them, but I don't know how to convert an object to an S3 url to integrate with my existing program.
I came across the getFile function and tried downloading a file from s3.
Perhaps I had something wrong, but it didn't seem like just the S3 Bucket and S3 Object key were enough to download a given file. If I'm wrong about that I need to double check my configuration.

Related

How to read and write data in spark via an S3 access point

I am attempting to use an S3 access point to store data in an S3 bucket. I have tried saving as I would if I had access to the bucket directly:
someDF.write.format("csv").option("header","true").mode("Overwrite")
.save("arn:aws:s3:us-east-1:000000000000:accesspoint/access-point/prefix/")
This returns the error
IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: "arn:aws:s3:us-east-1:000000000000:accesspoint/access-point/prefix/"
I havnt been able to find any documentation on how to do this. Are access points not supported? Is there a way to set up the access point as a custom data source?
Thank you
The problem is that you have provided the arn instead of the s3 URL. The URL would be something like this (assuming accesspoint is the bucket name):
s3://accesspoint/access-point/prefix/
There is a button in the AWS console if you are in the object or prefix, top right Copy S3 URL

I am not able to read dat file from S3 bucket using lambda function

I have been trying to read dat file from one s3 bucket and convert it into CSV and then compress it and put it into another bucket
for open and reading i am using below code but it is throwing me an error No such file or directory
with open(f's3://{my_bucket}/{filenames}', 'rb') as dat_file:
print(dat_file)'''
The Python language does not natively know how to access Amazon S3.
Instead, you can use the boto3 AWS SDK for Python. See: S3 — Boto 3 documentation
You also have two choices about how to access the content of the file:
Download the file to your local disk using download_file(), then use open() to access the local file, or
Use get_object() to obtain a StreamingBody of the file contents
See also: Amazon S3 Examples — Boto 3 documentation

HTML to PDF creation in AWS Lambda using Python

I am trying to create a pdf file that contains images, tables from HTML data in AWS lambda using python. I searched a lot on google and I didn't find any super cool solution. I tried some libraries in local(FPDF, pdfKit) and but it doesn't work on AWS. Is there any simple tool to create pdf and upload it to S3 bucket. Thanks in advance.
you can use reportlab PDF python module. It is good for all the things you have asked for. You can add images, create tables etc. There are a lot of styling options available as well. You can find more about it here: https://www.reportlab.com/docs/reportlab-userguide.pdf
I am using this is in my production and works pretty well for my use case where I have to create an invoice. You can create the invoice in the /tmp directory and then upload this to S3
pdfkit library works with aws lambda. pdfkit internally needs the wkhtmltopdf binaries installed, you can add them as lambda layer. You can download files from https://wkhtmltopdf.org/downloads.html.
Once you add the lambda layers you can set the config path as following:
config = pdfkit.configuration(wkhtmltopdf="/opt/bin/wkhtmltopdf")
pdfkit.from_string/from_file(input, <temp path of pdf file on lambda>, configuration=config)
You can uplod the file generated in your lambda temp location to S3 bucket using upload_file(). You can refer https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.upload_file on how to upload to s3 bucket.

How to upload downloaded file to s3 bucket using Lambda function

I saw different questions/answers but I could not find the one that worked for me. Hence, I am really new to AWS, I need your help. I am trying to download gzip file and load it to the json file then upload it to the S3 bucket using Lambda function. I wrote the code to download the file and convert it to json but having problem while uploading it to the s3 bucket. Assume that file is ready as x.json. What should I do then?
I know it is really basic question but still help needed :)
This code will upload to Amazon S3:
import boto3
s3_client = boto3.client('s3', region_name='us-west-2') # Change as appropriate
s3._client.upload_file('/tmp/foo.json', 'my-bucket', 'folder/foo.json')
Some tips:
In Lambda functions you can only write to /tmp/
There is a limit of 512MB
At the end of your function, delete the files (zip, json, etc) because the container can be reused and you don't want to run out of disk space
If your lambda has proper permission to write a file into S3, then simply use boto3 package which is an AWS SDK for python.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html
Be aware that if the lambda locates inside of VPC then lambda cannot access to the public internet, and also boto3 API endpoint. Thus, you may require a NAT gateway to proxy lambda to the public.

How TensorFlow read file from s3 bytestream

I have done a deep learning model in TensorFlow for image recognition, and this one works reading an image file from local directory with tf.read_file() method, but I need now that the file be read by TensorFlow since a variable that is a Byte-Streaming that extract the image file since an S3 Bucket of Amazon without storage the streaming in local directory
You should be able to pass in the fully formed s3 path to tf.read_file(), like:
s3://bucket-name/path/to/file.jpeg where bucket-name is the name of your s3 bucket, and path/to/file.jpeg is where it's stored in your bucket. It seems possible you might be running into some access permissions issue, depending on if your bucket is private. You can follow https://github.com/tensorflow/examples/blob/master/community/en/docs/deploy/s3.md to set up your credentials
Is there an error you ran into when doing this?

Resources