BOTO3 - Getting Access Denied when copying a s3 object - python-3.x

I am trying to copy from one bucket to another bucket and each bucket has their own access key and secret.
I can connect to the first bucket and down load a file just fine. It might be important to note that I do not have full access to the bucket I am copying from, meaning I can not read all keys in the bucket, just a subset I have access to. I have complete control on the second bucket I am copying to.
client2 is where I am copying to and client is where I am copying from.
copy_source = {
'Bucket': bucketName,
'Key': key
}
client2.copy(CopySource = copy_source,Bucket=bucketName2,Key=key,SourceClient=client)
Here is the error I get:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the UploadPartCopy operation: Access Denied
I am a newbie and any help would be greatly appreciated!!

The reason you're likely getting the Access Denied on this is because the SourceClient is only used for getting the size of the object to determine if it can be copied directly, or if a multi-part upload is required.
When it comes to the actual copy itself, the underlying the underlying copy_object method on the client, which does not accept a SourceClient, and calls out to the S3 APIs PUT Object - Copy method.
As such, if you want to be able to perform an S3 copy from one bucket to another, you can either give the user associated with the access key used by client2 permission to read from the Source bucket, or you can perform an S3 Get using client1 then an S3 Put with client2.

Related

What is the required permission to get s3 bucket creation date using boto3?

I'm trying to check if a bucket exists on s3 and have been following this link: https://stackoverflow.com/a/49817544/19505278
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket-name')
if bucket.creation_date:
print("The bucket exists")
else:
print("The bucket does not exist")
However, I'm unable to get this to work due to a potential missing permission.
I was able to try this on a different s3 bucket and can verify this works. However, the s3 bucket I'm working with does not and is likely due to missing permissions. Unfortunately, I do not have access to the working bucket's permissions.
Is there a permission that I need to enable to retrieve bucket metadata?
Here is how you would typically test for the existence of an S3 bucket:
import boto3
from botocore.exceptions import ClientError
Bucket = "my-bucket"
s3 = boto3.client("s3")
try:
response = s3.head_bucket(Bucket=Bucket)
print("The bucket exists")
except ClientError as e:
if e.response["Error"]["Code"] == "404":
print("No such bucket")
elif e.response["Error"]["Code"] == "403":
print("Access denied")
else:
print("Unexpected error:", e)
If you think that there is a permission issue, you might want to check the documentation on permissions on s3. If you simply want to make sure you can check existence of all buckets, s3:ListAllMyBuckets would work nicely.
For the code, you usually want to make it light-weight by using head_bucket for buckets, head_object for objects etc. #jarmod above provided sample code.
As for question on client vs resource, client is close to metal i.e. actual back-end api powering the service. Resource is higher level. It tries to create meaningful objects that you would create from client response. They both use botocore underneath. There are sometimes slight differences when requesting something as resource would already have the knowledge of underlying object.
For example, if you first create a Bucket Resource object, you can simply use a method that's meaningful for that bucket without specifying Bucket Name again.
resource = boto3.resource('s3')
bucket = resource.Bucket('some_bucket_name')
# you can do stuff with this bucket, e.g. create it without supplying any params
bucket.create()
# if you are using client, story is different. You dont have access to objects, so you need to supply everything
client = boto3.client('s3')
client.create_bucket(BucketName='some_bucket_name')
# here you would need to supply
client.create_bucket()

How to read and write data in spark via an S3 access point

I am attempting to use an S3 access point to store data in an S3 bucket. I have tried saving as I would if I had access to the bucket directly:
someDF.write.format("csv").option("header","true").mode("Overwrite")
.save("arn:aws:s3:us-east-1:000000000000:accesspoint/access-point/prefix/")
This returns the error
IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: "arn:aws:s3:us-east-1:000000000000:accesspoint/access-point/prefix/"
I havnt been able to find any documentation on how to do this. Are access points not supported? Is there a way to set up the access point as a custom data source?
Thank you
The problem is that you have provided the arn instead of the s3 URL. The URL would be something like this (assuming accesspoint is the bucket name):
s3://accesspoint/access-point/prefix/
There is a button in the AWS console if you are in the object or prefix, top right Copy S3 URL

Cloud Function storage trigger on folder of a particular Bucket

I have a scenario for executing a cloud function when something is changed in particular folder of a bucket. While I am deploying a function using cli and passing BUCKET/FOLDERNAME as a trigger, it was giving me an error invalid arguments. Is there any one to give trigger at FOLDER level?
You can only specify a bucket name. You cannot specify a folder within the bucket.
A key point to note is that the namespace for buckets is flat. Folders are emulated, they don't actually exist. All objects in a bucket have the bucket as the parent, not a directory.
What you can actually do is implement an if condition inside of your function to only do stuff if the request contains an object with the name of your folder. Keep in mind that by following this approach your function will still be triggered for every object uploaded to your bucket.

Amazonka: How to generate S3:// uri from Network.AWS.S3.Types.Object?

I've been using turtle to call "aws s3 ls" and I'm trying to figure out how to replace that with amazonka.
Absolute s3 urls were central to how my program worked. I now know how to get objects and filter them, but I don't know how to convert an object to an S3 url to integrate with my existing program.
I came across the getFile function and tried downloading a file from s3.
Perhaps I had something wrong, but it didn't seem like just the S3 Bucket and S3 Object key were enough to download a given file. If I'm wrong about that I need to double check my configuration.

Setting Metadata in Google Cloud Storage (Export from BigQuery)

I am trying to update the metadata (programatically, from Python) of several CSV/JSON files that are exported from BigQuery. The application that exports the data is the same with the one modifying the files (thus using the same server certificate). The export goes all well, that is until I try to use the objects.patch() method to set the metadata I want. The problem is that I keep getting the following error:
apiclient.errors.HttpError: <HttpError 403 when requesting https://www.googleapis.com/storage/v1/b/<bucket>/<file>?alt=json returned "Forbidden">
Obviously, this has something to do with bucket or file permissions, but I can't manage to get around it. How come if the same certificate is being used in writing files and updating file metadata, i'm unable to update it? The bucket is created with the same certificate.
If that's the exact URL you're using, it's a URL problem: you're missing the /o/ between the bucket name and the object name.

Resources