InvalidRequestError: Unrecognized request argument supplied: request_timeout in OpenAI LLM

InvalidRequestError: Unrecognized request argument supplied: request_timeout in OpenAI LLM - openai-api

As I am running this code block, I am getting a request timeout error.
Here is the code block:
from langchain.llms import OpenAI
import os
os.environ["OPENAI_API_KEY"]="xxxxxxxxxx"
llm = OpenAI(model_name="text-ada-001", n=2, best_of=2)
llm("Tell me a joke")
Here is the detailed error message:
---------------------------------------------------------------------------
File /usr/local/lib/python3.9/site-packages/openai/api_requestor.py:362, in APIRequestor._interpret_response_line(self, rbody, rcode, rheaders, stream)
360 stream_error = stream and "error" in resp.data
361 if stream_error or not 200 <= rcode < 300:
--> 362 raise self.handle_error_response(
363 rbody, rcode, resp.data, rheaders, stream_error=stream_error
364 )
365 return resp
InvalidRequestError: Unrecognized request argument supplied: request_timeout
I tried reinstalling the llm and openai packages to no avail. Appreciate any guidance you might have.

Related

Received client error (400) deploying huggingface bigscience/bloom to SageMaker

I want to deploy Bloom on SageMaker so that I have a Bloom inference API I can use. I started by running the following in a SageMaker jupyter notebook:
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'bigscience/bloom',
'HF_TASK':'text-generation'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)
predictor.predict({
'inputs': "Can you please let us know more details about your "
})
which produced:
---------------------------------------------------------------------------
ModelError Traceback (most recent call last)
/tmp/ipykernel_15151/842216467.py in <cell line: 1>()
----> 1 predictor.predict({
2 'inputs': "Can you please let us know more details about your "
3 })
~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
159 data, initial_args, target_model, target_variant, inference_id
160 )
--> 161 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
162 return self._handle_response(response)
163
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
393 "%s() only accepts keyword arguments." % py_operation_name)
394 # The "self" in this scope is referring to the BaseClient.
--> 395 return self._make_api_call(operation_name, kwargs)
396
397 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
723 error_code = parsed_response.get("Error", {}).get("Code")
724 error_class = self.exceptions.from_code(error_code)
--> 725 raise error_class(parsed_response, operation_name)
726 else:
727 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027bloom\u0027"
}
". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2022-07-29-23-06-38-076 in account 162923941922 for more information.
The cloudwatch log just shows:
2022-07-29T23:09:09,135 [INFO ] W-bigscience__bloom-4-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise PredictionException(str(e), 400)
How can I deploy it on sagemaker without encountering this problem?

I see that you are using this example. 4XX error occurs usually when the model is not downloaded and not available from HuggingFace Hub or not deployed onto SageMaker. I would suggest to check your SageMaker console if the model is deployed and do a Inference using local mode.

Azureml TabularDataset to_pandas_dataframe() returns InvalidEncoding error

When I run:
datasetTabular = Dataset.get_by_name(ws, "<Redacted>")
datasetTabular.to_pandas_dataframe()
The following error is returned. What can I do to get past this?
ExecutionError Traceback (most recent call last) File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\dataset_error_handling.py:101, in _try_execute(action, operation, dataset_info, **kwargs)
100 else:
--> 101 return action()
102 except Exception as e:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\tabular_dataset.py:169, in TabularDataset.to_pandas_dataframe.<locals>.<lambda>()
168 dataflow = get_dataflow_for_execution(self._dataflow, 'to_pandas_dataframe', 'TabularDataset')
--> 169 df = _try_execute(lambda: dataflow.to_pandas_dataframe(on_error=on_error,
170 out_of_range_datetime=out_of_range_datetime),
171 'to_pandas_dataframe',
172 None if self.id is None else {'id': self.id, 'name': self.name, 'version': self.version})
173 fine_grain_timestamp = self._properties.get(_DATASET_PROP_TIMESTAMP_FINE, None)
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_loggerfactory.py:213, in track.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
212 try:
--> 213 return func(*args, **kwargs)
214 except Exception as e:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\dataflow.py:697, in Dataflow.to_pandas_dataframe(self, extended_types, nulls_as_nan, on_error, out_of_range_datetime)
696 with tracer.start_as_current_span('Dataflow.to_pandas_dataframe', trace.get_current_span()) as span:
--> 697 return get_dataframe_reader().to_pandas_dataframe(self,
698 extended_types,
699 nulls_as_nan,
700 on_error,
701 out_of_range_datetime,
702 to_dprep_span_context(span.get_context()))
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_dataframereader.py:386, in _DataFrameReader.to_pandas_dataframe(self, dataflow, extended_types, nulls_as_nan, on_error, out_of_range_datetime, span_context)
384 if have_pyarrow() and not extended_types and not inconsistent_schema:
385 # if arrow is supported, and we didn't get inconsistent schema, and extended typed were not asked for - fallback to feather
--> 386 return clex_feather_to_pandas()
387 except _InconsistentSchemaError as e:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_dataframereader.py:298, in
_DataFrameReader.to_pandas_dataframe.<locals>.clex_feather_to_pandas()
297 activity_data = dataflow_to_execute._dataflow_to_anonymous_activity_data(dataflow_to_execute)
--> 298 dataflow._engine_api.execute_anonymous_activity(
299 ExecuteAnonymousActivityMessageArguments(anonymous_activity=activity_data, span_context=span_context))
301 try:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_aml_helper.py:38, in update_aml_env_vars.<locals>.decorator.<locals>.wrapper(op_code, message, cancellation_token)
37 engine_api_func().update_environment_variable(changed)
---> 38 return send_message_func(op_code, message, cancellation_token)
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\engineapi\api.py:160, in EngineAPI.execute_anonymous_activity(self, message_args, cancellation_token)
158 #update_aml_env_vars(get_engine_api)
159 def execute_anonymous_activity(self, message_args: typedefinitions.ExecuteAnonymousActivityMessageArguments, cancellation_token: CancellationToken = None) -> None:
--> 160 response = self._message_channel.send_message('Engine.ExecuteActivity', message_args, cancellation_token)
161 return response
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\engineapi\engine.py:291, in MultiThreadMessageChannel.send_message(self, op_code, message, cancellation_token)
290 cancel_on_error()
--> 291 raise_engine_error(response['error'])
292 else:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\errorhandlers.py:10, in raise_engine_error(error_response)
9 if 'ScriptExecution' in error_code:
---> 10 raise ExecutionError(error_response)
11 if 'Validation' in error_code:
ExecutionError: Error Code: ScriptExecution.StreamAccess.Validation Validation Error Code: InvalidEncoding Validation Target: TextFile Failed Step: 78059bb0-278f-4c7f-9c21-01a0cccf7b96 Error Message: ScriptExecutionException was caused by StreamAccessException. StreamAccessException was caused by ValidationException.
Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: Unable to translate bytes [8B] at index 1 from specified code page to Unicode.
Unable to translate bytes [8B] at index 1 from specified code page to Unicode. | session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c
During handling of the above exception, another exception occurred:
UserErrorException Traceback (most recent call last) Input In [34], in <module>
1 # preview the first 3 rows of the dataset
2 #datasetTabular.take(3)
----> 3 datasetTabular.take(3).to_pandas_dataframe()
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\_loggerfactory.py:132, in track.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
130 with _LoggerFactory.track_activity(logger, func.__name__, activity_type, custom_dimensions) as al:
131 try:
--> 132 return func(*args, **kwargs)
133 except Exception as e:
134 if hasattr(al, 'activity_info') and hasattr(e, 'error_code'):
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\tabular_dataset.py:169, in TabularDataset.to_pandas_dataframe(self, on_error, out_of_range_datetime)
158 """Load all records from the dataset into a pandas DataFrame.
159
160 :param on_error: How to handle any error values in the dataset, such as those produced by an error while (...)
166 :rtype: pandas.DataFrame
167 """
168 dataflow = get_dataflow_for_execution(self._dataflow, 'to_pandas_dataframe', 'TabularDataset')
--> 169 df = _try_execute(lambda: dataflow.to_pandas_dataframe(on_error=on_error,
170 out_of_range_datetime=out_of_range_datetime),
171 'to_pandas_dataframe',
172 None if self.id is None else {'id': self.id, 'name': self.name, 'version': self.version})
173 fine_grain_timestamp = self._properties.get(_DATASET_PROP_TIMESTAMP_FINE, None)
175 if fine_grain_timestamp is not None and df.empty is False:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\dataset_error_handling.py:104, in _try_execute(action, operation, dataset_info, **kwargs)
102 except Exception as e:
103 message, is_dprep_exception = _construct_message_and_check_exception_type(e, dataset_info, operation)
--> 104 _dataprep_error_handler(e, message, is_dprep_exception)
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\dataset_error_handling.py:154, in _dataprep_error_handler(e, message, is_dprep_exception)
152 for item in user_exception_list:
153 if _contains(item, getattr(e, 'error_code', 'Unexpected')):
--> 154 raise UserErrorException(message, inner_exception=e)
156 raise AzureMLException(message, inner_exception=e)
UserErrorException: UserErrorException: Message: Execution failed with error message: ScriptExecutionException was caused by StreamAccessException. StreamAccessException was caused by ValidationException.
Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: [REDACTED]
Failed due to inner exception of type: DecoderFallbackException | session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c ErrorCode: ScriptExecution.StreamAccess.Validation InnerException Error Code: ScriptExecution.StreamAccess.Validation Validation Error Code: InvalidEncoding Validation Target: TextFile Failed Step: 78059bb0-278f-4c7f-9c21-01a0cccf7b96 Error Message: ScriptExecutionException was caused by StreamAccessException. StreamAccessException was caused by ValidationException.
Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: Unable to translate bytes [8B] at index 1 from specified code page to Unicode.
Unable to translate bytes [8B] at index 1 from specified code page to Unicode. | session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c ErrorResponse {
"error": {
"code": "UserError",
"message": "Execution failed with error message: ScriptExecutionException was caused by StreamAccessException.\r\n StreamAccessException was caused by ValidationException.\r\n Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: [REDACTED]\r\n Failed due to inner exception of type: DecoderFallbackException\r\n| session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c ErrorCode: ScriptExecution.StreamAccess.Validation"
} }

This kind of error usually happens if the base input is not our supported OS version.
Unable to read file using Unicode (UTF-8) -> this is the key point in the error occurred
str_value = raw_data.decode('utf-8')
using the above code block convert the input and then perform the operation.

Since you're working on a collection of .json files I'd suggest using a FileDataset (if you want to work with the jsons) as you're currently doing.
If you'd prefer working with the data in tabular form, then I'd suggest doing some preprocessing to flatten the json files into a pandas dataframe before saving it as a dataset on AzureML. Then use the register_pandas_dataframe method from the DatasetFactory class to save this dataframe. This will ensure that when you fetch the Dataset from azure, the to_pandas_dataframe() method will work. Just be aware that some datatypes such as numpy arrays are not supported when using the register_pandas_dataframe() method.
The issue with creating a tabular set from json files and then converting this to a pandas dataframe once you've begun working with it (in a run or notebook), is that you're expecting azure to handle the flattening/processing.
Alternatively, you can also look at the from_json_lines method since it might suit your use case better.

Tweepy pagination KeyError: 0

I tried using tweepy's pagination based on the code provided in it's documentation:
```
import tweepy
auth = tweepy.AppAuthHandler("Consumer Key here", "Consumer Secret here")
api = tweepy.API(auth)
for status in tweepy.Cursor(api.search_tweets, "Tweepy",
count=100).items(250):
print(status.id)
```
However, I get the following error:
```
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_16136/3940301818.py in <module>
----> 1 for status in tweepy.Cursor(api.search_tweets, "Tweepy",
2 count=100).items(250):
3 print(status.id)
C:\ProgramData\Anaconda3\lib\site-packages\tweepy\cursor.py in __next__(self)
84
85 def __next__(self):
---> 86 return self.next()
87
88 def next(self):
C:\ProgramData\Anaconda3\lib\site-packages\tweepy\cursor.py in next(self)
290 self.page_index += 1
291 self.num_tweets += 1
--> 292 return self.current_page[self.page_index]
293
294 def prev(self):
KeyError: 0
```
Can someone explain and rectify the error please?

With the current version of Tweepy 4.8.0, the AuthHandler syntax has changed.
Update Tweepy:
pip install Tweepy -U
and the following should work:
import tweepy
auth = tweepy.OAuth2AppHandler("Consumer Key here", "Consumer Secret here")
api = tweepy.API(auth)
for status in tweepy.Cursor(api.search_tweets, "Tweepy",
count=100).items(250):
print(status.id)

How to load a torchvision model from disk?

I'm trying to follow the solution from the top answer here to load an object detection model from the .pth file.
os.environ['TORCH_HOME'] = '../input/torchvision-fasterrcnn-resnet-50/' #setting the environment variable
model = detection.fasterrcnn_resnet50_fpn(pretrained=False).to(DEVICE)
I get the following error
NotADirectoryError: [Errno 20] Not a directory: '../input/torchvision-fasterrcnn-resnet-50/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth/hub'
google did not reveal an answer to the error and I don't exactly know what it means except for the obvious (that folder 'hub' is missing).
Do I have to unpack or create a folder?
I have tried loading the weights but I get the same error message.
this is how I load the model
model = detection.fasterrcnn_resnet50_fpn(pretrained=True)
checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
thank you for your help!
Full Error Trace:
gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
/tmp/ipykernel_42/1218627017.py in <module>
1 # to load
----> 2 model = detection.fasterrcnn_resnet50_fpn(pretrained=True)
3 checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
4 model.load_state_dict(checkpoint['state_dict'])
/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/faster_rcnn.py in fasterrcnn_resnet50_fpn(pretrained, progress, num_classes, pretrained_backbone, trainable_backbone_layers, **kwargs)
360 if pretrained:
361 state_dict = load_state_dict_from_url(model_urls['fasterrcnn_resnet50_fpn_coco'],
--> 362 progress=progress)
363 model.load_state_dict(state_dict)
364 return model
/opt/conda/lib/python3.7/site-packages/torch/hub.py in load_state_dict_from_url(url, model_dir, map_location, progress, check_hash, file_name)
553 r = HASH_REGEX.search(filename) # r is Optional[Match[str]]
554 hash_prefix = r.group(1) if r else None
--> 555 download_url_to_file(url, cached_file, hash_prefix, progress=progress)
556
557 if _is_legacy_zip_format(cached_file):
/opt/conda/lib/python3.7/site-packages/torch/hub.py in download_url_to_file(url, dst, hash_prefix, progress)
423 # certificates in older Python
424 req = Request(url, headers={"User-Agent": "torch.hub"})
--> 425 u = urlopen(req)
426 meta = u.info()
427 if hasattr(meta, 'getheaders'):
/opt/conda/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
220 else:
221 opener = _opener
--> 222 return opener.open(url, data, timeout)
223
224 def install_opener(opener):
/opt/conda/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
523 req = meth(req)
524
--> 525 response = self._open(req, data)
526
527 # post-process response
/opt/conda/lib/python3.7/urllib/request.py in _open(self, req, data)
541 protocol = req.type
542 result = self._call_chain(self.handle_open, protocol, protocol +
--> 543 '_open', req)
544 if result:
545 return result
/opt/conda/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
501 for handler in handlers:
502 func = getattr(handler, meth_name)
--> 503 result = func(*args)
504 if result is not None:
505 return result
/opt/conda/lib/python3.7/urllib/request.py in https_open(self, req)
1391 def https_open(self, req):
1392 return self.do_open(http.client.HTTPSConnection, req,
-> 1393 context=self._context, check_hostname=self._check_hostname)
1394
1395 https_request = AbstractHTTPHandler.do_request_
/opt/conda/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
1350 encode_chunked=req.has_header('Transfer-encoding'))
1351 except OSError as err: # timeout error
-> 1352 raise URLError(err)
1353 r = h.getresponse()
1354 except:
URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

If you are loading a pretrained network, you don't need to load the model from torchvision pretrained (as in pretrained by torchvision on ImageNet using pretrained=True). You have two options:
Either set pretrained=False and load you weights using:
checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
Or if you decide to change TORCH_HOME (which is not ideal) you need to keep the same directory structure Torchvision has which would be:
inputs/hub/checkpoints/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth
In practice, you wouldn't change TORCH_HOME just to load one model.

I found the solution digging deep into github, to the problem, which is a little hidden.
detection.()
has a default argument besides pretrained, it's called pretrained_backbone which by default is set to true, which if True sets the models to download from a dictionary path of urls.
this will work:
detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone = False, num_classes = 91).
then load the model as usual.
num_classes is expected, in the docs it's a default = 91 but in github i saw it as None, which is why I added it here for saftey.

Connect to S3 accelerate endpoint with boto3

I want to download a file into a Python file object from an S3 bucket that has acceleration activated. I came across a few resources suggesting whether to overwrite the endpoint_url to "s3-accelerate.amazonaws.com" and/or to use the use_accelerate_endpoint attribute.
I have tried both, and several variations but the same error was returned everytime. One of the scripts I tried is:
from botocore.config import Config
import boto3
from io import BytesIO
session = boto3.session.Session()
s3 = session.client(
service_name='s3',
aws_access_key_id=<MY_KEY_ID>,
aws_secret_access_key=<MY_KEY>,
region_name="us-west-2",
config=Config(s3={"use_accelerate_endpoint": True,
"addressing_style": "path"}))
input = BytesIO()
s3.download_fileobj(<MY_BUCKET>,<MY_KEY>, input)
Returns the following error:
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)
<ipython-input-61-92b89b45f215> in <module>()
11 "addressing_style": "path"}))
12 input = BytesIO()
---> 13 s3.download_fileobj(bucket, filename, input)
14
15
~/Project/venv/lib/python3.5/site-packages/boto3/s3/inject.py in download_fileobj(self, Bucket, Key, Fileobj, ExtraArgs, Callback, Config)
568 bucket=Bucket, key=Key, fileobj=Fileobj,
569 extra_args=ExtraArgs, subscribers=subscribers)
--> 570 return future.result()
571
572
~/Project//venv/lib/python3.5/site-packages/s3transfer/futures.py in result(self)
71 # however if a KeyboardInterrupt is raised we want want to exit
72 # out of this and propogate the exception.
---> 73 return self._coordinator.result()
74 except KeyboardInterrupt as e:
75 self.cancel()
~/Project/venv/lib/python3.5/site-packages/s3transfer/futures.py in result(self)
231 # final result.
232 if self._exception:
--> 233 raise self._exception
234 return self._result
235
~/Project/venv/lib/python3.5/site-packages/s3transfer/tasks.py in _main(self, transfer_future, **kwargs)
253 # Call the submit method to start submitting tasks to execute the
254 # transfer.
--> 255 self._submit(transfer_future=transfer_future, **kwargs)
256 except BaseException as e:
257 # If there was an exception raised during the submission of task
~/Project/venv/lib/python3.5/site-packages/s3transfer/download.py in _submit(self, client, config, osutil, request_executor, io_executor, transfer_future)
347 Bucket=transfer_future.meta.call_args.bucket,
348 Key=transfer_future.meta.call_args.key,
--> 349 **transfer_future.meta.call_args.extra_args
350 )
351 transfer_future.meta.provide_transfer_size(
~/Project/venv/lib/python3.5/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
310 "%s() only accepts keyword arguments." % py_operation_name)
311 # The "self" in this scope is referring to the BaseClient.
--> 312 return self._make_api_call(operation_name, kwargs)
313
314 _api_call.__name__ = str(py_operation_name)
~/Project/venv/lib/python3.5/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
603 error_code = parsed_response.get("Error", {}).get("Code")
604 error_class = self.exceptions.from_code(error_code)
--> 605 raise error_class(parsed_response, operation_name)
606 else:
607 return parsed_response
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
When I run the same script with "use_accelerate_endpoint": False it works fine.
However, it returned the same error when:
I overwrite the endpoint_url with "s3-accelerate.amazonaws.com"
I define "addressing_style": "virtual"
When running
s3.get_bucket_accelerate_configuration(Bucket=<MY_BUCKET>)
I get {..., 'Status': 'Enabled'} as expected.
Any idea what is wrong with that code and what I should change to properly query the accelerate endpoint of that bucket?
Using python3.5 with boto3==1.4.7, botocore==1.7.43 on Ubuntu 17.04.
EDIT:
I have also tried a similar script for uploads:
from botocore.config import Config
import boto3
from io import BytesIO
session = boto3.session.Session()
s3 = session.client(
service_name='s3',
aws_access_key_id=<MY_KEY_ID>,
aws_secret_access_key=<MY_KEY>,
region_name="us-west-2",
config=Config(s3={"use_accelerate_endpoint": True,
"addressing_style": "virtual"}))
output = BytesIO()
output.seek(0)
s3.upload_fileobj(output, <MY_BUCKET>,<MY_KEY>)
Which works without the use_accelerate_endpoint option (so my keys are fine), but returns this error when True:
ClientError: An error occurred (SignatureDoesNotMatch) when calling the PutObject operation: The request signature we calculated does not match the signature you provided. Check your key and signing method.
I have tried both addressing_style options here as well (virtual and path)

Using boto3==1.4.7 and botocore==1.7.43.
Here is one way to retrieve an object from a bucket with transfer acceleration enabled.
import boto3
from botocore.config import Config
from io import BytesIO
config = Config(s3={"use_accelerate_endpoint": True})
s3_resource = boto3.resource("s3",
aws_access_key_id=<MY_KEY_ID>,
aws_secret_access_key=<MY_KEY>,
region_name="us-west-2",
config=config)
s3_client = s3_resource.meta.client
file_object = BytesIO()
s3_client.download_fileobj(<MY_BUCKET>, <MY_KEY>, file_object)
Note that the client sends a HEAD request to the accelerated endpoint before a GET.
The canonical request of which looks somewhat like the following:
CanonicalRequest:
HEAD
/<MY_KEY>
host:<MY_BUCKET>.s3-accelerate.amazonaws.com
x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date:20200520T204128Z
host;x-amz-content-sha256;x-amz-date
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Some reasons why the HEAD request can fail include:
Object with given key doesn't exist or has strict access control enabled
Invalid credentials
Transfer acceleration isn't enabled

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

InvalidRequestError: Unrecognized request argument supplied: request_timeout in OpenAI LLM - openai-api

Related

Received client error (400) deploying huggingface bigscience/bloom to SageMaker

Azureml TabularDataset to_pandas_dataframe() returns InvalidEncoding error

Tweepy pagination KeyError: 0

How to load a torchvision model from disk?

Connect to S3 accelerate endpoint with boto3

Categories

Resources