Received client error (400) deploying huggingface bigscience/bloom to SageMaker

Received client error (400) deploying huggingface bigscience/bloom to SageMaker - python-3.x

I want to deploy Bloom on SageMaker so that I have a Bloom inference API I can use. I started by running the following in a SageMaker jupyter notebook:
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'bigscience/bloom',
'HF_TASK':'text-generation'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)
predictor.predict({
'inputs': "Can you please let us know more details about your "
})
which produced:
---------------------------------------------------------------------------
ModelError Traceback (most recent call last)
/tmp/ipykernel_15151/842216467.py in <cell line: 1>()
----> 1 predictor.predict({
2 'inputs': "Can you please let us know more details about your "
3 })
~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
159 data, initial_args, target_model, target_variant, inference_id
160 )
--> 161 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
162 return self._handle_response(response)
163
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
393 "%s() only accepts keyword arguments." % py_operation_name)
394 # The "self" in this scope is referring to the BaseClient.
--> 395 return self._make_api_call(operation_name, kwargs)
396
397 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
723 error_code = parsed_response.get("Error", {}).get("Code")
724 error_class = self.exceptions.from_code(error_code)
--> 725 raise error_class(parsed_response, operation_name)
726 else:
727 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027bloom\u0027"
}
". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2022-07-29-23-06-38-076 in account 162923941922 for more information.
The cloudwatch log just shows:
2022-07-29T23:09:09,135 [INFO ] W-bigscience__bloom-4-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise PredictionException(str(e), 400)
How can I deploy it on sagemaker without encountering this problem?

I see that you are using this example. 4XX error occurs usually when the model is not downloaded and not available from HuggingFace Hub or not deployed onto SageMaker. I would suggest to check your SageMaker console if the model is deployed and do a Inference using local mode.

Related

Azureml TabularDataset to_pandas_dataframe() returns InvalidEncoding error

When I run:
datasetTabular = Dataset.get_by_name(ws, "<Redacted>")
datasetTabular.to_pandas_dataframe()
The following error is returned. What can I do to get past this?
ExecutionError Traceback (most recent call last) File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\dataset_error_handling.py:101, in _try_execute(action, operation, dataset_info, **kwargs)
100 else:
--> 101 return action()
102 except Exception as e:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\tabular_dataset.py:169, in TabularDataset.to_pandas_dataframe.<locals>.<lambda>()
168 dataflow = get_dataflow_for_execution(self._dataflow, 'to_pandas_dataframe', 'TabularDataset')
--> 169 df = _try_execute(lambda: dataflow.to_pandas_dataframe(on_error=on_error,
170 out_of_range_datetime=out_of_range_datetime),
171 'to_pandas_dataframe',
172 None if self.id is None else {'id': self.id, 'name': self.name, 'version': self.version})
173 fine_grain_timestamp = self._properties.get(_DATASET_PROP_TIMESTAMP_FINE, None)
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_loggerfactory.py:213, in track.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
212 try:
--> 213 return func(*args, **kwargs)
214 except Exception as e:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\dataflow.py:697, in Dataflow.to_pandas_dataframe(self, extended_types, nulls_as_nan, on_error, out_of_range_datetime)
696 with tracer.start_as_current_span('Dataflow.to_pandas_dataframe', trace.get_current_span()) as span:
--> 697 return get_dataframe_reader().to_pandas_dataframe(self,
698 extended_types,
699 nulls_as_nan,
700 on_error,
701 out_of_range_datetime,
702 to_dprep_span_context(span.get_context()))
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_dataframereader.py:386, in _DataFrameReader.to_pandas_dataframe(self, dataflow, extended_types, nulls_as_nan, on_error, out_of_range_datetime, span_context)
384 if have_pyarrow() and not extended_types and not inconsistent_schema:
385 # if arrow is supported, and we didn't get inconsistent schema, and extended typed were not asked for - fallback to feather
--> 386 return clex_feather_to_pandas()
387 except _InconsistentSchemaError as e:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_dataframereader.py:298, in
_DataFrameReader.to_pandas_dataframe.<locals>.clex_feather_to_pandas()
297 activity_data = dataflow_to_execute._dataflow_to_anonymous_activity_data(dataflow_to_execute)
--> 298 dataflow._engine_api.execute_anonymous_activity(
299 ExecuteAnonymousActivityMessageArguments(anonymous_activity=activity_data, span_context=span_context))
301 try:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\_aml_helper.py:38, in update_aml_env_vars.<locals>.decorator.<locals>.wrapper(op_code, message, cancellation_token)
37 engine_api_func().update_environment_variable(changed)
---> 38 return send_message_func(op_code, message, cancellation_token)
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\engineapi\api.py:160, in EngineAPI.execute_anonymous_activity(self, message_args, cancellation_token)
158 #update_aml_env_vars(get_engine_api)
159 def execute_anonymous_activity(self, message_args: typedefinitions.ExecuteAnonymousActivityMessageArguments, cancellation_token: CancellationToken = None) -> None:
--> 160 response = self._message_channel.send_message('Engine.ExecuteActivity', message_args, cancellation_token)
161 return response
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\engineapi\engine.py:291, in MultiThreadMessageChannel.send_message(self, op_code, message, cancellation_token)
290 cancel_on_error()
--> 291 raise_engine_error(response['error'])
292 else:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\dataprep\api\errorhandlers.py:10, in raise_engine_error(error_response)
9 if 'ScriptExecution' in error_code:
---> 10 raise ExecutionError(error_response)
11 if 'Validation' in error_code:
ExecutionError: Error Code: ScriptExecution.StreamAccess.Validation Validation Error Code: InvalidEncoding Validation Target: TextFile Failed Step: 78059bb0-278f-4c7f-9c21-01a0cccf7b96 Error Message: ScriptExecutionException was caused by StreamAccessException. StreamAccessException was caused by ValidationException.
Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: Unable to translate bytes [8B] at index 1 from specified code page to Unicode.
Unable to translate bytes [8B] at index 1 from specified code page to Unicode. | session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c
During handling of the above exception, another exception occurred:
UserErrorException Traceback (most recent call last) Input In [34], in <module>
1 # preview the first 3 rows of the dataset
2 #datasetTabular.take(3)
----> 3 datasetTabular.take(3).to_pandas_dataframe()
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\_loggerfactory.py:132, in track.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
130 with _LoggerFactory.track_activity(logger, func.__name__, activity_type, custom_dimensions) as al:
131 try:
--> 132 return func(*args, **kwargs)
133 except Exception as e:
134 if hasattr(al, 'activity_info') and hasattr(e, 'error_code'):
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\tabular_dataset.py:169, in TabularDataset.to_pandas_dataframe(self, on_error, out_of_range_datetime)
158 """Load all records from the dataset into a pandas DataFrame.
159
160 :param on_error: How to handle any error values in the dataset, such as those produced by an error while (...)
166 :rtype: pandas.DataFrame
167 """
168 dataflow = get_dataflow_for_execution(self._dataflow, 'to_pandas_dataframe', 'TabularDataset')
--> 169 df = _try_execute(lambda: dataflow.to_pandas_dataframe(on_error=on_error,
170 out_of_range_datetime=out_of_range_datetime),
171 'to_pandas_dataframe',
172 None if self.id is None else {'id': self.id, 'name': self.name, 'version': self.version})
173 fine_grain_timestamp = self._properties.get(_DATASET_PROP_TIMESTAMP_FINE, None)
175 if fine_grain_timestamp is not None and df.empty is False:
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\dataset_error_handling.py:104, in _try_execute(action, operation, dataset_info, **kwargs)
102 except Exception as e:
103 message, is_dprep_exception = _construct_message_and_check_exception_type(e, dataset_info, operation)
--> 104 _dataprep_error_handler(e, message, is_dprep_exception)
File C:\ProgramData\Anaconda3_2\envs\amlds\lib\site-packages\azureml\data\dataset_error_handling.py:154, in _dataprep_error_handler(e, message, is_dprep_exception)
152 for item in user_exception_list:
153 if _contains(item, getattr(e, 'error_code', 'Unexpected')):
--> 154 raise UserErrorException(message, inner_exception=e)
156 raise AzureMLException(message, inner_exception=e)
UserErrorException: UserErrorException: Message: Execution failed with error message: ScriptExecutionException was caused by StreamAccessException. StreamAccessException was caused by ValidationException.
Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: [REDACTED]
Failed due to inner exception of type: DecoderFallbackException | session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c ErrorCode: ScriptExecution.StreamAccess.Validation InnerException Error Code: ScriptExecution.StreamAccess.Validation Validation Error Code: InvalidEncoding Validation Target: TextFile Failed Step: 78059bb0-278f-4c7f-9c21-01a0cccf7b96 Error Message: ScriptExecutionException was caused by StreamAccessException. StreamAccessException was caused by ValidationException.
Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: Unable to translate bytes [8B] at index 1 from specified code page to Unicode.
Unable to translate bytes [8B] at index 1 from specified code page to Unicode. | session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c ErrorResponse {
"error": {
"code": "UserError",
"message": "Execution failed with error message: ScriptExecutionException was caused by StreamAccessException.\r\n StreamAccessException was caused by ValidationException.\r\n Unable to read file using Unicode (UTF-8). Attempted read range 0:777. Lines read in the range 0. Decoding error: [REDACTED]\r\n Failed due to inner exception of type: DecoderFallbackException\r\n| session_id=295acf7e-4af9-42f1-b04a-79f3c5a0f98c ErrorCode: ScriptExecution.StreamAccess.Validation"
} }

This kind of error usually happens if the base input is not our supported OS version.
Unable to read file using Unicode (UTF-8) -> this is the key point in the error occurred
str_value = raw_data.decode('utf-8')
using the above code block convert the input and then perform the operation.

Since you're working on a collection of .json files I'd suggest using a FileDataset (if you want to work with the jsons) as you're currently doing.
If you'd prefer working with the data in tabular form, then I'd suggest doing some preprocessing to flatten the json files into a pandas dataframe before saving it as a dataset on AzureML. Then use the register_pandas_dataframe method from the DatasetFactory class to save this dataframe. This will ensure that when you fetch the Dataset from azure, the to_pandas_dataframe() method will work. Just be aware that some datatypes such as numpy arrays are not supported when using the register_pandas_dataframe() method.
The issue with creating a tabular set from json files and then converting this to a pandas dataframe once you've begun working with it (in a run or notebook), is that you're expecting azure to handle the flattening/processing.
Alternatively, you can also look at the from_json_lines method since it might suit your use case better.

Azure ML Tabular Dataset : missing 1 required positional argument: 'stream_column'

For the Python API for tabular dataset of AzureML (azureml.data.TabularDataset), there are two experimental methods which have been introduced:
download(stream_column, target_path=None, overwrite=False, ignore_not_found=True)
mount(stream_column, mount_point=None)
Parameter stream_column has been defined as The stream column to mount or download.
What is the actual meaning of stream_column? I don't see any example any where?
Any pointer will be helpful.
The stack trace:
Method download: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_11561/3904436543.py in <module>
----> 1 tab_dataset.download(target_path="../data/tabular")
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azureml/_base_sdk_common/_docstring_wrapper.py in wrapped(*args, **kwargs)
50 def wrapped(*args, **kwargs):
51 module_logger.warning("Method {0}: {1} {2}".format(func.__name__, _method_msg, _experimental_link_msg))
---> 52 return func(*args, **kwargs)
53 return wrapped
54
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azureml/data/_loggerfactory.py in wrapper(*args, **kwargs)
130 with _LoggerFactory.track_activity(logger, func.__name__, activity_type, custom_dimensions) as al:
131 try:
--> 132 return func(*args, **kwargs)
133 except Exception as e:
134 if hasattr(al, 'activity_info') and hasattr(e, 'error_code'):
TypeError: download() missing 1 required positional argument: 'stream_column'

Update on 5th March, 2022
I posted this as a support ticket with Azure. Following is the answer I have received:
As you can see from our documentation of TabularDataset Class,
the “stream_column” parameter is required. So, that error is occurring
because you are not passing any parameters when you are calling the
download method. The “stream_column” parameter should have the
stream column to download/mount. So, you need to pass the column name
that contains the paths from which the data will be streamed.
Please find an example here.

How to load a torchvision model from disk?

I'm trying to follow the solution from the top answer here to load an object detection model from the .pth file.
os.environ['TORCH_HOME'] = '../input/torchvision-fasterrcnn-resnet-50/' #setting the environment variable
model = detection.fasterrcnn_resnet50_fpn(pretrained=False).to(DEVICE)
I get the following error
NotADirectoryError: [Errno 20] Not a directory: '../input/torchvision-fasterrcnn-resnet-50/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth/hub'
google did not reveal an answer to the error and I don't exactly know what it means except for the obvious (that folder 'hub' is missing).
Do I have to unpack or create a folder?
I have tried loading the weights but I get the same error message.
this is how I load the model
model = detection.fasterrcnn_resnet50_fpn(pretrained=True)
checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
thank you for your help!
Full Error Trace:
gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
/tmp/ipykernel_42/1218627017.py in <module>
1 # to load
----> 2 model = detection.fasterrcnn_resnet50_fpn(pretrained=True)
3 checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
4 model.load_state_dict(checkpoint['state_dict'])
/opt/conda/lib/python3.7/site-packages/torchvision/models/detection/faster_rcnn.py in fasterrcnn_resnet50_fpn(pretrained, progress, num_classes, pretrained_backbone, trainable_backbone_layers, **kwargs)
360 if pretrained:
361 state_dict = load_state_dict_from_url(model_urls['fasterrcnn_resnet50_fpn_coco'],
--> 362 progress=progress)
363 model.load_state_dict(state_dict)
364 return model
/opt/conda/lib/python3.7/site-packages/torch/hub.py in load_state_dict_from_url(url, model_dir, map_location, progress, check_hash, file_name)
553 r = HASH_REGEX.search(filename) # r is Optional[Match[str]]
554 hash_prefix = r.group(1) if r else None
--> 555 download_url_to_file(url, cached_file, hash_prefix, progress=progress)
556
557 if _is_legacy_zip_format(cached_file):
/opt/conda/lib/python3.7/site-packages/torch/hub.py in download_url_to_file(url, dst, hash_prefix, progress)
423 # certificates in older Python
424 req = Request(url, headers={"User-Agent": "torch.hub"})
--> 425 u = urlopen(req)
426 meta = u.info()
427 if hasattr(meta, 'getheaders'):
/opt/conda/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
220 else:
221 opener = _opener
--> 222 return opener.open(url, data, timeout)
223
224 def install_opener(opener):
/opt/conda/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
523 req = meth(req)
524
--> 525 response = self._open(req, data)
526
527 # post-process response
/opt/conda/lib/python3.7/urllib/request.py in _open(self, req, data)
541 protocol = req.type
542 result = self._call_chain(self.handle_open, protocol, protocol +
--> 543 '_open', req)
544 if result:
545 return result
/opt/conda/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
501 for handler in handlers:
502 func = getattr(handler, meth_name)
--> 503 result = func(*args)
504 if result is not None:
505 return result
/opt/conda/lib/python3.7/urllib/request.py in https_open(self, req)
1391 def https_open(self, req):
1392 return self.do_open(http.client.HTTPSConnection, req,
-> 1393 context=self._context, check_hostname=self._check_hostname)
1394
1395 https_request = AbstractHTTPHandler.do_request_
/opt/conda/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
1350 encode_chunked=req.has_header('Transfer-encoding'))
1351 except OSError as err: # timeout error
-> 1352 raise URLError(err)
1353 r = h.getresponse()
1354 except:
URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

If you are loading a pretrained network, you don't need to load the model from torchvision pretrained (as in pretrained by torchvision on ImageNet using pretrained=True). You have two options:
Either set pretrained=False and load you weights using:
checkpoint = torch.load('../input/torchvision-fasterrcnn-resnet-50/model.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
Or if you decide to change TORCH_HOME (which is not ideal) you need to keep the same directory structure Torchvision has which would be:
inputs/hub/checkpoints/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth
In practice, you wouldn't change TORCH_HOME just to load one model.

I found the solution digging deep into github, to the problem, which is a little hidden.
detection.()
has a default argument besides pretrained, it's called pretrained_backbone which by default is set to true, which if True sets the models to download from a dictionary path of urls.
this will work:
detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone = False, num_classes = 91).
then load the model as usual.
num_classes is expected, in the docs it's a default = 91 but in github i saw it as None, which is why I added it here for saftey.

Can not find the pytorch model when loading BERT model in Python

I am following this article to find the text similarity.
The code I have is this:
from sentence_transformers import SentenceTransformer
from tqdm import tqdm
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
import pandas as pd
documents = [
"Vodafone Wins ₹ 20,000 Crore Tax Arbitration Case Against Government",
"Voda Idea shares jump nearly 15% as Vodafone wins retro tax case in Hague",
"Gold prices today fall for 4th time in 5 days, down ₹6500 from last month high",
"Silver futures slip 0.36% to Rs 59,415 per kg, down over 12% this week",
"Amazon unveils drone that films inside your home. What could go wrong?",
"IPHONE 12 MINI PERFORMANCE MAY DISAPPOINT DUE TO THE APPLE B14 CHIP",
"Delhi Capitals vs Chennai Super Kings: Prithvi Shaw shines as DC beat CSK to post second consecutive win in IPL",
"French Open 2020: Rafael Nadal handed tough draw in bid for record-equaling 20th Grand Slam"
]
model = SentenceTransformer('sentence-transformers/bert-base-nli-mean-tokens')
I get an error when running the above code:
Full:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\anaconda3\envs\py3_nlp\lib\tarfile.py in nti(s)
188 s = nts(s, "ascii", "strict")
--> 189 n = int(s.strip() or "0", 8)
190 except ValueError:
ValueError: invalid literal for int() with base 8: 'ld_tenso'
During handling of the above exception, another exception occurred:
InvalidHeaderError Traceback (most recent call last)
~\anaconda3\envs\py3_nlp\lib\tarfile.py in next(self)
2298 try:
-> 2299 tarinfo = self.tarinfo.fromtarfile(self)
2300 except EOFHeaderError as e:
~\anaconda3\envs\py3_nlp\lib\tarfile.py in fromtarfile(cls, tarfile)
1092 buf = tarfile.fileobj.read(BLOCKSIZE)
-> 1093 obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
1094 obj.offset = tarfile.fileobj.tell() - BLOCKSIZE
~\anaconda3\envs\py3_nlp\lib\tarfile.py in frombuf(cls, buf, encoding, errors)
1034
-> 1035 chksum = nti(buf[148:156])
1036 if chksum not in calc_chksums(buf):
~\anaconda3\envs\py3_nlp\lib\tarfile.py in nti(s)
190 except ValueError:
--> 191 raise InvalidHeaderError("invalid header")
192 return n
InvalidHeaderError: invalid header
During handling of the above exception, another exception occurred:
ReadError Traceback (most recent call last)
~\anaconda3\envs\py3_nlp\lib\site-packages\torch\serialization.py in _load(f, map_location,
pickle_module, **pickle_load_args)
594 try:
--> 595 return legacy_load(f)
596 except tarfile.TarError:
~\anaconda3\envs\py3_nlp\lib\site-packages\torch\serialization.py in legacy_load(f)
505
--> 506 with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as
tar, \
507 mkdtemp() as tmpdir:
~\anaconda3\envs\py3_nlp\lib\tarfile.py in open(cls, name, mode, fileobj, bufsize, **kwargs)
1590 raise CompressionError("unknown compression type %r" % comptype)
-> 1591 return func(name, filemode, fileobj, **kwargs)
1592
~\anaconda3\envs\py3_nlp\lib\tarfile.py in taropen(cls, name, mode, fileobj, **kwargs)
1620 raise ValueError("mode must be 'r', 'a', 'w' or 'x'")
-> 1621 return cls(name, mode, fileobj, **kwargs)
1622
~\anaconda3\envs\py3_nlp\lib\tarfile.py in __init__(self, name, mode, fileobj, format, tarinfo, dereference, ignore_zeros, encoding, errors, pax_headers, debug, errorlevel, copybufsize)
1483 self.firstmember = None
-> 1484 self.firstmember = self.next()
1485
~\anaconda3\envs\py3_nlp\lib\tarfile.py in next(self)
2310 elif self.offset == 0:
-> 2311 raise ReadError(str(e))
2312 except EmptyHeaderError:
ReadError: invalid header
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
~\anaconda3\envs\py3_nlp\lib\site-packages\transformers\modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1210 try:
-> 1211 state_dict = torch.load(resolved_archive_file, map_location="cpu")
1212 except Exception:
~\anaconda3\envs\py3_nlp\lib\site-packages\torch\serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
425 pickle_load_args['encoding'] = 'utf-8'
--> 426 return _load(f, map_location, pickle_module, **pickle_load_args)
427 finally:
~\anaconda3\envs\py3_nlp\lib\site-packages\torch\serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
598 # .zip is used for torch.jit.save and will throw an un-pickling error here
--> 599 raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
600 # if not a tarfile, reset file offset and proceed
RuntimeError: C:\Users\user1/.cache\torch\sentence_transformers\sentence-transformers_bert-base-nli-mean-tokens\pytorch_model.bin is a zip archive (did you mean to use torch.jit.load()?)
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
<ipython-input-3-bba56aac60aa> in <module>
----> 1 model = SentenceTransformer('sentence-transformers/bert-base-nli-mean-tokens')
~\anaconda3\envs\py3_nlp\lib\site-packages\sentence_transformers\SentenceTransformer.py in __init__(self, model_name_or_path, modules, device, cache_folder)
88
89 if os.path.exists(os.path.join(model_path, 'modules.json')): #Load as SentenceTransformer model
---> 90 modules = self._load_sbert_model(model_path)
91 else: #Load with AutoModel
92 modules = self._load_auto_model(model_path)
~\anaconda3\envs\py3_nlp\lib\site-packages\sentence_transformers\SentenceTransformer.py in _load_sbert_model(self, model_path)
820 for module_config in modules_config:
821 module_class = import_from_string(module_config['type'])
--> 822 module = module_class.load(os.path.join(model_path, module_config['path']))
823 modules[module_config['name']] = module
824
~\anaconda3\envs\py3_nlp\lib\site-packages\sentence_transformers\models\Transformer.py in load(input_path)
122 with open(sbert_config_path) as fIn:
123 config = json.load(fIn)
--> 124 return Transformer(model_name_or_path=input_path, **config)
125
126
~\anaconda3\envs\py3_nlp\lib\site-packages\sentence_transformers\models\Transformer.py in __init__(self, model_name_or_path, max_seq_length, model_args, cache_dir, tokenizer_args, do_lower_case, tokenizer_name_or_path)
27
28 config = AutoConfig.from_pretrained(model_name_or_path, **model_args, cache_dir=cache_dir)
---> 29 self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)
30 self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path if tokenizer_name_or_path is not None else model_name_or_path, cache_dir=cache_dir, **tokenizer_args)
31
~\anaconda3\envs\py3_nlp\lib\site-packages\transformers\models\auto\auto_factory.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
393 if type(config) in cls._model_mapping.keys():
394 model_class = _get_model_class(config, cls._model_mapping)
--> 395 return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
396 raise ValueError(
397 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
~\anaconda3\envs\py3_nlp\lib\site-packages\transformers\modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1212 except Exception:
1213 raise OSError(
-> 1214 f"Unable to load weights from pytorch checkpoint file for '{pretrained_model_name_or_path}' "
1215 f"at '{resolved_archive_file}'"
1216 "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. "
OSError: Unable to load weights from pytorch checkpoint file for 'C:\Users\user1/.cache\torch\sentence_transformers\sentence-transformers_bert-base-nli-mean-tokens\' at 'C:\Users\user1/.cache\torch\sentence_transformers\sentence-transformers_bert-base-nli-mean-tokens\pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
Short:
OSError: Unable to load weights from pytorch checkpoint file for 'C:\Users\user1/.cache\torch\sentence_transformers\sentence-transformers_bert-base-nli-mean-tokens' at 'C:\Users\user1/.cache\torch\sentence_transformers\sentence-transformers_bert-base-nli-mean-tokens\pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
I do have the pytorch_model.bin in the '.cache\torch\sentence_transformers\sentence-transformers_bert-base-nli-mean-tokens' folder.
Why am I getting this error?

The reason for the error seems to be that the pre-trained model weight files are not available or loadable.
You can try that one to load pretrained model weight file:
from transformers import AutoModel
model = AutoModel.from_pretrained('sentence-transformers/bert-base-nli-mean-tokens')
Reference: https://huggingface.co/sentence-transformers/bert-base-nli-mean-tokens
Also, the model's hugging face page says:
This model is deprecated. Please don't use it as it produces sentence embeddings of low quality. You can find recommended sentence embedding models here: SBERT.net - Pretrained Models
Maybe you might want to take a look.

You may need to use the model without sentence_transformers.
The following code is tweaked from https://www.sbert.net/examples/applications/computing-embeddings/README.html
As I understand it, from the exception you need to pass from_tf=True to AutoModel.
from transformers import AutoTokenizer, AutoModel
import torch
#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output[0] #First element of model_output contains all token embeddings
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)
sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)
return sum_embeddings / sum_mask
#Sentences we want sentence embeddings for
sentences = ['This framework generates embeddings for each input sentence',
'Sentences are passed as a list of string.',
'The quick brown fox jumps over the lazy dog.']
#Load AutoModel from huggingface model repository
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/bert-base-nli-mean-tokens')
model = AutoModel.from_pretrained('sentence-transformers/bert-base-nli-mean-tokens',from_tf=True)
#Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=128, return_tensors='pt')
#Compute token embeddings
with torch.no_grad():
model_output = model(**encoded_input)
#Perform pooling. In this case, mean pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

Connect to S3 accelerate endpoint with boto3

I want to download a file into a Python file object from an S3 bucket that has acceleration activated. I came across a few resources suggesting whether to overwrite the endpoint_url to "s3-accelerate.amazonaws.com" and/or to use the use_accelerate_endpoint attribute.
I have tried both, and several variations but the same error was returned everytime. One of the scripts I tried is:
from botocore.config import Config
import boto3
from io import BytesIO
session = boto3.session.Session()
s3 = session.client(
service_name='s3',
aws_access_key_id=<MY_KEY_ID>,
aws_secret_access_key=<MY_KEY>,
region_name="us-west-2",
config=Config(s3={"use_accelerate_endpoint": True,
"addressing_style": "path"}))
input = BytesIO()
s3.download_fileobj(<MY_BUCKET>,<MY_KEY>, input)
Returns the following error:
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)
<ipython-input-61-92b89b45f215> in <module>()
11 "addressing_style": "path"}))
12 input = BytesIO()
---> 13 s3.download_fileobj(bucket, filename, input)
14
15
~/Project/venv/lib/python3.5/site-packages/boto3/s3/inject.py in download_fileobj(self, Bucket, Key, Fileobj, ExtraArgs, Callback, Config)
568 bucket=Bucket, key=Key, fileobj=Fileobj,
569 extra_args=ExtraArgs, subscribers=subscribers)
--> 570 return future.result()
571
572
~/Project//venv/lib/python3.5/site-packages/s3transfer/futures.py in result(self)
71 # however if a KeyboardInterrupt is raised we want want to exit
72 # out of this and propogate the exception.
---> 73 return self._coordinator.result()
74 except KeyboardInterrupt as e:
75 self.cancel()
~/Project/venv/lib/python3.5/site-packages/s3transfer/futures.py in result(self)
231 # final result.
232 if self._exception:
--> 233 raise self._exception
234 return self._result
235
~/Project/venv/lib/python3.5/site-packages/s3transfer/tasks.py in _main(self, transfer_future, **kwargs)
253 # Call the submit method to start submitting tasks to execute the
254 # transfer.
--> 255 self._submit(transfer_future=transfer_future, **kwargs)
256 except BaseException as e:
257 # If there was an exception raised during the submission of task
~/Project/venv/lib/python3.5/site-packages/s3transfer/download.py in _submit(self, client, config, osutil, request_executor, io_executor, transfer_future)
347 Bucket=transfer_future.meta.call_args.bucket,
348 Key=transfer_future.meta.call_args.key,
--> 349 **transfer_future.meta.call_args.extra_args
350 )
351 transfer_future.meta.provide_transfer_size(
~/Project/venv/lib/python3.5/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
310 "%s() only accepts keyword arguments." % py_operation_name)
311 # The "self" in this scope is referring to the BaseClient.
--> 312 return self._make_api_call(operation_name, kwargs)
313
314 _api_call.__name__ = str(py_operation_name)
~/Project/venv/lib/python3.5/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
603 error_code = parsed_response.get("Error", {}).get("Code")
604 error_class = self.exceptions.from_code(error_code)
--> 605 raise error_class(parsed_response, operation_name)
606 else:
607 return parsed_response
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
When I run the same script with "use_accelerate_endpoint": False it works fine.
However, it returned the same error when:
I overwrite the endpoint_url with "s3-accelerate.amazonaws.com"
I define "addressing_style": "virtual"
When running
s3.get_bucket_accelerate_configuration(Bucket=<MY_BUCKET>)
I get {..., 'Status': 'Enabled'} as expected.
Any idea what is wrong with that code and what I should change to properly query the accelerate endpoint of that bucket?
Using python3.5 with boto3==1.4.7, botocore==1.7.43 on Ubuntu 17.04.
EDIT:
I have also tried a similar script for uploads:
from botocore.config import Config
import boto3
from io import BytesIO
session = boto3.session.Session()
s3 = session.client(
service_name='s3',
aws_access_key_id=<MY_KEY_ID>,
aws_secret_access_key=<MY_KEY>,
region_name="us-west-2",
config=Config(s3={"use_accelerate_endpoint": True,
"addressing_style": "virtual"}))
output = BytesIO()
output.seek(0)
s3.upload_fileobj(output, <MY_BUCKET>,<MY_KEY>)
Which works without the use_accelerate_endpoint option (so my keys are fine), but returns this error when True:
ClientError: An error occurred (SignatureDoesNotMatch) when calling the PutObject operation: The request signature we calculated does not match the signature you provided. Check your key and signing method.
I have tried both addressing_style options here as well (virtual and path)

Using boto3==1.4.7 and botocore==1.7.43.
Here is one way to retrieve an object from a bucket with transfer acceleration enabled.
import boto3
from botocore.config import Config
from io import BytesIO
config = Config(s3={"use_accelerate_endpoint": True})
s3_resource = boto3.resource("s3",
aws_access_key_id=<MY_KEY_ID>,
aws_secret_access_key=<MY_KEY>,
region_name="us-west-2",
config=config)
s3_client = s3_resource.meta.client
file_object = BytesIO()
s3_client.download_fileobj(<MY_BUCKET>, <MY_KEY>, file_object)
Note that the client sends a HEAD request to the accelerated endpoint before a GET.
The canonical request of which looks somewhat like the following:
CanonicalRequest:
HEAD
/<MY_KEY>
host:<MY_BUCKET>.s3-accelerate.amazonaws.com
x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date:20200520T204128Z
host;x-amz-content-sha256;x-amz-date
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Some reasons why the HEAD request can fail include:
Object with given key doesn't exist or has strict access control enabled
Invalid credentials
Transfer acceleration isn't enabled

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Received client error (400) deploying huggingface bigscience/bloom to SageMaker - python-3.x

I see that you are using this example. 4XX error occurs usually when the model is not downloaded and not available from HuggingFace Hub or not deployed onto SageMaker. I would suggest to check your SageMaker console if the model is deployed and do a Inference using local mode.

Related

Azureml TabularDataset to_pandas_dataframe() returns InvalidEncoding error

Azure ML Tabular Dataset : missing 1 required positional argument: 'stream_column'

How to load a torchvision model from disk?

Can not find the pytorch model when loading BERT model in Python

Connect to S3 accelerate endpoint with boto3

Categories

Resources