How do I use build_full_result() in Boto3 - python-3.x

I was Googling around to understand how boto3 paginator works, and found a solution that potentially doesn't require writing any logic with NextToken and While loops.
Still, I'm not quite sure what I'm getting when I'm using this:
client = boto3.client('ec2', region_name='eu-west-1')
results = (
client.get_paginator('describe_instances')
.paginate()
.build_full_result()
)
print(results)
I got a huge JSON output and I'm not sure whether I got what I wanted, which is basically the output of all of my EC2 instances.
I'm also not sure how to loop over it, I keep getting TypeError: string indices must be integers which didn't happen before when I used something like:
for instance in response_iterator:
instance = instance['Reservations'][0]
instance_id = instance['Instances'][0]['InstanceId']
print(instance_id)
I would love to understand how to use the build_full_result() method.
I saw a post that says that it's not documented yet, pretty recent to now (as of writing this post).

Interesting find.. this isn't mentioned anywhere in the latest version of boto3 documentation, however it does appear to properly return all available results.
Below is an example from Lambda that shows how to perform a simple loop through the response.. you can update the last two lines to handle the response syntax from EC2 describe instances.
import boto3
client = boto3.client('lambda')
results = (
client.get_paginator('list_functions')
.paginate()
.build_full_result()
)
for result in results['Functions']:
print(result['FunctionName'])

Related

How to retrieve nested output from XCom using taskflow syntax in Airflow

Well, I know this seems to be possible I just don't know how. To begin with, I am using traditional operators (without #task decorator) but I am interested in XComArgs return output format from these operators that can be used in downstream tasks. Below is a sample example
task_1 = DummyOperator(
task_id = 'task_1'
) # returns {"data": {"foo" : [{"cmd": "ls"}]}}
task_2 = BashOperator(
task_2='task_2',
cmd=task_1.output['return_value']['data']['foo'][0]['cmd'] # does not give what I need and returns null.
#cmd = f"{{ ti.xcom_pull(task_ids = 'task_1', key='return_value')['data']['foo'][0]['cmd'] }}" Gives what I need
)
In this example what is working for me which is pure Jinja templating and the new syntax does not work for me using XComArgs. I have tried changing the argument render_template_as_native_obj=True in Dag configuration but does not change anything. I want to use .output format which returns XcomArgs object and is returning the complete dict but have not been able to use the nested keys like above. Also, have tried converting string to JSON and all those combinations but does not seem to work.
Unfortunately, retrieving nested values from XComArgs in a limitation of the TaskFlow API.
The TaskFlow API uses __getitem__ to override the XCom key to use. In your example, the key ends up being "cmd" rather than the value of what cmd represents in that nested object. You'll have to use the original ti.xcom_pull() method until that limitation is addressed.

How Can I Check For The Existence of a UUID

I am trying to check for the existence of a UUID as a primary key in my Django environment...and when it exists...my code works fine...But if it's not present I get a "" is not a Valid UUID...
Here's my code....
uuid_exists = Book.objects.filter(id=self.object.author_pk,is_active="True").first()
I've tried other variations of this with .exists() or .all()...but I keep getting the ['“” is not a valid UUID.'] error.
I did come up with a workaround....
if self.object.author_pk is not '':
book_exists = Book.objects.filter(id=self.object.author_pk,is_active="True").first()
context['author_exists'] = author_exists
Is this the best way to do this? I was hoping to be able to use a straight filter...without clarifying logic....But I've worked all afternoon and can't seem to come up with anything better. Thanks in advance for any feedback or comments.
I've had the same issue and this is what I have:
Wrapping it into try/except (in my case it's a View so it's supposed to return a Response object)
try:
object = Object.objects.get(id=object_id)
except Exception as e:
return Response(data={...}, status=status.HTTP_40...
It gets to the exception (4th line) but somehow sends '~your_id~' is not a valid UUID. text instead of proper data. Which might be enough in some cases.
This seems like an overlook, so might as well get a fix soon. I don't have enough time to investigate deeper, unfortunately.
So the solution I came up with is not ideal either but hopefully is a bit cleaner and faster than what you're using rn.
# Generate a list of possible object IDs (make use of filters in order to reduce the DB load)
possible_ids = [str(id) for id in Object.objects.filter(~ filters here ~).values_list('id', flat=True)]
# Return an error if ID is not valid
if ~your_id~ not in possible_ids:
return Response(data={"error": "Database destruction sequence initialized!"}, status=status.HTTP_401_UNAUTHORIZED)
# Keep working with the object
object = Objects.objects.get(id=object_id)

Can't list bucket objects on Scaleway using boto3

I saw a few similar posts, but unfortunately none helped me.
I have an s3 bucket (on scaleway), and I'm trying to simply list all objects contained in that bucket, using boto3 s3 client as follow:
s3 = boto3.client('s3',
region_name=AWS_S3_REGION_NAME,
endpoint_url=AWS_S3_ENDPOINT_URL,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
all_objects = s3.list_objects_v2(Bucket=AWS_STORAGE_BUCKET_NAME)
This simple piece of code responds with an error:
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the ListObjects operation: The specified key does not exist.
First, the error seems inapropriate to me since I'm not specifying any key to search. I also tried to pass a Prefix argument to this method to narrow down the search to a specific subdirectory, same error.
Second, I tried to achieve the same thing using boto3 Resource rather than Client, as follow:
session = boto3.Session(
region_name=AWS_S3_REGION_NAME,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
resource = session.resource(
's3',
endpoint_url=AWS_S3_ENDPOINT_URL,
)
for bucket in resource.buckets.all():
print(bucket.name)
That code produces absolutely nothing. One weird thing that strikes me is that I don't pass the bucket_name anywhere here, which seems to be normal according to aws documentation
There's no chance that I misconfigured the client, since I'm able to use the put_object method perfectly with that same client. One strange though: when I want to put a file, I pass the whole path to put_object as Key (as I found it to be the way to go), but the object is inserted with the bucket name prepend to it. So let's say I call put_object(Key='/path/to/myfile.ext'), the object will end up to be /bucket-name/path/to/myfile.ext.
Is this strange behavior the key to my problem ? How can I investigate what's happening, or is there another way I could try to list bucket files ?
Thank you
EDIT: So, after logging the request that boto3 client is sending, I noticed that the bucket name is append to the url, so instead of requesting https://<bucket_name>.s3.<region>.<provider>/, it requests https://<bucket_name>.s3.<region>.<provider>/<bucket-name>/, which is leading to the NoSuchKey error.
I took a look into the botocore library, and I found this:
url = _urljoin(endpoint_url, r['url_path'], host_prefix)
in botocore.awsrequest line 252, where r['url_path'] contains /skichic-bucket?list-type=2. So from here, I should be able to easily patch the library core to make it work for me.
Plus, the Prefix argument is not working, whatever I pass into it I always receive the whole bucket content, but I guess I can easily patch this too.
Now it's not satisfying, since there's no issue related to this on github, I can't believe that the library contains such a bug that I'm the first one to encounter.
Does anyone can explain this whole mess ? >.<
For those who are facing the same issue, try changing your endpoint_url parameter in your boto3 client or resource instantiation from https://<bucket_name>.s3.<region>.<provider> to https://s3.<region>.<provider> ; i.e for Scaleway : https://s3.<region>.scw.cloud.
You can then set the Bucket parameter to select the bucket you want.
list_objects_v2(Bucket=<bucket_name>)
you can try this. you'll have to use your resource instead of my s3sr.
s3sr = resource('s3')
bucket = 'your-bucket'
prefix = 'your-prefix/' # if no prefix, pass ''
def get_keys_from_prefix(bucket, prefix):
'''gets list of keys for given bucket and prefix'''
keys_list = []
paginator = s3sr.meta.client.get_paginator('list_objects_v2')
# use Delimiter to limit search to that level of hierarchy
for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/'):
keys = [content['Key'] for content in page.get('Contents')]
print('keys in page: ', len(keys))
keys_list.extend(keys)
return keys_list
keys_list = get_keys_from_prefix(bucket, prefix)
After looking more closely into things, I've found out that (a lot) of botocore services endpoints patterns starts with the bucket name. For example, here's the definition of the list_objects_v2 service:
"ListObjectsV2":{
"name":"ListObjectsV2",
"http":{
"method":"GET",
"requestUri":"/{Bucket}?list-type=2"
},
My guess is that in the standard implementation of AWS S3, there's a genericendpoint_url (which explains #jordanm comment) and the targeted bucket is reached through the endpoint.
Now, in the case of Scaleway, there's an endpoint_url for each bucket, with the bucket name contained in that url (e.g https://<bucket_name>.s3.<region>.<provider>), and any endpoint should directly starts with a bucket Key.
I made a fork of botocore where I rewrote every endpoint to remove the bucket name, if that can help someone in the future.
Thank's again to all contributors !

Google Cloud Datastore Cursor with google.cloud.ndb

I am working with Google Cloud Datastore using the latest google.cloud.ndb library
I am trying to implement pagination use Cursor using the following code.
The same is not fetching the data correctly.
[1] To Fetch Data:
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5)
This code works fine and fetches 5 entities from MyModel
I want to implementation pagination that can be integrated with a Web frontend
[2] To Fetch Next Set of Data
from google.cloud.ndb._datastore_query import Cursor
nextpage_value = "2"
nextcursor = Cursor(cursor=nextpage_value.encode()) # Converts to bytes
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5, start_cursor= nextcursor)
[3] To Fetch Previous Set of Data
previouspage_value = "1"
prevcursor = Cursor(cursor=previouspage_value.encode())
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5, start_cursor=prevcursor)
The [2] & [3] sets of code do not fetch paginated data, but returns results same as results of codebase [1].
Please note I'm working with Python 3 and using the
latest "google.cloud.ndb" Client library to interact with Datastore
I have referred to the following link https://github.com/googleapis/python-ndb
I am new to Google Cloud, and appreciate all the help I can get.
Firstly, it seems to me like you are expecting to use the wrong kind of pagination. You are trying to use numeric values, whereas the datastore cursor is providing cursor-based pagination.
Instead of passing in byte-encoded integer values (like 1 or 2), the datastore is expecting tokens that look similar to this: 'CjsSNWoIb3Z5LXRlc3RyKQsSBFVzZXIYgICAgICAgAoMCxIIQ3ljbGVEYXkiCjIwMjAtMTAtMTYMGAAgAA=='
Such a cursor you can obtain from the first call to the fetch_page() method, which returns a tuple:
(results, cursor, more) where results is a list of query results, cursor is a cursor pointing just after the last result returned, and more indicates whether there are (likely) more results after that
Secondly, you should be using fetch_page() instead of fetch_page_async(), since the second method does not return you the cursors you need for pagination. Internally, fetch_page() is calling fetch_page_async() to get your query results.
Thirdly and lastly, I am not entirely sure whether the "previous page" use-case is doable using the datastore-provided pagination. It may be that you need to implement that yourself manually, by storing some of the cursors.
I hope that helps and good luck!

Any examples of using multilogicadapter in chatterbot?

I am trying to combine multiple logic adapters in Python chatterbot. I cannot seem to get it right. I tried this:
english_bot = ChatBot("English Bot",
storage_adapter="chatterbot.storage.SQLStorageAdapter",
multi_logic_adapter = [
"chatterbot.logic.MathematicalEvaluation",
"chatterbot.logic.TimeLogicAdapter",
"chatterbot.logic.BestMatch"]
)
Only BestMatch seems to be active
And I tried this:
english_bot = ChatBot("English Bot",
storage_adapter="chatterbot.storage.SQLStorageAdapter",
logic_adapter = [
"chatterbot.logic.multi_adapter.MultiLogicAdapter",
"chatterbot.logic.MathematicalEvaluation",
"chatterbot.logic.TimeLogicAdapter",
"chatterbot.logic.BestMatch"]
)
But I get this error: AttributeError: 'NoneType' object has no attribute 'confidence' and none of the logic_adapters seems to be active.
Thanks,
Herb
BestMatch
Adapter is the default adapter for chatterbot, You don't need explicitly specify that. More information http://chatterbot.readthedocs.io/en/stable/logic/index.html#best-match-adapter
And you code should like this
# -*- coding: utf-8 -*-
from chatterbot import ChatBot
bot = ChatBot(
"English Bot",
logic_adapters=[
"chatterbot.logic.MathematicalEvaluation",
"chatterbot.logic.TimeLogicAdapter"
]
)
# Print an example of getting one math based response
response = bot.get_response("What is 4 + 9?")
print(response)
# Print an example of getting one time based response
response = bot.get_response("What time is it?")
print(response)
Every logic adapter in logic_adapters=[] is automatically processed by MultiLogicAdapter. You might need to tweak the confidence levels though.
More info about the MultiLogicAdapter here:
http://chatterbot.readthedocs.io/en/stable/logic/multi-logic-adapter.html
Multi Logic adapter is an inbuilt class and is not explicitly defined in the code.You can see this statement in the introduction part:"ChatterBot internally uses a special logic adapter that allows it to choose the best response generated by any number of other logic adapters" This is the link - http://chatterbot.readthedocs.io/en/stable/logic/multi-logic-adapter.html
Also, similar query is already available on stackover flow. Refer this as well.
Error while using chatterbot
The MultiLogicAdapter typically doesn't get used directly in this way.
Each logic adapter that you add to the logic_adapters=[] will get processed by the MultiLogicAdapter internally by ChatterBot, no need to explicitly specify it.

Resources