DynamoDB Pagination via Boto3, NextToken is not present but LastEvaluatedKey is? - pagination

The documentation for boto3 and dynamodb paginators specify that NextToken should be returned when paging, and you would then include that token in the next query for StartingToken to resume a paging session (as would happen when accessing information via a RESTful API).
However, my testing shows that it doesn't return NextToken in the results, but rather LastEvaluatedKey. I was thinking I could use LastEvaluatedKey as the token, but that doesn't work?
paginator = client.get_paginator('scan')
page_iterator = paginator.paginate(TableName='test1', PaginationConfig={'PageSize': 1 , 'MaxItems': 5000, 'MaxSize': 1 })
for page in page_iterator:
print(page)
break
I would expect the page object returned from the page_iterator to include NextToken Key but it does not?
{'Items': [{'PK': {'S': '99'}, 'SK': {'S': '99'}, 'data': {'S': 'Test Item 99'}}], 'Count': 1, 'ScannedCount': 1, 'LastEvaluatedKey': {'PK': {'S': '99'}, 'SK': {'S': '99'}}, 'ResponseMetadata': {'RequestId': 'DUE559L8KVKVH8H7G0G2JH0LUNVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Mon, 27 May 2019 14:22:09 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '153', 'connection': 'keep-alive', 'x-amzn-requestid': 'DUE559L8KVKVH8H7G0G2JH0LUNVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '3759060959'}, 'RetryAttempts': 0}}
What am I missing?
UPDATE: Somehow related to this? How to use Boto3 pagination

There are a few ways to address this using the boto3 scan paginator.
The first option is to call build_full_result like so:
result = paginator.paginate(TableName="your_table", PaginationConfig={"MaxItems":10, "PageSize": 10}).build_full_result()
That returns a response with 10 items, and 'NextToken' is populated provided there are more than 10 items. This is probably the simplest way, you can just treat the MaxItems as your returned page size and if 'NextToken' is empty you are at the end of the scan.
I noticed if you don't specify a page size the results are the same, but the consumed capacity and 'ScannedCount' are higher.
Another way is to do the encoding of the 'StartingToken' using the TokenEncoder in botocore.paginate directly.
If the initial call to the paginator is like:
pagination_config = {
"MaxItems": 5000,
"PageSize": 10,
}
scan_iterator = scan_paginator.paginate(
TableName="your_table_name",
PaginationConfig=pagination_config
)
The paged results will be as the question describes. The first 10 results will be returned in the first page, and 'NextToken' isn't specified but 'LastEvaluatedKey' is.
To use it, encode the returned 'LastEvaluatedKey' as the 'ExclusiveStartKey' and pass that into as the 'StartingToken' in the pagination config.
from botocore.paginate import TokenEncoder
encoder = TokenEncoder()
for page in scan_iterator:
if "LastEvaluatedKey" in page:
encoded_token = encoder.encode({"ExclusiveStartKey": page["LastEvaluatedKey"]})
Then:
pagination_config = {
"MaxItems": 500,
"PageSize": 10,
"StartingToken": encoded_token
}
The reason to encode the primary key as the 'ExclusiveStartKey' is that it's what the actual scan API expects. Essentially the paginators are encoding / decoding the 'LastEvaluatedKey' and 'ExclusiveStartKey' into the 'NextToken' and 'StartingToken' values. If you do a base64 decode of the 'NextToken' returned when doing build_full_result you'll see that it is also uses the 'ExclusiveStartKey'.

Related

Can I reuse the data presented in a datatable?

I have a flask application that makes a call to my API to pull down some records. I then pass the json response to json2html, then to datatables to create a table. I plan on creating a button on my table with row select option.
My question is, can I use the data on my table to populate a form on another page? Effectively I want to be able to select the row, click update button which takes me to a form page with the form values populated from the row values. So if click row 1, and my form page with be populated,
id = 6a3d7026-43f3-67zt-9211-99dfc6fee82e
name = test
description = test
number = 20934120
....some other fields not from the table
I am not sure how I can achieve this without a database. Not looking for a full solution just some pointers on the best methods or some good documentation. Thanks
example response
{'count': 2, 'total': 2,
'data': [
{'id': '6a3d7026-43f3-67zt-9211-99dfc6fee82e',
'name': 'test',
'properties': {'Description#en': 'test', 'Number#en': '20934120'},
{'id': '6a3d7026-43f3-67zt-9211-99hdttbhh4ed',
'name': 'test',
'properties': {'Description#en': 'test', 'Number#en': '20934121'}}],
inside app.py
....REQUEST
response = requests.request("GET", url, headers=headers, data=payload)
data = json.loads(response.text)
output = json2html.convert(json = data, table_attributes="id=\"table1\" class=\"table1\")
return render_template("update.html", data=data, output=output)
inside update.html
{{output|safe}}
<script>
$(document).ready(function() {
$('#table1').DataTable( {
"searching": false,
dom: 'Bfrtip',
buttons: [
{
text: 'Update',
action: function ( e, dt, node, config ) {
alert( 'This is my button' );
}
}
]
} );
} );
</script>

How to get data using search criteria with limit using dynamoDB in NodeJS?

I want apply a bootstrap table for listing with pagination sorting and filtering,
pagination is done but when i go for the filtering at that time the data found from only limited data which limit define on LIMIT, so how to apply pagination sorting and filtering Query on Dynamodb just like MySql (limit,offset).
current Query :-
var params = {
TableName: tableName,
KeyConditionExpression: (req.body.search.length == 0) ? null : " #emailid = :search_Query",
FilterExpression : (req.body.search.length == 0) ? null : " contains (#name , :search_Query) OR contains (#lastName , :search_Query) ",
ExpressionAttributeNames: (req.body.search.length == 0) ? null : {
"#name": "name",
"#lastName": "lastName",
},
ExpressionAttributeValues: (req.body.search.length == 0) ? null : {
":search_Query" : req.body.search
},
Limit : req.body.limit ,
ExclusiveStartKey: (req.body.offset == 0) ? null : {emailid : req.body.emailid } ,
};
var lastData=[];
docClient.scan(params, function scanUntilDone(err, data) {
});
current behaviour
data fetch from only from defined limit if pass limit 5 and search data from only five data but i need five data from all data just like MySql(limit) database then how can i apply query
Thank You is advance
You would have to pass the search context to the client in your response. The LastEvaluatedKey for the "current" page as well as an offset for how many results on the page you included in the response. The client would have to include the search context in the next request. Use a higher limit on the actual DynamoDB requests to avoid paginating on the serverside requests.
Example (where the client wants neat pages of 5 or fewer results):
API Request 1: Limit=5
DDB Request 1: ExclusiveStartKey=null, Limit=100
DDB Response 1: Results=3, LastEvaluatedStartKey=A
DDB Request 2: ExclusiveStartKey=A, Limit=100
DDB Response 2: Results=77, LastEvaluatedStartKey=B
API Response 1: Results=[<3 results from page 1 + 2 from page 2>], SearchKey={A: 2} // the offset
API Request 2: Limit=5, SearchKey={A: 2}
DDB Request 3: ExclusiveStartKey=A, Limit=100
DDB Response 3: Results=77, LastEvaluatedStartKey=B
API Response 1: Results=[<5 results from page 2, starting at offset 2>], SearchKey={A: 7} // the new offset, same page still
It would probably be a good idea to encode/encrypt/obfuscate the search context in the response to the client.

printing in python a twitter response

I am making a request to a twitter URL
response = requests.get(url, auth=auth_obj)
If I do this:
for data in response:
print(data)
I get a response from it that contains all the data, username, location etc.
{"users":[{"id":2190618097,"id_str":"2190618097","name":"huramachi","screen_name":"huramachi73","location":"entre Vincennes et R
\u00e9publique ","description":"Nuclear winter is coming!!!","
So, that is json.
And if I do this:
json.dumps(parsed, indent= 4, sort_keys=True)
it looks like this:
{
"next_cursor": 1591504703761404265,
"next_cursor_str": "1591504703761404265",
"previous_cursor": 0,
"previous_cursor_str": "0",
"users": [
{
"blocked_by": false,
"blocking": false,
"contributors_enabled": false,
"created_at": "Tue Nov 12 16:07:59 +0000 2013",
"default_profile": true,
"default_profile_image": false,
"description": "Nuclear winter is coming!!!",
"entities": {
"description": {
"urls": []
}
},
"favourites_count": 781,
"follow_request_sent": false,
"followers_count": 188,
"following": false,
"friends_count": 2054,
If I do this, I get the info that parsed is a dictionary.
parsed = json.loads(response.text)
print (type(parsed))
but how do I print this dictionary, or organize it so that I can save entities in lists?
first of all, thanks to all those who tried to help.
Here is what worked for me and why.
Why:
Twitter returns a json response but this object has parents and childrens, so the printing can only be complete if you nest your loops and you know who the parent and the children are according to twitter dev docs:
Tweets are the basic atomic building block of all things Twitter. Tweets are also known as “status updates.” The Tweet object has a long list of ‘root-level’ attributes, including fundamental attributes such as id, created_at, and text. Tweet objects are also the ‘parent’ object to several child objects. Tweet child objects include user, entities, and extended_entities. Tweets that are geo-tagged will have a place child object.
You can see in there several print outs of how their json looks like and also in mine own print out above.
So, if I want to print out one of the children, for example, location, this is what worked for me:
objeto = json.loads(response.text)
for element in objeto['users']:
print (element['location'])

google analytics reporting api v4 not show isDataGolden

I am using Python example to do a query. Here is my requests:
body={
'reportRequests': [
{
'viewId': VIEW_ID,
'dateRanges': [{'startDate': '2018-01-01', 'endDate': '2018-01-16'}],
'metrics': [{'expression': 'ga:sessions'}],
'dimensions': [{'name': 'ga:country'}],
'samplingLevel' : 'LARGE'
}]
}
In the response, I checked and there are valid data from the response, however, I am looking for the "isDataGolden" field in the ReportData object and it's not present; only these fields are present 'totals', 'rowCount', 'rows', 'minimums', 'maximums'.
Does anyone have any thoughts on why field is not present? Google's documentation shows that it suppose to be there.
Thanks.
If any boolean field is absent, you can assume its value is false.
From the documentation:
Note: When parsing the response body, if a boolean field is absent, you can assume its value is false. Only boolean fields with true value are shown in the response; for example, the isDataGolden field will be in the response if its value is true.

Has the Bitcoin getblocktemplate response json changed? Where is the coinbasetxn now?

When calling getblocktemplate for the bitcoin rpc, the json (dictionary) response that comes back no longer seems to have a 'coinbasetxn' key in it. Where is the coinbase transaction now?
Sample response to getblocktemplate():
{'height': 453825, 'mintime': 1487548277, 'version': 536870912,
'coinbasevalue': 1306788433, 'previousblockhash':
'0000000000000000019ca63484f8251b15647869d4c36ec5b201277f3e2aa70b',
'rules': ['csv'], 'sigoplimit': 20000, 'weightlimit': 4000000,
'mutable': ['time', 'transactions', 'prevblock'], 'target':
'0000000000000000027e93000000000000000000000000000000000000000000',
'bits': '18027e93', 'longpollid':
'0000000000000000019ca63484f8251b15647869d4c36ec5b201277f3e2aa70b20632',
'vbrequired': 0, 'noncerange': '00000000ffffffff', 'curtime':
1487550914, 'coinbaseaux': {'flags': ''}, 'transactions': [{'depends':
[], 'data':
'0100000001fb5b6947704577fd09260adf7f80c92ada4776ca7674a5fb8af40df3c747293a010000006a473044022048aab0d8bd6c127696ce2cceb42693af2ae8eec561a33acf43577193070cd965022043667ae3c25661d251133b75fbefa8b9b5d3dddeedae028c90899c642f693479012103c7b4ed6b91df7eb7d2bd62a6257dd2e2fa79d07e81080b7e95bcd3d9e448f464feffffff02504b4500000000001976a914fd09ed8b3099ee1f67693c3c6c25ca7ecb150fcf88ac48c5374e000000001976a914a97d5d95cd416dc0f734cb028f0785a4f545cea988ac96ec0600',
'txid':
'2ed056bed0417623433d6b600dfb6afcebb8c22b1f42b6e776cf8263289fb5ac',
'sigops': 2, 'weight': 900, 'fee': 204345, 'hash':
'2ed056bed0417623433d6b600dfb6afcebb8c22b1f42b6e776cf8263289fb5ac'},
{'depends': [], 'data':
'01000000011e7d9c88fe30a4c9fa1ec229e7c11704ec0185e36...... (...rest of
transactions...)
bitcoin core doesn't provide a coinbasetxn in the block template; it's the pool software that does that. if you're coding mining software to connect directly to bitcoind without using a pool, you'll need to generate the coinbase transaction yourself.
Confirmed that it is the first transaction of the transaction list. Can close this question (no longer need help). Thank you.

Resources