How to scan between date range using Lambda and DynamoDB? - node.js

I'm attempting to scan between a date range using a Node Lambda function. I have the data being scanned correctly, but I can't seem to get the date expression to work correctly.
var AWS = require('aws-sdk');
var dynamodb = new AWS.DynamoDB({apiVersion: '2012-08-10'});
exports.handler = function(event, context) {
var tableName = "MyDDBTable";
dynamodb.scan({
TableName : tableName,
FilterExpression: "start_date < :start_date",
ExpressionAttributeValues: {
":start_date": {
"S": "2016-12-01"
}
}
}, function(err, data) {
context.succeed(data);
});
};
This currently doesn't try to return between a range, it's just looking at a single date right now. I didn't want to add an and to the expression until I knew this was working.
A sample document in my DynamoDB is structured like so:
{
"end_date": {
"S": "2016-12-02"
},
"name": {
"S": "Name of document"
},
"start_date": {
"S": "2016-10-10"
},
"document_id": {
"N": "7"
}
}
The document_id is my primary key. I'm pretty new to this whole Lamdba / DynamoDB combination, so I may have this completely set up wrong, but this is what I've managed to complete through my research.
What I'm ultimately trying to achieve is given a start date and an end date, return all DynamoDB documents that have a date range within that. Any help would be greatly appreciated.

Firstly, the scan operation is correct. The dynamodb.scan should be executed in a loop until LastEvaluatedKey is not available. Please refer this blog.
The lambda is not returning the result because it would have not found the data in the first scan. If you can extend the scan until LastEvaluatedKey is not available, the lambda is likely to return the result.
For Query and Scan operations, DynamoDB calculates the amount of
consumed provisioned throughput based on item size, not on the amount
of data that is returned to an application.
If you query or scan for specific attributes that match values that
amount to more than 1 MB of data, you'll need to perform another Query
or Scan request for the next 1 MB of data. To do this, take the
LastEvaluatedKey value from the previous request, and use that value
as the ExclusiveStartKey in the next request. This approach will let
you progressively query or scan for new data in 1 MB increments.
BETWEEN Operator sample:-
FilterExpression: "start_date BETWEEN :date1 and :date2"

Related

How can I filter entries in dynamodb which has time_stamp more than 1 day?

I have a lambda function which queries dynamoDb table userDetailTable, and I want to filter only the entries whose timestamp(recorded in ms) has exceeded 1 day (86400000 ms) when subtracted from (new Date.getTime()). Can anyone suggest me the way for doing it in the right way ?
Dynamo Table has GSIndex as user_status which has value 'active' for all the entries and epoch_timestamp(timestamp in ms) as attribute used for filter expression.
In Lambda I am checking epoch_timestamp and trying to subtract epoch_timestamp with (new Date.getTime()) in the query, which I am not sure is even possible. Below is the code which has my query.
function getUserDetails(callback){
var params = {
TableName: 'userDetailTable',
IndexName: 'user_status-index',
KeyConditionExpression: 'user_status = :user_status',
FilterExpression: `expiration_time - ${new Date().getTime()} > :time_difference`,
ExpressionAttributeValues: {
':user_status': 'active',
':time_difference': '86400000' // 1 day in ms
}
};
docClient.query(params, function(err, data) {
if(err) {
callback(err, null)
} else{
callback(null, data)
}
})
}
Here's a rewrite of your code:
function getUserDetails(callback){
var params = {
TableName: 'userDetailTable',
IndexName: 'user_status-index',
KeyConditionExpression: 'user_status = :user_status',
FilterExpression: 'epoch_timestamp > :time_threshold_ms',
ExpressionAttributeValues: {
':user_status': 'active',
':time_threshold_ms': Date.now() - 86400000
}
};
docClient.query(params, function(err, data) {
if(err) {
callback(err, null)
} else{
callback(null, data)
}
})
}
Specifically, in the FilteExpression you cannot compute any date. Instead you should compare the item's epoch_timestamp attribute with time_threshold_ms which you compute once (for all items inspected by the query) at ExpressionAttributeValues
Please note though that you are can make this more efficient if you define a GSI which uses epoch_timestamp as its sort key (user_status can remain the partition key). Then, instead of placing the condition in the FilterExpression you will need to move it into KeyConditionExpression.
Also, when you use a FilterExpression you need to check the LastEvaluatedKey of the response. If it is not empty you need to issue a followup query with LastEvaluatedKey copied into the request's ExclusiveStartKey. Why? due to filtering it is possible that you will get no results from the "chunk" (or "page") examined by DDB. DDB only examines a single "chunck" at each query invocation. Issuing a followup query with ExclusiveStartKey will tell DDB to inspect the next "chunk".
(see https://dzone.com/articles/query-dynamodb-items-withnodejs for further details on that)
Alternatively, if you do not use filtering you are advised to use pass a Limit value in the request to tell DDB to stop after the desired number of items. However, if you do use filtering do not pass a Limit value as it will reduce the size of the "chunk" and you will need to do many more followup queries until you get your data.
You cannot perform a calculation in the filter expression but you can calculate it outside and use the result with a new inequality.
I think you are looking for items expiring after one day from now.
Something like
FilterExpression: 'expiration_time > :max_time',
ExpressionAttributeValues: {
':user_status': 'active',
':max_time': new Date().getTime() + 86400000 // 1 day in ms // i.e. one day from now.
}

DynamoDB Scan with FilterExpression in nodejs

I'm trying to retrieve all items from a DynamoDB table that match a FilterExpression, and although all of the items are scanned and half do match, the expected items aren't returned.
I have the following in an AWS Lambda function running on Node.js 6.10:
var AWS = require("aws-sdk"),
documentClient = new AWS.DynamoDB.DocumentClient();
function fetchQuotes(category) {
let params = {
"TableName": "quotient-quotes",
"FilterExpression": "category = :cat",
"ExpressionAttributeValues": {":cat": {"S": category}}
};
console.log(`params=${JSON.stringify(params)}`);
documentClient.scan(params, function(err, data) {
if (err) {
console.error(JSON.stringify(err));
} else {
console.log(JSON.stringify(data));
}
});
}
There are 10 items in the table, one of which is:
{
"category": "ChuckNorris",
"quote": "Chuck Norris does not sleep. He waits.",
"uuid": "844a0af7-71e9-41b0-9ca7-d090bb71fdb8"
}
When testing with category "ChuckNorris", the log shows:
params={"TableName":"quotient-quotes","FilterExpression":"category = :cat","ExpressionAttributeValues":{":cat":{"S":"ChuckNorris"}}}
{"Items":[],"Count":0,"ScannedCount":10}
The scan call returns all 10 items when I only specify TableName:
params={"TableName":"quotient-quotes"}
{"Items":[<snip>,{"category":"ChuckNorris","uuid":"844a0af7-71e9-41b0-9ca7-d090bb71fdb8","CamelCase":"thevalue","quote":"Chuck Norris does not sleep. He waits."},<snip>],"Count":10,"ScannedCount":10}
You do not need to specify the type ("S") in your ExpressionAttributeValues because you are using the DynamoDB DocumentClient. Per the documentation:
The document client simplifies working with items in Amazon DynamoDB by abstracting away the notion of attribute values. This abstraction annotates native JavaScript types supplied as input parameters, as well as converts annotated response data to native JavaScript types.
It's only when you're using the raw DynamoDB object via new AWS.DynamoDB() that you need to specify the attribute types (i.e., the simple objects keyed on "S", "N", and so on).
With DocumentClient, you should be able to use params like this:
const params = {
TableName: 'quotient-quotes',
FilterExpression: '#cat = :cat',
ExpressionAttributeNames: {
'#cat': 'category',
},
ExpressionAttributeValues: {
':cat': category,
},
};
Note that I also moved the field name into an ExpressionAttributeNames value just for consistency and safety. It's a good practice because certain field names may break your requests if you do not.
I was looking for a solution that combined KeyConditionExpression with FilterExpression and eventually I worked this out.
Where aws is the uuid. Id is an assigned unique number preceded with the text 'form' so I can tell I have form data, optinSite is so I can find enquiries from a particular site. Other data is stored, this is all I need to get the packet.
Maybe this can be of help to you:
let optinSite = 'https://theDomainIWantedTFilterFor.com/';
let aws = 'eu-west-4:EXAMPLE-aaa1-4bd8-9ean-1768882l1f90';
let item = {
TableName: 'Table',
KeyConditionExpression: "aws = :Aw and begins_with(Id, :form)",
FilterExpression: "optinSite = :Os",
ExpressionAttributeValues: {
":Aw" : { S: aws },
":form" : { S: 'form' },
":Os" : { S: optinSite }
}
};

DynamoDB, dynamic atomic update of mapped values with AWS Lambda (NodeJS runtime)

I am trying to figure out how I could perform atomic updates on an item where the source data contains mapped values with the keys of those maps being dynamic.
If you look at the sample data below, I am trying to figure out how I could do atomic updates of the values in BSSentDestIp and BSRecvDestIp over the same item. I was reading the documentation but the only thing I could find was list_append, which would leave me with a list of appended keys/values that I would need to traverse and sum later.
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.UpdateExpressions.html
Example of input data:
{
"RecordId": 31,
"UUID": "170ae748-f8cf-4df9-6e08-c0c8a5f029d4",
"UserId": "username",
"DeviceId": "e0:cb:4e:53:ae:ff",
"ExpireTime": 1501445446,
"StartTime": 1501441846,
"EndTime": 1501441856,
"MinuteId": 10,
"PacketCount": 1028,
"ByteSum": 834111,
"BSSent": 98035,
"BSRecv": 736076,
"BSSentDestIp": {
"151.101.129.69": 2518,
"192.168.1.254": 4780,
"192.168.1.80": 14089,
"192.33.31.162": 2386,
"54.239.30.232": 21815,
"54.239.31.129": 6423,
"54.239.31.69": 3255,
"54.239.31.83": 18447,
"98.138.253.109": 3020
},
"BSRecvDestIp": {
"151.101.129.69": 42414,
"151.101.57.174": 20792,
"192.230.66.108": 130175,
"192.33.31.162": 56398,
"23.194.140.100": 26209,
"54.239.26.209": 57210,
"54.239.31.129": 188747,
"54.239.31.69": 41115,
"98.138.253.109": 111775
}
}
NodeJS function executed via Lambda to update Dynamo:
function updateItem(UserIdValue, MinuteIdValue) {
var UpdateExpressionString = "set PacketCount = PacketCount + :PacketCount, \
ByteSum = ByteSum + :ByteSum, \
BSSent = BSSent + :BSSent, \
BSRecv = BSRecv + :BSRecv";
var params = {
TableName: gDynamoTable,
Key: {
"UserId": UserIdValue,
"MinuteId": MinuteIdValue
},
UpdateExpression: UpdateExpressionString,
ExpressionAttributeValues: {
":PacketCount": gRecordObject.PacketCount,
":ByteSum": gRecordObject.ByteSum,
":BSSent": gRecordObject.BSSent,
":BSRecv": gRecordObject.BSRecv
},
ReturnValues: "UPDATED_NEW"
};
dynamo.updateItem(params, function(err, data) {
if (err) {
console.log("updateItem Error: " + err);
} else {
console.log("updateItem Success: " + JSON.stringify(data));
}
});
}
Updating a singe item is atomic in DynamoDB if you read an item, and call PutItem it is guaranteed to be atomic. It either update all fields or update none of them.
Now the only issue that I see with that is that you can have write conflicts. Say if one process reads an item, update one map, while another process in parallel does the same thing it will result in one PutItem overwriting recent update and you can lose data.
To solve this issue you can use conditional updates. In a nutshell it allows you to update an item only if a specified condition is met. What you can do is to maintain a version number with every item. When you update an item you can increment a version attribute and, when you write an item, check that version number is the one you expect. Otherwise you need to read the item again (somebody updated it while you were working with it) perform your update again and try to write again.

Retrieving rows from dynamoDB from Lambda based on primary key?

Below is the code of my Lambda function. I'm having trouble querying rows based on the timestamps. My plan is to get all the rows from 5 seconds before the current time to the current time in milliseconds. TimeMillis(Number) stores the current time in miiliseconds and it is the primary key and the range key is PhoneId(String). Please help me with the solution or is there any way to overcome the problem?
I'm not able to get the output, it is throwing error.
'use strict';
var AWS = require("aws-sdk");
AWS.config.update({
region: "us-east-1",
});
var docClient = new AWS.DynamoDB.DocumentClient();
exports.handler = function(event, context, callback) {
var timemillis = new Date().getTime();
var timemillis1 = timemillis - 5000;
var params = {
TableName: 'Readings',
KeyConditionExpression: "TimeMillis = :tm and TimeMillis BETWEEN :from AND :to",
ExpressionAttributeValues: {
":tm" : "TimMillis",
":from" : timemillis1,
":to" : timemillis
}
};
docClient.query(params, function(err, data) {
if(err){
callback(err, null);
}
else{
callback(null, data);
}
});
};
Here is my DynamoDB table image.
You cannot have multiple conditions inside a KeyConditionExpression. What you can do is use a FilterExpression with KeyConditionExpression to narrow down the result set.
Quoting from the documentation,
Use the KeyConditionExpression parameter to provide a specific value
for the partition key. The Query operation will return all of the
items from the table or index with that partition key value. You can
optionally narrow the scope of the Query operation by specifying a
sort key value and a comparison operator in KeyConditionExpression. To
further refine the Query results, you can optionally provide a
FilterExpression. A FilterExpression determines which items within the
results should be returned to you. All of the other results are
discarded.
Also for the test, only supported test for partition key is equality. Other conditions can be applied to sort key.
partitionKeyName = :partitionkeyval AND sortKeyName = :sortkeyval
Another way is to create a GSI which support further querying. By the way, traditional RDBMS thinking would not work best with DynamoDB. You can read about best practices here.

AWS DynamoDB Node.js scan- certain number of results

I try to get first 10 items which satisfy condition from DynamoDB using lambda AWS. I was trying to use Limit parameter but it is (basis on that website)
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#scan-property
"maximum number of items to evaluate (not necessarily the number of matching items)".
How to get 10 first items which satisfy my condition?
var AWS = require('aws-sdk');
var db = new AWS.DynamoDB();
exports.handler = function(event, context) {
var params = {
TableName: "Events", //"StreamsLambdaTable",
ProjectionExpression: "ID, description, endDate, imagePath, locationLat, locationLon, #nm, startDate, #tp, userLimit", //specifies the attributes you want in the scan result.
FilterExpression: "locationLon between :lower_lon and :higher_lon and locationLat between :lower_lat and :higher_lat",
ExpressionAttributeNames: {
"#nm": "name",
"#tp": "type",
},
ExpressionAttributeValues: {
":lower_lon": {"N": event.low_lon},
":higher_lon": {"N": event.high_lon}, //event.high_lon}
":lower_lat": {"N": event.low_lat},
":higher_lat": {"N": event.high_lat}
}
};
db.scan(params, function(err, data) {
if (err) {
console.log(err); // an error occurred
}
else {
data.Items.forEach(function(record) {
console.log(
record.name.S + "");
});
context.succeed(data.Items);
}
});
};
I think you already know the reason behind this: the distinction that DynamoDB makes between ScannedCount and Count. As per this,
ScannedCount — the number of items that were queried or scanned,
before any filter expression was applied to the results.
Count — the
number of items that were returned in the response.
The fix for that is documented right above this:
For either a Query or Scan operation, DynamoDB might return a LastEvaluatedKey value if the operation did not return all matching items in the table. To get the full count of items that match, take the LastEvaluatedKey value from the previous request and use it as the ExclusiveStartKey value in the next request. Repeat this until DynamoDB no longer returns a LastEvaluatedKey value.
So, the answer to your question is: use the LastEvaluatedKey from DynamoDB response and Scan again.

Resources