How to batchGet global secundary index in DynamoDB?
These params gives me a schema error because this hash key is only in index table, main has other.
const params = {
RequestItems: {
"MyTableName": {
Keys: [
{
"ThisHashKeyIsOnlyInIndexTable": value
}
]
}
}
};
docClient.batchGet(params, (err, data) => {
// ...
}
Docs doesn't even mention how to batchGet only from index(es).
Unfortunately, the GetItem and BatchGetItem, can not access any indexes. You can't pass IndexName on params similar to Query API.
Highlighted the point relevant to the question.
ReturnConsumedCapacity — (String) Determines the level of detail about
provisioned throughput consumption that is returned in the response:
INDEXES - The response includes the aggregate ConsumedCapacity for the
operation, together with ConsumedCapacity for each table and secondary
index that was accessed. Note that some operations, such as GetItem
and BatchGetItem, do not access any indexes at all. In these cases,
specifying INDEXES will only return ConsumedCapacity information for
table(s).
TOTAL - The response includes only the aggregate ConsumedCapacity for
the operation. NONE - No ConsumedCapacity details are included in the
response.
Related
Is it possible to reference a newly created DynamoDB record in AWS Lambda? For example, retrieving and using the ID of the newly created record. Hoping this is possible without a query to retrieve the new record from DynamoDB.
const docClient = new AWS.DynamoDB.DocumentClient();
... // Omitting the rest of the code in this example
const params = {
TableName : 'ExampleTableName',
Item: {
id: uuid.v1()
}
}
try {
await docClient.put(params).promise();
} catch (err) {
return err;
}
// Reference newly created record to retrieve the ID.
Of course you can achieve it using ReturnValues Paramter.
ReturnValues:- return the item's attribute values in the same operation.
But I am afraid that if you want to achieve your purpose you need to use an alternative API which UpdateItem
From the Docs
Use ReturnValues if you want to get the item attributes as they appear before or after they are updated. For UpdateItem, the valid values are:
NONE - If ReturnValues is not specified, or if its value is NONE, then nothing is returned. (This setting is the default for ReturnValues.)
ALL_OLD - Returns all of the attributes of the item, as they appeared before the UpdateItem operation.
UPDATED_OLD - Returns only the updated attributes, as they appeared before the UpdateItem operation.
ALL_NEW - Returns all of the attributes of the item, as they appear after the UpdateItem operation.
UPDATED_NEW - Returns only the updated attributes, as they appear after the UpdateItem operation.
There is no additional cost associated with requesting a return value aside from the small network and processing overhead of receiving a larger response. No read capacity units are consumed.
The values returned are strongly consistent.
Why you can use returnValue with putItem? Reason -> https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html#API_PutItem_RequestSyntax which wont solve your purpose.
You can store return values in variable and add custom logic to proceed further. :)
I am trying to delete and update records in cosmosDB using my graphql/nodejs code and getting error - "Entity with the specified id does not exist in the system". Here is my code
deleteRecord: async (root, id) => {
const { resource: result } = await container.item(id.id, key).delete();
console.log(`Deleted item with id: ${id}`);
},
Somehow below code is not able to find record, even "container.item(id.id, key).read()" doesn't work.
await container.item(id.id, key)
But if I try to find record using query spec it works
await container.items.query('SELECT * from c where c.id = "'+id+'"' ).fetchNext()
FYI- I am able to fetch all records and create new item, so Connection to DB and reading/writing is not an issue.
What else can it be? Any pointer related to this will be helpful.
Thanks in advance.
It seems you pass the wrong key to item(id,key). According to the Note of this documentation:
In both the "update" and "delete" methods, the item has to be selected
from the database by calling container.item(). The two parameters
passed in are the id of the item and the item's partition key. In this
case, the parition key is the value of the "category" field.
So you need to pass the value of your partition key, not your partition key path.
For example, if you have document like below, and your partition key is '/category', you need to use this code await container.item("xxxxxx", "movie").
{
"id":"xxxxxx",
"category":"movie"
}
I'm trying to delete some items from DynamoDB table. My table has a global secondary index. And I'm wondering if it's possible to use the batchWrite method of the DocumentClient to delete items from GSI table. Or we can use GSI for fetching data only?
var params = {
RequestItems: {
'Table-1': [
{
DeleteRequest: {
Key: { HashKey: 'someKey' }
}
}
]
}
};
documentClient.batchWrite(params, function(err, data) {
if (err) console.log(err);
else console.log(data);
});
if it's possible please provide some example of params.
docs
You cannot delete from a GSI. These indexes are pretty much read only: you can’t mutate the data in the table through a global secondary index, so no inserting, deleting or updating.
You can only read from the GSI and then implement the necessary logic to delete the items in the main table by key.
Also, a batch operation doesn’t make it a whole lot more efficient to delete those items: yes, it saves on network calls (up to 25:1) but not on used write capacity.
I am working with NodeJS on Google App Engine with the Datastore database.
Due to the fact that Datastore does not have support the OR operator, I need to run multiple queries and combine the results.
I am planning to run multiple queries and then combine the results into a single array of entity objects. I have a single query working already.
Question: What is a reasonably efficient way to combine two (or more) sets of entities returned by Datastore including de-duplication? I believe this would be a "union" operation in terms of set theory.
Here is the basic query outline that will be run multiple times with some varying filters to achieve the OR conditions required.
//Set requester username
const requester = req.user.userName;
//Create datastore query on Transfer Request kind table
const task_history = datastore.createQuery('Task');
//Set query conditions
task_history.filter('requester', requester);
//Run datastore query
datastore.runQuery(task_history, function(err, entities) {
if(err) {
console.log('Task History JSON unable to return data results. Error message: ', err);
return;
//If query works and returns any entities
} else if (entities[0]) {
//Else if query works but does not return any entities return empty JSON response
res.json(entities); //HOW TO COMBINE (UNION) MULTIPLE SETS OF ENTITIES EFFICIENTLY?
return;
}
});
Here is my original post: Google Datastore filter with OR condition
IMHO the most efficient way would be to use Keys-only queries in the 1st stage, then perform the combination of the keys obtained into a single list (including deduplication), followed by obtaining the entities simply by key lookup. From Projection queries:
Keys-only queries
A keys-only query (which is a type of projection query) returns just
the keys of the result entities instead of the entities themselves, at
lower latency and cost than retrieving entire entities.
It is often more economical to do a keys-only query first, and then
fetch a subset of entities from the results, rather than executing a
general query which may fetch more entities than you actually need.
Here's how to create a keys-only query:
const query = datastore.createQuery()
.select('__key__')
.limit(1);
This method addresses several problems you may encounter when trying to directly combine lists of entities obtained through regular, non-keys-only queries:
you can't de-duplicate properly because you can't tell the difference between different entities with identical values and the same entity appearing in multiply query results
comparing entities by property values can be tricky and is definitely slower/more computing expensive than comparing just entity keys
if you can't process all the results in a single request you're incurring unnecessary datastore costs for reading them without actually using them
it is much simpler to split processing of entities in multiple requests (via task queues, for example) when handling just entity keys
There are some disadvantages as well:
it may be a bit slower because you're going to the datastore twice: once for the keys and once to get the actual entities
you can't take advantage of getting just the properties you need via non-keys-only projection queries
Here is the solution I created based on the advice provided in the accepted answer.
/*History JSON*/
module.exports.treqHistoryJSON = function(req, res) {
if (!req.user) {
req.user = {};
res.json();
return;
}
//Set Requester username
const loggedin_username = req.user.userName;
//Get records matching Requester OR Dataowner
//Google Datastore OR Conditions are not supported
//Workaround separate parallel queries get records matching Requester and Dataowner then combine results
async.parallel({
//Get entity keys matching Requester
requesterKeys: function(callback) {
getKeysOnly('TransferRequest', 'requester_username', loggedin_username, (treqs_by_requester) => {
//Callback pass in response as parameter
callback(null, treqs_by_requester)
});
},
//Get entity keys matching Dataowner
dataownerKeys: function(callback) {
getKeysOnly('TransferRequest', 'dataowner_username', loggedin_username, (treqs_by_dataowner) => {
callback(null, treqs_by_dataowner)
});
}
}, function(err, getEntities) {
if (err) {
console.log('Transfer Request History JSON unable to get entity keys Transfer Request. Error message: ', err);
return;
} else {
//Combine two arrays of entity keys into a single de-duplicated array of entity keys
let entity_keys_union = unionEntityKeys(getEntities.requesterKeys, getEntities.dataownerKeys);
//Get key values from entity key 'symbol' object type
let entity_keys_only = entity_keys_union.map((ent) => {
return ent[datastore.KEY];
});
//Pass in array of entity keys to get full entities
datastore.get(entity_keys_only, function(err, entities) {
if(err) {
console.log('Transfer Request History JSON unable to lookup multiple entities by key for Transfer Request. Error message: ', err);
return;
//If query works and returns any entities
} else {
processEntitiesToDisplay(res, entities);
}
});
}
});
};
/*
* Get keys-only entities by kind and property
* #kind string name of kind
* #property_type string property filtering by in query
* #filter_value string of filter value to match in query
* getEntitiesCallback callback to collect results
*/
function getKeysOnly(kind, property_type, filter_value, getEntitiesCallback) {
//Create datastore query
const keys_query = datastore.createQuery(kind);
//Set query conditions
keys_query.filter(property_type, filter_value);
//Select KEY only
keys_query.select('__key__');
datastore.runQuery(keys_query, function(err, entities) {
if(err) {
console.log('Get Keys Only query unable to return data results. Error message: ', err);
return;
} else {
getEntitiesCallback(entities);
}
});
}
/*
* Union two arrays of entity keys de-duplicate based on ID value
* #arr1 array of entity keys
* #arr2 array of entity keys
*/
function unionEntityKeys(arr1, arr2) {
//Create new array
let arr3 = [];
//For each element in array 1
for(let i in arr1) {
let shared = false;
for (let j in arr2)
//If ID in array 1 is same as array 2 then this is a duplicate
if (arr2[j][datastore.KEY]['id'] == arr1[i][datastore.KEY]['id']) {
shared = true;
break;
}
//If IDs are not the same add element to new array
if(!shared) {
arr3.push(arr1[i])
}
}
//Concat array 2 and new array 3
arr3 = arr3.concat(arr2);
return arr3;
}
I just wanted to write in for folks who stumble upon this...
There is a workaround for some cases of not having the OR operator if you can restructure your data a bit, using Array properties: https://cloud.google.com/datastore/docs/concepts/entities#array_properties
From the documentation:
Array properties can be useful, for instance, when performing queries with equality filters: an entity satisfies the query if any of its values for a property matches the value specified in the filter.
So, if you needed to query for all entities bearing one of multiple potential values, putting all of the possibilities for each entity into an Array property and then indexing it for your query should yield the results you want. (But, you'd need to maintain that additional property, or replace your existing properties with that Array implementation if it could work for all of what you need.)
I am using AWS.DynamoDB.DocumentClient in a nodejs program to fetch items from multiple Dynamodb tables. To make code simple, I choose to use BatchGetItem/BatchGet method.
The challenge is I need to fetch items based on a Global Secondary Index, e.g. name+age, rather than the initial primary key generated when creating the table. I went through BatchGetItem/BatchGet but not see any parameters of using Global Secondary Index.
I ran some testing with the following code
var params = {
RequestItems: {
'Table-1': {
Keys: [
{
name: 'abc',
age: 18,
},
]
}
}
};
var docClient = new AWS.DynamoDB.DocumentClient();
docClient.batchGet(params, function(err, data) {
if (err) console.log(err);
else console.log(data);
});
And got following error.
> ValidationException: The provided key element does not match the
> schema
Does it mean BatchGetItem/BatchGet can't use Global Secondary Index, and I have to read from tables one by one?
I don't believe so. You will likely have to query one-by-one.
INDEXES - The response includes the aggregate ConsumedCapacity for the operation, together with ConsumedCapacity for each table and secondary index that was accessed. Note that some operations, such as GetItem and BatchGetItem , do not access any indexes at all. In these cases, specifying INDEXES will only return ConsumedCapacity information for table(s).
Source: https://docs.aws.amazon.com/cli/latest/reference/dynamodb/batch-get-item.html