I've been trying to find an explanation for this situation but I didn't find any.
I have two DynamoDb tables, both with two key indexes, one is a HASH key and the other is a RANGE key.
In the table where both keys are strings, I can query the database with just the HASH key like this (using the node sdk):
const params = {
TableName: process.env.DYNAMODB_TABLE,
Key: { id: sessionId },
};
const { Item } = await dynamoDb.get(params);
However, the same operation on the other table throws the mentioned error about The number of conditions on the keys is invalid
Here are the two table schemas:
This table definiton allows me to use the mentioned query.
SessionsDynamoDbTable:
Type: 'AWS::DynamoDB::Table'
DeletionPolicy: Retain
Properties:
AttributeDefinitions:
-
AttributeName: userId
AttributeType: S
-
AttributeName: id
AttributeType: S
-
AttributeName: startDate
AttributeType: S
KeySchema:
-
AttributeName: userId
KeyType: HASH
-
AttributeName: id
KeyType: RANGE
LocalSecondaryIndexes:
- IndexName: byDate
KeySchema:
- AttributeName: userId
KeyType: HASH
- AttributeName: startDate
KeyType: RANGE
Projection:
NonKeyAttributes:
- endDate
- name
ProjectionType: INCLUDE
BillingMode: PAY_PER_REQUEST
TableName: ${self:provider.environment.DYNAMODB_TABLE}
This does not allow me to make a query like the one mentioned
SessionsTable:
Type: 'AWS::DynamoDB::Table'
TimeToLiveDescription:
AttributeName: expiresAt
Enabled: true
Properties:
AttributeDefinitions:
-
AttributeName: id
AttributeType: S
-
AttributeName: expiresAt
AttributeType: N
KeySchema:
-
AttributeName: id
KeyType: HASH
-
AttributeName: expiresAt
KeyType: RANGE
BillingMode: PAY_PER_REQUEST
TableName: ${self:provider.environment.DYNAMODB_TABLE}
I'm including the entire table definition because I don't know if secondary indexes can have an impact or not on this problem.
You must provide the name of the partition key attribute and a single value for that attribute. Query returns all items with that partition key value. Optionally, you can provide a sort key attribute and use a comparison operator to refine the search results.more
get(params, callback) ⇒ AWS.Request
Returns a set of attributes for the item with the given primary key by delegating to AWS.DynamoDB.getItem().
In SessionsTable id is HASH key and in SessionsDynamoDbTable id in RANGE key.for SessionsDynamoDbTable you should provide HASH Key in addition to RANGE
key.
Related
I am very new to AWS and I have been reading the dynamoDb SDK documentation and the properties that you can specify when creating a Table are far more than the properties that you pass when creating a table using AWS CDK.
SDK example:
var AWS = require("aws-sdk");
AWS.config.update({
region: "us-west-2",
endpoint: "http://localhost:8000"
});
var dynamodb = new AWS.DynamoDB();
var params = {
TableName : "Movies",
KeySchema: [
{ AttributeName: "year", KeyType: "HASH"}, //Partition key
{ AttributeName: "title", KeyType: "RANGE" } //Sort key
],
AttributeDefinitions: [
{ AttributeName: "year", AttributeType: "N" },
{ AttributeName: "title", AttributeType: "S" }
],
ProvisionedThroughput: {
ReadCapacityUnits: 10,
WriteCapacityUnits: 10
}
};
dynamodb.createTable(params, function(err, data) {
if (err) {
console.error("Unable to create table. Error JSON:", JSON.stringify(err, null, 2));
} else {
console.log("Created table. Table description JSON:", JSON.stringify(data, null, 2));
}
});
CDK example:
import * as dynamodb from '#aws-cdk/aws-dynamodb';
const table = new dynamodb.Table(this, 'Hits', {
partitionKey: { name: 'path', type: dynamodb.AttributeType.STRING }
});
here are all the props you can set which are more high level table related settings:
https://docs.aws.amazon.com/cdk/api/latest/docs/#aws-cdk_aws-dynamodb.Table.html
so for example if I want to set the provision throughput in CDK how do I do it? or set AttributeDefinitions, or indexes?
Do I wait unit table creation is done and then modify the table properties via the SDK UpdateTable call?
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#updateTable-property
Billing Mode
DynamoDB supports two billing modes:
PROVISIONED - the default mode where the table and global secondary
indexes have configured read and write capacity.
PAY_PER_REQUEST - on-demand pricing and scaling. You only pay for what
you use and there is no read and write capacity for the table or its
global secondary indexes.
see the Billing Mode attribute:
cdk docs
Dynamodb is pretty much entirely implemented in CDK, but some of the properties are a bit more difficult to find if you aren't very familiar with.
Billing Mode is the property for Provisioned or on demand read/write capacity. It is a constant, so it would be used something like
billingMode: aws_dynamodb.BillingMode.PAY_PER_REQUEST
With CDK you often have to dive a little bit into the documentation to find what you want, but for the mainstream services - Lambda, S3, Dynamo - these are fully implemented in CDK.
And in any case, for other services that may not be, you can use any of the functions that start with Cfn as these are escape hatches that allow you to basically implement direct cloud formation template jsons from cdk
I am playing around with DynamoDb. I am not sure what is the purpose of StreamSpecification and why we should or shouldn't use it? I have read the documentation Aws - StreamSpecification but it does not explain much as what it does.
MovieTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: "Name"
AttributeType: "S"
- AttributeName: "Genre"
AttributeType: "S"
- AttributeName: "Rating"
AttributeType: "N"
- AttributeName: "DateReleased"
AttributeType: "S"
KeySchema:
- AttributeName: "Name"
KeyType: "HASH"
- AttributeName: "Genre"
KeyType: "RANGE"
- AttributeName: "Rating"
KeyType: "RANGE"
- AttributeName: "DateReleased"
KeyType: "RANGE"
TimeToLiveSpecification:
AttributeName: ExpireAfter
Enabled: false
SSESpecification:
SSEEnabled: true
The StreamSpecification allows you to enable the optional DynamoDB Streams support for this table. DynamoDB Streams allow you to read all the changes to a table as a stream - which you can use for various reasons such as replicating the same changes to another table, checking for suspicious activity, and so on. You can read an introduction to the DynamoDB Streams feature here.
If you don't want to enable a stream on your table (and since you didn't know what this was, you probably don't :-)), you can just ignore StreamSpecification.
I have an Azure CosmosDB SQP API account with one container "EmployeeContainer", with the partition key "personId". I have three different type of collections in this container. Their schema are as shown below:
Person Collection:
{
"Id": "1234569",
"personId" : "P1241234",
"FirstName": "The first name",
"LarstName": "The last name"
}
Person-Department Collection:
{
"Id": "923456757",
"personId" : "P1241234",
"departmentId": "dept01",
"unitId": "unit01",
"dateOfJoining": "joining date"
}
Department-Employees
{
"id": "678678",
"departmentId" : "dept01",
"departmentName": "IT",
"employees" : [
{ "personId": "P1241234" },
{ "personId": "P1241235" },
{ "personId": "P1241236" },
]
}
How will the data be stored in the logical partitions? PersonId is the partition key and all the collections have personId in it. So, the document in the person collection with the person id "P1241234" and the document in the Person-Department collection with person id "P1241234" will be stored in the same logical partition? How will be the data in the Department-Employees be stored?
This design is not optimal. You should combine Person and Person-Department into a single collection using personId as the partition key, then have a second container for departments that has departmentId as it's partition key with each person as another row and any other properties that you would need when querying that collection. Do not write code that queries both containers. Each should have all the data it needs to satisfy any request you make. Then use change feed to keep them in sync.
You can get more details on how to model this article here
Yes, that is true, documents with the same personId will be stored under the same logical partition (regardless of their type\schema). I'm not sure you can create documents without the partition key on a collection with a partition key, but if you can - all of them should be under the same logical partition (but I dont think you can create them).
In my serverless.yml file, I have specified a DynamoDB resource, something to this effect (see below). I'd like to know two things:
Why is it that I'm not seeing these tables get created when they don't exist, forcing me to manually enter AWS console and do so myself?
In my source code (nodejs), i'm not sure how I'd reference a table specified in the yml file instead of hardcoding it.
The two questions above roll up into a singular problem, which is that I'd like to be able to specify the tables in the yml and then when doing a "deploy", have a different table set created per environment.
i.e. If I deploy to "--stage Prod", then table would be "MyTable_Prod". If I deploy to "--stage Dev", then table would be "MyTable_Dev", etc...
Figuring this out would go a long way to making deployments much smoother :).
The serverless.yml section of interest is as follows:
resources:
Resources:
DynamoDbTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: MyHappyFunTable
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
DynamoDBIamPolicy:
Type: AWS::IAM::Policy
DependsOn: DynamoDbTable
Properties:
PolicyName: lambda-dynamodb
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:Scan
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
Resource: "arn:aws:dynamodb:${opt:region, self:provider.region}:*:table/${self:provider.environment.DYNAMODB_TABLE}"
Roles:
- Ref: IamRoleLambdaExecution
A sample of my horrid 'hardcoded' table names is as follows:
dbParms = {
TableName : "MyTable_Dev",
FilterExpression: "#tid = :tid and #owner = :owner",
ProjectionExpression: "#id, #name",
ExpressionAttributeNames: {
"#tid" : "tenantid",
"#id" : "id",
"#name" : "name",
"#owner" : "owner"
},
ExpressionAttributeValues: {
":tid": tenantId,
":owner": owner
}
};
Note the "MyTable_Dev" ... ideally i'd like that to be something like "MyTable_"
+ {$opt.stage} ... or something to that effect.
In my source code (nodejs), i'm not sure how I'd reference a table specified in the yml file instead of hardcoding it.
I would put your stage in an environment variable that your Lambda function has access to.
In your serverless.yml,
provider:
...
environment:
STAGE: {$opt:stage}
Then, in your code you can access it through process.env.STAGE.
const tableName = 'MyTable_' + process.env.STAGE
I'm try to add 2 tables to serverless.yml to link with DynamoDB.
A part of my code in serverless.yml:
...
resources:
Resources:
ItemsTable:
Type: "AWS::DynamoDB::Table"
Properties:
TableName: "InvoiceConfig"
AttributeDefinitions:
- AttributeName: "providerName"
AttributeType: "S"
KeySchema:
- AttributeName: "providerName"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 2
WriteCapacityUnits: 2
TableName: "DifferentTermsPages"
AttributeDefinitions:
- AttributeName: "id"
AttributeType: "S"
- AttributeName: "providerName"
AttributeType: "S"
- AttributeName: "productType"
AttributeType: "S"
- AttributeName: "language"
AttributeType: "S"
- AttributeName: "terms"
AttributeType: "L"
KeySchema:
- AttributeName: "id"
KeyType: "HASH"
- AttributeName: "providerName"
KeyType: "HASH"
- AttributeName: "productType"
KeyType: "HASH"
- AttributeName: "language"
KeyType: "HASH"
- AttributeName: "terms"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 10
WriteCapacityUnits: 10
Is that correct??
My tables are:
InvoiceConfig: with columns: providerName (String)
DifferentTermsPages: id (String), providerName (String), productType (String), language (String), terms (list)
Do I need more changes in serverles.yml? what is the meaning of the expressions "ReadCapacityUnits" and "WriteCapacityUnits"?
There should be some separation between two resources (i.e. two DynamoDB tables).
Note:-
You can define only key attributes while creating the DynamoDB table. In other words, you don't need to define all other non-key attributes.
Try this:-
Resources:
ItemsTable:
Type: "AWS::DynamoDB::Table"
Properties:
TableName: "InvoiceConfig"
AttributeDefinitions:
- AttributeName: "providerName"
AttributeType: "S"
KeySchema:
- AttributeName: "providerName"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 2
WriteCapacityUnits: 2
DifferentTermsPages:
Type: "AWS::DynamoDB::Table"
Properties:
TableName: "DifferentTermsPages"
AttributeDefinitions:
- AttributeName: "id"
AttributeType: "S"
KeySchema:
- AttributeName: "id"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 10
WriteCapacityUnits: 10
Read and Write capacity units:-
You specify throughput capacity in terms of read capacity units and
write capacity units:
One read capacity unit represents one strongly consistent read per
second, or two eventually consistent reads per second, for an item up
to 4 KB in size. If you need to read an item that is larger than 4 KB,
DynamoDB will need to consume additional read capacity units. The
total number of read capacity units required depends on the item size,
and whether you want an eventually consistent or strongly consistent
read. One write capacity unit represents one write per second for an
item up to 1 KB in size. If you need to write an item that is larger
than 1 KB, DynamoDB will need to consume additional write capacity
units. The total number of write capacity units required depends on
the item size.
Read and write capacity units
Short Answer:
Read and Write Capacity Units are the max size of data the db is allowed to processes per second, if you go over this amount in any second your request would throttle.
Alternative:
It might be easier to just use DynamoDB On-Demand and pay for the Db table usages rather than calculating WCU and RCU.
Example
Here is an example of 3 tables added in a formatted manner and without semiquotes:
resources:
Resources:
myDynamoDBTable1:
Type: AWS::DynamoDB::Table
Properties:
TableName: Table1
AttributeDefinitions:
- AttributeName: ColumnName1
AttributeType: S
- AttributeName: ColumnName2
AttributeType: N
KeySchema:
- AttributeName: ColumnName1
KeyType: HASH
- AttributeName: ColumnName2
KeyType: RANGE
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
myDynamoDBTable2:
Type: AWS::DynamoDB::Table
Properties:
TableName: Table2
AttributeDefinitions:
- AttributeName: ColumnName1
AttributeType: S
KeySchema:
- AttributeName: ColumnName1
KeyType: HASH
BillingMode: PAY_PER_REQUEST
myDynamoDBTableN:
Type: AWS::DynamoDB::Table
Properties:
TableName: TableN
AttributeDefinitions:
- AttributeName: ColumnName1
AttributeType: S
KeySchema:
- AttributeName: ColumnName1
KeyType: HASH
BillingMode: PAY_PER_REQUEST
Additional Explanation with Examples:
Back to Read/Write Capacity Mode:
Write Capacity Units (WCU) formula: Round up (DataSize / 1KB)
Example1:
imagine you foresee a traffic of writing 10KB of data per second into the db. Using the WCU formula, you would need (10KB / 1KB) = 10WCU.
Example2:
Expecting writing traffic of 7.5KB of data to the db, we would need: (7.5KB / 1KB) = 8WCU
Reading Capacity Units (RCU) depends on Strongly or Eventually Consistent models.
Strongly Consistent mode: Round up (DataSize / 4KB)
Eventually Consistent mode: Round up(DataSize / 4KB) / 2