I am playing around with DynamoDb. I am not sure what is the purpose of StreamSpecification and why we should or shouldn't use it? I have read the documentation Aws - StreamSpecification but it does not explain much as what it does.
MovieTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: "Name"
AttributeType: "S"
- AttributeName: "Genre"
AttributeType: "S"
- AttributeName: "Rating"
AttributeType: "N"
- AttributeName: "DateReleased"
AttributeType: "S"
KeySchema:
- AttributeName: "Name"
KeyType: "HASH"
- AttributeName: "Genre"
KeyType: "RANGE"
- AttributeName: "Rating"
KeyType: "RANGE"
- AttributeName: "DateReleased"
KeyType: "RANGE"
TimeToLiveSpecification:
AttributeName: ExpireAfter
Enabled: false
SSESpecification:
SSEEnabled: true
The StreamSpecification allows you to enable the optional DynamoDB Streams support for this table. DynamoDB Streams allow you to read all the changes to a table as a stream - which you can use for various reasons such as replicating the same changes to another table, checking for suspicious activity, and so on. You can read an introduction to the DynamoDB Streams feature here.
If you don't want to enable a stream on your table (and since you didn't know what this was, you probably don't :-)), you can just ignore StreamSpecification.
Related
I am very new to AWS and I have been reading the dynamoDb SDK documentation and the properties that you can specify when creating a Table are far more than the properties that you pass when creating a table using AWS CDK.
SDK example:
var AWS = require("aws-sdk");
AWS.config.update({
region: "us-west-2",
endpoint: "http://localhost:8000"
});
var dynamodb = new AWS.DynamoDB();
var params = {
TableName : "Movies",
KeySchema: [
{ AttributeName: "year", KeyType: "HASH"}, //Partition key
{ AttributeName: "title", KeyType: "RANGE" } //Sort key
],
AttributeDefinitions: [
{ AttributeName: "year", AttributeType: "N" },
{ AttributeName: "title", AttributeType: "S" }
],
ProvisionedThroughput: {
ReadCapacityUnits: 10,
WriteCapacityUnits: 10
}
};
dynamodb.createTable(params, function(err, data) {
if (err) {
console.error("Unable to create table. Error JSON:", JSON.stringify(err, null, 2));
} else {
console.log("Created table. Table description JSON:", JSON.stringify(data, null, 2));
}
});
CDK example:
import * as dynamodb from '#aws-cdk/aws-dynamodb';
const table = new dynamodb.Table(this, 'Hits', {
partitionKey: { name: 'path', type: dynamodb.AttributeType.STRING }
});
here are all the props you can set which are more high level table related settings:
https://docs.aws.amazon.com/cdk/api/latest/docs/#aws-cdk_aws-dynamodb.Table.html
so for example if I want to set the provision throughput in CDK how do I do it? or set AttributeDefinitions, or indexes?
Do I wait unit table creation is done and then modify the table properties via the SDK UpdateTable call?
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#updateTable-property
Billing Mode
DynamoDB supports two billing modes:
PROVISIONED - the default mode where the table and global secondary
indexes have configured read and write capacity.
PAY_PER_REQUEST - on-demand pricing and scaling. You only pay for what
you use and there is no read and write capacity for the table or its
global secondary indexes.
see the Billing Mode attribute:
cdk docs
Dynamodb is pretty much entirely implemented in CDK, but some of the properties are a bit more difficult to find if you aren't very familiar with.
Billing Mode is the property for Provisioned or on demand read/write capacity. It is a constant, so it would be used something like
billingMode: aws_dynamodb.BillingMode.PAY_PER_REQUEST
With CDK you often have to dive a little bit into the documentation to find what you want, but for the mainstream services - Lambda, S3, Dynamo - these are fully implemented in CDK.
And in any case, for other services that may not be, you can use any of the functions that start with Cfn as these are escape hatches that allow you to basically implement direct cloud formation template jsons from cdk
I've been trying to find an explanation for this situation but I didn't find any.
I have two DynamoDb tables, both with two key indexes, one is a HASH key and the other is a RANGE key.
In the table where both keys are strings, I can query the database with just the HASH key like this (using the node sdk):
const params = {
TableName: process.env.DYNAMODB_TABLE,
Key: { id: sessionId },
};
const { Item } = await dynamoDb.get(params);
However, the same operation on the other table throws the mentioned error about The number of conditions on the keys is invalid
Here are the two table schemas:
This table definiton allows me to use the mentioned query.
SessionsDynamoDbTable:
Type: 'AWS::DynamoDB::Table'
DeletionPolicy: Retain
Properties:
AttributeDefinitions:
-
AttributeName: userId
AttributeType: S
-
AttributeName: id
AttributeType: S
-
AttributeName: startDate
AttributeType: S
KeySchema:
-
AttributeName: userId
KeyType: HASH
-
AttributeName: id
KeyType: RANGE
LocalSecondaryIndexes:
- IndexName: byDate
KeySchema:
- AttributeName: userId
KeyType: HASH
- AttributeName: startDate
KeyType: RANGE
Projection:
NonKeyAttributes:
- endDate
- name
ProjectionType: INCLUDE
BillingMode: PAY_PER_REQUEST
TableName: ${self:provider.environment.DYNAMODB_TABLE}
This does not allow me to make a query like the one mentioned
SessionsTable:
Type: 'AWS::DynamoDB::Table'
TimeToLiveDescription:
AttributeName: expiresAt
Enabled: true
Properties:
AttributeDefinitions:
-
AttributeName: id
AttributeType: S
-
AttributeName: expiresAt
AttributeType: N
KeySchema:
-
AttributeName: id
KeyType: HASH
-
AttributeName: expiresAt
KeyType: RANGE
BillingMode: PAY_PER_REQUEST
TableName: ${self:provider.environment.DYNAMODB_TABLE}
I'm including the entire table definition because I don't know if secondary indexes can have an impact or not on this problem.
You must provide the name of the partition key attribute and a single value for that attribute. Query returns all items with that partition key value. Optionally, you can provide a sort key attribute and use a comparison operator to refine the search results.more
get(params, callback) ⇒ AWS.Request
Returns a set of attributes for the item with the given primary key by delegating to AWS.DynamoDB.getItem().
In SessionsTable id is HASH key and in SessionsDynamoDbTable id in RANGE key.for SessionsDynamoDbTable you should provide HASH Key in addition to RANGE
key.
In my serverless.yml file, I have specified a DynamoDB resource, something to this effect (see below). I'd like to know two things:
Why is it that I'm not seeing these tables get created when they don't exist, forcing me to manually enter AWS console and do so myself?
In my source code (nodejs), i'm not sure how I'd reference a table specified in the yml file instead of hardcoding it.
The two questions above roll up into a singular problem, which is that I'd like to be able to specify the tables in the yml and then when doing a "deploy", have a different table set created per environment.
i.e. If I deploy to "--stage Prod", then table would be "MyTable_Prod". If I deploy to "--stage Dev", then table would be "MyTable_Dev", etc...
Figuring this out would go a long way to making deployments much smoother :).
The serverless.yml section of interest is as follows:
resources:
Resources:
DynamoDbTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: MyHappyFunTable
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
DynamoDBIamPolicy:
Type: AWS::IAM::Policy
DependsOn: DynamoDbTable
Properties:
PolicyName: lambda-dynamodb
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:Scan
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
Resource: "arn:aws:dynamodb:${opt:region, self:provider.region}:*:table/${self:provider.environment.DYNAMODB_TABLE}"
Roles:
- Ref: IamRoleLambdaExecution
A sample of my horrid 'hardcoded' table names is as follows:
dbParms = {
TableName : "MyTable_Dev",
FilterExpression: "#tid = :tid and #owner = :owner",
ProjectionExpression: "#id, #name",
ExpressionAttributeNames: {
"#tid" : "tenantid",
"#id" : "id",
"#name" : "name",
"#owner" : "owner"
},
ExpressionAttributeValues: {
":tid": tenantId,
":owner": owner
}
};
Note the "MyTable_Dev" ... ideally i'd like that to be something like "MyTable_"
+ {$opt.stage} ... or something to that effect.
In my source code (nodejs), i'm not sure how I'd reference a table specified in the yml file instead of hardcoding it.
I would put your stage in an environment variable that your Lambda function has access to.
In your serverless.yml,
provider:
...
environment:
STAGE: {$opt:stage}
Then, in your code you can access it through process.env.STAGE.
const tableName = 'MyTable_' + process.env.STAGE
I'm try to add 2 tables to serverless.yml to link with DynamoDB.
A part of my code in serverless.yml:
...
resources:
Resources:
ItemsTable:
Type: "AWS::DynamoDB::Table"
Properties:
TableName: "InvoiceConfig"
AttributeDefinitions:
- AttributeName: "providerName"
AttributeType: "S"
KeySchema:
- AttributeName: "providerName"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 2
WriteCapacityUnits: 2
TableName: "DifferentTermsPages"
AttributeDefinitions:
- AttributeName: "id"
AttributeType: "S"
- AttributeName: "providerName"
AttributeType: "S"
- AttributeName: "productType"
AttributeType: "S"
- AttributeName: "language"
AttributeType: "S"
- AttributeName: "terms"
AttributeType: "L"
KeySchema:
- AttributeName: "id"
KeyType: "HASH"
- AttributeName: "providerName"
KeyType: "HASH"
- AttributeName: "productType"
KeyType: "HASH"
- AttributeName: "language"
KeyType: "HASH"
- AttributeName: "terms"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 10
WriteCapacityUnits: 10
Is that correct??
My tables are:
InvoiceConfig: with columns: providerName (String)
DifferentTermsPages: id (String), providerName (String), productType (String), language (String), terms (list)
Do I need more changes in serverles.yml? what is the meaning of the expressions "ReadCapacityUnits" and "WriteCapacityUnits"?
There should be some separation between two resources (i.e. two DynamoDB tables).
Note:-
You can define only key attributes while creating the DynamoDB table. In other words, you don't need to define all other non-key attributes.
Try this:-
Resources:
ItemsTable:
Type: "AWS::DynamoDB::Table"
Properties:
TableName: "InvoiceConfig"
AttributeDefinitions:
- AttributeName: "providerName"
AttributeType: "S"
KeySchema:
- AttributeName: "providerName"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 2
WriteCapacityUnits: 2
DifferentTermsPages:
Type: "AWS::DynamoDB::Table"
Properties:
TableName: "DifferentTermsPages"
AttributeDefinitions:
- AttributeName: "id"
AttributeType: "S"
KeySchema:
- AttributeName: "id"
KeyType: "HASH"
ProvisionedThroughput:
ReadCapacityUnits: 10
WriteCapacityUnits: 10
Read and Write capacity units:-
You specify throughput capacity in terms of read capacity units and
write capacity units:
One read capacity unit represents one strongly consistent read per
second, or two eventually consistent reads per second, for an item up
to 4 KB in size. If you need to read an item that is larger than 4 KB,
DynamoDB will need to consume additional read capacity units. The
total number of read capacity units required depends on the item size,
and whether you want an eventually consistent or strongly consistent
read. One write capacity unit represents one write per second for an
item up to 1 KB in size. If you need to write an item that is larger
than 1 KB, DynamoDB will need to consume additional write capacity
units. The total number of write capacity units required depends on
the item size.
Read and write capacity units
Short Answer:
Read and Write Capacity Units are the max size of data the db is allowed to processes per second, if you go over this amount in any second your request would throttle.
Alternative:
It might be easier to just use DynamoDB On-Demand and pay for the Db table usages rather than calculating WCU and RCU.
Example
Here is an example of 3 tables added in a formatted manner and without semiquotes:
resources:
Resources:
myDynamoDBTable1:
Type: AWS::DynamoDB::Table
Properties:
TableName: Table1
AttributeDefinitions:
- AttributeName: ColumnName1
AttributeType: S
- AttributeName: ColumnName2
AttributeType: N
KeySchema:
- AttributeName: ColumnName1
KeyType: HASH
- AttributeName: ColumnName2
KeyType: RANGE
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
myDynamoDBTable2:
Type: AWS::DynamoDB::Table
Properties:
TableName: Table2
AttributeDefinitions:
- AttributeName: ColumnName1
AttributeType: S
KeySchema:
- AttributeName: ColumnName1
KeyType: HASH
BillingMode: PAY_PER_REQUEST
myDynamoDBTableN:
Type: AWS::DynamoDB::Table
Properties:
TableName: TableN
AttributeDefinitions:
- AttributeName: ColumnName1
AttributeType: S
KeySchema:
- AttributeName: ColumnName1
KeyType: HASH
BillingMode: PAY_PER_REQUEST
Additional Explanation with Examples:
Back to Read/Write Capacity Mode:
Write Capacity Units (WCU) formula: Round up (DataSize / 1KB)
Example1:
imagine you foresee a traffic of writing 10KB of data per second into the db. Using the WCU formula, you would need (10KB / 1KB) = 10WCU.
Example2:
Expecting writing traffic of 7.5KB of data to the db, we would need: (7.5KB / 1KB) = 8WCU
Reading Capacity Units (RCU) depends on Strongly or Eventually Consistent models.
Strongly Consistent mode: Round up (DataSize / 4KB)
Eventually Consistent mode: Round up(DataSize / 4KB) / 2
Well, I am having an issue that I've been dealing with for the last 2 days and still seem to have no progress on.
Basically, I am trying to develop a skill for Amazon's echo dot, but my particular skill requires the use of persistent data. I took to the docs and found information on account linking and DynamoDB, account linking seemed to complex for a simple research project so I took to DynamoDB.
I used a lambda function, and it ran fine until I put the DynamoDB table line:
alexa.dynamoDBTableName = 'rememberThisDB';
That line completely stops my skill from working and returns the following message:
The remote endpoint could not be called, or the response it returned was invalid.
I honestly have no idea how to deal with it; I am completely new to the whole AWS concept so I don't even know how to get the actual error message that the Lambda function is returning.
I changed the role and gave it the following configuration:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:DeleteItem",
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:Scan",
"dynamodb:UpdateItem"
],
"Resource": "*Yes, I did put the correct ARN*"
}
]
}
But that didn't really change anything, it still just returned the same error.
The issue is, I'm not doing anything at all with DynamoDB, I am simply defining the dynamoDBTableName property of the alexa object, that's it.
Yes, the DynamoDB table exists.
I feel like my head is about to blow up, so any help would be greatly appreciated.
UPDATE: Found out how to see logs, here is the latest log: Error fetching user state: ValidationException: The provided key element does not match the schema, not sure why it would give that error since I never queried anything, the only thing I did was declare the table name.
Just to document the resolution of the question in the comments and so that this question doesn't remain "unanswered" on SO:
Assuming you're using the alexa-skills-kit-sdk-for-nodejs, your table should have a single userId string HASH key.
var newTableParams = {
AttributeDefinitions: [
{
AttributeName: 'userId',
AttributeType: 'S'
}
],
KeySchema: [
{
AttributeName: 'userId',
KeyType: 'HASH'
}
],
ProvisionedThroughput: {
ReadCapacityUnits: 5,
WriteCapacityUnits: 5
}
}
It turned out the user did not have the appropriate schema setup for their DynamoDB table.