Try to create table with three fields in Dynamo db by using using flask-dynamo got error ""
botocore.exceptions.ClientError
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateTable operation: The number of attributes in key schema must match the number of attributesdefined in attribute definitions
Here goes the configuration create table dynamo db
#app.route('/create_table')
def create_table():
app.config['DYNAMO_TABLES'] = [
{
'TableName': "user_detail",
'KeySchema': [
{'AttributeName': "timestamp", 'KeyType': "HASH"},
{'AttributeName': "question", 'KeyType': "RANGE"},
],
'AttributeDefinitions': [
{'AttributeName': "timestamp", 'AttributeType': "S"},
{'AttributeName': "question", 'AttributeType': "N"},
{'AttributeName': "user", 'AttributeType': "N"},
],
'ProvisionedThroughput': {
'ReadCapacityUnits': 40,
'WriteCapacityUnits': 40
}
}]
dynamo = Dynamo(app)
with app.app_context():
dynamo.create_all()
return "Table created"
Thanks in advance
You need to remove the following line:
{'AttributeName': "user", 'AttributeType': "N"},
With DynamoDB (as with most NoSQL databases) you don't need to specify every record attribute field ahead of time. You only need to specify the hash and range fields ahead of time.
Related
I'm working with the Cosmos SDK in my my Node.js app. I've been able to query the database successfully but I'm having trouble with the REPLACE method. I'm wanting to update a single item in the database either by the unique 'id' field or the build in '_rid' field.
Here is the how I currently have it formatted (which returns the error: Entity with the specified id does not exist in the system):
const { resource: updatedItem } = await client.database(databaseId).container(contianerId).item('2INhAI1fcdkSAAAAERFAAA==', 'TX').replace(newJsonObject);
Sample Item:
'state' is the partition key
{
"DateTime": "01-28-19 11:55:48",
"id": "15",
"resolved": false,
"state": "TX",
"_rid": "2INhAI1fcdkSAAAAAAAAAA==",
"_self": "dbs/2INhAA==/colls/2INhAI1fcdk=/docs/2INhAI1fcdkSAAAAAAAAAA==/",
"_etag": "\"fd03208d-0000-0700-0000-5fc68a550000\"",
"_attachments": "attachments/",
"_ts": 1606847061
}
This ended up being the correct syntax for what I needed:
const { resource: updatedItem } = await client.database(databaseId).container(contianerId).item('15', 'TX').replace(newJsonObject);
where the value for the id and the value for the predefined partition key (state in my example) are the properties provided for the item.
I have a mongo db collection users with the following data format
{
"name": "abc",
"email": "abc#xyz.com"
"address": {
"city": "Gurgaon",
"state": "Haryana"
}
}
Now I'm creating a datasource, an index, and an indexer for this collection using azure rest apis.
Datasource
def create_datasource():
request_body = {
"name": 'users-datasource',
"description": "",
"type": "cosmosdb",
"credentials": {
"connectionString": "<db conenction url>"
},
"container": {"name": "users"},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
}
}
resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body),
headers=headers)
Index for the above datasource
def create_index(config):
request_body = {
'name': "users-index",
'fields': [
{
'name': 'name',
'type': 'Edm.String'
},
{
'name': 'email',
'type': 'Edm.DateTimeOffset'
},
{
'name': 'address',
'type': 'Edm.String'
},
{
'name': 'doc_id',
'type': 'Edm.String',
'key': True
}
]
}
resp = requests.post(url="<azure-create-index-api-url>", data=json.dumps(request_body),
headers=config.headers)
Now the inxder for the above datasource and index
def create_interviews_indexer(config):
request_body = {
"name": "users-indexer",
"dataSourceName": "users-datasource",
"targetIndexName": users-index,
"schedule": {"interval": "PT5M"},
"fieldMappings": [
{"sourceFieldName": "address.city", "targetFieldName": "address"},
]
}
resp = requests.post("create-indexer-pi-url", data=json.dumps(request_body),
headers=config.headers)
This creates the indexer without any exception, but when I check the retrieved data in azure portal for the users-indexer, the address field is null and is not getting any value from address.city field mapping that is provided while creating the indexer.
I have also tried the following code as a mapping but its also not working.
"fieldMappings": [
{"sourceFieldName": "/address/city", "targetFieldName": "address"},
]
The azure documentation also does not say anything about this kind of mapping. So if anyone can help me on this, it will be very much appreciated.
container element in data source definition allows you to specify a query that you can use to flatten your JSON document (Ref: https://learn.microsoft.com/en-us/rest/api/searchservice/create-data-source) so instead of doing column mapping in the indexer definition, you can write a query and get the output in desired format.
Your code for creating data source in that case would be:
def create_datasource():
request_body = {
"name": 'users-datasource',
"description": "",
"type": "cosmosdb",
"credentials": {
"connectionString": "<db conenction url>",
},
"container": {
"name": "users",
"query": "SELECT a.name, a.email, a.address.city as address FROM a",
},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
}
}
resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body),
headers=headers)
Support for MongoDb API flavor is in public preview - you need to explicitly indicate Mongo in the datasource's connection string as described in this article. Also note that with Mongo datasources, custom queries suggested by the previous response are not supported afaik. Hopefully someone from the team would clarify the current state of this support.
It's working for me with the below field mapping correctly. Azure search query is returning values for address properly.
"fieldMappings": [{"sourceFieldName": "address.city", "targetFieldName": "address"}]
I did made few changes to the data your provided for e.g.
while creating indexers, removed extra comma at the end of
fieldmappings
while creating index, email field is kept at
Edm.String and not datetimeoffset.
Please make sure you are using the Preview API version since for MongoDB API is in preview mode with Azure Search.
For e.g. https://{azure search name}.search.windows.net/indexers?api-version=2019-05-06-Preview
I'm trying to create an item in AWS DynamoDB using boto3 and regardless what I try I can't manage to get an item of type 'SS' created. Here's my code:
client = boto3.resource('dynamodb', region_name=region)
table = client.Table(config[region]['table'])
sched = {
"begintime": begintime,
"description": description,
"endtime": endtime,
"name": name,
"type": "period",
"weekdays": [weekdays]
}
table.put_item(Item=sched)
The other columns work fine but regardless what I try, weekdays always ends up as a 'S' type. For reference, this is what one of the other items look like from the same table:
{'begintime': '09:00', 'endtime': '18:00', 'description': 'Office hours', 'weekdays': {'mon-fri'}, 'name': 'office-hours', 'type': 'period'}
Trying to convert this to a Python structure obviously fails so I'm not sure how it's possible to insert a new item.
To indicate an attribute of type SS (String Set) using the boto3 DynamoDB resource-level methods, you need to supply a set rather than a simple list. For example:
import boto3
res = boto3.resource('dynamodb', region_name=region)
table = res.Table(config[region]['table'])
sched = {
"begintime": '09:00',
"description": 'Hello there',
"endtime": '14:00',
"name": 'james',
"type": "period",
"weekdays": set(['mon', 'wed', 'fri'])
}
table.put_item(Item=sched)
As follow up on #jarmod's answer:
If you want to call update_item with a String Set, then you'll insert a set via ExpressionAttributeValues property like shown below:
entry = table.put_item(
ExpressionAttributeNames={
"#begintime": "begintime",
"#description": "description",
"#endtime": "endtime",
"#name": "name",
"#type": "type",
"#weekdays": "weekdays"
},
ExpressionAttributeValues={
":begintime": '09:00',
":description": 'Hello there',
":endtime": '14:00',
":name": 'james',
":type": "period",
":weekdays": set(['mon', 'wed', 'fri'])
},
UpdateExpression="""
SET #begintime= :begintime,
#description = :description,
#endtime = :endtime,
#name = :name,
#type = :type,
#weekdays = :weekdays
"""
)
(Hint: Usage of AttributeUpdates (related Items equivalent for put_item calls) is deprecated, therefore I recommend using ExpressionAttributeNames, ExpressionAttributeValues and UpdateExpression).
I'm trying to implement a cursor-based pagination using DynamoDB (definitely not easy to do pagination in DynamoDB...) using a query request using ExclusiveStartKey.
My table index is made of an "id", while I have a GSI on "owner" (partition key) and "created_at" (range key).
I can easily retrieve the 10 first records using a query request, by specifying the GSI index and using the "owner" property.
However, on next requests, the ExclusiveStartKey only works if I specify the THREE elements from both indices (so "id", "owner" AND "created_at").
While I understand for "id", and "owner" as those are both partitioned key and are needed to "locate" the record, I don't see why DynamoDb requires me to specify the "created_at". This is annoying because this means that the consumer must not only submit the "id" as cursor, but also the "created_at".
As DynamoDb could find the record using the "id" (which is guarantees unique), why do I need to specify this created_at?
Thanks
GSI primary keys are not necessarily unique. Base table keys are necessary to answer the question, "For a given owner and creation date, up to which id did I read in this page?". Put another way, you could have multiple items with the same owner and creation date.
In my testing, querying a gsi on a table resulted in a last evaluated key with all the item properties (essentially gsi key + table key). I needed to add all elements of the last evaluated key to the next request as exclusive start key to get the next page. If I excluded any elements of the last evaluated key in the next request, I received an exclusive start key error.
The following request:
aws dynamodb query --table-name MyTable --index-name MyIndex --key-condition-expression "R = :type" --expression-attribute-values '{\":type\":{\"S\":\"Blah\"}}' --exclusive-start-key '{\"I\":{\"S\":\"9999\"},\"R\":{\"S\":\"Blah\"},\"S\":{\"S\":\"Bluh_999\"},\"P\":{\"S\":\"Blah_9999~Sth\"}}' --limit 1
Resulted in the following response:
{
"Items": [
{
"I": {
"S": "9999"
},
"R": {
"S": "Blah"
},
"S": {
"S": "Bluh_999"
},
"P": {
"S": "Blah_9999~Sth"
}
}
],
"Count": 1,
"ScannedCount": 1,
"LastEvaluatedKey": {
"I": {
"S": "9999"
},
"R": {
"S": "Blah"
},
"S": {
"S": "Bluh_999"
},
"P": {
"S": "Blah_9999~Sth"
}
}
}
If I left off some elements of the last evaluated key, for example (same request as above minus the table partition/sort keys):
aws dynamodb query --table-name MyTable --index-name MyIndex --key-condition-expression "R = :type" --expression-attribute-values '{\":type\":{\"S\":\"Blah\"}}' --exclusive-start-key '{\"I\":{\"S\":\"9999\"},\"R\":{\"S\":\"Blah\"}}' --limit 1
I get the following error:
An error occurred (ValidationException) when calling the Query operation: The provided starting key is invalid
I'm trying to get only the person's membership info i.e. ID, name and committee memberships in a SELECT query. This is my object:
{
"id": 123,
"name": "John Smith",
"memberships": [
{
"id": 789,
"name": "U.S. Congress",
"yearElected": 2012,
"state": "California",
"committees": [
{
"id": 444,
"name": "Appropriations Comittee",
"position": "Member"
},
{
"id": 555,
"name": "Armed Services Comittee",
"position": "Chairman"
},
{
"id": 678,
"name": "Veterans' Affairs Comittee",
"position": "Member"
}
]
}
]
}
In this example, John Smith is a member of the U.S. Congress and three committees in it.
The result that I'm trying to get should look like this. Again, this is the "DESIRED RESULT":
{
"id": 789,
"name": "U.S. Congress",
"committees": [
{
"id": 444,
"name": "Appropriations Committee",
"position": "Member"
},
{
"id": 555,
"name": "Armed Services Committee",
"position": "Chairman"
},
{
"id": 678,
"name": "Veterans' Affairs Committee",
"position": "Member"
}
]
}
Here's my SQL query:
SELECT m.id, m.name,
[
{
"id": c.id,
"name": c.name,
"position": c.position
}
] AS committees
FROM a
JOIN m IN a.memberships
JOIN c IN m.committees
WHERE a.id = "123"
I'm getting the following results which is correct but the shape is not right. I'm getting the same membership 3 times. Here's what I'm getting which is NOT the desired result:
[
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 444,
"name": "Appropriations Committee",
"position": "Member"
}
]
},
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 555,
"name": "Armed Services Committee",
"position": "Chairman"
}
]
},
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 678,
"name": "Veterans' Affairs Committee",
"position": "Member"
}
]
}
]
As you can see here, the "U.S. Congress" membership is repeated 3 times.
The following SQL query gets me exactly what I want in Azure Query Explorer but when I pass it as the query in my code -- using DocumentDb SDK -- I don't get any of the details for the committees. I simply get blank results for committee ID, name and position. I do, however, get the membership data i.e. "U.S. Congress", etc. Here's that SQL query:
SELECT m.id, m.name, m.committees AS committees
FROM c
JOIN m IN c.memberhips
WHERE c.id = 123
I'm including the code that makes the DocumentDb call. I'm including the code with our internal comments to help clarify their purpose:
First the ReadQuery function that we call whenever we need to read something from DocumentDb:
public async Task<IEnumerable<T>> ReadQuery<T>(string collectionId, string sql, Dictionary<string, object> parameterNameValueCollection)
{
// Prepare collection self link
var collectionLink = UriFactory.CreateDocumentCollectionUri(_dbName, collectionId);
// Prepare query
var query = getQuery(sql, parameterNameValueCollection);
// Creates the query and returns IQueryable object that will be executed by the calling function
var result = _client.CreateDocumentQuery<T>(collectionLink, query, null);
return await result.QueryAsync();
}
The following function prepares the query -- with any parameters:
protected SqlQuerySpec getQuery(string sql, Dictionary<string, object> parameterNameValueCollection)
{
// Declare query object
SqlQuerySpec query = new SqlQuerySpec();
// Set query text
query.QueryText = sql;
// Convert parameters received in a collection to DocumentDb paramters
if (parameterNameValueCollection != null && parameterNameValueCollection.Count > 0)
{
// Go through each item in the parameters collection and process it
foreach (var item in parameterNameValueCollection)
{
query.Parameters.Add(new SqlParameter($"#{item.Key}", item.Value));
}
}
return query;
}
This function makes async call to DocumentDb:
public async static Task<IEnumerable<T>> QueryAsync<T>(this IQueryable<T> query)
{
var docQuery = query.AsDocumentQuery();
// Batches gives us the ability to read data in chunks in an asyc fashion.
// If we use the ToList<T>() LINQ method to read ALL the data, the call will synchronous which is why we prefer the batches approach.
var batches = new List<IEnumerable<T>>();
do
{
// Actual call is made to the backend DocumentDb database
var batch = await docQuery.ExecuteNextAsync<T>();
batches.Add(batch);
}
while (docQuery.HasMoreResults);
// Because batches are collections of collections, we use the following line to merge all into a single collection.
var docs = batches.SelectMany(b => b);
// Return data
return docs;
}
I just write a demo to test with your query and I can get the expected result, check the snapshot below. So I think that query is correct, you've mentioned that you don't seem to get any data when you make the call in my code, would you mind share your code? Perhaps there are some mistakes in you code. Anyway, here is my test just for your reference and hope it helps.
Query used:
SELECT m.id AS membershipId, m.name AS membershipNameName, m.committees AS committees
FROM c
JOIN m IN c.memberships
WHERE c.id = "123"
Code here is very simple, sp_db.innerText represents a span which I used to show the result in my test page:
var docs = client.CreateDocumentQuery("dbs/" + databaseId + "/colls/" + collectionId,
"SELECT m.id AS membershipId, m.name AS membershipName, m.committees AS committees " +
"FROM c " +
"JOIN m IN c.memberships " +
"WHERE c.id = \"123\"");
foreach (var doc in docs)
{
sp_db.InnerText += doc;
}
I think maybe there are some typos in the query you specified in client.CreateDocumentQuery() which makes the result to be none, it's better to provide the code for us, then we can help check it.
Updates:
Just tried your code and still I can get the expected result. One thing I found is that when I specified the where clause like "where c.id = \"123\"", it gets the result:
However, if you didn't make the escape and just use "where c.id = 123", this time you get nothing. I think this could be a reason. You can verify whether you have ran into this scenario.
Just updated my original post. All the code provided in the question is correct and works. I was having a problem because I was using aliases in the SELECT query and as a result some properties were not binding to my domain object.
The code provided in the question is correct.