Stream Analytics to CosmosDB always fails - azure

The errors I'm getting don't match up with what I'm sending in my query.
My query:
SELECT
udf.CreateGuid('') AS [id],
udf.CreateGuid('') AS [DocumentId],
BlobName,
BlobLastModifiedUtcTime,
[telmetry].[event_type] as PartitionKey,
-- webhook
[telmetry].[id] AS [hook_id],
[telmetry].[event_version],
[telmetry].[create_time],
[telmetry].[resource_type],
[telmetry].[event_type],
[telmetry].[summary],
[telmetry].[resource],
[telmetry].[links]
INTO
[cosmosdb2]
FROM
[telemetrydepot] AS [telmetry]
TIMESTAMP BY [telmetry].[create_time]
Here's the export config:
I've tried setting the DocumentId property to DocumentId or id with no success. I'm even throwing additional ID, DocumentId and PartitionKey fields into the results just to get something to save with no success (also trying individual runs putting id or DocumentId in the CosmosDB Document Id property. Can't get anything to save...
The errors I'm getting back say:
An error occurred while preparing data for DocumentDB. The output record does not contain the column DocumentId to use as the partition key property by DocumentDB

DocumentDB is complaining that you've configured the collection's partition key as DocumentId but no such column was in your output. I've found when I alias columns in ASA the column names in the output end up lowercase...
ASA doesn't care about the case but DocumentDB will. Try creating a new collection with partition key set to documentid. You can review the current key under "settings" in the portal for docdb.
Note Document id in ASA output properties controls what goes in the id field. It can be different than the field you partition by in DocumentDB. For example in one of my jobs I want to organize the database by deviceID but identify documents based on messageType. Because I have to alias deviceID, it loses its upper case letters and I have to set the partition key to deviceid. Then I set my Document id to messageType:
I get documents that look like this:
{
"deviceid": "MyDeviceIdentifier",
/.../,
"id": "MyMessageType"
}

Related

How to delete a map from list in an Item of a table in DynamoDB using nodeJS?

I am developing an app and I am using AWS DynamoDB as the database server. DB is connected with nodeJS. I have a table called chat to store messages. In chat table every item have connectionid as partition key. An item looks like this
{"id": "123456", "messages": [{"id": "1214545","message":"ksadfksfmsfsdf","sender":"112415",
"timestamp": "27:1:2022:11:53"}]}
This is a single Item and there are more inside the chat table.
I have been trying to delete a message inside the item. Its easy for me to delete a whole item because connectionid is the partition key. But I want to delete it with respect to id which is nested. How to achieve this?
But I want to delete with respect to id which is nested. How to to achieve this?
No, you cannot. Its not possible to have nested primary key or secondary index on nested key.
I would recommend to remodel your table schema, you can use a composite key for example:
keep message array as same, just take id out of the nesting.
you can have "id": "123456" as sort key and "id": "1214545" as primary key.
a combination of both will give you the ability to find the unique record and delete it.
if you want to delete all messages ( multiple messages), you can delete it using query operation with key = "id": "1214545" which is your primary key.
Or a very simple solution would be to have message id as your single primary key if it is going to be unique for all record.
Note:- in composite key primary key can have same value but sort key should be different all the time.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey

How to put value as DateTime in Azure Table from Logic App

I have a Logic App that should store data in Azure Table. Everything worked fine until I realized that one of my properties that should be stored as DateTime is stored as String.
The problem is that some other application is doing queries periodically on data in the table and it expects to find DateTimes there:
var query = new TableQuery<UserEntity>().Where(
TableQuery.CombineFilters(
TableQuery.GenerateFilterConditionForDate(
nameof(UserEntity.AccessEndTime),
QueryComparisons.GreaterThanOrEqual,
DateTime.SpecifyKind(queriedDate, DateTimeKind.Utc)),
TableOperators.And,
TableQuery.GenerateFilterConditionForDate(
nameof(UserEntity.AccessEndTime),
QueryComparisons.LessThan,
DateTime.SpecifyKind(queriedDate.AddDays(1), DateTimeKind.Utc))));
Basically, my C# app is looking for users who have their AccessEndTime property value set to some specific day.
Unfortunately, since the Logic App writes the value as a string, my query does not return any data.
Here's a part of my Logic App:
First I create an object with the proper data as JSON and then I use Insert or Replace Entity block, which used Body of that JSON as an entity to be put in the table. As you can see, AccessEndTime has a type: string. I tried using type: datetime, but it just fails with an error (no such type).
I guess I could handle it on the client-side, but then my UserEntity will have to have AccessEndTime as a String and it just doesn't feel right.
What am I missing?
//EDIT
I also found this. I tried to put my data like this:
So, I added explicitly the type of my property. Unfortunately, the result is still the same.
Check out this SO question's response around the same: Cannot query a DateTime column in Table Storage
It looks like you could have used formatDateTime() as per documentation, but this will not work as described below:
According to some test, the value is still in "String" type but not "DateTime" type. This document shows us the method formatDateTime() response a value in string.
So when we insert the value from method formatDateTime(), it will insert a string into the storage table. It seems there is a bug in display of azure portal, it shows the type is "DateTime". But if we open the table storage in "Azure Storage Explorer" but not on Azure portal, we can find the TimeOfCreation of new inserted record is in "String" type.
For this requirement, it's difficult to get a "DateTime" type value in logic app and insert it into table storage. We can just insert a string. But we can edit the type after insert the new record to table storage. We can do it on Azure portal or in "Azure Storage Explorer". If do it on Azure portal, just click "edit" the record and click "Update" button without do anything(because the type already show as "DateTime"). If do it in "Azure Storage Explorer", just change the type from "String" to "DateTime" and click "Update". After that, we can query the records by "TimeOfCreation" >= Last 365 days success.
The bad thing here is, we can just do it manually on each inserted record. We can't solve this problem in logic app or batch update the type(on portal or in explorer). If you want to batch update the type, you can query all of the new inserted records by this api (use $filter to filter timestamp). And then get each record's PartitionKey and RowKey, and loop them. Use this api to update the column TimeOfCreation type.

Azure Cosmos DB - Can I use a JSON field that doesn't exist for all documents as my partition key?

I am trying to setup a new Cosmos DB and it's asking me to set a partition key. I think I understand the concept where I should select a JSON field that can group my documents efficiently.
Is it possible to configure the collection to use a JSON field that may not exist in every incoming document?
For example:
{
"name" : "Robin",
"DOB" : "01/01/1969",
"scans" : {
"bloodType" : "O"
}
}
{
"name" : "Bill",
"DOB" : "01/01/1969"
}
Can I use /scans.bloodType as the partion key? For documents that don't have a scans JSON field, I still want that data as I can update that document later.
You can, indeed, specify a partition key that might not exist in every document. When you save a document that's missing the property specified by your partition key, it will result in assigning an "undefined" value for its partition key.
In the future, if you wanted to provide a value for the partition key of such a document, you'd have to delete, and then re-add, the document. You cannot modify a property's value when it happens to be the partition key within that document's container (nor can you add the property to an existing document that doesn't explicitly have that property defined, since it's already been assigned the "undefined" value).
See this answer for more details on undefined partition key properties.
Unfortunately you can't do that.
As per the official docs the partition key path (/scans.bloodType) or the partition key value cannot be changed.
Be a property that has a value which does not change. If a property is your partition key, you can't update that property's value.
In terms of solutions, you could either try and find another partition Key property path and ensure there's a value at the time of creation, or maybe use a secondary collection to store your incomplete documents and use the change feed to "move" them to the final collection once all the data becomes available.
Something like blood type wouldn't be a good partition key, assuming it refers to A/B/O types, etc, if there are only a small number of possible values. What you can generally always fall back on is the unique id property of the item, which Cosmos creates automatically. It's fine to have one item per partition, if nothing else is a better candidate. Per docs:
If your container has a property that has a wide range of possible
values, it is likely a great partition key choice. One possible
example of such a property is the item ID. For small read-heavy
containers or write-heavy containers of any size, the item ID is
naturally a great choice for the partition key.

Find the history of Deleted Data in QLDB

I have created Vehicle Table In the Ledger and added some vehicles in QLDB and I deleted the vehicle data.Now I am not able to fetch the metadata id because user table and committed table will have only non-deleted latest version of application data.So I am not able to fetch History of that deleted Data through metadata.Kindly help me with PartiQL query to fetch the History, if there is a way.
Note: I don't have vehicle registration table which stores metadataId of vehicles.
The way you are doing it is correct. First, you filter on history by some known attribute (in this case, a user defined primary key such as 'VIN') and you retrieve the document id. After that, you can filter history using that document id.
The second query should return the same as the first but it will also contain the deletion information (the first query will not include it because the deletion removes the attribute).
Note that the document id is returned as part of the DELETE PartiQL statement.

Override id creation in Azure cosmos DB

I have a JSON document, which I need to push to azure cosmos db. I am using DocumentDB data migration tool to dump my json file to azure cosmos DB. The problem is my JSON contains an id field which is same for all the records. For e,g.
"angularAcceleration_T12_x:-0.137993,"id":"5946001","jointAngle_jRightHip_z"
And because of this id field, I am not able to insert more than 1 record in my collection as after inserting one record I get an error message "resource with specified id already exist".
Is there a way to force cosmos DB to ignore the id field in JSON and create a GUID for each entry, like it does if it does not find an id field.
Thanks,
Yaju
After a lot of searching i came to the verdict that if you have a column named "id"(case sensitive) in your json, cosmos db will be make that column value as the id for the document, you can not use any other column as the unique identifier.
If cosmos does not find id field, it will create a GUID as the id for the document.
Microsoft.Azure.Documents.Resource.Id documentation hints that it is serialised in JSON with fixed name "id":
[Newtonsoft.Json.JsonProperty(PropertyName="id")]
public virtual string Id { get; set; }
So, I would say that you cannot use another property as storage PK.
What you CAN do, is convert your JSON document yourself, for example renaming your current id to originalId or similar. Then documentDB would generate unique ids on import for you.

Resources