I have a JSON document, which I need to push to azure cosmos db. I am using DocumentDB data migration tool to dump my json file to azure cosmos DB. The problem is my JSON contains an id field which is same for all the records. For e,g.
"angularAcceleration_T12_x:-0.137993,"id":"5946001","jointAngle_jRightHip_z"
And because of this id field, I am not able to insert more than 1 record in my collection as after inserting one record I get an error message "resource with specified id already exist".
Is there a way to force cosmos DB to ignore the id field in JSON and create a GUID for each entry, like it does if it does not find an id field.
Thanks,
Yaju
After a lot of searching i came to the verdict that if you have a column named "id"(case sensitive) in your json, cosmos db will be make that column value as the id for the document, you can not use any other column as the unique identifier.
If cosmos does not find id field, it will create a GUID as the id for the document.
Microsoft.Azure.Documents.Resource.Id documentation hints that it is serialised in JSON with fixed name "id":
[Newtonsoft.Json.JsonProperty(PropertyName="id")]
public virtual string Id { get; set; }
So, I would say that you cannot use another property as storage PK.
What you CAN do, is convert your JSON document yourself, for example renaming your current id to originalId or similar. Then documentDB would generate unique ids on import for you.
Related
I'm trying to add entries into my cosmosdb using Azure Data Factory - However i am not able to choose the right collection as Azure Data Factory can only see the top level of the database.
Is there any funny syntax for choosing which collection to pick from Cosmos DB SQL API? - i've tried doing, entities[0] and entities['tasks'] but none of them seem to work
The new entries are inserted as we see in the red box, how do i get the entries into the entries collection?
Update:
Original Answer:
If the requirement you mentioned in the comments is what you need, then it is possible. For example, to put JSON data into an existing ‘tasks’ item, you only need to use the upsert method, and the source json data has the same id as the ‘tasks’ item.
This is the offcial doc:
https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db#azure-cosmos-db-sql-api-as-sink
The random letters and numbers in your red box appear because you did not specify the document id.
Have a look of this:
By the way, if the tasks have partitional key, then you also need to specify.
I have a Logic App that should store data in Azure Table. Everything worked fine until I realized that one of my properties that should be stored as DateTime is stored as String.
The problem is that some other application is doing queries periodically on data in the table and it expects to find DateTimes there:
var query = new TableQuery<UserEntity>().Where(
TableQuery.CombineFilters(
TableQuery.GenerateFilterConditionForDate(
nameof(UserEntity.AccessEndTime),
QueryComparisons.GreaterThanOrEqual,
DateTime.SpecifyKind(queriedDate, DateTimeKind.Utc)),
TableOperators.And,
TableQuery.GenerateFilterConditionForDate(
nameof(UserEntity.AccessEndTime),
QueryComparisons.LessThan,
DateTime.SpecifyKind(queriedDate.AddDays(1), DateTimeKind.Utc))));
Basically, my C# app is looking for users who have their AccessEndTime property value set to some specific day.
Unfortunately, since the Logic App writes the value as a string, my query does not return any data.
Here's a part of my Logic App:
First I create an object with the proper data as JSON and then I use Insert or Replace Entity block, which used Body of that JSON as an entity to be put in the table. As you can see, AccessEndTime has a type: string. I tried using type: datetime, but it just fails with an error (no such type).
I guess I could handle it on the client-side, but then my UserEntity will have to have AccessEndTime as a String and it just doesn't feel right.
What am I missing?
//EDIT
I also found this. I tried to put my data like this:
So, I added explicitly the type of my property. Unfortunately, the result is still the same.
Check out this SO question's response around the same: Cannot query a DateTime column in Table Storage
It looks like you could have used formatDateTime() as per documentation, but this will not work as described below:
According to some test, the value is still in "String" type but not "DateTime" type. This document shows us the method formatDateTime() response a value in string.
So when we insert the value from method formatDateTime(), it will insert a string into the storage table. It seems there is a bug in display of azure portal, it shows the type is "DateTime". But if we open the table storage in "Azure Storage Explorer" but not on Azure portal, we can find the TimeOfCreation of new inserted record is in "String" type.
For this requirement, it's difficult to get a "DateTime" type value in logic app and insert it into table storage. We can just insert a string. But we can edit the type after insert the new record to table storage. We can do it on Azure portal or in "Azure Storage Explorer". If do it on Azure portal, just click "edit" the record and click "Update" button without do anything(because the type already show as "DateTime"). If do it in "Azure Storage Explorer", just change the type from "String" to "DateTime" and click "Update". After that, we can query the records by "TimeOfCreation" >= Last 365 days success.
The bad thing here is, we can just do it manually on each inserted record. We can't solve this problem in logic app or batch update the type(on portal or in explorer). If you want to batch update the type, you can query all of the new inserted records by this api (use $filter to filter timestamp). And then get each record's PartitionKey and RowKey, and loop them. Use this api to update the column TimeOfCreation type.
I am working on Azure and I have created a database with Cosmos DB SQL API. After creating a container I see that the primary key is always named "id". Is there any way to create a container with PK with name different than "id"?
Every document has a unique id called id. This cannot be altered. If you don't set a value, it is assigned a GUID.
When using methods such as ReadDocument() (or equivalent, based on the SDK) for direct reads instead of queries, a document's id property must be specified.
Now, as far as partitioning goes, you can choose any property you want to use as the partition key.
And if you have additional domain-specific identifiers (maybe a part number), you can always store those in their own properties as well. In this example though, just remember that, while you can query for documents containing a specific part number, you can only do a direct-read via id. If direct reads will be the majority of your read operations, then it's worth considering the use of id to store such a value.
PK = primary key, by the way ;) It's the "main" key for the record, it is not private :)
And no, as far as I know you cannot change the primary key name. It is always "id". Specifically it must be in lowercase, "Id" is not accepted.
The errors I'm getting don't match up with what I'm sending in my query.
My query:
SELECT
udf.CreateGuid('') AS [id],
udf.CreateGuid('') AS [DocumentId],
BlobName,
BlobLastModifiedUtcTime,
[telmetry].[event_type] as PartitionKey,
-- webhook
[telmetry].[id] AS [hook_id],
[telmetry].[event_version],
[telmetry].[create_time],
[telmetry].[resource_type],
[telmetry].[event_type],
[telmetry].[summary],
[telmetry].[resource],
[telmetry].[links]
INTO
[cosmosdb2]
FROM
[telemetrydepot] AS [telmetry]
TIMESTAMP BY [telmetry].[create_time]
Here's the export config:
I've tried setting the DocumentId property to DocumentId or id with no success. I'm even throwing additional ID, DocumentId and PartitionKey fields into the results just to get something to save with no success (also trying individual runs putting id or DocumentId in the CosmosDB Document Id property. Can't get anything to save...
The errors I'm getting back say:
An error occurred while preparing data for DocumentDB. The output record does not contain the column DocumentId to use as the partition key property by DocumentDB
DocumentDB is complaining that you've configured the collection's partition key as DocumentId but no such column was in your output. I've found when I alias columns in ASA the column names in the output end up lowercase...
ASA doesn't care about the case but DocumentDB will. Try creating a new collection with partition key set to documentid. You can review the current key under "settings" in the portal for docdb.
Note Document id in ASA output properties controls what goes in the id field. It can be different than the field you partition by in DocumentDB. For example in one of my jobs I want to organize the database by deviceID but identify documents based on messageType. Because I have to alias deviceID, it loses its upper case letters and I have to set the partition key to deviceid. Then I set my Document id to messageType:
I get documents that look like this:
{
"deviceid": "MyDeviceIdentifier",
/.../,
"id": "MyMessageType"
}
I have a Person class that I save to a table in Azure Table Storage.
I want to query it with one of the following queries:
var query = from getThis in _serviceContext.CreateQuery<PersonForSearch>(_tableName)
where getThis.Name.Contains(searchTerm)
select new Person
{
PartitionKey = getThis.PartitionKey,
RowKey = getThis.RowKey,
Name = getThis.Name
};
OR
CloudTableQuery<Person> query =
(from getThis in _serviceContext.CreateQuery<Person>(_tableName)
where getThis.Name.Contains(searchTerm)
select getThis).AsTableServiceQuery<Person>();
With either one, I get the following error, thrown on the foreach loop I use to loop through the results of the query:
NotImplemented
The requested operation is not implemented on the specified resource.
I thought that perhaps this resulted from the fact that my Person model does not inherit from TableServiceEntity (I refuse to introduce that coupling - so I decorated it with this attribute instead: [DataServiceKey("PartitionKey", "RowKey")] and manually gave it a PartitionKey and RowKey property.
So I tried to create an entity that DID inherit from TableServiceEntity, which would allow me to query this table (as you can see from the queries, the only property I'm worried about is Name).
This new entity is as follows:
class PersonForSearch : TableServiceEntity
{
public string Name { get; set; }
}
However, this hasn't solved the problem. Is this error talking about some other resource than the class I'm using in my query?
There are a two issues here:
1) Azure Table Storage does not support Contains() method. This is the reason you're getting Not Implemented exception. ATS does support string.Compare() for any range-type operations on strings
2) In order to retrieve the data effectively, you can only search on PartitionKey or on PartitionKey/RowKey combination. Any other query will result in either errors or the download of the full table into client's memory (can't remember which one). If your table is small, download it into memory completely by dropping the 'where' clause and afterwards use Linq for Objects to query however you like. If it is large, find a way to compare against PartitionKey or PartitionKey/RowKey fields
If I'm understanding what you're trying to do correctly, is that you're trying to do a partial string-search through employee table. Overall, ATS is not a very good solution for doing string-based searches (unless these are "starts with" searches on PartitionKey or PartitionKey/RowKey fields). I'd like to highly recommend Lucene.NET for doing text-based searches in the cloud. There is an Azure Directory API for Lucene.NET available as well. Or switch to SQL Azure
HTH