How to deduplicate Core Data entity records when using CloudKit? - core-data

Since unique constraint is not something available when using CloudKit and storing some entities inside NSPersistentCloudKitContainer would cause duplicated records after the data is synced across multiple devices.
Is there any existing best practice to deduplicate Core Data entity records stored on CloudKit?
I store the app's current user object in Core Data with iCloud user record ID as the unique identifier; however, after the entry got synced to other devices, multiple users appear under the User SQLite data.
func getLocalUser() -> User? {
/// Try to fetch the only local user
let idMixpanel = DCMixpanel.shared.distinctId
let fetchRequest: NSFetchRequest<User> = User.fetchRequest()
let userPredicate = NSPredicate(
format: "%K = %#",
fetchRequest.predicate = userPredicate
let userList = try? container.viewContext.fetch(fetchRequest)
if let user = userList?.first {
if user.idCloudKit == nil {
/// This method is async and the ID is not available immediately after function invocation
return user
/// Create a new local user if no user found with local ID
let userLocalNew = User(context: container.viewContext)
userLocalNew.idMixpanel = DCMixpanel.shared.distinctId
return userLocalNew

My solution is to create a mergeLocalUser method
private func mergeLocalUser() -> User? {
/// Fetch all users
/// Merge users if there are more than one user in the local database
/// Custom user properties merging logic
/// Delete all user instances and re-create a new user
return newUserInstance


access indexfields, batchfields and batch variables in custom module

In my setup form I configure some settings for my custom module. The settings are stored in the custom storage of the batch class. Given the variable IBatchClass batchClass I can access the data by executing
string data = batchClass.get_CustomStorageString("myKey");
and set the data by executing
batchClass.set_CustomStorageString("myKey", "myValue");
When the custom module gets executed I want to access this data from the storage. The value I get returned is the key for the batchfield collection or indexfield collection or batch variables collection. When creating Kofax Export Connector scripts I would have access to the ReleaseSetupData object holding these collections.
Is it possible to access these fields during runtime?
private string GetFieldValue(string fieldName)
string fieldValue = string.Empty;
IIndexFields indexFields = null; // access them
fieldValue = indexFields[fieldName].ToString();
catch (Exception e)
IBatchFields batchFields = null; // access them
fieldValue = batchFields[fieldName].ToString();
catch (Exception e)
dynamic batchVariables = null; // access them
fieldValue = batchVariables[fieldName].ToString();
catch (Exception e)
return fieldValue;
The format contains a string like
"{#Charge}; {Current Date} {Current Time}; Scan Operator: {Scan
Operator's User ID}; Page: x/y"
and each field wrapped by {...} represents a field from one of these 3 collections.
Kofax exposes a batch as an XML, and DBLite is basically a wrapper for said XML. The structure is explained in AcBatch.htm and AcDocs.htm (to be found under the CaptureSV directory). Here's the basic idea (just documents are shown):
For a standard server installation, the file would be located here: \\servername\CaptureSV\AcBatch.htm. A single document has child elements itself such as index fields, and multiple properties such as Confidence, FormTypeName, and PDFGenerationFileName.
Here's how to extract the elements from the active batch (your IBatch instance) as well as accessing all batch fields:
var runtime = activeBatch.ExtractRuntimeACDataElement(0);
var batch = runtime.FindChildElementByName("Batch");
foreach (IACDataElement item in batch.FindChildElementByName("BatchFields").FindChildElementsByName("BatchField"))
The same is true for index fields. However, as those reside on document level, you would need to drill down to the Documents element first, and then retrieve all Document children. The following example accesses all index fields as well, storing them in a dictionary named IndexFields:
var documents = batch.FindChildElementByName("Documents").FindChildElementsByName("Document");
var indexFields = DocumendocumentstData.FindChildElementByName("IndexFields").FindChildElementsByName("IndexField");
foreach (IACDataElement indexField in indexFields)
IndexFields.Add(indexField["Name"], indexField["Value"]);
With regard to Batch Variables such as {Scan Operator's User ID}, I am not sure. Worst case scenario is to assign them as default values to index or batch fields.

How do I get a fetch request to return the number of objects in the persistent store?

When my app launches and the first view controller is created, a new backing NSManagedObject is also created. At this point, I have NOT saved the context (and I started with a fresh, empty persistent store).
The user can transition to another screen that will show a message if there are no saved items or, if saved items exist, it will show a list of the items. This is how I'm checking for saved items:
func checkForSavedItems() -> Bool {
var itemsDoExist = false
let fetchRequest = NSFetchRequest<NSNumber>(entityName: "Items")
fetchRequest.includesPendingChanges = false
fetchRequest.resultType = .countResultType
do {
let countResult = try context.fetch(fetchRequest)
itemsDoExist = countResult.first!.intValue > 0
} catch let error {
return itemsDoExist
I expected that fetchRequest.includesPendingChanges = false would have ensured that the new object that hasn't been saved would not be counted but it is. The count comes back as 1 so it must be counting items in the NSManagedObjectContext
This also suggests that the fetch request is returning the count of items in the context, not the persistent store.
How do I get the real number of items in the persistent store?
I also expected that fetchRequest.includesPendingChanges = false would exclude objects that have been inserted in the context but not saved to the store.
However, the count(for: NSFetchRequest) method should give the correct count. You can find the Apple documentation here.

Access resources by Id in Azure DocumentDB

I just started playing with Azure DocumentDB and my excitement has turned into confusion. This thing is weird. It seems like everything (databases, collections, documents) needs to be accessed not by its id, but by its 'SelfLink'. For example:
I create a database:
public void CreateDatabase()
using (var client = new DocumentClient(new Uri(endpoint), authKey))
Database db = new Database()
Id = "TestDB",
Then later sometime I want to create a Collection:
public void CreateCollection()
using (var client = new DocumentClient(new Uri(endpoint), authKey))
DocumentCollection collection = new DocumentCollection()
Id = "TestCollection",
client.CreateDocumentCollectionAsync(databaseLink: "???", documentCollection: collection).Wait();
The api wants a 'databaseLink' when what I'd really prefer to give it is my database Id. I don't have the 'databaseLink' handy. Does DocumentDB really expect me to pull down a list of all databases and go searching through it for the databaseLink everytime I want to do anything?
This problem goes all the way down. I can't save a document to a collection without having the collection's 'link'.
public void CreateDocument()
using (var client = new DocumentClient(new Uri(endpoint), authKey))
client.CreateDocumentAsync(documentCollectionLink: "???", document: new { Name = "TestName" }).Wait();
So to save a document I need the collection's link. To get the collections link I need the database link. To get the database link I have to pull down a list of all databases in my account and go sifting through it. Then I have to use that database link that I found to pull down a list of collections in that database that I then have to sift through looking for the link of the collection I want. This doesn't seem right.
Am I missing something? Am I not understanding how to use this? Why am I assigning ids to all my resources when DocumentDB insists on using its own link scheme to identify everything? My question is 'how do I access DocumentDB resources by their Id?'
The information posted in other answers from 2014 is now somewhat out of date. Direct addressing by Id is possible:
Although _selflinks still exist, and can be used to access resources, Microsoft have since added a much simpler way to locate resources by their Ids that does not require you to retain the _selflink :
UriFactory.CreateDocumentCollectionUri(databaseId, collectionId))
UriFactory.CreateDocumentUri(databaseId, collectionId, "document id");
This enables you to create a safe Uri (allowing for example for whitespace) - which is functionally identical to the resources _selflink; the example given in the Microsoft announcement is shown below:
// Use **UriFactory** to build the DocumentLink
Uri docUri = UriFactory.CreateDocumentUri("SalesDb", "Catalog", "prd123");
// Use this constructed Uri to delete the document
await client.DeleteDocumentAsync(docUri);
The announcement, from August 13th 2015, can be found here:
I would recommend you look at the code samples here in particular the DocumentDB.Samples.ServerSideScripts project.
In the Program.cs you will find the GetOrCreateDatabaseAsync method:
/// <summary>
/// Get or create a Database by id
/// </summary>
/// <param name="id">The id of the Database to search for, or create.</param>
/// <returns>The matched, or created, Database object</returns>
private static async Task<Database> GetOrCreateDatabaseAsync(string id)
Database database = client.CreateDatabaseQuery()
.Where(db => db.Id == id).ToArray().FirstOrDefault();
if (database == null)
database = await client.CreateDatabaseAsync(
new Database { Id = id });
return database;
To answer you question, you can use this method to find your database by its id and other resources (collections, documents etc.) using their respective Create[ResourceType]Query() methods.
Hope that helps.
The create database call returns a the database object:
var database = client.CreateDatabaseAsync(new Database { Id = databaseName }).Result.Resource;
And then you can use that to create your collection
var spec = new DocumentCollection { Id = collectionName };
spec.IndexingPolicy.IndexingMode = IndexingMode.Consistent;
spec.IndexingPolicy.Automatic = true;
spec.IndexingPolicy.IncludedPaths.Add(new IndexingPath { IndexType = IndexType.Range, NumericPrecision = 6, Path = "/" });
var options = new RequestOptions
ConsistencyLevel = ConsistencyLevel.Session
var collection = client.CreateDocumentCollectionAsync(database.SelfLink, spec, options).Result.Resource;
The client.Create... methods return the objects which have the self links you are looking for
Database database = await client.CreateDatabaseAsync(
new Database { Id = "Foo"});
DocumentCollection collection = await client.CreateDocumentCollectionAsync(
database.SelfLink, new DocumentCollection { Id = "Bar" });
Document document = await client.CreateDocumentAsync(
collection.SelfLink, new { property1 = "Hello World" });
For deleting the document in partitioned collection, please leverage this format:
result = await client.DeleteDocumentAsync(selfLink, new RequestOptions {
PartitionKey = new PartitionKey(partitionKey)

OrganizationServiceContext: System.InvalidOperationException: The context is already tracking the 'contact' entity

I'm trying to create a plugin that changes all related contacts' address fields if the parent account's address field is changed in account form. I created a plugin to run in pre operation stage (update message against account entity) synchronously.
I used LINQ query to retrieve all related contacts and it works. Then I'm using foreach loop to loop trough all contacts and change them address fields. I'm using OrganizationServiceContext.AddObject(); function to add every contact to the tracking pipeline (or whatever it's called) and finally I'm using OrganizationServiceContext.SaveChanges(); to trying to save all contacts. But that's when I'm getting this error:
System.InvalidOperationException: The context is already tracking the 'contact' entity.
Here's my code
// Updating child contacts' fields
if (context.PreEntityImages.Contains("preAccount") && context.InputParameters.Contains("Target") && context.InputParameters["Target"] is Entity) {
if (((Entity)context.InputParameters["Target"]).Contains("address2_name")) {
Entity account = (Entity)context.InputParameters["Target"];
Entity preAccount = (Entity)context.PreEntityImages["preAccount"];
if (account["address2_name"] != preAccount["address2_name"]) {
EntityReference parentCustomer = new EntityReference(account.LogicalName, account.Id);
Contact[] childContacts = orgService.ContactSet.Where(id => id.ParentCustomerId == parentCustomer).ToArray();
foreach (Contact contact in childContacts) {
contact.Address2_Name = (string)account["address2_name"];
What I'm doing wrong?
You already attached the entities to the context when you retrieved the contacts with the query
Contact[] childContacts = orgService.ContactSet.Where(id => id.ParentCustomerId == parentCustomer).ToArray();
so you don't need to add again the entities to the context, instead you need to update them, by:
orgService.UpdateObject(contact); // this row instead of orgService.AddObject(contact);

How can I update a content item (draft) from a background task in Orchard?

I have a simple IBackgroundTask implementation that performs a query and then either performs an insert or one or more updates depending on whether a specific item exists or not. However, the updates are not persisted, and I don't understand why. New items are created just as expected.
The content item I'm updating has a CommonPart and I've tried authenticating as a valid user. I've also tried flushing the content manager at the end of the Sweep method. What am I missing?
This is my Sweep, slightly edited for brevity:
public void Sweep()
// Authenticate as the site's super user
var superUser = _membershipService.GetUser(_orchardServices.WorkContext.CurrentSite.SuperUser);
// Create a dummy "Person" content item
var item = _contentManager.New("Person");
var person = item.As<PersonPart>();
if (person == null)
person.ExternalId = Random.Next(1, 10).ToString();
person.FirstName = GenerateFirstName();
person.LastName = GenerateLastName();
// Check if the person already exists
var matchingPersons = _contentManager
.Query<PersonPart, PersonRecord>(VersionOptions.AllVersions)
.Where(record => record.ExternalId == person.ExternalId)
if (!matchingPersons.Any())
// Insert new person and quit
_contentManager.Create(item, VersionOptions.Draft);
// There are at least one matching person, update it
foreach (var updatedPerson in matchingPersons)
updatedPerson.FirstName = person.FirstName;
updatedPerson.LastName = person.LastName;
Try to add _contentManager.Publish(updatedPerson). If you do not want to publish, but just to save, you don't need to do anything more, as changes in Orchard as saved automatically unless the ambient transaction is aborted. The call to Flush is not necessary at all. This is the case both during a regular request and on a background task.
