Windows Azure: "An item with the same key has already been added." exception thrown on Select - azure

I'm getting a strange error while trying to select a row from a table under Windows Azure Table Storage. The exception "An item with the same key has already been added." is being thrown even though I'm not inserting anything. The query that is causing the problem is as follows:
var ids = new HashSet<string>() { id };
var fields = new HashSet<string> {"#all"};
using (var db = new AzureDbFetcher())
{
var result = db.GetPeople(ids, fields, null);
}
public Dictionary<string, Person> GetPeople(HashSet<String> ids, HashSet<String> fields, CollectionOptions options)
{
var result = new Dictionary<string, Person>();
foreach (var id in ids)
{
var p = db.persons.Where(x => x.RowKey == id).SingleOrDefault();
if (p == null)
{
continue;
}
// do something with result
}
}
As you can see, there's only 1 id and the error is thrown right at the top of the loop and nothing is being modified.
However, I'm using "" as the Partition Key for this particular row. What gives?

You probably added an object with the same row key (and no partition key) to your DataServiceContext before performing this query. Then you're retrieving the conflicting object from the data store, and it can't be added to the context because of the collision.
The context tracks all object retrieved from the Tables. Since entities are uniquely identified by their partitionKey/rowKey combination, a context, like the tables, cannot contain duplicate partitionkey/rowkey combinations.
Possible causes of such a collison are:
Retrieving an entity, modifying it, and then retrieving it again using the same context.
Adding an entity to the context, and then retrieving one with the same keys.
In both cases, the context the encounters it's already tracking a different object which does however have the same keys. This is not something the context can sort out by itself, hence the exception.
Hope this helps. If you could give a little more information, that would be helpful.

Related

Get the Scan Operator when releasing documents

When releasing documents the scan operator should get logged to a file. I know this is a kofax system variable but how do I get it from the ReleaseData object?
Maybe this value is hold by the Values collection? What is the key then? I would try to access it by using
string scanOperator = documentData.Values["?scanOperator?"].Value;
Kofax's weird naming convention strikes again - during setup, said items are referred to as BatchVariableNames. However, during release they are KFX_REL_VARIABLEs (an enum named KfxLinkSourceType).
Here's how you can add all available items during setup:
foreach (var item in setupData.BatchVariableNames)
{
setupData.Links.Add(item, KfxLinkSourceType.KFX_REL_VARIABLE, item);
}
The following sample iterates over the DocumentData.Values collection, storing each BatchVariable in a Dictionary<string, string> named BatchVariables.
foreach (Value v in DocumentData.Values)
{
switch (v.SourceType)
{
case KfxLinkSourceType.KFX_REL_VARIABLE:
BatchVariables.Add(v.SourceName, v.Value);
break;
}
}
You can then access any of those variables by key - for example Scan Operator's User ID yields the scan user's domain and name.

ServiceStack: Update<T>(...) produces 'duplicate entry'

I have tried reading the docs, but I don't get why the Update method produces a "Duplicate entry" MySQL error.
The docs says
In its most simple form, updating any model without any filters will update every field, except the Id which is used to filter the update to this specific record:
So I try it, and pass in an object, like below. A row with id 2 already exists.
using (var _db = _dbFactory.Open())
{
Customer coreObject = new Customer(...);
coreObject.Id = 2;
coreObject.ObjectName = "a changed value";
_db.Update<Customer>(coreObject); // <-- error "duplicate entry"
}
Yes, there are options using .Save and such, but what am I missing with the .Update? As I read it, it should use its Id property to update the row in the db, not insert a new row?
The issue with this method is that you're updating a generic object T but your Update API says to update the Concrete Customer type:
public void MyTestMethod<T>(T coreObject) where T : CoreObject
{
long id = 0;
using (var _db = _dbFactory.Open())
{
id = _db.Insert<T>(coreObject, selectIdentity: true);
if (DateTime.Now.Ticks == 0)
{
coreObject.Id = (uint)id;
_db.Delete(coreObject);
}
if (DateTime.Now.Ticks == 0)
{
_db.DeleteById<Customer>(id);
}
if (DateTime.Now.Ticks == 0)
{
coreObject.Id = (uint)id;
coreObject.ObjectName = "a changed value";
_db.Update<Customer>(coreObject);
}
}
}
Which OrmLite assumes that you're using a different/anonymous object to update the customer table, similar to:
db.Update<Customer>(new { Id = id, ObjectName = "a changed value", ... });
Which as it doesn't have a WHERE filter will attempt to update all rows with the same primary key.
What you instead want is to update the same entity, either passing in the Generic Type T or have it inferred by not passing in any type, e.g:
_db.Update<T>(coreObject);
_db.Update(coreObject);
Which will use OrmLite's behavior of updating entity by updating each field except for Primary Keys which it instead used in the WHERE expression to limit the update to only update that entity.
New Behavior in v5.1.1
To prevent accidental misuse like this I've added an Update API overload in this commit which will use the Primary Key as a filter when using an anonymous object to update an entity, so your previous usage:
_db.Update<Customer>(coreObject);
Will add the Primary Key to the WHERE filter instead of including it in the SET list. This change is available from v5.1.1 that's now available on MyGet.

AWS sdk for .net queryAsync method using global secondary index fails

given below is the method I used to retrieve details from a Dynamodb table. But when I call this method it ended up throwing an exception "Unable to locate property for key attribute appointmentId". primary key of this particular table is appointmentId, but I've already created a global secondary index on patientId column. I'm using that index in below query to get the appointment details by a given patientID.
public async Task GetAppointmentByPatientID(int patientID)
{
var context = CommonUtils.Instance.DynamoDBContext;
PatientAppointmentObjectList.Clear();
DynamoDBOperationConfig config = new DynamoDBOperationConfig();
config.IndexName = DBConstants.APPOINTMENT_PATIENTID_GSI;
AsyncSearch<ScheduledAppointment> appQuery = context.QueryAsync<ScheduledAppointment>(patientID.ToString(), config);
IEnumerable<ScheduledAppointment> appList = await appQuery.GetRemainingAsync();
appList.Distinct().ToList().ForEach(i => PatientAppointmentObjectList.Add(i));
if (PropertyChanged != null)
this.OnPropertyChanged("PatientAppointmentObjectList");
}
}
It was a silly mistake. I've had the hash key column of the table as "appointmentID" and the model object property named as AppointmentID. mismatch in the case of the property name had confused the dynamodb mapping.

Getting an error creating a Query object in SubSonic

I am getting the following error in one of our environments. It seems to occur when IIS is restarted, but we haven't narrowed down the specifics to reproduce it.
A DataTable named 'PeoplePassword' already belongs to this DataSet.
at System.Data.DataTableCollection.RegisterName(String name, String tbNamespace)
at System.Data.DataTableCollection.BaseAdd(DataTable table)
at System.Data.DataTableCollection.Add(DataTable table)
at SubSonic.SqlDataProvider.GetTableSchema(String tableName, TableType tableType)
at SubSonic.DataService.GetSchema(String tableName, String providerName, TableType tableType)
at SubSonic.DataService.GetTableSchema(String tableName, String providerName)
at SubSonic.Query..ctor(String tableName)
at Wad.Elbert.Data.Enrollment.FetchByUserId(Int32 userId)
Based on the stacktrace, I believe the error is happening on the second line of the method while creating the query object.
Please let me know if anyone else has this problem.
Thanks!
The code for the function is:
public static List<Enrollment> FetchByUserId(int userId)
{
List<Enrollment> enrollments = new List<Enrollment>();
SubSonic.Query query = new SubSonic.Query("Enrollment");
query.SelectList = "userid, prompt, response, validationRegex, validationMessage, responseType, enrollmentSource";
query.QueryType = SubSonic.QueryType.Select;
query.AddWhere("userId", userId);
DataSet dataset = query.ExecuteDataSet();
if (dataset != null &&
dataset.Tables.Count > 0)
{
foreach (DataRow dr in dataset.Tables[0].Rows)
{
enrollments.Add(new Enrollment((int)dr["userId"], dr["prompt"].ToString(), dr["response"].ToString(), dr["validationRegex"] != null ? dr["validationRegex"].ToString() : string.Empty, dr["validationMessage"] != null ? dr["validationMessage"].ToString() : string.Empty, (int)dr["responseType"], (int)dr["enrollmentSource"]));
}
}
return enrollments;
}
This is a threading issue.
Subsonic loads it's schema on the first call of SubSonic.DataService.GetTableSchema(...) but this is not Thread safe.
Let me demonstrate this with a little example
private static Dictionary<string, DriveInfo> drives = new Dictionary<string, DriveInfo>;
private static DriveInfo GetDrive(string name)
{
if (drives.Count == 0)
{
Thread.Sleep(10000); // fake delay
foreach(var drive in DriveInfo.GetDrives)
drives.Add(drive.Name, drive);
}
if (drives.ContainsKey(name))
return drives[name];
return null;
}
this explains well what happens, on the first call to this method the dictionary is empty
If that's the case the method will preload all drives.
For every call the requested drive (or null) is returned.
But what happens if you fire the method two times directly after the start? Then both executions try to load the drives in the Dictionary. The first one to add a drive wins the second will throw an ArgumentException (element already exists).
After the initial preload, everything works fine.
Long story short, you have two choices.
Modify subsonic source to make SubSonic.DataService.GetTableSchema(...) thread safe.
http://msdn.microsoft.com/de-de/library/c5kehkcz(v=vs.80).aspx
"Warmup" subsonic before accepting requests. The technic to achive this depends on your application design. For ASP.NET you have an Application_Start method that is only executed once during your application lifecycle
http://msdn.microsoft.com/en-us/library/ms178473(v=vs.100).aspx
So you can basically put a
var count = new SubSonic.Query("Enrollment").GetRecordCount();
in the method to force subsonic to init the table schema itself.

Add or replace entity in Azure Table Storage

I'm working with Windows Azure Table Storage and have a simple requirement: add a new row, overwriting any existing row with that PartitionKey/RowKey. However, saving the changes always throws an exception, even if I pass in the ReplaceOnUpdate option:
tableServiceContext.AddObject(TableName, entity);
tableServiceContext.SaveChangesWithRetries(SaveChangesOptions.ReplaceOnUpdate);
If the entity already exists it throws:
System.Data.Services.Client.DataServiceRequestException: An error occurred while processing this request. ---> System.Data.Services.Client.DataServiceClientException: <?xml version="1.0" encoding="utf-8" standalone="yes"?>
<error xmlns="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
<code>EntityAlreadyExists</code>
<message xml:lang="en-AU">The specified entity already exists.</message>
</error>
Do I really have to manually query for the existing row first and call DeleteObject on it? That seems very slow. Surely there is a better way?
As you've found, you can't just add another item that has the same row key and partition key, so you will need to run a query to check to see if the item already exists. In situations like this I find it helpful to look at the Azure REST API documentation to see what is available to the storage client library. You'll see that there are separate methods for inserting and updating. The ReplaceOnUpdate only has an effect when you're updating, not inserting.
While you could delete the existing item and then add the new one, you could just update the existing one (saving you one round trip to storage). Your code might look something like this:
var existsQuery = from e
in tableServiceContext.CreateQuery<MyEntity>(TableName)
where
e.PartitionKey == objectToUpsert.PartitionKey
&& e.RowKey == objectToUpsert.RowKey
select e;
MyEntity existingObject = existsQuery.FirstOrDefault();
if (existingObject == null)
{
tableServiceContext.AddObject(TableName, objectToUpsert);
}
else
{
existingObject.Property1 = objectToUpsert.Property1;
existingObject.Property2 = objectToUpsert.Property2;
tableServiceContext.UpdateObject(existingObject);
}
tableServiceContext.SaveChangesWithRetries(SaveChangesOptions.ReplaceOnUpdate);
EDIT: While correct at the time of writing, with the September 2011 update Microsoft have updated the Azure table API to include two upsert commands, Insert or Replace Entity and Insert or Merge Entity
In order to operate on an existing object NOT managed by the TableContext with either Delete or SaveChanges with ReplaceOnUpdate options, you need to call AttachTo and attach the object to the TableContext, instead of calling AddObject which instructs TableContext to attempt to insert it.
http://msdn.microsoft.com/en-us/library/system.data.services.client.dataservicecontext.attachto.aspx
in my case it was not allowed to remove it first, thus I do it like this, this will result in one transaction to server which will first remove existing object and than add new one, removing need to copy property values
var existing = from e in _ServiceContext.AgentTable
where e.PartitionKey == item.PartitionKey
&& e.RowKey == item.RowKey
select e;
_ServiceContext.IgnoreResourceNotFoundException = true;
var existingObject = existing.FirstOrDefault();
if (existingObject != null)
{
_ServiceContext.DeleteObject(existingObject);
}
_ServiceContext.AddObject(AgentConfigTableServiceContext.AgetnConfigTableName, item);
_ServiceContext.SaveChangesWithRetries();
_ServiceContext.IgnoreResourceNotFoundException = false;
Insert/Merge or Update was added to the API in September 2011. Here is an example using the Storage API 2.0 which is easier to understand then the way it is done in the 1.7 api and earlier.
public void InsertOrReplace(ITableEntity entity)
{
retryPolicy.ExecuteAction(
() =>
{
try
{
TableOperation operation = TableOperation.InsertOrReplace(entity);
cloudTable.Execute(operation);
}
catch (StorageException e)
{
string message = "InsertOrReplace entity failed.";
if (e.RequestInformation.HttpStatusCode == 404)
{
message += " Make sure the table is created.";
}
// do something with message
}
});
}
The Storage API does not allow more than one operation per entity (delete+insert) in a group transaction:
An entity can appear only once in the transaction, and only one operation may be performed against it.
see MSDN: Performing Entity Group Transactions
So in fact you need to read first and decide on insert or update.
You may use UpsertEntity and UpsertEntityAsync methods in the official Microsoft Azure.Data.Tables TableClient.
The fully working example is available at https://github.com/Azure-Samples/msdocs-azure-data-tables-sdk-dotnet/blob/main/2-completed-app/AzureTablesDemoApplicaton/Services/TablesService.cs --
public void UpsertTableEntity(WeatherInputModel model)
{
TableEntity entity = new TableEntity();
entity.PartitionKey = model.StationName;
entity.RowKey = $"{model.ObservationDate} {model.ObservationTime}";
// The other values are added like a items to a dictionary
entity["Temperature"] = model.Temperature;
entity["Humidity"] = model.Humidity;
entity["Barometer"] = model.Barometer;
entity["WindDirection"] = model.WindDirection;
entity["WindSpeed"] = model.WindSpeed;
entity["Precipitation"] = model.Precipitation;
_tableClient.UpsertEntity(entity);
}

Resources