Windows Azure Table Services - Extended Properties and Table Schema

Windows Azure Table Services - Extended Properties and Table Schema - azure

I have an entity that, in addition to a few common properties, contains a list of extended properties stored as (Name, Value) pairs of strings within a collection. I should probably mention that these extended properties widely vary from instance to instance, and that they only need to be listed for each instance (there won't be any queries over the extended properties, for example finding all instances with a particular (Name, Value) pair). I'm exploring how I might persist this entity using Windows Azure Table Services. With the particular approach I'm testing now, I'm concerned that there may be a degradation of performance over time as more distinct extended property names are encountered by the application.
If I were storing this entity in a typical relational database, I'd probably have two tables to support this schema: the first would contain the entity identifier and its common properties, and the second would reference the entity identifier and use EAV style row-modeling to store the extended (Name, Value) pairs, one to each row.
Since tables in Windows Azure already use an EAV model, I'm considering custom serialization of my entity so that the extended properties are stored as though they were declared at compile time for the entity. I can use the Reading- and Writing-Entity events provided by DataServiceContext to accomplish this.
private void OnReadingEntity(object sender, ReadingWritingEntityEventArgs e)
{
MyEntity Entry = e.Entity as MyEntity;
if (Entry != null)
{
XElement Properties = e.Data
.Element(Atom + "content")
.Element(Meta + "properties");
//select metadata from the extended properties
Entry.ExtendedProperties = (from p in Properties.Elements()
where p.Name.Namespace == Data && !IsReservedPropertyName(p.Name.LocalName) && !string.IsNullOrEmpty(p.Value)
select new Property(p.Name.LocalName, p.Value)).ToArray();
}
}
private void OnWritingEntity(object sender, ReadingWritingEntityEventArgs e)
{
MyEntity Entry = e.Entity as MyEntity;
if (Entry != null)
{
XElement Properties = e.Data
.Element(Atom + "content")
.Element(Meta + "properties");
//add extended properties from the metadata
foreach (Property p in (from p in Entry.ExtendedProperties
where !IsReservedPropertyName(p.Name) && !string.IsNullOrEmpty(p.Value)
select p))
{
Properties.Add(new XElement(Data + p.Name, p.Value));
}
}
}
This works, and since I can define requirements for extended property names and values, I can ensure that they conform to all the standard requirements for entity properties within a Windows Azure Table.
So what happens over time as the application encounters thousands of different extended property names?
Here's what I've observed within the development storage environment:
The table container schema grows with each new name. I'm not sure exactly how this schema is used (probably for the next point), but obviously this xml document could grow quite large over time.
Whenever an instance is read, the xml passed to OnReadingEntity contains elements for every property name ever stored for any other instance (not just the ones stored for the particular instance being read). This means that retrieval of an entity will become slower over time.
Should I expect these behaviors in the production storage environment? I can see how these behaviors would be acceptable for most tables, as the schema would be mostly static over time. Perhaps Windows Azure Tables were not designed to be used like this? If so, I will certainly need to change my approach. I'm also open to suggestions on alternate approaches.

Development storage uses SQL Express to simulate cloud table storage. Ignore what you see there... the production storage system doesn't store any schema, so there's no overhead to having lots of unique properties in a table.

Related

ServiceStack.OrmLite: Table collision when class name appears in different namespaces

When having two classes that has the same name, but in different namespaces, ServiceStacks OrmLite is unable to distinguish between the two. For example:
Type type = typeof(FirstNameSpace.BaseModel);
using (IDbConnection db = _dbFactory.Open())
{
db.CreateTable(false, type); // Creates table "basemodel"
}
type = typeof(SecondNamespace.BaseModel);
using (IDbConnection db = _dbFactory.Open())
{
db.CreateTable(false, type); // Creates nothing as there already is a table 'basemodel', even though its a completely different object/class
}
Is there a general, clean way to make sure that this is resolved?
It is not ideal to be forced to name classes uniquely; a part of the namespaces in .NET is to group and categorize different classes. Also, there might be third-party assemblies with the same class names, that is not available to change for you.
Is there a way to handle this?

OrmLite uses the name of the Type for the table name so you can’t use 2 different Types with the same name.
You will need to either rename one of the Types to avoid the collision or use the [Alias(“UseTableName”)] attribute to tell one of the Types to use a different RDBMS Table name.

DDD. May I migrate my value object to an entity?

So I'm trying to refactor rewrite my app in a DDD way. This is a simple app with 3 classes :
Configuration(name)
Environment(name)
Property(key)
I use it to view and edit configuration files per environment. One Configuration can be viewed as a table with Property as row and Environment as column.
At this time, the Configuration is an entity and Environment and Property are value objects. But now I jave to implement the us ecase to set a Value to a Property for a given Environment. My first idea was this one :
class Configuration(name) {
environments = SetOf[Environment]
properties = SetOf[Property]
setValue(property, environment, value) {
knowEnv = environments.get(environment)
knowEnv.setValue(property, value)
}
}
class Environment(name) {
properties = MapOf[Property, Value]
setValue(property, value) {
properties[property] = value
}
}
But doing so will change my Environment from a value object to an entity. So I started to think (too much) and have trouble to find the "best" solution. That's why I came here to ask you, experts, how would you implement this use case.
Thanks

From what you've posted it does sound as though each Environment is a unique thing with an identity. I'm guessing your Environments are probably platforms or development environments? So it probably should be an entity.
It does sound as though your Environment could be edited, used, created, etc. independently of anything else. In this case it probably shouldn't exist as part of another aggregate, so it should be it's own aggregate root (even if it's just a single entity). Therefore it would have it's own repository. This is a point that isn't blatantly obvious in the Evans DDD book, but an entity on it's own, is considered an aggregate root (made up of just one object).
If you wish to reference an Environment from another aggregate root, you would reference it by its unique id (not as a object reference). You would then need another technique/method to retrieve these Environments.
This might seem to fly in the face of the old data-centric dogma, but you can do all sorts of things, like data caching your Environments (as there's probably a limited amount and they probably change infrequently) or employ CQRS.

Given the discussion and comments received on this questions I decided to keep the Environment immutable as value object. Setting a property value will then produce a new Environment :
class Configuration(name) {
environments = SetOf[Environment]
properties = SetOf[Property]
setValue(property, environment, value) {
knowEnv = environments.get(environment)
updatedEnv = knowEnv.setValue(property, value)
environments.replace(knowEnv, updatedEnv)
}
}
class Environment(name) {
properties = MapOf[Property, Value]
setValue(property, value) {
copy = new Environment(name)
copy.properties = properties
copy.properties[property] = value
return copy
}
}
It is simple to use and acceptable for our use cases.

How to use ObjectContext with Model Builder?

Is there a way we can use ObjectContext with DbContext's ModelBuilder? We don't want to use POCO because we have customized property code that does not modify entire object in update, but only update modified properties. Also we have lots of serialisation and auditing code that uses EntityObject.
Since poco does create a proxy with EntityObject, we want our classes to be derived from EntityObject. We don't want proxy. We also heavily use CreateSourceQuery. The only problem is EDMX file and its big connection string syntax web.config.
Is there any way I can get rid of EDMX file? It will be useful as we can dynamically compile new class based on reverse engineering database.
I would also like to use DbContext with EntityObject instead of poco.
Internal Logic
Access Modified Properties in Save Changes which is available in ObjectStateEntry and Save them onto Audit with Old and New Values
Most of times we need to only check for Any condition on Navigation Property for example
User.EmailAddresses.CreateSourceQuery()
.Any( x=> x.EmailAddress == givenAddress);
Access Property Attributes, such as XmlIgnore etc, we rely heavily on attributes defined on the properties.

A proxy for a POCO is a dynamically created class which derives from (inherits) a POCO. It adds functionality previously found in EntityObject, namely lazy loading and change tracking, as long as a POCO meets requirements. A POCO or its proxy does not contain an EntityObject as the question suggests, but rather a proxy contains functionality of EntityObject. You cannot (AFAIK) use ModelBuilder with EntityObject derivatives and you cannot get to an underlying EntityObject from a POCO or a proxy, since there isn't one as such.
I don't know what features of ObjectContext does your existing serialisation and auditing code use, but you can get to ObjectContext from a DbContext by casting a DbContext to a IObjectContextAdapter and accessing IObjectContextAdapter.ObjectContext property.
EDIT:
1. Access Modified Properties in Save Changes which is available in ObjectStateEntry and Save them onto Audit with Old and New Values
You can achieve this with POCOs by using DbContext.ChangeTracker. First you call DbContext.ChangeTracker.DetectChanges to detect the changes (if you use proxies this is not needed, but can't hurt) and then you use DbCotnext.Entries.Where(e => e.State != EntityState.Unchanged && e.State != EntityState.Detached) to get DbEntityEntry list of changed entities for auditing. Each DbEntityEntry has OriginalValues and CurrentValues and the actual Entity is in property Entity.
You also have access to ObjectStateEntry, see below.
2. Most of times we need to only check for Any condition on Navigation Property for example:
User.EmailAddresses.CreateSourceQuery().Any( x=> x.EmailAddress == givenAddress);
You can use CreateSourceQuery() with DbContext by utilizing IObjectContextAdapter as described previously. When you have ObjectContext you can get to the source query for a related end like this:
public static class DbContextUtils
{
public static ObjectQuery<TMember> CreateSourceQuery<TEntity, TMember>(this IObjectContextAdapter adapter, TEntity entity, Expression<Func<TEntity, ICollection<TMember>>> memberSelector) where TMember : class
{
var objectStateManager = adapter.ObjectContext.ObjectStateManager;
var objectStateEntry = objectStateManager.GetObjectStateEntry(entity);
var relationshipManager = objectStateManager.GetRelationshipManager(entity);
var entityType = (EntityType)objectStateEntry.EntitySet.ElementType;
var navigationProperty = entityType.NavigationProperties[(memberSelector.Body as MemberExpression).Member.Name];
var relatedEnd = relationshipManager.GetRelatedEnd(navigationProperty.RelationshipType.FullName, navigationProperty.ToEndMember.Name);
return ((EntityCollection<TMember>)relatedEnd).CreateSourceQuery();
}
}
This method uses no dynamic code and is strongly typed since it uses expressions. You use it like this:
myDbContext.CreateSourceQuery(invoice, i => i.details);

Azure caching and entity framework deserialization issue

I have a web project deployed in azure using colocated caching. I have 2 instances of this web role.
I am using Entity framework 5 and upon fetching some entities from the db, I cache them using colocated caching.
My entities are defined in class library called Drt.BusinessLayer.Entities
However when I visit my web app, I get the error:
The deserializer cannot load the type to deserialize because type 'System.Data.Entity.DynamicProxies.Country_4C17F5A60A033813EC420C752F1026C02FA5FC07D491A3190ED09E0B7509DD85' could not be found in assembly 'EntityFrameworkDynamicProxies-Drt.BusinessLayer.Entities, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'. Check that the type being serialized has the same contract as the type being deserialized and the same assembly is used.
Also sometimes I get this too:
Assembly 'EntityFrameworkDynamicProxies-Drt.BusinessLayer.Entities, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null' is not found.
It appears that there is an error getting the entities out/deserialized. Since they are 2 instances of my web role, instance1 might place some entity objects in the cache and instance2 might get them out. I was expecting this to work, but I am unsure why I am getting this error....
Can anyone help/advise?

I ran into the same issue. At least in my case, the problem was the DynamicProxies with which the EF wraps all the model classes. In other words, you might think you're retrieving a Country class, but under the hood, EF is actually dynamically generating a class that's called something like Country_4C17F5A60A033813EC420C752F1026C02FA5FC07D491A3190ED09E0B7509DD85. The last part of the name is obviously generated at run-time, and it can be expected to remain static throughout the life of your application - but (and this is the key) only on the same instance of the app domain. If you've got two machines accessing the same out-of-process cache, one will be storing an object of the type Country_4C17F5A60A033813EC420C752F1026C02FA5FC07D491A3190ED09E0B7509DD85, but that type simply won't exist on the other machine. Its dynamic Country class will be something like Country_JF7ASDF8ASDF8ADSF88989ASDF8778802348JKOJASDLKJQAWPEORIU7879243AS, and so there won't be any type into which it can deserialize the serialized object. The same thing will happen if you restart the app domain your web app is running in.
I'm sure the big brains at MS could come up with a better solution, but the one I've been using is to do a "shallow clone" of my EF objects before I cache them. The C# method I'm using looks like this:
public static class TypeHelper
{
public static T ShallowClone<T>(this T obj) where T : class
{
if (obj == null) return null;
var newObj = Activator.CreateInstance<T>();
var fields = typeof(T).GetFields();
foreach (var field in fields)
{
if (field.IsPublic && (field.FieldType.IsValueType || field.FieldType == typeof(string)))
{
field.SetValue(newObj, field.GetValue(obj));
}
}
var properties = typeof(T).GetProperties();
foreach (var property in properties)
{
if ((property.CanRead && property.CanWrite) &&
(property.PropertyType.IsValueType || property.PropertyType == typeof(string)))
{
property.SetValue(newObj, property.GetValue(obj, null), null);
}
}
return newObj;
}
}
This takes care of two problems at once: (1) It ensures that only the EF object I'm specifically interested in gets cached, and not the entire object graph - sometimes huge - to which it's attached; and (2) The object that it caches is of a common type, and not the dynamically generated type: Country and not Country_4C17F5A60A033813EC420C752F1026C02FA5FC07D491A3190ED09E0B7509DD85.
It's certainly not perfect, but it does seem a reasonable workaround for many scenarios.
It would in fact be nice, though, if the good folks at MS were to come up with a way to cache EF objects without this.

I'm not familiar with azure-caching in particular, but I'm guessing you need to hydrate your entities completely before passing them to anything that does serialization, which is something a distributed or out-of-process cache would do.
So, just do .Include() on all relationships when you're fetching an entity or disable lazy initialization and you should be fine.

DDD: keep a link to an entity inside an aggregate root, for reporting only

I'm refactoring a project using DDD, but am concerned about not making too many Entities their own Aggregate Root.
I have a Store, which has a list of ProductOptions and a list of Products. A ProductOption can be used by several Products. These entities seem to fit the Store aggregate pretty well.
Then I have an Order, which transiently uses a Product to build its OrderLines:
class Order {
// ...
public function addOrderLine(Product $product, $quantity) {
$orderLine = new OrderLine($product, $quantity);
$this->orderLines->add($orderLine);
}
}
class OrderLine {
// ...
public function __construct(Product $product, $quantity) {
$this->productName = $product->getName();
$this->basePrice = $product->getPrice();
$this->quantity = $quantity;
}
}
Looks like for now, DDD rules as respected. But I'd like to add a requirement, that might break the rules of the aggregate: the Store owner will sometimes need to check statistics about the Orders which included a particular Product.
That means that basically, we would need to keep a reference to the Product in the OrderLine, but this would never be used by any method inside the entity. We would only use this information for reporting purposes, when querying the database; thus it would not be possible to "break" anything inside the Store aggregate because of this internal reference:
class OrderLine {
// ...
public function __construct(Product $product, $quantity) {
$this->productName = $product->getName();
$this->basePrice = $product->getPrice();
$this->quantity = $quantity;
// store this information, but don't use it in any method
$this->product = $product;
}
}
Does this simple requirement dictates that Product becomes an aggregate root? That would also cascade to the ProductOption becoming an aggregate root, as Product has a reference to it, thus resulting in two aggregates which have no meaning outside a Store, and will not need any Repository; looks weird to me.
Any comment is welcome!

Even though it is for 'reporting only' there is still a business / domain meaning there. I think that your design is good. Although I would not handle the new requirement by storing OrderLine -> Product reference. I would do something similar to what you already doing with product name and price. You just need to store some sort of product identifier (SKU?) in the order line. This identifier/SKU can later be used in a query. SKU can be a combination of Store and Product natural keys:
class Sku {
private String _storeNumber;
private String _someProductIdUniqueWithinStore;
}
class OrderLine {
private Money _price;
private int _quantity;
private String _productName;
private Sku _productSku;
}
This way you don't violate any aggregate rules and the product and stores can be safely deleted without affecting existing or archived orders. And you can still have your 'Orders with ProductX from StoreY'.
Update: Regarding your concern about foreign key. In my opinion foreign key is just a mechanism that enforces long-living Domain relationships at the database level. Since you don't have a domain relationship you don't need the enforcement mechanism as well.

In this case you need the information for reporting which has nothing to do with the aggregate root.
So the most suitable place for it would be a service (could be a domain service if it is related to business or better to application service like querying service which query the required data and return them as DTOs customizable for presentation or consumer.
I suggest you create a statistics services which query the required data using read only repositories (or preferable Finders) which returns DTOs instead of corrupting the domain with query models.
Check this

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Windows Azure Table Services - Extended Properties and Table Schema - azure

Development storage uses SQL Express to simulate cloud table storage. Ignore what you see there... the production storage system doesn't store any schema, so there's no overhead to having lots of unique properties in a table.

Related

ServiceStack.OrmLite: Table collision when class name appears in different namespaces

DDD. May I migrate my value object to an entity?

How to use ObjectContext with Model Builder?

Azure caching and entity framework deserialization issue

DDD: keep a link to an entity inside an aggregate root, for reporting only

Categories

Resources