Azure table storage - pattern for parent-child (self referencing schema) - azure

Using Windows Azure Table Storage (WATS) and trying to update the app to use Azure. I've read many articles, and am not sure on the best approach for this, that is parent to child in a self referencing model.
ie a single parent message could have many child sub-messages. In a DB model, it would be a self referencing table.
How would I best structure this for WATS so that when I make a query "Give me 10 parent records", it will also return all the child-messages belonging to the parent...
The entity of the message / submessage as below. I've tried to define the PK and RK as below:
public class TextCacheEntity : AzureTableEntity // custom table inherits AzureTableEntity
{
public override void GenerateKeys()
{
PartitionKey = string.Format("{0}_{1}_{2}", MessageType, AccountId.PadThis(), ParentMessageId );
RowKey = string.Format("{0}_{1}", DateOfMessage.Ticks.ReverseTicks(), MessageId);
}
public string MessageType { get; set; }
public int AccountId { get; set; }
public DateTime DateOfMessage { get; set; }
public string MessageId { get; set; }
public string ParentMessageId { get; set; }
// other properties...
}
I thought of an implementation so the child messages store the parentMessagesId, and the parent parentMessageId would be empty.
The pattern would then be
Get the parent messages
.Where(o => o.ParititionKey == "Parent_000000000000001_").Take(10)
Get the child messages. Iterate through all the parent messages and using a parallel for loop
.Where(o => o.ParititionKey == "Child_000000000000001_" + parentMessageId)
But the problem is that this will result in 11 queries !

See this example by Scott Densmore:
http://scottdensmore.typepad.com/blog/2011/04/multi-entity-schema-tables-in-windows-azure.html

You can do this by using the same PK for both. There are a couple reasons to do this, but one good one is that you can then also issue batch commands for parent and children at once and achieve a type of consistent transaction. Also, when they share the same PK within the same table, it means they are going to be colocated together and served from the same partition. You are less likely to continuation tokens (but you should still expect them). To differentiate between parent and children you can either add an attribute or use the RowKey perhaps.
The only trick to this (and the model you already ahve), is that if the parent and children are not the same CLR type, you will have issues with serialization in WCF DataServices. You can fix this of course by creating an uber-CLR type that has both child and parent properties or you can override serialization with the ReadingEntity event and handle it yourself.
Anyhow, use the same PK for both children and parent. Then when you search PK ranges you will always get parents and children returned at once (you can discriminate with a Where clause predicate if you wish).

Related

Entity Framework Linking tables

I’m using Entity Framework 5.0,
Scenario
"Organisation" has a list of "clients" and a list of "Periods" and a "CurrentPeriodID" At the start of each period some or all of the "Clients" are associated with that "Period", this I have done using a link table and this works OK so when I do "Organisation->Period->Clients" I get a list of "Clients" for the "Period".
Next I need to add some objects ("Activities") to the "Clients" for a "Period" so I get "Organisation->Period->Client->Activates" this won’t be the only one there will eventually be several other navigation properties that will need to be added to the "Clients" and the "Activities" and all of them have to be "Period" related, I also will have to be able to do (if possible) "Organisation->Period-Activities".
Question
What would be the best way of implementing the "Activities" for the "Organisation->Period-Client", I Don’t mind what way it is done Code First reverse Engineering etc. Also on the creation of the "Organisation" object could I load a current "Period" object using the "CurrentPeriodID" value which is stored in the "Organisation" object.
Thanks
To me this sounds like you need an additional entity that connects Period, Client and Activity, let's call it ClientActivityInPeriod. This entity - and the corresponding table - would have three foreign keys and three references (and no collections). I would make the primary key of that entity a composition of the three foreign keys because that combination must be unique, I guess. It would look like this (in Code-First style):
public class ClientActivityInPeriod
{
[Key, ForeignKey("Period"), Column(Order = 1)]
public int PeriodId { get; set; }
[Key, ForeignKey("Client"), Column(Order = 2)]
public int ClientId { get; set; }
[Key, ForeignKey("Activity"), Column(Order = 3)]
public int ActivityId { get; set; }
public Period Period { get; set; }
public Client Client { get; set; }
public Activity Activity { get; set; }
}
All three foreign keys are required (because the properties are not nullable).
Period, Client and Activity can have collections refering to this entity (but they don't need to), for example in Period:
public class Period
{
[Key]
public int PeriodId { get; set; }
public ICollection<ClientActivityInPeriod> ClientActivities { get; set; }
}
You can't have navigation properties like a collection of Clients in Period that would contain all clients that have any activities in the given period because it would require to have a foreign key from Client to Period or a many-to-many link table between Client and Period. Foreign key or link table would only be populated if the client has activities in that Period. Neither EF nor database is going to help you with such a business logic. You had to program this and ensure that the relationship is updated correctly if activities are added or removed from the period - which is error prone and a risk for your data consistency.
Instead you would fetch the clients that have activities in a given period 1 by a query, not by a navigation property, for example with:
var clientsWithActivitiesInPeriod1 = context.Periods
.Where(p => p.PeriodId == 1)
.SelectMany(p => p.ClientActivities.Select(ca => ca.Client))
.Distinct()
.ToList();

EF - One to one relationship

I have the following class:
public class FinanceiroLancamento
{
/// <summary>Identificação</summary>
public override int Id { get; set; }
/// <summary>Financeiro caixa</summary>
public FinanceiroLancamentoCaixa FinanceiroLancamentoCaixa { get; set; }
}
public class FinanceiroLancamentoCaixa
{
/// <summary>Identificação</summary>
public override int Id { get; set; }
/// <summary>Identificação do lançamento financeiro</summary>
public int IdFinanceiroLancamento { get; set; }
}
When I try to map and execute migration it´s return:
Property name 'IdFinanceiroLancamento' was already defined.
To solve this problem I needed to comment idfinanceirolancamento and map like this:
HasRequired(e => e.FinanceiroLancamentoCaixa)
.WithRequiredPrincipal()
.Map(m => m.MapKey("IdFinanceiroLancamento"));
The question is:
How can I this FK (FinanceiroLancamento -> FinanceiroLancamentoCaixa) keeping the "IdFinanceiroLancamento { get; set; }"?
This is very important in my case to use later in the class.
Ps: FinanceiroLancamento does not need a FinanceiroLancamentoCaixa, but when FinanceiroLancamentoCaixa exists he needs a FinanceiroLancamento.
Best regards.
Wilton Ruffato Wonrath
Entity Framework requires that 1:1 mappings share the same primary key. In your case, you are trying to use a different member as the mapping id. Also, do not override the base class id, just inherit it.
What you want Is this:
.HasRequired(e => e.FinanceiroLancamentoCaixa)
.WithRequiredPrincipal();
Entity Framework does not allow you to use a 1:1 that is not a shared primary key, so you can't do it in EF. If you absolutely need this, you may have to do it as a stored procedure and call it from EF.
The reason you can't have a 1:1 like this is because the data model allows you to set IdFinanceiroLancamento to the same ID in more than one record, thus breaking your 1:1.
Basically, EF will not allow you to create models with mappings that allow for a violation of the mapping, even if you never create duplicates, it's still a possibility. EF doesn't know about unique constraints either so placing a unique constraint won't tell EF that it's ok.
If you'd like to see this feature, I suggest you vote for it at the EF uservoice:
http://data.uservoice.com/forums/72025-entity-framework-feature-suggestions/suggestions/1050579-unique-constraint-i-e-candidate-key-support

Azure table storage inverse relationship

I am using azure table storage (Note: NOT Azure SQL) and I have the following situation:
In my application I have a number of organisations that 'invite' users, and on the invite there is an associated 'Role' and 'Expiry'. Once the organisation has invited a user I want the org to see the list of users that they have invited, and I want the user to see a list of organisations that they have been invited to.
I think in my application and this case, that there would be low numbers (ie an org would only invite a few users and a user will generally only be invited by one org). However is there a general pattern that people use to deal with this situation even with very large numbers?
I have three approaches that I currently use, depending on my needs:
Transactional
I store the forward and inverse relationship on the same partition... this means that EVERY entity is on the same partition (ie this method is rate limited by a single partition), but it means you can use a batch transaction to insert the forward and inverse relationship at the same time which means that you know they will always be correct.
public class OrganisationInvite : TableEntity
{
// Partition Id - string.Empty
// Row Id - "Invite_" + OrangisationId + "_" + UserId
public string Role { get; set; }
public DateTime Expiry { get; set; }
}
public class OrganisationRequest : TableEntity
{
// Partition Id - string.Empty
// Row Id - "Request_" + UserId + "_" + OrganisationId
public string Role { get; set; }
public DateTime Expiry { get; set; }
}
To query I use a t.RowKey.StartsWith("Request_...") or t.RowKey.StartsWith("Invite_...") depending on whether I want to get a list of a user/org invites.
I guess this is best used when the data is very critical.
Eventual Consistency
I give both tables all the properties but they live on different partitions, this gives you awesome scalability but you loose the transaction. I use a messaging queue to update the inverse relationship to match the forward relationship, so eventually the data will match. (But for a while it may not).
// Assume both in the same table, thus the prefix on partition
public class OrganisationInvite : TableEntity
{
// Partition Id - "Invite_" + OrangisationId
// Row Id - UserId
public string Role { get; set; }
public DateTime Expiry { get; set; }
}
public class OrganisationRequest : TableEntity
{
// Partition Id - "Request_" + UserId
// Row Id - OrganisationId
public string Role { get; set; }
public DateTime Expiry { get; set; }
}
To query I use a t.PatitionKey == "Request_..." or t.PatitionKey == "Invite_..." depending on whether I want to get a list of a user/org invites. Perhaps you would consider one of these the 'source of truth' so when a user does accept the invite you would look up the 'source of truth' and give the user that role etc.
This is the most scalable solution, and especially makes sense if you are using caching on top of it.
Source of truth
In this case I only give the properties on one entity, and only have the keys of the inverse relationship on the other. You would add the entities to the list that is longest or is queried the most... in this case I would say it is the invites for an org. Like the eventual consistency method you would queue the inverse relationship to add the inverse entity. This method gives you complete data consistency except for when you add a new relationship (as there is a bit of time before the inverse relationship is created), and is highly scalable - there is a higher cost to read the inverse list though.
// Assume both in the same table, thus the prefix on partition
public class OrganisationInvite : TableEntity
{
// Partition Id - "Invite_" + OrangisationId
// Row Id - UserId
public string Role { get; set; }
public DateTime Expiry { get; set; }
}
public class OrganisationRequest : TableEntity
{
// Partition Id - "Request_" + UserId
// Row Id - OrganisationId
}
You can trivially query the forward relationship using t.PatitionKey == "Invite_...". The inverse relationship is not trivial though. You have to query using t.PatitionKey == "Request_..." and create n number of parallel calls to get each item's data forward data (In this case to use the org id found in the inverse relationship's RowKey). If the item does not exist then you do not add it to your final list. This ensures that if the org changes its role for example the user will see this change on the next hit.
I think this method is useful if the inverse relationship is used rarely and it is critical that the data is up to date (I'm thinking user permissions etc?)

DDD: Modeling simple domain with two aggregate roots

Let's say I want to create action web site where members would be able to bid for items. To model this domain I have three classes: Member, Item and Bid.
My brainstorming would go something like this:
Item can contain multiple bids
Bid is associated with one Item and one Member
Member can contain multiple bids
Member and Item can exist without bid instance
Bid instance can't exist without both Member and Item
Considering all this it is obvious that since Member and Item objects are independent we can consider them aggregate roots. Bid will be part of one of these aggregate. That is clear but what is confusing to me right now is which aggregate root should I choose? Item or Member?
This is example from Pro ASP.NET MVC 3 Framework book by Apress, and the way they did is like following:
Which gives following code:
public class Member
{
public string LoginName { get; set; } // The unique key
public int ReputationPoints { get; set; }
}
public class Item
{
public int ItemID { get; private set; } // The unique key
public string Title { get; set; }
public string Description { get; set; }
public DateTime AuctionEndDate { get; set; }
public IList<Bid> Bids { get; set; }
}
public class Bid
{
public Member Member { get; set; }
public DateTime DatePlaced { get; set; }
public decimal BidAmount { get; set; }
}
Member and Item are aggregate roots here and Bid is contained within Item.
Now let's say that I have application use case: "Get all bids posted by specific member". Does that mean that I would have to first get all Items (eg. from data base via repository interface) and then enumerate all bids for each Item trying to find matching Member? Isn't that a bit inefficient? So a better way would be then to aggregate Bid objects inside of Member. But in that case consider the new use case: "Get all bids for specific item". Now again we need to go other way around to get all bids...
So taking into account that I need to implement both of these use cases in my application, what is the right and efficient way to model this domain then?
Your domain should really reflect only Command (CQRS) requirements (update/change data). I presume that you need Queries (read data, no update/change of data): "Get all bids for specific item" and "Get all bids posted by specific member". So, this "querying" has nothing to do with the domain, as the query implementation is independent on the command implementation (command is calling a domain method). This gives you a freedom to implement each query in an efficient way. My approach is to implement an efficient DB view getting you only data you want to display in UI. Then you create a new class called BidForItemDto (DTO = data transfer object) and you map data from DB view into a collection of BidForItemDto (you can do it manually via ADO.NET or use NHibernate (preferred, does everything for you)). The same for the second query, create a new class called BidPostedByMemberDto.
So, if it is queries you need, just forget about domain, realize that it's just data you want to display in UI, and query them efficiently from the DB. Only when you do some action in UI (click a button to place a bid for instance), it's executing a command "place a bid", which would at the end call domain method Item.PlaceBid(Member member, DateTime date, decimal amount). And btw, IMHO is it an Item which "has many bids", and the domain method "place bid" would surely need to access previous bids to implement the whole logic correctly. Placing bids collection into Member does not make much sense to me...
From the top of my head some examples of DB views and sql queries:
Get all bids for specific item:
create view BidForItemDto
as
select
i.ItemId,
b.BidId,
b.MemberId,
b.DatePlaced,
b.BidAmount
from Item i
join Bid b ON b.ItemId = i.ItemId
query:
SELECT *
from BidFormItemDto
where ItemId = <provide item id>
Get all bids posted by specific member:
create view BidPostedByMemberDto
as
select
m.MemberId,
b.BidId,
b.MemberId,
b.DatePlaced,
b.BidAmount
from Member m
join Bid b ON b.MemberId = i.MemberId
query:
SELECT *
from BidPostedByMemberDto
where MemberId = <provide member id>

How to embed documents in Azure Table Storage

I'd like to be able to store objects that have child objects in Azure Table storage using a structure like so:
public class AzureTestDocument : TableServiceEntity
{
public AzureTestDocument(int counter)
: base("_default", counter.ToString())
{
Counter = counter;
Child = new AzureTestChildDocument(counter);
}
public int Counter { get; set; }
}
public class AzureTestChildDocument
{
public AzureTestChildDocument(int counter)
{
Counter = counter;
}
public int Counter { get; set; }
}
Saving the parent document if I remove the child document works fine. Saving a structure like this results in a "One of the request inputs is not valid" exception. Doing a little googling turned up this article about supported types which may mean you can't embed any types other than that short list of supported ones.
Please clarify if this is the case or point me towards what I may be missing.
Azure Table Storage supports saving of entities that only contain primitive properties. Any nested child objects need to be saved separately:
You can serialize the child objects into strings and save those strings as properties.
Alternatively, you can save those child objects as individual rows in Azure tables
Alternatively, if you're dealing with documents, you can save those objects in Azure BLOB storage.

Resources