I have several objects (Product, Rule, PriceDetail, etc.) that manage and store information in a CRUD application. I want a way to keep a log of when the data is updated, and to that end I've created an Update class, referenced as ICollection<Update> Updates within each data class.
When the tables are all generated, EF creates a FK for each class in the Updates table (Product_ID, Rule_ID, etc.). This seems horribly inefficient. Could I use a two-field key, such as enum ObjectType and long ID? Alternately, can I use string ID and force a pattern where the first N characters of the string identify the referencing object? If the latter, can the database auto-increment the string value?
Here's some example code, trimmed for placement here:
public class Update
{
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public long ID { get; set; }
public string Reason { get; set; }
public DateTime TimeOfUpdate { get; set; }
public long Product_ID { get; set; }
public long Rule_ID { get; set; }
}
public class Product
{
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public long ID { get; set; }
public string Name { get; set; }
public PriceDetail Price { get; set; }
public ICollection<Update> Updates { get; set; }
}
public class Rule
{
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public long ID { get; set; }
public string Name { get; set; }
public ICollection<Condition> Conditions { get; set; }
public ICollection<Update> Updates { get; set; }
}
There are multiple ways of handling auditing logic.
Do you anticipate storing update history for every table? If it's going to be limited to a few tables, your design might work fine. If however, you want to update many tables, you might want to try out the options below.
Include 3 tables (Products, Updates and ProductUpdates). The Products tables will always have the latest data. The Updates tables will get a new row capturing the updated timestamp every time an entry in Products is updated. The ProductUpdates will have a foreign key to the Updates table and will have the old row from the Products table. This way you know exactly what the row looked at any point of time. Extending it to any other table X will require adding XUpdates table. But you wouldn't have the unnecessary 50 foreign keys that you mentioned.
Another option would be to have IsActive, UpdatedBy, UpdatedTimestamp, etc... columns in the tables that will be updated. Every time, you update a row, you mark it as inactive and insert a new row with the latest data. You can store the reason and rule columns also if needed.
You can also redesign your entities in such a way that their primary key is a foreign key to your updates table. This way you will eliminate the inelegance of all previous solutions. Every time you update, you will insert a row in the Updates table and use the generated Id as the primary key of a new row in your products table.
Entity Framework can help you in automating the process laid out in points 3 and 4. The basic idea would be to intercept the Save requests for updates and force an update and insert instead.
Lastly, you might also be able to use CLR triggers to have the audit functionality you want.
Each solution has its pros and cons. The best solution for you would depend upon your specific use case.
Related
I have a collection where I am storing the timestamp and its latest location with the following class:
public class TrackingInfo
{
[JsonProperty("id")]
public string Id { get; set; }
[JsonProperty("_partition_key")]
public string _PartitionKey { get; set; }
[JsonProperty("asset_id")]
public string AssetId { get; set; }
[JsonProperty("unix_timestamp")]
public double UnixTimestamp { get; set; }
[JsonProperty("timestamp")]
public string Timestamp { get; set; }
[JsonProperty("location")]
public Point Location { get; set; }
}
which is partitioned by _PartitionKey which contains a construct like this:
tracking._PartitionKey = $"tracking_{tracking.AssetId.ToLower()}_{DateTime.Today.ToString("D")}";
Looks like there is no way to do a Group by on the collection.
Can someone please help me create a SQL document query to find the latest entry for each AssetId and its Location and Timnestamp when the data was recorded.
Update 1:
what if I change the _PartitionKey to represent per day something like below:
tracking._PartitionKey = $"tracking_{DateTime.Today.ToString("D")}";
would it make it easier to get all assets and its latest tracking record?
As per my comment, my suggestion would be to solve your problem differently.
Assumption: You have a large number of assetIds and don't know the values beforehand:
Have one document that represents the latest state of your asset
Have another document that represents the location events of your asset
Update the first document whenever there is a new location event
You can put both types of documents in the same collection or separate them - both approaches have benefits. I would probably separate them.
Then do a query "what assets are within 1km of xxx" (Querying spatial types)
Sidenote: It might be a good idea to use the assetId as partitionKey instead of your combined key. Using such a key is very bad for queries
If you only have very few assetIds, you can use those to only find the latest updates by using and ordering by the timestamp field. This will only return the last item
Cosmos DB doesn't support group by feature,you could vote up this.
Provide a third-party package [documentdb-lumenize for your reference which supports group by feature,it has .net example:
string configString = #"{
cubeConfig: {
groupBy: 'state',
field: 'points',
f: 'sum'
},
filterQuery: 'SELECT * FROM c'
}";
Object config = JsonConvert.DeserializeObject<Object>(configString);
dynamic result = await client.ExecuteStoredProcedureAsync<dynamic>("dbs/db1/colls/coll1/sprocs/cube", config);
Console.WriteLine(result.Response);
You could group by assetId column and get the max timestamp.
so a quick update on why I created this question.
We currently are storing our telemetry data of our devices in the field within Azure SQL Server. This is working great (have a ton of experience with EF, LINQ and relationship dbs) BUT I am aware that this most likely isnt the best solution especially for storing "big" data (data is still small for now but will grow within a year).
I have chosen DocumentDB as our possible solution for storing of just our event history. The rest will stay in SQL - users, profiles, device info, sim, vehicle etc as I dont want to completely halt development as we move 100% across to docdb and rather just do whats best short term - cost + performance.
Going through this video I finally came up with a possible solution as to how to store telemetry data - https://www.youtube.com/watch?v=-o_VGpJP-Q0
They recommended One document per time period (example used 1 per hour). Is this the recommended approach still?
[Index]
public DateTime TimestampUtc { get; set; }
public DateTime ReceivedTimestampUtc { get; set; }
[Index]
public EventType EventType { get; set; }
public Guid ConnectionId { get; set; }
public string RawEventMessage { get; set; }
[Index]
public Sender Sender { get; set; }
[Index]
public Channel Channel { get; set; }
public DbGeography Location { get; set; }
public double? Speed { get; set; }
public double? Altitude { get; set; }
public Int16? Heading { get; set; }
public Byte? HDOP { get; set; }
public Byte? GPSFixStatus { get; set; }
public Byte? GPSFixType { get; set; }
public string Serial { get; set; }
public string HardwareVersion { get; set; }
public string FirmwareVersion { get; set; }
public string Relay1 { get; set; }
public string Relay2 { get; set; }
public string Relay3 { get; set; }
public string Ign { get; set; }
public string Doors { get; set; }
public string Input1 { get; set; }
public string Input2 { get; set; }
public string Out1 { get; set; }
public string Out2 { get; set; }
public int V12 { get; set; }
public int VBat { get; set; }
That's one of several possible alternatives. Which is best depends on what your data looks like. For instance, if you have events that vary in their start date/time and duration (or end date/time) or if you track all state changes of entities then something like Richard Snodgrass' temporal data model is ideal. Interestingly Microsoft SQL Server 2016 recently added direct support for temporal tables but they've been in the SQL spec as TSQL2 for a while. Note, the TSQL2 spec includes both valid-time and transaction-time support but I believe that the recent MS SQL 2016 addition only supports valid time... but that's OK since that's what is most valuable. I only point it out because getting your head around how a valid-time table works is hard enough without the added complexity of adding transaction-time.
The beauty of this approach is that you don't have to decide on the needed time granularity as the data is collected, only if/when you aggregate it.
However, as you said, SQL is not ideal for such large data sets. So, I've implemented valid-time Richard Snodgrass style temporal model on top of DocumentDB in my Lumenize library in particular the TimeSeriesCalculator and its other time-series functionality. Read pages 10-19 here for a backgrounder on the data model and common operations in the Lumenize time-series analysis. That deck is for an implementation I did while at Rally called the Lookback API built on MongoDB but the concepts are the same and I've now switched to DocumentDB (but Rally hasn't).
Another comment on your proposed model, you might want to consider a separate document for every reading. It's a bit confusing from the example if there is a document per minute or one per device. If it's one per device per hour, then you can be assured that you'll never go past 60 minutes, which would be OK, but in just about every other way I can think of, it looks like you have the risk of a single document growing unbounded which is a big no-no in DocumentDB (and all NoSQL data modeling). Also, as you say, even if it isn't unbounded, it would involve a lot of in-place updates. Since your system is likely to be write heavy, I would suggest that you might be better off with a single document per reading. If you have to store denormalized aggregations for speed later on, then you still have the option to do that. You may not even need it though. Let the performance of the production system inform that decision.
I suggest that you read up on time-dimensions for star-schemas. It looks a lot like what you are planning, but it's also ideal for the denormalized aggregation storage that I describe. I have not seen any writeups of star schema concepts for NoSQL but here is one from the traditional SQL world that will help you with the concepts.
As I said, there are a lot of alternatives and without knowing more about your situation, I cannot know which is best.
Ok so I think I am going for the 1 document per event (for now 1 every 5 minutes but could change to 1 per second per device). The reason being is appending to a document should surely be costly as you need to do a "replace" on that document?? (does docdb support append/partial updates now?) Surely that involves a read and then a growing replace which would be more costly and timely than just adding a new doc per event. The only concern is when we have millions/billions of documents... is this fine?
Allright, this should be fairly easy.
I would like to persist some records for my module in Orchard (1.7.2) without those records being also a ContentPartRecord.
In other words, I would like to be able to persist in DB the following objects:
public class LogItemRecord
{
public virtual string Message { get; set; }
}
..which is already mapped on to the db. But notice that this class is not derived from ContentPartRecord, as it is most certainly not one.
However, when I call IRepository instance's .Create method, all I get is a lousy nHibernate exception:
No persister for: MyModule.Models.LogItemRecord
...which disappears if I do declare the LogItem record as having been inherited from ContentPartRecord, but trying to persist that, apart from being hacky-tacky, runs into an exception of its own, where nHibernate again justly complains that the Id value for the record is zero, though in not so many words.
So... how do I play nicely with Orchard and use its API to persist objects of my own that are not ContentParts / ContentItems?
I'm running 1.7.3 (also tested in 1.7.2) and have successfully been able to persist the following class to the DB:
public class ContactRecord
{
public virtual int Id { get; set; }
public virtual string Name { get; set; }
public virtual string JobTitle { get; set; }
public virtual string Email { get; set; }
public virtual string Phone { get; set; }
}
Here are the relevant lines from Migrations.cs
SchemaBuilder.CreateTable(
typeof(ContactRecord).Name,
table => table
.Column<int>("Id", col => col.Identity().PrimaryKey())
.Column<string>("Name")
.Column<string>("JobTitle")
.Column<string>("Email")
.Column<string>("Phone")
);
I'm going to assume that the code you've shown for LogItemRecord is the complete class definition when making the following statement...
I think that any Record class you store in the DB needs an Id property, and that property should be marked as Identity and PrimaryKey in the table definition (as I've done above).
When you create a *Record class which inherits from ContentPartRecord and setup the table like
SchemaBuilder.CreateTable(
"YourRecord",
table => table
.ContentPartRecord()
// more column definitions
);
then you get the Id property/PK "for free" by inheritance and calling .ContentPartRecord() in the Migration.
See the PersonRecord in the Orchard Training Demo Module for another example of storing a standard class as a record in the DB.
I’m using Entity Framework 5.0,
Scenario
"Organisation" has a list of "clients" and a list of "Periods" and a "CurrentPeriodID" At the start of each period some or all of the "Clients" are associated with that "Period", this I have done using a link table and this works OK so when I do "Organisation->Period->Clients" I get a list of "Clients" for the "Period".
Next I need to add some objects ("Activities") to the "Clients" for a "Period" so I get "Organisation->Period->Client->Activates" this won’t be the only one there will eventually be several other navigation properties that will need to be added to the "Clients" and the "Activities" and all of them have to be "Period" related, I also will have to be able to do (if possible) "Organisation->Period-Activities".
Question
What would be the best way of implementing the "Activities" for the "Organisation->Period-Client", I Don’t mind what way it is done Code First reverse Engineering etc. Also on the creation of the "Organisation" object could I load a current "Period" object using the "CurrentPeriodID" value which is stored in the "Organisation" object.
Thanks
To me this sounds like you need an additional entity that connects Period, Client and Activity, let's call it ClientActivityInPeriod. This entity - and the corresponding table - would have three foreign keys and three references (and no collections). I would make the primary key of that entity a composition of the three foreign keys because that combination must be unique, I guess. It would look like this (in Code-First style):
public class ClientActivityInPeriod
{
[Key, ForeignKey("Period"), Column(Order = 1)]
public int PeriodId { get; set; }
[Key, ForeignKey("Client"), Column(Order = 2)]
public int ClientId { get; set; }
[Key, ForeignKey("Activity"), Column(Order = 3)]
public int ActivityId { get; set; }
public Period Period { get; set; }
public Client Client { get; set; }
public Activity Activity { get; set; }
}
All three foreign keys are required (because the properties are not nullable).
Period, Client and Activity can have collections refering to this entity (but they don't need to), for example in Period:
public class Period
{
[Key]
public int PeriodId { get; set; }
public ICollection<ClientActivityInPeriod> ClientActivities { get; set; }
}
You can't have navigation properties like a collection of Clients in Period that would contain all clients that have any activities in the given period because it would require to have a foreign key from Client to Period or a many-to-many link table between Client and Period. Foreign key or link table would only be populated if the client has activities in that Period. Neither EF nor database is going to help you with such a business logic. You had to program this and ensure that the relationship is updated correctly if activities are added or removed from the period - which is error prone and a risk for your data consistency.
Instead you would fetch the clients that have activities in a given period 1 by a query, not by a navigation property, for example with:
var clientsWithActivitiesInPeriod1 = context.Periods
.Where(p => p.PeriodId == 1)
.SelectMany(p => p.ClientActivities.Select(ca => ca.Client))
.Distinct()
.ToList();
I am just wondering about when and where tables should be created for a persisted application. I have registered my database connection factory in Global.asax.cs:
container.Register<IDbConnectionFactory>(new OrmLiteConnectionFactory(conn, MySqlDialectProvider.Instance));
I also understand that I need to use the OrmLite API to create tables from the classes I have defined. So for example to create my User class:
public class User
{
[AutoIncrement]
public int Id { get; set; }
public string Name { get; set; }
[Index(Unique = true)]
public string Email { get; set; }
public string Country { get; set; }
public string passwordHash { get; set; }
public DateTime Dob { get; set; }
public Sex Sex { get; set; }
public DateTime CreatedOn { get; set; }
public Active Active { get; set; }
}
I would execute the following:
Db.CreateTable<User>(false);
I have a lot of tables that need to be created. Should I create a separate class that first created all my tables like this or execute that in each rest call to UserService.
Also is it possible to create all my tables directly in my database, naming each table with its corresponding class, and then Orm would match classes to existing tables automatically?
Sorry this has me a bit confused. Thanks for any help you can give me.
I would create them in the AppHost.Configure() which is only run by a single main thread on Startup that's guaranteed to complete before any requests are served.
If you wanted to you can automate this somewhat by using reflection to find all the types that need to be created and calling the non-generic API versions:
db.CreateTable(overwrite:false, typeof(Table1));
db.CreateTable(overwrite:false, new[] { typeof(Table1), typeof(Table2, etc });
is it possible to create all my tables directly in my database, naming each table with its corresponding class, and then Orm would match classes to existing tables automatically?
You don't have to use OrmLite to create tables. If the table(s) already exist in your MySQL database (or you want to create using MySQL interface) you will be able to access them as long as the class name is the same as the table name. If table names don't match the class names, use the Alias attribute
[Alias("Users")] //If table name is Users
public class User
{
public int Id {get;set;}
}
I wouldn't create the tables in your services. Generally, I would do it in AppHost.Configure method which is run when the application starts. Doing this will attempt to create the tables every time your application is started (which could be once a day - see here) so you might want to set a flag in a config file to do a check for creating the tables.