How to use database views in EF Core 3.0? - ef-core-3.0

I know the question was asked before, but at the time it was, we had EF Core 2.x. The short answer was "no you can't" and obviously, not very helpful.
The other answers involved ugly hacks like changing migration files after they were created by the tool.
I make an application Code First. I have my models created with lot's of foreign keys and database joins in mind.
But here comes the unpleasant surprise (I'm a little new to EF): those joins written in LINQ are pretty slow, as a matter of fact they do not produce database join, but fetch whole tables instead.
Of course it's totally unacceptable, I import an old database with millions of records, with the joins I get results in milliseconds, without I get couple of seconds lags - on my very fast internet connection (in real world scenario it would be much worse).
I need views, and AFAIK EF won't create them for me, is it STILL true for EF 3.0?
Then, what would be the best and the most clean way to create views in SQL and to make entities for them? I mean - considering the situation the database models would change over time, and the database structure would have to be updated.
Well, I would prefer doing my joins not in SQL views, just have queries returned "JOIN" statement results. Especially some not obvious joins. Lets say table B has a column being a foreign key referencing table A. I want to get results from table A joining B for details. With normal SQL JOIN performance.
I checked the database: there is no significant performance difference between "select * from A" and "select * from A join B...". In LINQ - the difference is huge.

I figured out that in Code First database views are redundant.
The "views" can be created as models (ordinary classes) having a field or a property set to joined entity. I use private fields for that purpose. Then I use LINQ Join() to create my view entity. The query may refer ONLY to the fields set to joined entities, nothing else. Such query, if written properly translates clearly to SQL JOIN and works with full speed. In my application it's equivalent of a database view.
Why private fields and not properties, you may ask. Maybe because joined entities are "implementation details", but another reason is my presentation code uses reflection to operate on entity public properties, it's good to have those entities hidden from it. Otherwise I would probably need to use attributes to hide those "columns".
BTW, such views can be ordered with OrderBy(), filtered with Where() at virtually no cost. The constraint is to maintain the collection's IQueryable interface, never refer joined entities indirectly. So even if X refers to A.B, never refer X in a LINQ query, always A.B where A is direct entity reference assigned in the Join() query.
To build dynamic queries at runtime one must use expressions.
This set of properties of EF Core 3.0 allows to build a database application without using SQL, but with the full SQL speed maintained. However, the database / entity structure must be relatively simple to achieve that.

Related

PouchDB structure

i am new with nosql concept, so when i start to learn PouchDB, i found this conversion chart. My confusion is, how PouchDB handle if lets say i have multiple table, does it mean that i need to create multiple databases? Because from my understanding in pouchdb a database can store a lot of documents, but a document mean a row in sql or am i misunderstood?
The answer to this question seems to be surprisingly under-documented. While #llabball clearly gave a decent answer, I don't think that views are always the way to go.
As you can read here in the section When not to use map/reduce, Nolan explains that for simpler applications, the key is to abuse _ids, and leverage the power of allDocs().
In other words, if you had two separate types (say artists, and albums), then you could prefix the id of each type to obtain an easily searchable data set. For example _id: 'artist_name' & _id: 'album_title', would allow you to easily retrieve artists in name order.
Laying out the data this way will result in better performance due to not requiring extra indexes, and less code. Clearly however, if your data requirements are more complex, then views are the way to go.
... does it mean that i need to create multiple databases?
No.
... a document mean a row in sql or am i misunderstood?
That's right. The SQL table defines column header (name and type) - that are the JSON property names of the doc.
So, all docs (rows) with the same properties (a so called "schema") are the equivalent of your SQL table. You can have as much different schemata in one database as you want (visit json-schema.org for some inspiration).
How to request them separately? Create CouchDB views! You can get all/some "rows" of your tabular data (docs with the same schema) with one request as you know it from SQL.
To write such views easily the property type is very common for CouchDB docs. Your known name from a SQL table can be your type like doc.type: "animal"
Your view names will be maybe animalByName or animalByWeight. Depends on your needs.
Sometimes multiple-databases plan is a good option, like a database per user or even a database per user-feature. Take a look at this conversation on CouchDB mailing list.

How to implement sub-query in Microstratergy?

Please give me guidance on how to implement the following query in Microstratergy.
SELECT batch_nr,check_nr,update_ts
FROM
claim_financial_transaction_dim a,
(select max(update_ts) update_ts,check_nr,batch_nr from claim_financial_transaction_dim group by check_nr)max where
ROW_END_TS IN ('9999-12-31 00:00:00') AND a.check_nr IN ('045-4254355') and a.update_ts=max.update_ts and
a.check_nr=max.check_nr
Simply put, you don't implement SQL queries in MicroStrategy. You model your business entities in your schema, and MicroStrategy writes the SQL.
There are, however, some exceptions. You can use a Freeform SQL report, which allows you to write the SQL for a report yourself. This is somewhat inflexible, as this report cannot be modified by anyone using it (by, for example, drilling to a different level of data).
Alternatively, you can create a Logical Table in MicroStrategy, which allows you to write a single pass of SQL, and then map schema objects onto it. This SQL will be typically be used as a sub-query in the query MicroStrategy. This is sometimes known as the My DBA Won't Allow Me To Create Views functionality.
It does sound however, that you need to go back and understand how MicroStrategy works fundamentally. If you're working back from a query to MSTR, you're (probably) going about things the wrong way.

CouchDB map/reduce by any document property at runtime?

I come from a SQL world where lookups are done by several object properties (published = TRUE or user_id = X) and there are no joins anywhere (because of the 1:1 cache layer). It seems that a document database would be a good fit for my data.
I am trying to figure-out if there is a way to pass one (or more) object properties to a CouchDB map/reduce function to find matching documents in a database without creating dozens of views for each document type.
Is it possible to pass the desired document property key(s) to match at run-time to CouchDB and have it return the objects that match (or the count of object that match for pagination)?
For example, on one page I want all posts with a doc.user_id of X that are doc.published. On another page I might want all documents with doc.tags[] with the tag "sport".
You could build a view that iterates over the keys in the document, and emits a key of [propertyName, propertyValue] - that way you're building a single index with EVERYTHING prop/value in it. Would be massive, no idea how performance would be to build, and disk usage (probably bad).
Map function would look something like:
// note - totally untested, my CouchDB fu is rusty
function(doc) {
for(prop in doc) {
emit([prop, doc[prop]], null);
}
}
Works for the basic case of simple properties, and can be extended to be smart about arrays, and emit a prop/value pair for each item in the array. That would let you handle the tags.
To query on it, set [prop] as your query key on the view.
Basically, no.
The key difference between something like Couch and a SQL DB is that the only way to query in CouchDB is essentially through the views/indexes. Indexes in SQL are optional. They exist (mostly) to boost performance. For example, if you have a small DB, your app will run just fine on SQL with 0 indexes. (Might be some issue with unique constraints, but that's a detail.)
The overall point being is that part of the query processor in a SQL database includes other methods of data access beyond simply indexes, notably table scans, merge joins, etc.
Couch has no query processor. It has views (defined by JS) used to define B-Tree indexes.
And, that's it. That's the hammer of Couch. It's a good hammer. It's been lasting the data processing world for basically 40 years.
Indexes are somewhat expensive to create in Couch (based on data volume) which is why "temporary views" are frowned upon. And they have a cost in maintenance as well, so views need to be a conscious design element in your database. At the same time, they're a bit more powerful than normal SQL indexes as well.
You can readily add your own query processing on top of Couch, but that will be more work for you. You can create a few select views, on your most popular or selective criteria, and then filter the resulting documents by other criteria in your own code. Yes, you have to do it, so you have to question whether the effort involved is worth more than whatever benefits you feel Couch is offering your (HTTP API, replication, safe, always consistent datastore, etc.) over a SQL solution.
I ran into a similar issue like this, and built a quick workaround using CouchDB-Python (which is a great library). It's not a pretty solution (goes against the principles of CouchDB), but it works.
CouchDB-Python gives you the function "Query", which allows you to "execute an ad-hoc temporary view against the database". You can read about it here
What I have is that I store the javascript function as a string in python, and the concatenate it with variable names that I define in Python.
In some_function.py
variable = value
# Map function (in javascript)
map_fn = """function(doc) {
<javascript code>
var survey_match = """ + variable + """;
<javascript code>
"""
# Iterates through rows
for row in db.query(map_fn):
<python code>
It sure isn't pretty, and probably breaks a bunch of CouchDB philosophies, but it works.
D

SubSonic - Non-Crud Stored Procedures

I want to create a Data Access Layer for a small application. The stored procedures have previously being created and are not basic CRUD ones. Most are quite custom and don't really map one-to-one to tables in the database. I also need concurrency support.
Can SubSonic / SimpleRepository handle this for me?
I don't think SimpleRepository will work well in this situation. You might find the LinqTemplates work well to query the data. Subsonic also does a good job handling sprocs and makes it easy to return datasets, or typed results if you have classes that match the structure of your sproc resultsets.
For example, you can map the results of a sproc to a List like this:
StoredProcedure sproc = _db.GetProductList();
List<Product> products = sproc.ExecuteTypedList<Product>();
All matching columns that can be populated will be.

What is the best way to store and search through object transactions?

We have a decent sized object-oriented application. Whenever an object in the app is changed, the object changes are saved back to the DB. However, this has become less than ideal.
Currently, transactions are stored as a transaction and a set of transactionLI's.
The transaction table has fields for who, what, when, why, foreignKey, and foreignTable. The first four are self-explanatory. ForeignKey and foreignTable are used to determine which object changed.
TransactionLI has timestamp, key, val, oldVal, and a transactionID. This is basically a key/value/oldValue storage system.
The problem is that these two tables are used for every object in the application, so they're pretty big tables now. Using them for anything is slow. Indexes only help so much.
So we're thinking about other ways to do something like this. Things we've considered so far:
- Sharding these tables by something like the timestamp.
- Denormalizing the two tables and merge them into one.
- A combination of the two above.
- Doing something along the lines of serializing each object after a change and storing it in subversion.
- Probably something else, but I can't think of it right now.
The whole problem is that we'd like to have some mechanism for properly storing and searching through transactional data. Yeah you can force feed that into a relational database, but really, it's transactional data and should be stored accordingly.
What is everyone else doing?
We have taken the following approach:-
All objects are serialised (using the standard XMLSeriliser) but we have decorated our classes with serialisation attributes so that the resultant XML is much smaller (storing elements as attributes and dropping vowels on field names for example). This could be taken a stage further by compressing the XML if necessary.
The object repository is accessed via a SQL view. The view fronts a number of tables that are identical in structure but the table name appended with a GUID. A new table is generated when the previous table has reached critical mass (a pre-determined number of rows)
We run a nightly archiving routine that generates the new tables and modifies the views accordingly so that calling applications do not see any differences.
Finally, as part of the overnight routine we archive any old object instances that are no longer required to disk (and then tape).
I've never found a great end all solution for this type of problem. Some things you can try is if your DB supports partioning (or even if it doesn't you can implement the same concept your self), but partion this log table by object type and then you can further partion by date/time or by your object ID (if your ID is a numeric this works nicely not sure how a guid would partion).
This will help maintain the size of the table and keep all related transactions to a single instance of an object to itself.
One idea you could explore is instead of storing each field in a name value pair table, you could store the data as a blob (either text or binary). For example serialize the object to Xml and store it in a field.
The downside of this is that as your object changes you have to consider how this affects all historical data if your using Xml then there are easy ways to update the historical xml structures, if your using binary there are ways but you have to be more concious of the effort.
I've had awsome success storing a rather complex object model that has tons of interelations as a blob (the xml serializer in .net didn't handle the relationships btw the objects). I could very easily see myself storing the binary data. A huge downside of storing it as binary data is that to access it you have to take it out of the database with Xml if your using a modern database like MSSQL you can access the data.
One last approach is to split the two patterns, you could define a Difference Schema (and I assume more then one property changes at a time) so for example imagine storing this xml:
<objectDiff>
<field name="firstName" newValue="Josh" oldValue="joshua"/>
<field name="lastName" newValue="Box" oldValue="boxer"/>
</objectDiff>
This will help alleviate the number of rows, and if your using MSSQL you can define an XML Schema and get some of the rich querying ability around the object. You can still partition the table.
Josh
Depending on the characteristics of your specific application an alternative approach is to keep revisions of the entities themselves in their respective tables, together with the who, what, why and when per revision. The who, what and when can still be foreign keys.
Although I would be very careful to use this approach, since this is only viable for applications with a relatively small amount of changes per entity/entity type.
If querying the data is important I would use true Partitioning in SQL Server 2005 and above if you have enterprise edition of SQL Server. We have millions of rows partitioned by year down to day for the current month - you can be as granular as your application demands with a maximum number of 1000 partitions.
Alternatively , if you are using SQL 2008 you could look into filtered indexes.
These are solutions that will enable you to retain the simplified structure you have whilst providing the performance you need to query that data.
Splitting/Archiving older changes obviously should be considered.

Resources