Better performance MetadaWorkspace or Reflection? - entity-framework-5

Fellows,
What is the best option performance to get entity properties ? Querying objectContext.MetadataWorkspace.GetItems or iterating table.Getproperties by Reflection ?

Related

Keyset Pagination for Spring Data JDBCs Pageable

AFAIK, the Pageable class supports only LIMIT/OFFSET based paging. However, while being a quite universal solution, it comes with some downsides as outlined here https://momjian.us/main/blogs/pgblog/2020.html#August_10_2020
Keyset Pagination (aka Seek Method or Cursor-based Pagination) has some benefits in terms of performance and behavior during concurrent data inserts and deletes. For details see
https://use-the-index-luke.com/no-offset
http://allyouneedisbackend.com/blog/2017/09/24/the-sql-i-love-part-1-scanning-large-table/
https://slack.engineering/evolving-api-pagination-at-slack-1c1f644f8e12
https://momjian.us/main/blogs/pgblog/2020.html#August_17_2020
So, are there any plans to support this pagination method, e.g. via Pageable<KeyType> and getKey() that then gets incorporated into the SQLs WHERE clause?
This possibility was discussed in the team and while not considered urgent it is something we would like to offer eventually.
The first step would be to provide support for this in Spring Data Commons, i.e. a persistence store independent API. The issue for this is DATACMNS-1729

Heterogeneous Data Storage in CouchDB

I would like to know what are the best practices for storing heterogeneous data in CouchDB. In MongoDB you have collections which help with the modelling of the data (IE: Typical usage is one document type per collection). What is the best way to handle this kind of requirement in CouchDB? Tagging of documents with a _type field? Or is there some other method that I am not aware of?
Main benefit of Mongo's collection is that indexes are defined and calculated per collection. In case of Couch you have even more freedom and flexibility to do that. Each index is defined by the view in map/reduce way. You limit the data to calculate the index by filtering it in map function. Because of this flexibility, it is up to you how to distinguish which document belongs to which view.
If you really like the fixed Mongo-like style of division documents into set of distinct partitions with separate indexes just create the field collection and never mix two different collections in single view. In my opinion, rejecting one of the only benefit of Couch over Mongo (where Mougo is in general more powerful and flexible system) does not seem to be good idea.

Using Cassandra as a "schemaless NoSQL database"

I'm looking at using Cassandra for an enterprise web-site I'm working on, which could be used by up to 250 million users. Cassandra seems like an obvious choice because of the way it scales, although I was a little sad not to be able to use a schema-less database like Couch (for political reasons I won't go in to).
I've read that you can still use Cassandra like a schema-less database, using either a super-column or simply serializing objects in to normal columns. At the moment I'm using .NET for my front-end.
Are there any libraries out there already that help with using Cassandra in this way?
Has anyone done anything like this already using .NET? Any tips?
Any advice gratefully received!
Thanks,
Steve.
Datomic is schemaless. Attributes are modeled and generic objects can be created, saved, queried with any combination of attributes.
http://www.datomic.com
http://docs.datomic.com/storage.html#cassandra

In CouchDB, are there ways to improve performance of the View index process?

I have some basic views and some map/reduce views with logic. Nothing too complex. Not too many documents. I've tried with 250k, 75k, and 10k documents. Seems like I'm always waiting for view indexing.
Does better, more efficient code in the view help? I'm assuming it's basically processing the view at all levels of aggregation. So there must be some improvement there.
Does emit()-ing less data help? emit(doc.id, doc) vs specifying fewer fields?
Do more or less complex keys impact view indexing?
Or is it all about memory, CPU cores, and processor speed?
There must be some documentation out there, but I can't find anything referencing ways to improve performance.
I would take a deeper look into the reduce function. Try to use the built-in Erlang functions like _sum, _count, instead of writing Javascript.
Complex views can take hours and more, that's normal.
Maybe post such not too complex map/reduce.
And don't forget: indexing all docs is only done once after changing the view (or pushing a whole bunch of new docs). Subsequent new docs are indexed incrementally.
Use a view with &stale=ok to retrieve the "old" data instantly, so you don't have to wait. (But pay attention: you always have to call a view without stale=ok at least once to trigger the indexing process). Or better: use stale=update_after.
The code you write in views is more like CREATE INDEX than SELECT. It should be irrelevant how long it takes, as long as the view builds keep up with the document change rate. Building a view is a sunk (one-time) cost.
When you query the view, that is always a binary tree scan, which operates against a static data set in logarithmic time. That is usually the performance people care about more (in production.)
If you are not seeing behavior like I describe, perhaps we could discuss your view functions and your general approach to your problem. CouchDB is very different from relational databases. In the latter, you have highly structured data and free-form queries. In CouchDB, you have free-form data but highly structured index definitions (views). Except during development, changing and rebuilding views should be rare.
not emitting anything will help, but doing the view creation in smaller batches ( there are scripts that do this automagically ) helps more than anything other than not emitting anything at all, which can't be helped sometimes.

NoSQL database with high read performances (write accesses are not significant)?

I'm working on a "real-time" website using Nodejs. Currently, I'm using Redis because I need high performance for read-access. The write accesses are not really significant for my use case.
In addition, Redis does not have a query language for the search. So, I create my indexes manually and I use some unions/intersections/... to find some values.
I think that it will be easier to use MongoDB with a embedded finding system and a ORM-like (Mongoose for example). The problem is that I'm not sure that MongoDB is the best choice for my usecase.
What is your advices about the NoSQL DB that I need ? Redis ? CouchDB ? MongoDB ? Cassandra ? etc.
I repeat: I want to have a real good performance for the read accesses and for the searches (the write accesses are not significant), the simplest possible (orm-like ? finding system ? etc.)
Thanks.
I believe that redis would be the better solution for the following reasons.
You require fast read access and redis provides the fastest solution since the keys are in memory, if not most.
Although mongodb is easier to query in the general case, your problem domain is narrow and once you decide how you would like to query the data, you can put the correct data structures and indexes in place.
I would say that Redis is a good fit for your DB, and you should look at something like Solr or elasticsearch to provide your searching.
CouchDB will do better in write heavy environment. I don't use it though.
MongoDB will do better on read heavy environment.
For search and indexing:
MongoDB would require separate index for each of your search criteria for better performance (at least this is what I remember).
Proper index is important in MongoDB. And no joins!!
Here are some links you might go through:
http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB
http://www.snailinaturtleneck.com/blog/2009/06/29/couchdb-vs-mongodb-benchmark/
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
Hope these will help you find the right db
Goodluck

Resources