Recommended strategy for fetching data asynchornously - multithreading

Lets say that I need to executed several different queries to database. Each query returns different data. Each query will be executed on a thread different than UI thread.
Should I have one thread for all queries to database, or I can freely have one thread per query? What is the recommended practice?

A single ObjectContext/DbContext instance should not be used for concurrent database access because it is not designed for such scenario.
Interacting with objects loaded by different context instances is error prone because all related entity instances should belong to a single context instance. Otherwise you have to attach and detach entities.
If all the operations are reads, then having multiple threads to retrieve data is preferred while for CRUD operations single context instance with a thread is advisable.

Related

Using Sessions in Cassandra

When using cassandra datastax java driver, When can I use multiple sessions under same cluster? I am not able to find any good usecase for having a cluster and multiple sessions.
My application have multiple components/modules that accesses Cassandra. Based on the answer I may decide Should I be having one session per component/module or just one session shared across all the components of my application.
Update: Everywhere on the internet they recommend to use one session. I get it, but my question is "in what scenario do you create multiple sessions for one cluster?". If there is no such scenario, why the library allows to create multiple sessions, instead the library can just have a method to return a singleton session object.
Use Just One Session across all your component.
Because In Cassandra Session is a heavy object. Thread-safe. It maintain multiple connection, cached prepared statement etc.
Here is the JavaDoc :
A session holds connections to a Cassandra cluster, allowing it to be queried. Each session maintains multiple connections to the cluster nodes, provides policies to choose which node to use for each query (round-robin on all nodes of the cluster by default), and handles retries for failed query (when it makes sense), etc...
Session instances are thread-safe and usually a single instance is enough per application. As a given session can only be "logged" into one keyspace at a time (where the "logged" keyspace is the one used by query if the query doesn't explicitely use a fully qualified table name), it can make sense to create one session per keyspace used. This is however not necessary to query multiple keyspaces since it is always possible to use a single session with fully qualified table name in queries.
Source :
https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/Session.html
https://ahappyknockoutmouse.wordpress.com/2014/11/12/246/

How to ensure thread safety when using Enterprise Library's Data Acces

I have an application that runs multiple concurrent background processes to insert data into the database using the Enterprise Library's Data Access application block.
Each of the background thread uses the DatabaseFactory.CreateDatabase passing in the same database instance name. The following is the snippet of code that retrieves the database and command object:
Microsoft.Practices.EnterpriseLibrary.Data.Database database = DatabaseFactory.CreateDatabase(this.DatabaseInstanceName);
DbCommand commandObj = database.GetSqlStringCommand(statement);
I'm finding that this is not thread safe and I'm getting errors due to the values getting mixed up across the threads. How should I handle this to ensure that it is thread safe?
thanks in advance!
I found my issue. The values that were getting mixed up across threads were not due to the Enterprise Library Data Access objects but another object I used to store parameters. I had accidentally made it global instead of a local resource within each thread.

Preventing duplicate entries in Multi Instance Application Environment

I am writing an application to serve facebook APIs; share, like etc.. I am keeping all those shared objects from my appliction in a database and I do not want to share the same object if it already been shared.
Considering I will deploy application on different servers there could be a case where both instance tries to insert the same object to table.
How can I manage this concurrency problem with blocking the applications fully ? I mean two threads will try to insert same object and they must sync but they should not block a 3rd thread where it is inserting totally different object.
If there's a way to derive primary key of data entry from data itself, database will resolve such concurrency issue by itself -- 2nd insert will fail with 'Primary Key constraint violation'. Perhaps, data supplied by Facebook API already have some unique ID?
Or, you can consider some distributed lock solution, for example, based on Hazelcast or on similar data grid. This would allow to have record state shared by different JVMs, so it will be possible to avoid unneeded INSERTS.

Why are database connection pools better than a single connection?

I'm currently working on writing a multithreaded application that will need to access a database in order to serve requests. I see many people saying that using a pool of many persistent database connections is the way to go for this type of application, but I'm trying to wrap my head around why exactly this is the case.
Keep in mind that I'm designing this application in Erlang, so I'll be using threads/processes/workers a lot.
So let's compare two situations:
You have a single thread that owns a single database connection. All your client-handling-threads talk to this thread in order to make database queries.
You have a pool of threads, each with their own database connection. When a client-handling-thread wants to access the database, it gets one of these threads from the pool, and uses that to query the DB.
In the first case, I see many people saying that it is bad because having one thread handling all database related queries will in turn cause a bottleneck. But my confusion is the following: Wouldn't the bottleneck in that single thread actually be the database itself? If all that the thread is doing is querying the database through its connection handle, isn't waiting for the DB to respond to requests the main source of latency? How will throwing more connections threads at this problem solve it?
The database probably has well-developed multithreading abilities. Using a connection pool allows:
Make use of the DB's multithreading / load-balancing ability
Avoid the overhead of setting up and tearing down connections over and over
When the database is serving multiple connections, it can make its own decisions on how to prioritize requests. Imagine this scenario:
User A requests a set of records from Table A with 100,000 rows
User B requests a set of records from Table B with 50 rows
User C updates Table A
If multiple connections are used, the DB can take advantage of the fact that (1) and (2) can occur concurrently, and User B gets his 50 records without having to wait for User A to get all 100,000 of his. Only User C has to wait for User A to finish.
Also, setting up and tearing down TCP connections is a relatively expensive task. Using a pool allows one user to release the resource without tearing down the TCP connection, so the next user doesn't have to wait for a new connection. Your single-threaded approach wouldn't benefit from this aspect of connection-pooling, though.

ColdFusion singleton object pool

In our ColdFusion application we have stateless model objects.
All the data I want I can get with one method call (it calls other internally without saving the state).
Methods usually ask the database for the data. All methods are read only, so I don't have to worry about thread safety (please correct me if I'm wrong).
So there is no need to instantiate objects at all. I could call them statically, but ColdFusion doesn't have static methods - calling the method would mean instantiating the object first.
To improve performance I have created singletons for every Model object.
So far it works great - each object is created once and then accessed as needed.
Now my worry is that all requests for data would go through only 1 model object.
Should I? I mean if on my object I have a method getOfferData() and it's time-consuming.
What if a couple of clients want to access it?
Will second one wait for the first request to finish or is it executed in a separate thread?
It's the same object after all.
Should I implement some kind of object pool for this?
The singleton pattern you are using won't cause the problem you are describing. If getOfferData() is still running when another call to that function gets called on a different request then this will not cause it to queue unless you do one of the following:-
Use cflock to grant an exclusive lock
Get queueing connecting to your database because of locking / transactions
You have too many things running and you use all the available concurrent threads available to ColdFusion
So the way you are going about it is fine.
Hope that helps.

Resources