Does GridGain support backup of distributed lock? - gridgain

So far I know GridGain supports distributed lock. Is distributed lock based on distributed cache? Does GridGain support backup of distributed lock, like distributed map?
Thanks,
Bill

Yes, distributed lock is acquired by calling GridCacheProjection.lock(...) method, so it will have as many backups as there have been configured for cache.
However, locks do not have transactional semantics, and it is always more advisable to use cache transactions via any of the GridCacheProjection.txStart(...) methods. This way you you still get the locking semantics, but will also be able to commit or rollback your transaction atomically.

Related

DB access from a Mapper in MapReduce

I planning the next generation of an analysis system I'm developing and I think of implementing it using one of the MapReduce/Stream-Processing platforms like Flink, Spark Streaming etc.
For the analysis, the mappers must have DB access.
So my greatest concern is when a mapper is paralleled, the connections from the connection pool will all be in use and there might be a mapper that fail to access the DB.
How should I handle that?
Is it something I need to concern about?
As you have pointed out, a pull-style strategy is going to be inefficient and/or complex.
Your strategy for ingesting the meta-data from the DB will be dictated by the amount of meta-data and the frequency that the meta-data changes. Either way, moving away from fetching the meta-data when it's needed, and toward receiving updates when the meta-data is changed, is likely to be a good approach.
Some ideas:
Periodically dump the meta-data to flat file/s into distributed file system
Streaming meta-data updates to your pipeline at write-time to keep an in-memory cache up-to-date
Use a separate mechanism to fetch the meta-data, for instance Akka Actor/s polling for changes
It will depend on the trade-offs you are able to make for your given use-case.
If DB interactivity is unavoidable, I do wonder if map-reduce style frameworks would be the best approach to solve your problem. But any failed tasks should be retried by the framework.

Hazelcast-IMap get, does it have auto lock mechanism?

I have an IMap with Mapstore configured, it seems when I get from IMap from multiple thread, it does correctly with only fetch 1 time and other will automatically have hits. So I wonder that Imap does auto have lock mechanism on when try to get from cache, nowhere I read actually confirm this but it behave as it has lock.
Can someone confirm this?
From com.hazelcast.core.IMap javadoc:
Concurrent, distributed, observable and queryable map.
So the concurrency is guaranteed by design, however it doesn't necessarily mean that locks are used.
From hazelcast documentation:
Hazelcast Distributed Map (IMap) is thread-safe to meet your thread safety requirements. When these requirements increase or you want to have more control on the concurrency
This can be achieved via a multitude of lock/unlock methods.

Distributing scheduled tasks across multi-datacenter environment in Node.js with Cassandra

We are attempting to build a system that gets a list of task to execute from a Cassandra database and then through some kind of group consensus creates an execution plan (preferably on one node) which is then agreed on and executed by the entire cluster of servers. We really do not want to add any additional pieces of software such as Redis or a AMPQ system, rather have the consensus built directly into all of the servers running the jobs. So far we have found Skiff, an implementation of the Raft algorithm that looks like it could accomplish the task, but I was wondering if anyone has found an elegant solution to this problem in a pure Node.js way not involving external messaging systems.
Cassandra supports lightweight transactions, which is basically Paxos implementation that offers linearizable consistency and CAS operation (consensus). So you can use Cassandra itself to serialize the execution plan.

Does the Cassandra driver have its own speculative-retry mechanism?

Cassandra since v2.0.2 have mechanism named Rapid Read Protection described in details here. For this question important notes from blog post are:
Mechanism controlled by a per-table speculative_retry setting
Coordinator node is responsible for applying this mechanism - it starts new read-request if retry condition is satisfied.
But documentation for cassandra java-driver describes something very similar here, named also similar speculative query execution. But driver needs some additional libraries to use this feature.
Q1: Am I right that this mean that it is implemented on driver-side and have no relations to Rapid Read Protection implemented inside cassandra?
If so, that means that driver will retry a query with anther coordinator, if driver retry condition is satisfied.
Q2: For read queries retry on coordinator-side seems more effective, since even when you switch coordinator for query it's still a chance that another one will query same set of nodes(and will have same response time as previous). But I didn't find how to enable driver-side retry only for write queries. So if I want to use retry on all type queries - should I disabler RR on cassandra-server-side, since double protection will give more pressure to cluster? Or I can gain some profit by enabling both of them?
Q1: Yes, speculative query execution in the driver is completely independent of the cluster rapid reads.
Q2.1: For the first part, it's not absolutely necessary as the coordinator could be busy processing other requests, etc.
Q2.2: I think you can enable both mechanisms (cluster and client side) and play a bit with their configurations.

Multiple thread access(read/write) same table

If there are multiple thread which access (read/write) to a same table into a DB, what considerations of thread-safety should I take?
Here are some good tips, for example if using MySQL
Use row-level locking.
Use the TRANSACTION_READ_COMMITTED isolation level.
Avoid queries that cannot use indexes; they require locking of all the rows in the table (if only very briefly) and might block an update.
Avoid sharing Statements among threads
Here is some more information and reference
check for mechanisms which implement transactions in different isolation levels. These mechanism are present in database system or your API.

Resources