Does the Cassandra driver have its own speculative-retry mechanism? - cassandra

Cassandra since v2.0.2 have mechanism named Rapid Read Protection described in details here. For this question important notes from blog post are:
Mechanism controlled by a per-table speculative_retry setting
Coordinator node is responsible for applying this mechanism - it starts new read-request if retry condition is satisfied.
But documentation for cassandra java-driver describes something very similar here, named also similar speculative query execution. But driver needs some additional libraries to use this feature.
Q1: Am I right that this mean that it is implemented on driver-side and have no relations to Rapid Read Protection implemented inside cassandra?
If so, that means that driver will retry a query with anther coordinator, if driver retry condition is satisfied.
Q2: For read queries retry on coordinator-side seems more effective, since even when you switch coordinator for query it's still a chance that another one will query same set of nodes(and will have same response time as previous). But I didn't find how to enable driver-side retry only for write queries. So if I want to use retry on all type queries - should I disabler RR on cassandra-server-side, since double protection will give more pressure to cluster? Or I can gain some profit by enabling both of them?

Q1: Yes, speculative query execution in the driver is completely independent of the cluster rapid reads.
Q2.1: For the first part, it's not absolutely necessary as the coordinator could be busy processing other requests, etc.
Q2.2: I think you can enable both mechanisms (cluster and client side) and play a bit with their configurations.

Related

Spark errors when writing to Synapse DWH pool

I am trying to write a dataframe in either append/overwrite mode into a Synapse table using ("com.databricks.spark.sqldw") connector .The official docs doesn't mention much about ACID properties of this write operation. My question is that , if the write operation fails in the middle of the write, would the actions preformed previously be rolled back?
One thing that the docs does mention is that there are two classes of exception that could be thrown during this operation: SqlDWConnectorException and SqlDWSideException .My logic is that if the write operation is ACID compliant,then we do not do anything,but if not,then we plan to encapsulate this operation in a try-catch block and look for other options(maybe retry,or timeout).
As a good practice you should write your code to be re-runnable, eg delete potentially duplicate records. Imagine you are re-running a file for a failed day or someone want to reprocess a certain period. However SQL pools does implement ACID through transaction isolation levels:
Use transactions in a SQL pool in Azure Synapse
SQL pool implements ACID transactions. The isolation level of the
transactional support is default to READ UNCOMMITTED. You can change
it to READ COMMITTED SNAPSHOT ISOLATION by turning ON the
READ_COMMITTED_SNAPSHOT database option for a user SQL pool when
connected to the master database.
You should bear in mind that the default transaction isolation level for dedicated SQL pools is READ UNCOMMITTED which does allow dirty reads. So the way I think about it is, ACID (Atomic, Consistent, Isolated, Durable) is a standard and each provider implements the standard to different degrees through transaction isolation levels. Each transaction isolation level can be strongly meeting ACID or weakly meeting ACID. Here is my summary for READ UNCOMMITTED:
A - you should reasonably expect your transaction to be atomic but you should (IMHO) write your code to be re-runnable
C - you should reasonably expect your transaction to be consistent but bear in mind dedicated SQL pools does not support foreign keys and the NOT ENFORCED keyword is applied to unique indexes on creation.
I - READ UNCOMMITED does not meet 'I' Isolated criteria of ACID, allowing dirty reads (uncommitted data) but the gain is concurrency. You can change the default to READ COMMITTED SNAPSHOT ISOLATION as described above, but you would need a good reason to do so and conduct extensive tests on your application as the impacts on behaviour, performance, concurrency etc
D - you should reasonably expect your transaction to be durable
So the answer to your question is, depending on your transaction isolation level (bearing in mind the default is READ UNCOMMITTED in a dedicated SQL pool), each transaction meets ACID to a degree, most notably Isolation (I) is not fully met. You have the opportunity to change this by altering the default transaction
at the cost of reducing concurrency and the now obligatory regression test. I think you are most interested in Atomicity and my advice is there, make sure your code is re-runnable anyway.
You tend to see the 'higher' transaction isolation levels (READ SERIALIZABLE) in more OLTP systems rather than MPP systems like Synapse, the cost being concurrency. You want your bank withdrawal to work right?
It has guaranteed ACID transaction behavior.
Refer: What is Delta Lake, where it states:
Azure Synapse Analytics is compatible with Linux Foundation Delta
Lake. Delta Lake is an open-source storage layer that brings ACID
(atomicity, consistency, isolation, and durability) transactions to
Apache Spark and big data workloads. This is fully managed using
Apache Spark APIs available in Azure Synapse.

Limiting Cassandra query syntax for clients

We plan to use Cassandra 3.x and we want to allow our customers to connect to Cassandra directly for exporting the data into their data warehouses.
They will connect via ODBC from remote.
Is there any way to prevent that the customer executes huge or bad SELECT statements that will result in a high load for all nodes? We use an extra data center in our replication strategy where only customers can connect, so live system will not be affected. But we want to setup some workers that will run on this shadow system also. Most important thing is, that a connected remote client will not have any noticable impact on other remote connections or our local worker jobs. There is a materialized view already and I want to force customers to get data based on primary key only (i.e. disallow usage of ALLOW FILTERING). It would be great also, if one can limit the number of rows returned (e.g. 1 million) to prevent a pull of all data.
Is there a best practise for this use case?
I know of BlackRocks video related to multi-tenant strategy in C* which advises to use tenant_id in schema. That is what we're doing already, but how can I ensure security/isolation via ODBC connected tenants/customers? Or do I have to write an API on my own which handles security?
I would recommend to expose access via API, not via ODBC - at least you would have greater control on what is executed, and enforce tenant_id, and other checks, like limits, etc. You can try to utilize the Cassandra's CQL parser to decompose query, and put all required things back.
Theoretically, you can could utilize Apache Calcite, for example. It has implementation of JDBC driver that could be used, plus there is existing Cassandra adapter that you can modify to accomplish your task (mapping authentication into tenant_ids, etc.), but this will be quite a lot of work.

DB access from a Mapper in MapReduce

I planning the next generation of an analysis system I'm developing and I think of implementing it using one of the MapReduce/Stream-Processing platforms like Flink, Spark Streaming etc.
For the analysis, the mappers must have DB access.
So my greatest concern is when a mapper is paralleled, the connections from the connection pool will all be in use and there might be a mapper that fail to access the DB.
How should I handle that?
Is it something I need to concern about?
As you have pointed out, a pull-style strategy is going to be inefficient and/or complex.
Your strategy for ingesting the meta-data from the DB will be dictated by the amount of meta-data and the frequency that the meta-data changes. Either way, moving away from fetching the meta-data when it's needed, and toward receiving updates when the meta-data is changed, is likely to be a good approach.
Some ideas:
Periodically dump the meta-data to flat file/s into distributed file system
Streaming meta-data updates to your pipeline at write-time to keep an in-memory cache up-to-date
Use a separate mechanism to fetch the meta-data, for instance Akka Actor/s polling for changes
It will depend on the trade-offs you are able to make for your given use-case.
If DB interactivity is unavoidable, I do wonder if map-reduce style frameworks would be the best approach to solve your problem. But any failed tasks should be retried by the framework.

Should users be directed to specific data nodes when using an eventually consistent datastore?

When running a web application in a farm that uses a distributed datastore that's eventually consistent (CouchDB in my case), should I be ensuring that a given user is always directed to same the datastore instance?
It seems to me that the alternate approach, where any web request can use any data store, adds significant complexity to deal with consistency issues (retries, checks, etc). On the other hand, if a user in a given session is always directed to the same couch node, won't my consistency issues revolve mostly around "shared" user data and thus be greatly simplified?
I'm also curious about strategies for directing users but maybe I'll keep that for another question (comments welcome).
According to the CAP Theorem, distributed systems can either have complete consistency (all nodes see the same data at the same time) or availability (every request receives a response). You'll have to trade one for the other during a partition or datastore instance failure.
Should I be ensuring that a given user is always directed to same the datastore instance?
Ideally, you should not! What will you do when the given instance fails? A major feature of a distributed datastore is to be available in spite of network or instance failures.
If a user in a given session is always directed to the same couch node, won't my consistency issues revolve mostly around "shared" user data and thus be greatly simplified?
You're right, the architecture would be a lot more simpler that way, but again, what would you do if that instance fails? A lot of engineering effort has gone into distributed systems to allow multiple instances to reply to a query. I am not sure about CouchDB, but Cassandra allows you to choose your consistency model, you'll have to tradeoff availability for higher degree of consistency. The client is configured to request servers in a round-robin fashion by default, which distributes the load.
I would recommend you read the Dynamo paper. The authors describe a lot of engineering details behind a distributed database.

Cassandra as an embedded service and with custom consistency level

I am thinking of building an application that uses Cassandra as its data store, but has low latency requirements. I am aware of EmbeddedCassandraService from this blog post
Is the following implementation possible and what are known pitfalls (defects, functional limitations)?
1) Run Cassandra as an embedded service, persisting data to disk (durable).
2) Java application interacts with local embedded service via one of the following. What are the pros
TMemoryBuffer (or something more appropriate?)
StorageProxy (what are the pitfalls of using this API?)
Apache Avro? (see question #5 below)
3) Java application interacts with remote Cassandra service ("backup" nodes) via Thrift (or Avro?).
4) Write must always succeed to the local embedded Cassandra service in order to be successful, and at least one of the remote (non-embedded) Cassandra nodes. Is this possible? Is it possible to define a custom / complex consistency level?
5) Side-question: Cassandra: The Definitive Guide mentions in several places that Thrift will ultimately be replaced with Avro, but seems like that's not the case just yet?
As you might guess, I am new to Cassandra, so any direction to specific documentation pages (not the wiki homepage) or sample projects are appreciated.
Unless your entire database is sitting on the local machine (i.e. a single node), you gain nothing by this configuration. Cassandra will shard your data across the cluster, so (as mentioned in one of the comments) your writes will frequently be made to another node that owns the data. Presuming you write with a consistency level of at least one, your call will block until that other node acks the write. This negates any benefit of talking to the embedded instance since you have some network latency anyway.

Resources