As I am a newbie to Cassandra, I was confused with this term known as 'fast replica'. Basically what I know is dynamic snitch identifies the faster replica during the reading process and data from the faster replica is compared with the other replicas with the help of hash(for every message) and if the data is consistent then data from the fast replica is sent to the client or replicas go under the read repair(if the read consistency level is not met). What does exactly 'fast replica' means? Does it mean the read query doesn't need to query other nodes for the data? Please guide me through it. I couldn't find any relevant SO posts for this.
Cassandra uses phi accural failure detection alogrithm to identify the node's health. A dynamic snitch that sorts endpoints by latency with an adapted phi failure detector.
What does exactly 'fast replica' means?
The replica which is the top scorer in terms of latency and proximity.
Does it mean the read query doesn't need to query other nodes for the
data?
As the score of replicas keep on changing dynamically and hence the fastest replica will keep on getting data request until it is not the top scorer anymore which makes sense.
For more details you can check the code FD and DS
Related
I'm creating a sync program to periodically copy our Cassandra data into another database. The database I'm copying from only gets INSERTs - data is never UPDATEd or DELETEd. I would like to address Cassandra's eventual consistency model in two ways:
1 - Each sync scan overlaps the last by a certain time span. For example, if the scan happens every hour, then each scan looks an hour and a half backwards. The data contains a unique key, so reading the same record in more than one scan is not an issue.
2 - I use a Consistency level of ALL to ensure that I'm scanning all of the nodes on the cluster for the data.
Is ALL the best Consistency for this situation? I just need to see a record on any node, I don't care if it appears on any other nodes. But I don't want to miss any INSERTed records either. But I also don't want to experience timeouts or performance issues because Cassandra is waiting for multiple nodes to see that record.
To complicate this a bit more, this Cassandra network is made up of 6 clusters in different geographic locations. I am only querying one. My assumption is that the overlap mentioned in #1 will eventually catch up records that exist on other clusters.
The query I'm doing is like this:
SELECT ... FROM transactions WHERE userid=:userid AND transactiondate>:(lastscan-overlap)
Where userid is the partioning key and transactiondate is a clustering column. The list of userId's is sourced elsewhere.
I use a Consistency level of All to ensure that I'm scanning all of the nodes on the cluster for the data
So consistency ALL has more to do with the number of data replicas read than it does with the number of nodes contacted. If you have a replication factor (RF) of 3 and query a single row at ALL, then Cassandra will hash your partition key to figure out the three nodes responsible for that row, contact all 3 nodes, and wait for all 3 to respond.
I just need to see a record on one node
So I think you'd be fine with LOCAL_ONE, in this regard.
The only possible advantage of using ALL, is that it actually does help to enforce data consistency by triggering a read repair 100% of the time. So if eventual consistency is a concern, that's a "plus." But *_ONE is definitely faster.
The CL documentation talks a lot about 'stale data', but I am interested in 'new data'
In your case, I don't see stale data as a possibility, so you should be ok there. The issue that you would face instead, is in the event that one or more replicas failed during the write operation, querying at LOCAL_ONE may or may not get you the only replica that actually exists. So your data wouldn't be stale vs. new, it'd be exists vs. does not exist. One point I talk about in the linked answer, is that perhaps writing at a higher consistency level and reading at LOCAL_ONE might work for your use case.
A few years ago, I wrote an answer about the different consistency levels, which you might find helpful in this case:
If lower consistency level is good then why we need to have a higher consistency(QUORUM,ALL) level in Cassandra?
While going through the reading materials of Cassandra and HBase I found that Cassandra is not consistent but HBase is. Didn't find any proper reading materials for the same.
Could anybody provide any blogs/articles on this topic?
Cassandra is consistent, eventually. Based in Brewer's theorem (also known as CAP theorem), distributed data systems can only guarantee to achieve 2 of the following 3 characteristics:
Consistency.
Availability.
Partition tolerance.
What this means is that Cassandra, in its default configuration, can guarantee to be available and partition tolerant, and there may be a delay before achieving consistency. But this is configurable, as you can increase the consistency levels for any query, sacrificing partition tolerance.
There are multiple resources in the web, you should look up for "eventual consistency in Cassandra", you can start with Ed Capriolo's talk, or this post in quora
Actually, since version 1.1 HBase has two consistency models:
Consistency.STRONG is the default consistency model provided by HBase.
In case the table has region replication = 1, or in a table with
region replicas but the reads are done with this consistency, the read
is always performed by the primary regions, so that there will not be
any change from the previous behaviour, and the client always observes
the latest data.
In case a read is performed with Consistency.TIMELINE, then the read
RPC will be sent to the primary region server first. After a short
interval (hbase.client.primaryCallTimeout.get, 10ms by default),
parallel RPC for secondary region replicas will also be sent if the
primary does not respond back...
In other words, strong consistency is achieved by allowing reads only against replica that does the writing, while timeline consistent (Ref. Guide makes it a point to differentiate timeline vs eventual consistency) behavior provides highly available reads with low-latency at the expense of a small chance of reading stale data.
I am currently managing a percona xtradb cluster composed by 5 nodes, that hadle milions of insert every day. Write performance are very good but reading is not so fast, specially when i request a big dataset.
The record inserted are sensors time series.
I would like to try apache cassandra to replace percona cluster, but i don't understand how data reading works. I am looking for something able to split query around all the nodes and read in parallel from more than one node.
I know that cassandra sharding can have shard replicas.
If i have 5 nodes and i set a replica factor of 5, does reading will be 5x faster?
Cassandra read path
The read request initiated by a client is sent over to a coordinator node which checks the partitioner what are the replicas responsible for the data and if the consistency level is met.
The coordinator will check is it is responsible for the data. If yes, will satisfy the request. If no, it will send the request to fastest answering replica (this is determined using the dynamic snitch). Also, a request digest is sent over to the other replicas.
The node will compare the returning data digests and if all are the same and the consistency level has been met, the data is returned from the fastest answering replica. If the digests are not the same, the coordinator will issue some read repair operations.
On the node there are a few steps performed: check row cache, check memtables, check sstables. More information: How is data read? and ReadPathForUsers.
Load balancing queries
Since you have a replication factor that is equal to the number of nodes, this means that each node will hold all of your data. So, when a coordinator node will receive a read query it will satisfy it from itself. In particular(if you would use a LOCAL_ONE consistency level, the request will be pretty fast).
The client drivers implement the load balancing policies, which means that on your client you can configure how the queries will be spread around the cluster. Some more reading - ClientRequestsRead
If i have 5 nodes and i set a replica factor of 5, does reading will be 5x faster?
No. It means you will have up to 5 copies of the data to ensure that your query can be satisfied when nodes are down. Cassandra does not divide up the work for the read. Instead it tries to force you to design your data in a way that makes the reads efficient and fast.
Best way to read cassandra is by making sure that each query you generate hits cassandra partition. Which means the first part of your simple primary(x,y,z) key and first bracket of compound ((x,y),z) primary key are provided as query parameters.
This goes back to cassandra table design principle of having a table design by your query needs.
Replication is about copies of data and Partitioning is about distributing data.
https://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archPartitionerAbout.html
some references about cassandra modelling,
https://www.datastax.com/dev/blog/the-most-important-thing-to-know-in-cassandra-data-modeling-the-primary-key
https://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling
it is recommended to have 100 MB partitions but not compulsory.
You can use cassandra-stress utility to have look report of how your reads and writes look.
there is a multicenter cassandra environment.
and set the consistency-level=local_quorum.
I want to know the latency of the local datacenter and other datacenter.
What I mean is when a data is writen successfully,and what's the time that other datacenter can have the replica.
this metrics is not exposed by cassandra.
Have found that writelatency is collected in org.apache.cassandra.service.StorageProxy.mutate method.
and want to add code in there to achieve collecting the latency of datacenter.
but the problem is cassandra write finish when the num of write consistency-level success,I cannot block the write transaction.
how to keep the sync between
write memtable and
write merics
have no idea going on.anybody have idea on achieving this,pls help a look.
There isnt anything available at this time directly, there is a ticket with patch available at CASSANDRA-11569 though.
There are some tricks you can try in mean time.
If you enable trace on a query (CL.ALL) you can check the trace events table to see the time that the mutations left coordinator and when it arrives on the replica.
You can make a local quorum write query, then a each quorum write query and track difference.
Theres a problem with some of these metrics in tracking mutations. Cassandra will piggyback all the writes in that DC over a single proxy write (vs coordinator actually sending to each node). If that node hits a GC it is likely to get a spike. Speculative retry will help with that affecting latency in an extreme case but then your not really tracking your raw cross dc latency. May want to just consider "ping".
I'm trying to design an architecture of my streaming application and choose the right tools for the job.
This is how it works currently:
Messages from "application-producer" part have a form of (address_of_sensor, timestamp, content) tuples.
I've already implemented all functionality before Kafka, and now I've encountered major flaw in the design. In "Spark Streaming" part, consolidated stream of messages is translated into stream of events. The problem is that events for the most part are composite - consist of multiple messages, which have occurred at the same time at different sensors.
I can't rely on "time of arrival to Kafka" as a mean to detect "simultaneity". So I has to somehow sort messages in Kafka before extracting them with Spark. Or, more precisely, make queries over Kafka messages.
Maybe Cassandra is the right replacement for Kafka here? I have really simple data model, and only two possible types of queries to perform: query by address, and range query by timestamp. Maybe this is the right choice?
Do somebody have any numbers of Cassandra's throughput?
If you want to run queries on your time series, Cassandra may be the best fit - it is very write optimized, you can build 'wide' rows for your series. It is possible to make slices on your wide rows, so you can select some time ranges with only one query.
On the other hand, kafka can be considered as a raw data flow - you don't have queries, only recently produced data. In order to collect data based on some key in the same partition, you have to select this key carefully. All data within same partition are time sorted.
Range Query on Timestamp is the classic use case of cassandra , if u need address based queries as well u would have to make them as clustering column if using cassandra . As far as cassandra througput are concerned if you can invest in proper performance analysis on cassandra cluster you can achieve very high write throughput . But I have used SparkQL , Cassandra Driver and spark Cassandra connector they don't really give high query throughput speed until you have a big cluster with high CPU configuration , it does not work well with small dataset .
Kafka should not be used as data source for queries , its more of commit log