Audit log in Cassandra 2.2.8 community edition - cassandra

Is there any way to log queries along with user that executed the query in Cassandra community edition?
I'm looking for a Server level solution, not driver/client based solution
Thanks!

Try nodetool settraceprobability
nodetool settraceprobability <value>
Sets the probability for tracing a request.
Value is a probability between 0 and 1.
Tracing a request usually requires at least 10 rows to be inserted.
A probability of 1.0 will trace everything whereas lesser amounts (for example, 0.10) only sample a certain percentage of statements.
The trace information is stored in a system_traces keyspace that holds two tables – sessions and events, which can be easily queried to answer questions, such as what the most time-consuming query has been since a trace was started. Query the parameters map and thread column in the system_traces.sessions and events tables for probabilistic tracing information.
Note : Care should be taken on large and active systems, as system-wide tracing will have a performance impact. Unless you are under very light load, tracing all requests (probability 1.0) will probably overwhelm your system
If you don't want to use this, then you have log the query from the client side How to use Query Logger ?. There is no other way
Source : https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsSetTraceProbability.html

Related

Could my large amount of tables (2k+) be causing my write timeout exceptions?

I'm running OS Cassandra 3.11.9 with Datastax Java Driver 3.8.0. I have a Cassandra keyspace that has multiple tables functioning as lookup tables / search indices. Whenever I receive a new POST request to my endpoint, I parse the object and insert it in the corresponding Cassandra table. I also put inserts to each corresponding lookup table. (10-20 per object)
When ingesting a lot of data into the system, I've been running into WriteTimeoutExceptions in the driver.
I tried to serialize the insert requests into the lookup tables by introducing Apache Camel and putting all the Statements into a queue that the Session could work off of, but it did not help.
With Camel, since the exceptions are now happening in the Camel thread, the test continues to run, instead of failing on the first exception. Eventually, the test seems to crash Cassandra. (Nothing in the Cassandra logs though)
I also tried to turn off my lookup tables and instead insert into the main table 15x per object (to simulate a similar number of writes as if I had the lookup tables on). This test passed with no exception, which makes me think the large number of tables is the problem.
Is a large number (2k+) of Cassandra tables a code smell? Should we rearchitect or just throw more resources at it? Nothing indicative has shown in the logs, mostly just some status about the number of tables etc - no exceptions)
Can the Datastax Java Driver be used multithreaded like this? It says it is threadsafe.
There is a direct effect of the high number of tables onto the performance - see this doc (the whole series is good source of information), and this blog post for more details. Basically, with ~1000 tables, you get ~20-25% degradation of performance.
That's could be a reason, not completely direct, but related. For each table, Cassandra needs to allocate memory, have a part for it in the memtable, keep information about it, etc. This specific problem could come from the blocked memtable flushes, or something like. Check the nodetool tpstats and nodetool tablestats for blocked or pending memtable flushes. It's better to setup some continuous monitoring solution, such as, metrics collector for Apache Cassandra, and and for period of time watch for the important metrics that include that information as well.

Cassandra - How to check table data is consistent at a given point in time?

How to find out when a Cassandra table becomes "eventually consistent"? Is there a definitive way to determine this at a given point in time? Preferably programatically through the Datastax driver API? I checked out the responses to the following related questions but there does not seem to be anything more concrete than "check the nodetool netstats output"
Methods to Verify Cassandra Node Sync
how do i know if nodetool repair is finished
If your system is always online doing operations then it may never become full consistent at single point of time untill you are on Consistency level "ALL".
Repairs process logs error in log file if it does not get reply from other replica nodes cause they were down/timeout etc.
you can check the logs if no error WRT AntiEntropy/stream it means your system is almost consistence.

Is Tracing On in cassandra the right choice to track the timetaken in Cassandra

When I try to execute the query on 500000 entries in a table, I could see that it is completed in 1200ms, But when I try to execute the query using TRACING ON enabled, I can see it is showing a long time in the Tracing log say 1850 ms.
So I would like to confirm whether the TRACING ON feature in Cassandra is a right choice for tracking time taken for executing queries?
There are metrics that will give you the amount of time spent on queries, you can most easily view it with nodetool proxyhistograms (doc) or grabbing directly from JMX. TRACING ON is for debugging why a request is slow. Its important to note that this is very expensive (and possibly adds time to query, although most tracing is async) and should be avoided outside of debugging issues.
You can also use nodetool settraceprobability to globally record some % of the queries, which you can then look at and maybe process with some tooling the events and sessions table in the system_traces keyspace.
Per documentation https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlshTracing.html
Enables and disables tracing for transactions on all nodes in the
cluster. Use tracing to troubleshoot performance problems. Detailed
transaction information related Cassandra internal operations is
captured in the system_traces keyspace. When a query runs a session id
displays in the query results and an entry with the high-level details
such as session id and client, and session length, is written to the
system_traces.session table.
So it should be used for troubleshooting performance and hence measuring time taken.
The tracing information consists of the activity, the timestamp at which the activity occurred, the source of the activity and the time elapsed since the start of the request (source_elapsed). source_elapsed is in microseconds.

Cassandra gossipinfo severity explained

I was unable to find a good documentation/explanation as to what severity indicates in nodetool gossipinfo. was looking for a detailed explanation but could not find a suitable one.
The severity is a value added to the latency in the dynamic snitch to determine which replica a coordinator will send the read's DATA and DIGEST requests to.
Its value would depend on the IO used in compaction and also it would try to read /proc/stat (same as the iostat utility) to get actual disk statistics as its weight. In post 3.10 versions of cassandra this is removed in https://issues.apache.org/jira/browse/CASSANDRA-11738. In pervious versions you can disable it by setting -Dcassandra.ignore_dynamic_snitch_severity in jvm options. The issue is that it weighting the io use the same as the latency. So if a node is GC thrashing and not doing much IO because of it, it could end up being treated as the target of most reads even though its the worst possible node to send requests to.
Now you can still use JMX to set the value still (to 1) if you want to exclude it from being used for reads. A example use case is using nodetool disablebinary so application wont query it directly, then setting the severity to 1. That node would then only be queried by cluster if theres a CL.ALL request or a read repair. Its a way to take a node "offline" for maintenance from a read perspective but still allow it to get mutations so it doesn't fall behind.
Severity reports activity that happens on the particular node (compaction, etc.), and this information then is used to make a decision on what node could better handle the request. There is discussion in original JIRA about this functionality & how this information is used.
P.S. Please see Chris's answer about changes in post 3.10 versions - I wasn't aware about these changes...

cassandra stress testing distribution of writes

How do I build a test that will tell me which Cassandra nodes are being written to, so I would want to specify number of nodes and replication factor and get back which nodes are affected by each write as the result of an attempted insert. this will tell me how evenly the data would be distributed at runtime. I have test data, so what i really need is a way to call mock Cassandra that's configured the way i would run in production that would return to me which node is affected.
I don't see a way to do that with the Cassandra stress tool, unless i am completely missing it...
Since you are interested in knowing all nodes that were impacted by a query, in I would recommend looking into tracing.
Here are a few approaches you could take:
Use cassandra-stress and enable tracing with nodetool settraceprobability on each of your C* nodes and set it to a low value like .01. This will enable query on 1% of your queries for which you can observe the results of the trace in the system via the system_traces.events and sessions tables (see this article for more information on how to use these tables). The trace will include information like which node was used as the coordinator, what other nodes were used as replicas for reads/writes and how long it took to process individual steps. Note that how your application will end up querying data may be slightly different then cassandra-stress since what nodes are queried is influenced by your Cluster configuration. cassandra-stress uses JavaDriverClient#connect. You will want to compare your configuration with what JavaDriverClient is doing and understand the differences. You could also modify JavaDriverClient to match your application.
You may also want to write a test against your application that uses cassandra. The java-driver has an API for enabling tracing and observing the data which I've documented in a video here. Additionally when you get a ResultSet back, there is a method getExecutionInfo() that provides information such as which hosts were tried, but this only includes nodes that were used as a coordinator, not all the replicas.

Resources