We have experienced, that if we rollout DDL cql scripts, that will alter an existing table in parallel, that there is a substantial chance to corrupt the keyspace to the point that we needed to recreate it.
We have now serialized this process, including the creation of that keyspace. Now there is a flaming discussion, if cassandra explicitely supports the creation of different keyspaces in parallel.
I suppose, that this is ok, but since the cluster is large, we would like to have a second opinion, so I am asking here:
Can we safely assume, that parallel creation of different keyspaces is safe in cassandra?
In current versions of the Cassandra it's not possible - you need to wait for schema agreement after each DDL statement, including creation of other keyspaces. Usually drivers are waiting for some time (default 10 seconds) to get confirmation that all nodes in cluster have the same schema version. Depending on the driver, you can explicitly check for schema agreement - either in the result set returned after execution of statement, or via cluster metadata. For example, in Java it could look as following:
Metadata metadata = cluster.getMetadata();
for (int i = 0; i < commands.length; i++) {
System.out.println("Executing '" + commands[i] + "'");
ResultSet rs = session.execute(commands[i]);
if (!rs.getExecutionInfo().isSchemaInAgreement()) {
while (!metadata.checkSchemaAgreement()) {
System.out.println("Schema isn't in agreement, sleep 1 second...");
Thread.sleep(1000);
}
}
}
New versions of Cassandra will have improvements in this area, for example, via CASSANDRA-13426 (committed into 4.0), and CASSANDRA-10699 (not yet done)
Related
I am running a spark job where some data is loaded from cassandra table. From that data, I make some insert and delete statements.
and execute them. (using forEach)
boolean deleteStatus= connector.openSession().execute(delete).wasApplied();
boolean insertStatus = connector.openSession().execute(insert).wasApplied();
System.out.println(delete+":"+deleteStatus);
System.out.println(insert+":"+insertStatus);
When i run it locally, i see the respective results in the table.
However, when I run it on a cluster, sometimes the result is displayed and sometime the changes don't take place.
I saw the stdout from web-ui of spark, and the query along with true was printed for both the queries.(Data was loaded correctly. But sometimes, only insert is reflected, sometimes only delete, sometimes both, and most of the times none.)
Specifications:
spark slaves on same machines as the cassandra nodes.(each node has two instances of slaves.)
spark master on a separate machine.
Repair done on all nodes.
Cassandra restarted
boolean deleteStatus= connector.openSession().execute(delete).wasApplied();
boolean insertStatus = connector.openSession().execute(insert).wasApplied();
This is a known anti-pattern, you create a new Session object for each query, which is extremely expensive.
Just create the session once and re-use it for all the queries.
To see which queries are being executed and sent to Cassandra, use the slow query logger feature as a hack: http://datastax.github.io/java-driver/manual/logging/#logging-query-latencies
The idea is to set the threshold to a ridiculously low value so that every query will be considered slow and displayed in the log.
You should use this hack only for testing of course
I was following this link to use a batch transaction without using BATCH keyword.
Cluster cluster = Cluster.builder()
.addContactPoint(“127.0.0.1")
.build();
Session session = cluster.newSession();
//Save off the prepared statement you're going to use
PreparedStatement statement = session.prepare(“INSERT INTO tester.users (userID, firstName, lastName) VALUES (?,?,?)”);
//
List<ResultSetFuture> futures = new ArrayList<ResultSetFuture>();
for (int i = 0; i < 1000; i++) {
//please bind with whatever actually useful data you're importing
BoundStatement bind = statement.bind(i, “John”, “Tester”);
ResultSetFuture resultSetFuture = session.executeAsync(bind);
futures.add(resultSetFuture);
}
//not returning anything useful but makes sure everything has completed before you exit the thread.
for(ResultSetFuture future: futures){
future.getUninterruptibly();
}
cluster.close();
My question is with the given approach is it possible to INSERT, UPDATE or DELETE data from different table and if any of those fail all should be failed by maintaining the same performance (as described in the link).
With this approach what i tried, i was trying to insert, delete data from different table and one query got failed so all previous query was executed and updated the db.
With BATCH I can see that if any statement get failed all statement will be failed. But using BATCH on different table is anti-pattern so what is the solution ?
With BATCH I can see that if any statement get failed all statement will be failed.
Wrong, the guarantee of LOGGED BATCH is: if some statements in the batch fail, they will be retried until the succeed.
But using BATCH on different table is anti-pattern so what is the solution ?
ACID transaction is not possible with Cassandra, it would require some sort of global lock or global coordination and be prohibitive performance-wise.
However, if you don't care about the performance cost, you can implement your self a global lock/lease system using Light Weight Transaction primitives as described here
But be ready to face poor performance
I'm banging my head on this, but, frankly speaking, my brains won't get it - or so it seems.
I have a column family that holds jobs for a rather large group of actors. It is a central job management and scheduling table that must be distributed and available throughout the whole cluster and possibly even traverses datacenter barriers some day in the near future.
Each job executor actor system, the ones that actually execute the jobs, is installed alongside one Cassandra node - that is, on the same node. Actually of course there is s master actor that pulls the jobs and distributes them to the actor agents, but that has nothing to do with my question.
There are also some actor systems that can create jobs in the central job table to be executed by other actors or even actor systems, but usually the jobs are loaded batch wise or manually through a web interface.
An actor that is to execute a job always only queries it's local cassandra node. If finished, it will update the job table to indicate it is finished. This write should, in normal circumstances, also only update records with jobs, for which his local Cassandra node is authoritative.
Now, sometimes it may happen that an actor system on a given host has nothing to do. In this case it should indeed get jobs from other nodes too, but of course it will still only talk to it's local Cassandra node. I know this works and it doesn't bother me a bit.
What keep me up at night is this:
How would I create a compound key to achieve the local authoritative of a Cassandra node for job entries for it's local actor system and thereby it's job execution actors, without splitting the job table in multiple column families or the like?
In other words: how can I create a compound key that makes sure that a) jobs are evenly distributed through my cluster and
b) a local query on the job table only returns jobs for which this Cassandra node is authoritative and
c) my distributed agent system still has the possibility to fetch jobs from other nodes, in case it has no own jobs to execute???
A last word on c) above. I do not want to do 2 queries in the case there is no local job, but still only on!
Any hints on this?
This is general structure of job table so far:
ClusterKey UUID: Primary Key
JobScope String: HOST / GLOBAL / SERVICE / CHANNEL
JobIdentifier String: Web-Crawler, Twitter
Description String:
URL String:
JobType String: FETCH / CLEAN / PARSE /
Job String: Definition of the job
AdditionalData Collection:
JobStatus String: NEW / WORKING / FINISHED
User String:
ValidFrom Timestamp:
ValidUntill Collection:
Still in the process setting everything up, so no query so far defined. But an Actor will pull jobs out of it and set status and so
Cassandra has no way of "pinning" a key to a node, if that's what you are after.
If I were you, I'd stop worrying about whether my local node was authoritative for some set of data, and start leveraging the built-in consistency controls in Cassandra for managing the set of nodes that you read from or write to.
Lots of information here on read consistency and write consistency- using the right consistency will ensure that your application scales well while keeping it logically correct: http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
Another item worth mentioning is atomic "compare and swap", also known as lightweight transactions. Let's say you want to ensure that a given job is only performed once. You could add a field indicating whether the job has been "picked up", then query on that field (where picked_up = 0) and simultaneously (and atomically) update the field to indicate that you are "picking up" that work. That way no other actors will pick it up again.
Info on lightweight transactions here: http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_ltwt_transaction_c.html
I have a very simple cluster with 2 nodes.
I have created a keyspace with SimpleStrategy replication and a replication factor of 2.
For reads and writes I always use the default data consistency level of ONE.
If I take down one of the two nodes, by using the datastax java driver, I can still read data but when I try to write I get "Not enough replica available for query at consistency ONE (1 required but only 0 alive)".
Strangely if I execute the exactly same insert statement by using the CQL console it works without any problem. Even when using the CQL console the data consistency level was 1.
Am I missing something?
TIA
Update
I have done some more tests and the problem appears only when I use the BatchStatement. If I execute the prepared statement directly it works. Any idea ?
Here the code
Cluster cluster = Cluster.builder()
.addContactPoint("192.168.1.10")
.addContactPoint("192.168.1.12")
.build();
Session session = cluster.connect();
session.execute("use giotest");
BatchStatement batch = new BatchStatement();
PreparedStatement statement = session.prepare("INSERT INTO hourly(series_id, timestamp, value) VALUES (?, ?, ?)");
for (int i = 0; i < 50; i++) {
batch.add(statement.bind(new Long(i), new Date(), 2345.5));
}
session.execute(batch);
batch.clear();
session.close();
cluster.close();
Batches are atomic by default: if the coordinator fails mid-batch, Cassandra will make sure other nodes replay the remaining requests. It uses a distributed batch log for that (see this post for more details).
This batch log must be replicated to at least one replica other than the coordinator, otherwise that would defeat the above mechanism.
In your case, there is no other replica, only the coordinator. So Cassandra is telling you that it cannot provide the guarantees of an atomic batch. See also the discussion on CASSANDRA-7870.
If you haven't already, make sure you have specified both hosts at the driver level.
I want to update multiple rows in 2 CF's.
I don't care about the order they get updated ?
But is it guaranteed that if one gets succeeded then eventually others will get too if some C* node fails in between ?
Hector BatchMutation class use batch update or atomic batch update as these are two separate things.
You should use an atomic batch in CQL3. This guarantees that either the entire batch succeeds or the entire batch fails. An example from the CQL3 docs:
BEGIN BATCH
INSERT INTO users (userid, password, name) VALUES ('user2', 'ch#ngem3b', 'second user');
UPDATE users SET password = 'ps22dhds' WHERE userid = 'user3';
INSERT INTO users (userid, password) VALUES ('user4', 'ch#ngem3c');
DELETE name FROM users WHERE userid = 'user1';
APPLY BATCH;
The Hector BatchMutation class uses the Thrift operation batch_mutate. This is weaker than atomic_batch_mutate, which is the Thrift equivalent of the above. batch_mutate is only atomic for updates on the same key (can be different CFs though), whereas atomic_batch_mutate is atomic on all updates. I don't think Hector has implemented atomic_batch_mutate so you will need to move to CQL3 and a CQL3-capable driver e.g. DataStax's java driver.