Issue with Cassandra lightweight transaction

Issue with Cassandra lightweight transaction - cassandra

I am seeing some interesting behavior with Cassandra lightweight transactions in a standalone Cassandra instance. I am using DataStax Enterprize 5.0.2 for my testing. The issue is, a table being updated using lightweight transaction is returning true, which means it is updated. But a subsequent query on the same table shows that the row is NOT updated. Please note that I tried the same in a clustered environment and it worked absolutely fine! So I am just trying to understand what's going wrong in my environment.
Here is a simple example of what I am doing.
I create a simple table as provided below:
CREATE TABLE smart.TOPICSCONSUMERASSNSTATUS (
TOPNM text,
PARTID int,
STATUS text,
PRIMARY KEY (TOPNM,PARTID)
);
I I put the following set of preload data for testing purpose:
insert into smart.topicsconsumerassnstatus (topnm, partid, status) values ('ESP', 0, 'UNASSIGNED');
insert into smart.topicsconsumerassnstatus (topnm, partid, status) values ('ESP', 1, 'UNASSIGNED');
insert into smart.topicsconsumerassnstatus (topnm, partid, status) values ('ESP', 2, 'UNASSIGNED');
insert into smart.topicsconsumerassnstatus (topnm, partid, status) values ('ESP', 3, 'UNASSIGNED');
insert into smart.topicsconsumerassnstatus (topnm, partid, status) values ('ESP', 4, 'UNASSIGNED');
Now, I put the first select statement to get the details from the table:
select * from smart.topicsconsumerassnstatus where topnm='ESP';
It lists all partids with status UNASSIGNED. To assign partid 0, I then fire the following update statement:
update smart.topicsconsumerassnstatus set status='ASSIGNED' where topnm='ESP' and partid=0 if status='UNASSIGNED';
It returns true. And now, when I fire the above select query again, it lists all 5 rows with status UNASSIGNED. Interestingly, repeated execution of the update statement keeps returning true all the time - this clearly mean that actual data is not getting updated in the table.
I have seen the query trace as well and the update seems to be working fine as CAS is returned successful.
Also note that this behavior is seen explicitly when a query is used with "allow filtering" at least once, and from then on...
Can anyone please put some lights on what could be the issue? Is it something to do with "allow filtering" clause?

Related

What can cause "java.lang.IndexOutOfBoundsException: Index: 1, Size: 1" in a prepared query?

This is in Cassandra 3.11.4.
I'm running a modified version of a query that previously worked fine in my app. The original query that is fine was like:
SELECT SerializedRecord FROM SxRecord WHERE Mark=?
I modified the query to have a range of a timestamp (which I also added an index for, though I don't think that is relevant):
SELECT SerializedRecord FROM SxRecord WHERE Mark=? AND Timestamp>=? AND Timestamp<=?
This results in:
ResponseError {reHost = datacenter1:rack1:127.0.0.1:9042, reTrace = Nothing, reWarn = [], reCause = ServerError "java.lang.IndexOutOfBoundsException: Index: 1, Size: 1"}
When this occurs, I don't see the query CQL being logged in system_traces.sessions, which is interesting, because if I put a syntax error into the query, it is still logged there.
Additionally, when I run an (as far as I know, identical, up to timestamps) query in cqlsh, there doesn't seem to be a problem:
cqlsh> SELECT SerializedRecord FROM test_fds.SxRecord WHERE Mark=8391 AND Timestamp >= '2021-03-06 00:00:00.000+0000' AND Timestamp <= '2021-03-09 00:00:00.000+0000';
serializedrecord
------------------
This results in the following query trace:
cqlsh> select parameters from system_traces.sessions;
parameters
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{'consistency_level': 'ONE', 'page_size': '100', 'query': 'SELECT SerializedRecord FROM test_fds.SxRecord WHERE Mark=8391 AND Timestamp >= ''2021-03-06 00:00:00.000+0000'' AND Timestamp <= ''2021-03-09 00:00:00.000+0000'';', 'serial_consistency_level': 'SERIAL'}
null

It seems that the query, executed inside a prepared/bound statement, is not recieving all the parameters needed OR too many (something bound in previous code)
The fact that you don't see the query traced comes from the fact that the driver does not even perform the query as it has unbound parameters

How to reset record with counter column after delete it in Cassandra?

I create a table with counter column in using com.datastax.driver.core packages, and a function in class:
public void doStartingJob(){
session.execute("CREATE KEYSPACE myks WITH replication "
+ "= {'class':'SimpleStrategy', 'replication_factor':1};");
session.execute("CREATE TABLE myks.clients_count(ip_address text PRIMARY KEY,"
+ "request_count counter);");
}
After this I deleted table entry from CQLSH like:
jay#jayesh-380:~$ cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh:myks> DELETE FROM clients_count WHERE ip_address='127.0.0.1';
Then to insert row with same primary-Key I used following statement(via cqlsh):
UPDATE myks.clients_count SET request_count = 1 WHERE ip_address ='127.0.0.1';
And it is not allowed as:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot set the value of counter column request_count (counters can only be incremented/decremented, not set)"
But, I want the value of record's counter column should be set to 1, and with same primary-Key. (Functional Requirement)
How to do the same ??

The usage of counters is a bit strange, but you'll get used to. The main thing however, is that counters cannot be reused. Once you delete a counter value for a particular primary key, this counter is lost forever. This is by design and I think is not going to change.
Back to your question, the first of your problems is the initial DELETE. Don't.
Second, if the counter value for a primary key doesn't exists, C* will treat it is zero by default. Following the documentation, to load data in the counter for the first time you have to issue:
UPDATE mysecurity.clients_count SET request_count = request_count + 1 WHERE ip_address ='127.0.0.1';
And a SELECT will return the correct answer: 1
Again, beware of deletes! Don't! If you do, any subsequent query:
UPDATE mysecurity.clients_count SET request_count = request_count + 1 WHERE ip_address ='127.0.0.1';
will NOT fail, but the counter will NOT be updated.
Another thing to note is that C* don't support atomic read and update (or update and read on counter columns. That is you cannot issue an update and get within the same query the new (or the old) value of the counter. You'll need to perform two distinct queries, one with SELECT and one with UPDATE, but in a multi-client environment the SELECT value you get could not reflect the counter value during the UPDATE.
Your app will definitely fail if you do underestimate this.

What isolation level should I use for workers pulling from a queue of jobs to run?

I have a PostgreSQL database (v9.5.3) that's hosting "jobs" for workers to pull, run, and commit back.
When a worker wants a job, it runs something to the effect of:
SELECT MIN(sim_id) FROM job WHERE job_status = 0;
-- status 0 indicates it's ready to be run
job is a table with this schema:
CREATE TABLE commit_schema.job (
sim_id serial NOT NULL,
controller_id smallint NOT NULL,
controller_parameters smallint NOT NULL,
model_id smallint NOT NULL,
model_parameters smallint NOT NULL,
client_id smallint,
job_status smallint DEFAULT 0,
initial_glucose_id smallint NOT NULL
);
Afterwards, it uses this sim_id to piece together a bunch of parameters in a JOIN:
SELECT a.par1, b.par2 FROM
a INNER JOIN b ON a.sim_id = b.sim_id;
These parameters are then return to the worker, along with the sim_id, and the job is run. The sim_id is locked by setting job.job_status to 1, using an UPDATE:
UPDATE job SET job_status = 1 WHERE sim_id = $1;
The results are then committed using that same sim_id.
Ideally,
Workers wouldn't under any circumstances be able to get the same sim_id.
Two workers requesting a job won't error out, one will just have to wait to receive a job.
I think that using a serializable isolation level will ensure that the MIN() always returns unique sim_id's, but I believe this may also be achievable using a read committed isolation level. Then again, MIN() may not be able to concurrently and deterministically give unique sim_id's to two concurrent workers?

This should work just fine for concurrent access using the default isolation level Read Committed and FOR UPDATE SKIP LOCKED (new in pg 9.5):
UPDATE commit_schema.job j
SET job_status = 1
FROM (
SELECT sim_id
FROM commit_schema.job
WHERE job_status = 0
ORDER BY sim_id
LIMIT 1
FOR UPDATE SKIP LOCKED
) sub
WHERE j.sim_id = sub.sim_id
RETURNING sim_id;
job_status should probably be defined NOT NULL.
Be wary of certain corner cases - detailed explanation in this related answer on dba.SE:
Postgres UPDATE … LIMIT 1
To address your comment
There are various ways to return from a function:
Make it a simple SQL function instead of PL/pgSQL.
Or use RETURN QUERY with PL/pgSQL.
Or assign the result to a variable with RETURNING ... INTO - which can be an OUT parameter, so it will be returned at the end of the function automatically. Or any other variable and return it explicitly.
Related (with code examples):
Can I make a plpgsql function return an integer without using a variable?
Returning from a function with OUT parameter

CQL no viable alternative at input '(' error

I have a issue with my CQL and cassandra is giving me no viable alternative at input '(' (...WHERE id = ? if [(]...) error message. I think there is a problem with my statement.
UPDATE <TABLE> USING TTL 300
SET <attribute1> = 13381990-735b-11e5-9bed-2ae6d3dfc201
WHERE <attribute2> = dfa2efb0-7247-11e5-a9e5-0242ac110003
IF (<attribute1> = null OR <attribute1> = 13381990-735b-11e5-9bed-2ae6d3dfc201) AND <attribute3> = 0;
Any idea were the problem is in the statement about?

It would help to have your complete table structure, so to test your statement I made a couple of educated guesses.
With this table:
CREATE TABLE lwtTest (attribute1 timeuuid, attribute2 timeuuid PRIMARY KEY, attribute3 int);
This statement works, as long as I don't add the lightweight transaction on the end:
UPDATE lwttest USING TTL 300 SET attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201
WHERE attribute2=dfa2efb0-7247-11e5-a9e5-0242ac110003;
Your lightweight transaction...
IF (attribute1=null OR attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201) AND attribute3 = 0;
...has a few issues.
"null" in Cassandra is not similar (at all) to its RDBMS counterpart. Not every row needs to have a value for every column. Those CQL rows without values for certain column values in a table will show "null." But you cannot query by "null" since it isn't really there.
The OR keyword does not exist in CQL.
You cannot use extra parenthesis to separate conditions in your WHERE clause or your lightweight transaction.
Bearing those points in mind, the following UPDATE and lightweight transaction runs without error:
UPDATE lwttest USING TTL 300 SET attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201
WHERE attribute2=dfa2efb0-7247-11e5-a9e5-0242ac110003
IF attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201 AND attribute3=0;
[applied]
-----------
False

Cassandra does not execute insert statement with timestamp field

--Update: it seems it was a glitch with the Virtual Machine. I restarted Cassandra service and it works as expected.
--Update: It seems that the problem is not in the code, I tried to execute the insert statement in a Cassandra client and I get the same behavior, No error is displayed, nothing is inserted.
The column that causes this behavior is of type timestamp. When I set this column value to some values (ex. '2015-08-25 22:15:12')
The table is:
create table player
(msisdn varchar primary key,
game int,keyword varchar, inserted timestamp,lang int,mo int,mt int,qid int,score int)
I am new to Cassandra, downloaded the VirtualBox snapshot to test it.
I am trying the example code of the batch and it did nothing, so I tried as people suggested to execute the prepared statement directly.
var addPlayer = casDemoSession.Prepare("INSERT INTO player (msisdn,qid,keyword,mo,mt,game,score,lang,inserted) Values (?,?,?,?,?,?,?,?,?)");
for (int i = 0; i < 20; i++) {
var bs = addPlayer.Bind(getRandMSISDN(), 1, "", 1, 0, 0, 10, 0, DateTime.Now);
bs.EnableTracing(true);
casDemoSession.Execute(bs);
}
The code above does not throw any exceptions nor insert any data. I tried to trace the query but it does not show the actual cql query.
PlannetCassandra V0.1 VM running Cassandra 2.0.1
datastax driver 2.6 https://github.com/datastax/csharp-driver

One thing that might be missing is the keyspace name for your player table. Usually you would have "INSERT INTO <keyspace>.name (..."
If you're able to run cqlsh, could you add the output from "DESCRIBE TABLE <keyspace>.player" to your question, and show what happens when you attempt to do the insert in cqlsh.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Issue with Cassandra lightweight transaction - cassandra

Related

What can cause "java.lang.IndexOutOfBoundsException: Index: 1, Size: 1" in a prepared query?

How to reset record with counter column after delete it in Cassandra?

What isolation level should I use for workers pulling from a queue of jobs to run?

CQL no viable alternative at input '(' error

Cassandra does not execute insert statement with timestamp field

Categories

Resources