I have a cluster of 3 Cassandra 2.0 nodes. My application I wrote a test which tries to write and read some data into/from Cassandra. In general this works fine.
The curiosity is that after I restarted my computer, this test will fail, because after writting I read the same value I´ve write before and there I get null instead of the value, but the was no exception while writing.
If I manually truncate the used column family, the test will pass. After that I can execute this test how often I want, it passes again and again. Furthermore it doesn´t matter if there are values in the Cassandra or not. The result is alwalys the same.
If I look at the CLI and the CQL-shell there are two different views:
Does anyone have an ideas what is going wrong? The timestamp in the CLI is updated after re-execution, so it seems to be a read-problem?
A part of my code:
For inserts I tried
Insert.Options insert = QueryBuilder.insertInto(KEYSPACE_NAME,TABLENAME)
.value(ID, id)
.value(JAHR, zonedDateTime.getYear())
.value(MONAT, zonedDateTime.getMonthValue())
.value(ZEITPUNKT, date)
.value(WERT, entry.getValue())
.using(timestamp(System.nanoTime() / 1000));
and
Insert insert = QueryBuilder.insertInto(KEYSPACE_NAME,TABLENAME)
.value(ID, id)
.value(JAHR, zonedDateTime.getYear())
.value(MONAT, zonedDateTime.getMonthValue())
.value(ZEITPUNKT, date)
.value(WERT, entry.getValue());
My select looks like
Select.Where select = QueryBuilder.select(WERT)
.from(KEYSPACE_NAME,TABLENAME)
.where(eq(ID, id))
.and(eq(JAHR, zonedDateTime.getYear()))
.and(eq(MONAT, zonedDateTime.getMonthValue()))
.and(eq(ZEITPUNKT, Date.from(instant)));
Consistencylevel is QUORUM (for both) and replicationfactor 3
I'd say this seems to be a problem with timestamps since a truncate solves the problem. In Cassandra last write wins and this could be a problem caused by the use of System.nanoTime() since
This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time.
...
The values returned by this method become meaningful only when the difference between two such values, obtained within the same instance of a Java virtual machine, is computed.
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()
This means that the write that occured before the restart could have been performed "in the future" compared to the write after the restart. This would not fail the query, but the written value would simply not be visible due to the fact that there is a "newer" value available.
Do you have a requirement to use sub-millisecond precision for the insert timestamps? If possible I would recommend using System.currentTimeMillis() instead of nanoTime().
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis()
If you have a requirement to use sub-millisecond precision it would be possible to use System.currentTimeMillis() with some kind of atomic counter that ranged between 0-999 and then use that as a timestamp. This would however break if multiple clients insert the same row at the same time.
Related
I am using a Logic App to transform some data for an integration. I am trying to avoid using For Each loops as the amount of data I am working with is high, and these incur a cost for each action and iteration of the for each loop.
However the integration I am working with requires a unique incrementing number for each line. They don't have to be sequential, or even starting with 1 but the order should be kept the same.
So with the above, the first one would get LineNumber 1, the second LineNumber 2, etc.. (or like I said, it could be 67829, 67835, etc..)
I tried to set a variable with ticks(utcNow()) before the start of the mapping, and then use sub(ticks(utcNow()), variables('startTicks')) but this is evaluated once and the same number is applied to all.
My next thought is to use an azure function/inline javascript to go through afterward and assign them, but just wondering if there is a way to accomplish this in the select.
or like I said, it could be 67829, 67835, etc..
Answering to this requirement,
Inside the Select Option :
indexOf(string(variables('<DATA Variable>')),string(item()))
Explanation :
item() - current item (of all items) in the select - stringified the same & tried to find the same in stringified version of the entire data - the index number will be returned.
OUTPUT
Please note :
Did not get a chance to check on a very large dataset.
This may fail, if a specific row(all values in the row) repetitive in nature - I assume this may not
be your case (order number might unique )
I have 3 node Spanner instance, and a single table that contains around 4 billion rows. The DDL looks like this:
CREATE TABLE predictions (
name STRING(MAX),
...,
model_version INT64,
) PRIMARY KEY (name, model_version)
I'd like to setup a job to periodically remove some old rows from this table using the Python Spanner client. The query I'd like to run is:
DELETE FROM predictions WHERE model_version <> ?
According to the docs, it sounds like I would need to execute this as a Partitioned DML statement. I am using the Python Spanner client as follows, but am experiencing timeouts (504 Deadline Exceeded errors) due to the large number of rows in my table.
# this always throws a "504 Deadline Exceeded" error
database.execute_partitioned_dml(
"DELETE FROM predictions WHERE model_version <> #version",
params={"model_version": 104},
param_types={"model_version": Type(code=INT64)},
)
My first intuition was to see if there was some sort of timeout I could increase, but I don't see any timeout parameters in the source :/
I did notice there was a run_in_transaction method in the Spanner lib that contains a timeout parameter, so I decided to deviate from the partitioned DML approach to see if using this method worked. Here's what I ran:
def delete_old_rows(transaction, model_version):
delete_dml = "DELETE FROM predictions WHERE model_version <> {}".format(model_version),
dml_statements = [
delete_dml,
]
status, row_counts = transaction.batch_update(dml_statements)
database.run_in_transaction(delete_old_rows,
model_version=104,
timeout_secs=3600,
)
What's weird about this is the timeout_secs parameter appears to be ignored, because I still get a 504 Deadline Exceeded error within a minute or 2 of executing the above code, despite a timeout of one hour.
Anyways, I'm not too sure what to try next, or whether or not I'm missing something obvious that would allow me to run a delete query in a timely fashion on this huge Spanner table. The model_version column has pretty low cardinality (generally 2-3 unique model_version values in the entire table), so I'm not sure if that would factor into any recommendations. But if someone could offer some advice or suggestions, that would be awesome :) Thanks in advance
The reason that setting timeout_secs didn't help was because the argument is unfortunately not the timeout for the transaction. It's the retry timeout for the transaction so it's used to set the deadline after which the transaction will stop being retried.
We will update the docs for run_in_transaction to explain this better.
The root cause was that the total timeout for the Streaming RPC calls was set too low in the client libraries, being set to 120s for Streaming APIs (eg ExecuteStreamingSQL used by partitioned DML calls.)
This has been fixed in the client library source code, changing them to a 60 minute timout (which is the maximum), and will be part of the next client library release.
As a workaround, in Java, you can configure the timeouts as part of the SpannerOptions when you connect your database. (I do not know how to set custom timeouts in Python, sorry)
final RetrySettings retrySettings =
RetrySettings.newBuilder()
.setInitialRpcTimeout(Duration.ofMinutes(60L))
.setMaxRpcTimeout(Duration.ofMinutes(60L))
.setMaxAttempts(1)
.setTotalTimeout(Duration.ofMinutes(60L))
.build();
SpannerOptions.Builder builder =
SpannerOptions.newBuilder()
.setProjectId("[PROJECT]"));
builder
.getSpannerStubSettingsBuilder()
.applyToAllUnaryMethods(
new ApiFunction<UnaryCallSettings.Builder<?, ?>, Void>() {
#Override
public Void apply(Builder<?, ?> input) {
input.setRetrySettings(retrySettings);
return null;
}
});
builder
.getSpannerStubSettingsBuilder()
.executeStreamingSqlSettings()
.setRetrySettings(retrySettings);
builder
.getSpannerStubSettingsBuilder()
.streamingReadSettings()
.setRetrySettings(retrySettings);
Spanner spanner = builder.build().getService();
The first suggestion is to try gcloud instead.
https://cloud.google.com/spanner/docs/modify-gcloud#modifying_data_using_dml
Another suggestion is to pass the range of name as well so that limit the number of rows scanned. For example, you could add something like STARTS_WITH(name, 'a') to the WHERE clause so that make sure each transaction touches a small amount of rows but first, you will need to know about the domain of name column values.
Last suggestion is try to avoid using '<>' if possible as it is generally pretty expensive to evaluate.
I have a 2 node cassandra cluster with RF=2.
When a delete from x where y cql statement is issued - is it known how long it will take all nodes to delete the row?
What I see in one of the integration tests:
A row is deleted, the result of the deletion is tested with a select * from y where id = xxx statement. What I see is that sometimes the result is not null as expected and the deleted row is still found.
Is the correct approch to read with CL=2 to get the result I am expecting?
make sure that the servers time are in synch if you are using server side timestamp.
Better use client side timestamp.
Is the correct approch to read with CL=2 to get the result I am
expecting?
i assume you are using default consistecy while delete ie 1 and as 1+2 > 2 (ie W+R > N) in your case hence it is ok.
Local to the replica it will be sub ms. The time is dominated in the time from app->coordinator->replica->coordinator->app network hops. Use quorum or local_quorum for consistency across sequential requests like that on both write and read.
I am new to Cassandra and am having an issue with counters double counting sometimes. I am trying to keep track of daily event counts for certain events. Here is my table structure:
create table pipes.pipe_event_counts (
count counter,
pipe_id text,
event_type text,
date text,
PRIMARY KEY ((pipe_id, event_type, date))
);
The driver I am using is the Datastax Java driver, and I am compiling and binding parameters to the following prepared statement:
incrementPipeEventCountStatement = CassandraClient.getInstance().getSession().prepare(
QueryBuilder.update("pipes", PIPE_EVENT_COUNT_TABLE_NAME).with(incr("count")).
where(eq("pipe_id", "?")).and(eq("date", "?")).and(eq("event_type", "?")).
getQueryString()
);
incrementPipeEventCountStatement.bind(
event.getAttrubution(Meta.PIPE_ID), dateString, event.getType().toString()
)
The problem is very weird. Sometimes when I process a single event, the counter increments properly by 1. However, the majority of the time, it double increments. I've been looking at my code for some time now and can't find any issues that would cause a second increment.
Is my implementation of counters in Cassandra correct for my use case? I think it is, but I could be losing my mind. I'm hoping someone can help me confirm so I can focus in the right area to find my problem.
Important edit: This is the query I'm running to check the count after the event:
select count from pipes.pipe_event_counts where pipe_id = 'homepage' and event_type = 'click' and date = '2015-04-07';
The thing with counters is that they are not idempotent operations so when you retry (and don't know if your original write was successful) you may end up over-counting.
You can also never re-try and undercount.
As Chris chared, there are some issues with the counter implementation pre-2.1 that make the overcounting issue much more severe. There are also performance issues associated with counters so you want to make sure you look into these in detail before you push a counter deployment to production.
Here are the related Jiras to help you make informed decisions:
Counters ++ (major improvement - fixed 2.1) -- https://issues.apache.org/jira/browse/CASSANDRA-6504
Memory / GC issues from large counter workloads, Counter Column (major improvement - fixed 2.1)--https://issues.apache.org/jira/browse/CASSANDRA-6405
Counters into separate cells (final solution - eta 3.1)- https://issues.apache.org/jira/browse/CASSANDRA-6506
We are doing the following to update the value of a counter, now we wonder if there is a straightforward way to get back the updated counter value immediately.
mutator.incrementCounter(rowid1, "cf1", "counter1", value);
There's no single 'incrementAndGet' operation in Cassandra thrift API.
Counters in Cassandra are eventually consistent and non-atomic. Fragile ConsistencyLevel.ALL operation is required to get "guaranteed to be updated" counter value, i.e. perform consistent read. ConsistencyLevel.QUORUM is not sufficient (as specified in counters design document: https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf).
To implement incrementAndGet method that looks consistent, you might want at first read counter value, then issue increment mutation, and return (read value + inc).
For example, if previous counter value is 10 to 20 (on different replicas), and one add 50 to it, read-before-increment will return either 60 or 70. And read-after-increment might still return 10 or 20.
The only way to do it is query for it. There is no increment-then-read functionality available in Cassandra.