I am new to cassandra and trying to do multi node setup on two Mac machine. Its not a datastax casandra.
like my Ip is 10.100.1.12 and other machine ip is 10.100.1.15. I have changed the following properties in cassandra.yaml files on bothe the machine:
10.100.1.15:
seeds: "127.0.0.1,10.100.1.12,10.100.1.15"
listen_address: 10.100.1.15
rpc_address: 10.100.1.15
endpoint_snitch: GossipingPropertyFileSnitch
10.100.1.12:
seeds: "127.0.0.1,10.100.1.12,10.100.1.15"
listen_address: 10.100.1.12
rpc_address: 10.100.1.12
endpoint_snitch: GossipingPropertyFileSnitch
cassandra runs fine cqlsh is also opening using the command
bin/cqlsh 10.100.1.12
but when i am trying to retrieve the count of tables its showing me the error:
ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
i tried changing the read_request_timeout_in_ms proprtty from 5000 to 20000 but still i am getting the same error. Could anyone please help what i am doing wrong?
the table schema is following:
cqlsh:app_auditor> describe table traffic_data;
CREATE TABLE app_auditor.traffic_data (
current_time bigint PRIMARY KEY,
appid bigint,
attributes map<text, text>,
device bigint,
flow_type text,
message_time bigint
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 86400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
the count query i am using is
select count(*) from traffic_data;
In Cassandra count(*) is very costly, cassandra needs to scan all the row from all the node just to give you the count, that's why it's giving you timeout exception.
Instead of using count(*) maintain a counter table, like the below one :
CREATE TABLE traffic_counter (
type int PRIMARY KEY,
traffic_count counter
);
When a data insert into traffic_data data, increment the value of traffic_count
UPDATE traffic_counter SET traffic_count = traffic_count + 1 WHERE type = 0;
Now you can select traffic count very efficiently.
SELECT traffic_count FROM traffic_counter WHERE type = 0;
Related
I have this table in Cassandra:
CREATE TABLE wear_dealer.product_color_size_stock (
productcode text,
colorcode text,
sizecode text,
ean text,
shortdescription text,
stock int,
**PRIMARY KEY (productcode, colorcode, sizecode)**
) WITH CLUSTERING ORDER BY (colorcode ASC, sizecode ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE INDEX product_color_size_stock_stock_idx ON wear_dealer.product_color_size_stock (stock);
How can I update shortdescription having only the value for productcode
When I perform this query:
cqlsh:wear_dealer> update seasons_product_color_size
set shortdescription ='AAA'
where productcode='RUNTS';
I get the following error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Some partition key parts are missing: seasoncode"
Any strategie to overcome this?
Many thanks in advance!
Unfortunately, CQL does not allow writes for a partial key. Remember that Cassandra treats INSERTs and UPDATEs the same. So when this:
UPDATE seasons_product_color_size
SET shortdescription ='AAA'
WHERE productcode='RUNTS';
Returns this: "Some partition key parts are missing: seasoncode"
It's saying that Cassandra doesn't know which node to write the data to, because there isn't a partition key. In SQL, it would just iterate through all rows in the table and update them according to your WHERE clause. But Cassandra is specifically designed not to allow operations like that.
For this query you will need to figure out the missing seasoncodes separately, and UPDATE each row individually.
Cassandra supports write based on partition key, As you supplied partial partition key you cannot update with that.
UPDATE seasons_product_color_size SET shortdescription ='AAA' WHERE productcode='RUNTS' and sizecode=10
I am trying to run a stress test using the cassandra-stress tool with profiles on a 6 node cluster with a replication factor=3.
./cassandra-stress user profile=/path/to/cassandra_stress.yaml duration=2h ops\(insert=20,select=10\) **cl=local_quorum** no-warmup -node nodeaddress -transport truststore=/path/to/tls/truststore.jks truststore-password=***** -rate threads=5 -log level=verbose file=/path/to/log -graph file=graph_.html title='Graph' & 2>1
The execution stops at some point with a ReadTimeout and the logs show the following:
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_QUORUM (2 replica were required but only 1 acknowledged the write)
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ALL (3 replica were required but only 2 acknowledged the read)
I am not sure why it is taking cl=local_quorum for writes but not for reads. Any insights would be helpful.
Profile
# Keyspace Name keyspace: d3 keyspace_definition: | CREATE KEYSPACE d3 WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '3'} AND DURABLE_WRITES = true;
# Table name table: stress_offheap_long table_definition: | CREATE TABLE d3.stress_offheap_long (
dart_id timeuuid,
dart_version_id timeuuid,
account_id timeuuid,
amount double,
data text,
state text, PRIMARY KEY (dart_id, dart_version_id) ) WITH CLUSTERING ORDER BY (dart_version_id DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys':'ALL', 'rows_per_partition':'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
columnspec:
- name: dart_id
size: gaussian(36..64)
population: uniform(1..10M)
- name: art_version_id
size: gaussian(36..64)
- name: account_id
size: gaussian(36..64)
population: uniform(1..10M)
- name: amount
size: fixed(1)
- name: data
size: gaussian(5000..20000)
- name: state
size: gaussian(1..2)
population: fixed(1)
### Batch Ratio Distribution Specifications ###
insert:
partitions: fixed(1)
select: fixed(1)/1000
batchtype: UNLOGGED # Unlogged batches
#
# A list of queries you wish to run against the schema
#
queries:
select:
cql: select * from stress_offheap_long where dart_id = ? and dart_version_id=? LIMIT 1
fields: samerow
I have table as
cqlsh> DESC relation.students;
CREATE TABLE relation.students (
student_id uuid,
created_at timeuuid,
name text,
PRIMARY KEY (student_id, created_at)
) WITH CLUSTERING ORDER BY (created_at DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX index_students_student_id ON relation.students (student_id);
When I check timeuuid functions, it has toTimestamp function, when I try to use that, it gives error.
cqlsh> select student_id, name, toTimestamp(created_at) from relation.students where student_id=c2e160e6-27bb-4e25-91ca-e33dc7538d25;
InvalidRequest: code=2200 [Invalid query] message="Unknown function 'toTimestamp'"
I am using version 2.1
cqlsh> select peer, release_version from system.peers;
peer | release_version
-------------+-----------------
127.0.0.1 | 2.1.14.1272
In Cassandra 2.2 introduced toTimestamp
Check the Cassandra Release News
You can also create your own function. Cheeck UDF. UDF also introduced in Cassandra 2.2
If you use Cassandra lower than 2.2, you can't use this function.
Im running Cassandra Version 2.1.2 and cqlsh 5.0.1
Here is the table weather.log, weather is the keyspace having consistency level One.
I have 2 nodes configured.
CREATE KEYSPACE weather WITH replication = {'class': 'NetworkTopologyStrategy', 'us-east': '1'} AND durable_writes = true;
CREATE TABLE weather.log (
ip inet,
ts timestamp,
city text,
country text,
PRIMARY KEY (ip, ts)
) WITH CLUSTERING ORDER BY (ts DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
When we run the query.
select distinct ip from weather.log
We get inconsistent, wrong responses. Once we get 99 just next time we get 1600 etc. [where the actual number should be > 2000]
I have tried this query with consistency level set to ALL also. It dint work.
Why is this happening ? I need to get all the keys. How to get all the primary keys?
It looks like you might be effected by CASSANDRA-8940. I'd suggest to update to the latest 2.1.x release and verify if this issue is fixed for you.
I am receiving a OperationTimedOut error while running an alter table command in cqlsh. How is that possible? Since this is just a table metadata update, shouldn't this operation run almost instantaneously?
Specifically, this is an excerpt from my cqlsh session
cqlsh:metric> alter table metric with gc_grace_seconds = 86400;
OperationTimedOut: errors={}, last_host=sandbox73vm230
The metric table currently has a gc_grace_seconds of 864000. I am seeing this behavior in a 2-node cluster and in a 6-node 2-datacenter cluster. My nodes seem to be communicating fine in general (e.g. I can insert in one and read from the other). Here is the full table definition (a cyanite 0.1.3 schema with DateTieredCompactionStrategy, clustering and caching changes):
CREATE TABLE metric.metric (
tenant text,
period int,
rollup int,
path text,
time bigint,
data list<double>,
PRIMARY KEY ((tenant, period, rollup, path), time)
) WITH CLUSTERING ORDER BY (time ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'timestamp_resolution': 'SECONDS', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
I realize at this point the question is pretty old, and you may have either figured out the answer or otherwise moved on, but wanted to post this in case others stumbled upon it.
The default cqlsh request timeout is 10 seconds. You can adjust this by starting up cqlsh with the --request-timeout option set to some value that allows your ALTER TABLE to run to completion, e.g.:
cqlsh --request-timeout=1000000