When I run a count query against C*, I get result with a warning:
cqlsh:my_keyspace> SELECT count(*) from user;
count
1
(1 rows)
Warnings : Aggregation query used without partition key
Warning is because C* has to do scan across all nodes.
But when I do this query:
cqlsh:my_keyspace> SELECT * FROM user;
last_name | first_name | title
-----------+------------+-------
Doe | John | Mr.
(1 rows)
This query doesn't show me a warning. Isn't this also supposed to scan all nodes?
Yes, both queries do a full table scan but the former uses an aggregate function while the latter doesn't.
COUNT() is an aggregate function -- it returns a final count by aggregating all rows in the query. This is the reason SELECT COUNT(*) generates a warning while SELECT * doesn't. Cheers!
Related
Is there any performance tuning to do for a write-bound workload in YugabyteDB? We thought that by simply adding additional nodes to our YugabyteDB cluster, without further tuning, we would have seen some noticeable increase in writes, however this is not the case. Schema can be found below.
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
update_id | character varying(255) | | not null | | extended | |
node_id | character varying(255) | | not null | | extended | |
data | character varying | | not null | | extended | |
created_at | timestamp without time zone | | | timezone('utc'::text, now()) | plain | |
Indexes:
"test_pkey" PRIMARY KEY, lsm (update_id HASH)
"test_crat" lsm (created_at DESC)
This table has tablets spread across all tservers with RF=3. Created_at is a timestamp that changes all of the time. At this point it has no more than two days of data, all new inserts are acquiring a new timestamp.
In the case of the schema called out above, the test_crat index here is limited to 1 tablet because it is range-sharded. Since created_at has only recent values they will end up going to 1 shard/tablet even with tablet splitting, meaning that all inserts will go to 1 shard. As explained in this Google Spanner documentation, whose sharding, replication, and transactions architecture YugabyteDB is based off of, this is an antipattern for scalability. As mentioned in that documentation:
If you need a global (cross node) timestamp ordered table, and you need to support higher write rates to that table than a single node is capable of, use application-level sharding. Sharding a table means partitioning it into some number N of roughly equal divisions called shards. This is typically done by prefixing the original primary key with an additional ShardId column holding integer values between [0, N). The ShardId for a given write is typically selected either at random, or by hashing a part of the base key. Hashing is often preferred because it can be used to ensure all records of a given type go into the same shard, improving performance of retrieval. Either way, the goal is to ensure that, over time, writes are distributed across all shards equally. This approach sometimes means that reads need to scan all shards to reconstruct the original total ordering of writes.
What that would mean is: to get recent changes, you would have to query each of the shards. Suppose you have 32 shards:
select * from raw3 where shard_id = 0 and created_at > now() - INTERVAL 'xxx';
..
select * from raw3 where shard_id = 31 and created_at > now() - INTERVAL 'xxx';
On the insert, every row could just be given a random value for your shard_id column from 0..31. And your index would change from:
(created_at DESC)
to
(shard_id HASH, created_at DESC)
Another approach you could use that may not be as intuitive, but may be more effective, would be to use a partial index for each shard_id that you would want.
Here is a simple example using 4 shards:
create index partial_0 ON raw3(created_at DESC) where (extract(epoch from timezone('utc',created_at)) * 1000)::bigint % 4=0;
The partial index above only includes rows where the modulus of the epoch in milliseconds of created_at timestamp is 0. And you repeat for the other 3 shards:
create index partial_1 ON raw3(created_at DESC) where (extract(epoch from timezone('utc',created_at)) * 1000)::bigint % 4 = 1;
create index partial_2 ON raw3(created_at DESC) where (extract(epoch from timezone('utc',created_at)) * 1000)::bigint % 4 = 2;
create index partial_3 ON raw3(created_at DESC) where (extract(epoch from timezone('utc',created_at)) * 1000)::bigint % 4 = 3;
And then when you query PostgreSQL is smart enough to pick the right index:
yugabyte=# explain analyze select * from raw3 where (extract(epoch from timezone('utc',created_at)) * 1000)::bigint % 4 = 3 AND created_at < now();
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Index Scan using partial_3 on raw3 (cost=0.00..5.10 rows=10 width=16) (actual time=1.429..1.429 rows=0 loops=1)
Index Cond: (created_at < now())
Planning Time: 0.210 ms
Execution Time: 1.502 ms
(4 rows)
No need for a new shard_id column in the base table or in the index. If you want to reshard down the road, you can recreate new partial indexes with different shards and drop the old indexes.
More information about the DocDB sharding layer within YugabyteDB can be found here. If you are interested in the different sharding strategies we evaluated, and why we decided on consistent hash sharding as the default sharding strategy, take a look at this blog written by our Co-Founder and CTO Karthik Ranganathan.
I am trying to find a way to determine if the table is empty in Cassandra DB.
cqlsh> SELECT * from examples.basic ;
key | value
-----+-------
(0 rows)
I am running count(*) to get the value of the number of rows , but I am getting warning message, So I wanted to know if there is any better way to check if the table is empty(zero rows).
cqlsh> SELECT count(*) from examples.basic ;
count
-------
0
(1 rows)
Warnings :
Aggregation query used without partition key
cqlsh>
Aggregations, like count, can be an overkill for what you are trying to accomplish, specially with the star wildcard, as if there is any data on your table, the query will need to do a full table scan. This can be quite expensive if you have several records.
One way to get the result you are looking for is the query
cqlsh> SELECT key FROM keyspace1.table1 LIMIT 1;
Empty table:
The resultset will be empty
cqlsh> SELECT key FROM keyspace1.table1 LIMIT 1;
key
-----
(0 rows)
Table with data:
The resultset will have a record
cqlsh> SELECT key FROM keyspace1.table1 LIMIT 1;
key
----------------------------------
uL24bhnsHYRX8wZItWM6xKdS0WLvDsgi
(1 rows)
My Cassandra DB not responding as expected Row result. please see the below details of my Cassandra keyspace creation and to query of Count(*)
Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra
3.11.0 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh> CREATE KEYSPACE key1 WITH replication = {'class':'SimpleStrategy', 'replicationfactor' : 1};
cqlsh> CREATE TABLE Key.Transcation_CompleteMall (i text, i1 text static, i2 bigint , i3 int static, i4 decimal static, i5 bigint static, i6 decimal static, i7 decimal static, PRIMARY KEY ((i),i1));
cqlsh> COPY Key1.CompleteMall (i,i1,i2,i3,i4,i5,i6,i7) FROM '/home/gpadmin/all.csv' WITH HEADER = TRUE; Using 16 child processes
Starting copy of Key1.completemall with columns [i, i1, i2, i3, i4, i5, i6, i7]. Processed: 25461792 rows; Rate: 15162 rows/s; Avg. rate: 54681 rows/s
> **bold**25461792 rows imported from 1 files in 7 minutes and 45.642 seconds (0 skipped).
cqlsh> select count(*) from Key1.transcation_completemall; OperationTimedOut: errors={'127.0.0.1': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.1 cqlsh> exit
[gpadmin#hmaster ~]$ cqlsh --request-timeout=3600
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> select count(*) from starhub.transcation_completemall;
count
---------
**bold**2865767
(1 rows)
Warnings :
Aggregation query used without partition key
cqlsh>
I got only 2865767 rows but Copy command shows that 25461792 Rows accepted Cassandra. all.csv file has 2.5G size. For evaluating I exported the table to another file test.csv file which file size I wondered it became 252Mb.
My question is that, is Cassandra will automatically remove the duplicate in a row ?
If yes how the Cassandra delete the duplicate in a table? Like primary Key repetition or Partition Key or like exact field duplication?
or
What would be the possibility that data get Loss
Expected your valuable suggestion
Advance Thanks to you all
Cassandra will overwrite data with same primary key (Ideally all database will not have duplicate values for primary key(some throws constraint error,while some overwrites data)).
Example:
CREATE TABLE test(id int,id1 int,name text,PRIMARY KEY(id,id1))
INSERT INTO test(id,id1,name) VALUES(1,2,'test');
INSERT INTO test(id,id1,name) VALUES(1,1,'test1');
INSERT INTO test(id,id1,name) VALUES(1,2,'test2');
INSERT INTO test(id,id1,name) VALUES(1,1,'test1');
SELECT * FROM test;
-----------------
|id |id1 |name |
-----------------
|1 |2 |test2 |
-----------------
|1 |1 |test1 |
-----------------
The above statement will have only 2 records in table one with primary key (1,1) and other with primary key(1,2).
So in your case if values of i and i1 have duplicates that data will be overwritten.
Maybe check LIMIT option on SELECT statement, see ref doc here
Ref doc says:
Specifying rows returned using LIMIT
Using the LIMIT option, you can specify that the query return a limited number of rows.
SELECT COUNT() FROM big_table LIMIT 50000;
SELECT COUNT() FROM big_table LIMIT 200000;
The output of these statements if you had 105,291 rows in the database would be: 50000, and 105,291. The cqlsh shell has a default row limit of 10,000. The Cassandra server and native protocol do not limit the number of rows that can be returned, although a timeout stops running queries to protect against running malformed queries that would cause system instability.
Is it possible to create an index on a UUID/TIMEUUID column in Cassandra? I'm testing out a model design which would have an index on a UUID column, but queries on that column always return 0 rows found.
I have a table like this:
create table some_data (site_id int, user_id int, run_id uuid, value int, primary key((site_id, user_id), run_id));
I create an index with this command:
create index idx on some_data (run_id) ;
No errors are thrown by CQL when I create this index.
I have a small bit of test data in the table:
site_id | user_id | run_id | value
---------+---------+--------------------------------------+-----------------
1 | 1 | 9e118af0-ac92-11e4-81ae-8d1bc921f26d | 3
However, when I run the query:
select * from some_data where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d
CQLSH just returns: (0 rows)
If I use an int for the run_id then the index behaves as expected.
Yes, you can create a secondary index on a UUID. The real question is "should you?"
In any case, I followed your steps, and got it to work.
Connected to Test Cluster at 192.168.23.129:9042.
[cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
aploetz#cqlsh> use stackoverflow ;
aploetz#cqlsh:stackoverflow> create table some_data (site_id int, user_id int, run_id uuid, value int, primary key((site_id, user_id), run_id));
aploetz#cqlsh:stackoverflow> create index idx on some_data (run_id) ;
aploetz#cqlsh:stackoverflow> INSERT INTO some_data (site_id, user_id, run_id, value) VALUES (1,1,9e118af0-ac92-11e4-81ae-8d1bc921f26d,3);
aploetz#cqlsh:stackoverflow> select * from usr_rec3 where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d;
code=2200 [Invalid query] message="unconfigured columnfamily usr_rec3"
aploetz#cqlsh:stackoverflow> select * from some_data where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d;
site_id | user_id | run_id | value
---------+---------+--------------------------------------+-------
1 | 1 | 9e118af0-ac92-11e4-81ae-8d1bc921f26d | 3
(1 rows)
Notice though, that when I ran this command, it failed:
select * from usr_rec3 where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d
Are you sure that you didn't mean to select from some_data instead?
Also, creating secondary indexes on high-cardinality columns (like a UUID) is generally not a good idea. If you need to query by run_id, then you should revisit your data model and come up with an appropriate query table to serve that.
Clarification:
Using secondary indexes in general is not considered good practice. In the new book Cassandra High Availability, Robbie Strickland identifies their use as an anti-pattern, due to poor performance.
Just because a column is of the UUID data type doesn't necessarily make it high-cardinality. That's more of a data model question for you. But knowing the nature of UUIDs and their underlying purpose toward being unique, is setting off red flags.
Put these two points together, and there isn't anything about creating an index on a UUID that sounds appealing to me. If it were my cluster, and (more importantly) I had to support it later, I wouldn't do it.
We use cassandra wide rows heavily to store per user time-series as they are perfect for that use-case. Let's assume we have a table:
create table user_events (
user_id text,
timestmp timestamp,
event text,
primary key((user_id), timestmp));
What if clashes on timestamp may happen (same user can emit two different events with the same timestamp). What is the best way to tweak this schema to resolve that assuming we have an ordering for all events present (have a sequence int for each event).
If I modify schema the following way:
create table user_events (
user_id text,
timestmp timestamp,
seq int,
event text,
primary key((user_id), timestmp, seq));
I won’t be able to do WHERE user_id = ? ORDER BY timestmp ASC, seq ASC – cassandra does not allow that.
I won’t be able to do WHERE user_id = ? ORDER BY timestmp ASC, seq ASC – cassandra does not allow that.
You might be seeing an error because you are repeating ASC. This should work:
WHERE user_id = ? ORDER BY timestmp,seq ASC
Also, as long as you have defined your primary key as PRIMARY KEY((user_id),timestmp,seq)) you don't even need to specify ORDER BY x[,y] ASC. It will cluster the data on disk in that order, and thus return it to you already sorted in that order. ORDER BY should only be necessary when you want to put your results in descending order (or whatever the opposite of how you have it defined is).
What if clashes on timestamp may happen?
I think your extra seq column should be sufficient, depending on how you plan on inserting the data. If you are setting the timestmp from the client, then you should be ok. However, look what happens when I (using your second table) INSERT rows while creating the timestamp two different ways.
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('Mal',dateof(now()),1,'commanding');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('Wash',dateof(now()),1,'piloting');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River',dateof(now()),1,'freaking out');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River',dateof(now()),3,'being weird');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River',dateof(now()),2,'killing reavers');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River','2015-01-13 13:14-0600',1,'freaking out');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River','2015-01-13 13:14-0600',3,'being weird');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River','2015-01-13 13:14-0600',2,'killing reavers');
Querying that data by a user_id of "River" yields:
aploetz#cqlsh:stackoverflow> SELECT * FROM user_events WHERE user_id='River';
user_id | timestmp | seq | event
---------+--------------------------+-----+-----------------
River | 2015-01-13 13:14:00-0600 | 1 | freaking out
River | 2015-01-13 13:14:00-0600 | 2 | killing reavers
River | 2015-01-13 13:14:00-0600 | 3 | being weird
River | 2015-01-14 12:58:41-0600 | 1 | freaking out
River | 2015-01-14 12:58:57-0600 | 3 | being weird
River | 2015-01-14 12:58:57-0600 | 2 | killing reavers
(6 rows)
Notice that using the now() function to generate a timeuuid, and then converting that to a timestamp with dateof() causes the two rows with the timestmp "2015-01-14 12:58:57-0600" to appear to be the same. But they are not the same, as you can tell by the seq column.
So just a bit of caution on using/generating timestamps. They might look the same, but they may not be stored as the same value. Just to be on the safe side, I would use a timeuuid instead.