Cassandra secondary index issue - returning zero rows - cassandra

I have a problem in that a secondary index is returning zero rows in cassandra:
I'm following along the getting started docs:
http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedCQL.html
Based on that I have the following cassandra script
/* hello.cql */
drop keyspace test;
CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
use test;
CREATE TABLE users ( user_id int PRIMARY KEY, fname text, lname text);
DESCRIBE TABLES;
INSERT INTO users (user_id, fname, lname)
VALUES (1745, 'john', 'smith');
INSERT INTO users (user_id, fname, lname)
VALUES (1744, 'john', 'doe');
INSERT INTO users (user_id, fname, lname)
VALUES (1746, 'john', 'smith');
SELECT * FROM users;
CREATE INDEX ON users (lname);
/* These queries both return 0 rows ??? */
SELECT * FROM users WHERE lname = 'smith';
SELECT * FROM users WHERE lname = 'doe';
However...
cqlsh < hello.cql
users
user_id | fname | lname
---------+-------+-------
1745 | john | smith
1744 | john | doe
1746 | john | smith
(3 rows)
(0 rows)
(0 rows)
This should be straightforward -- am I missing something?

For the 2 SELECT queries to return results would mean that the CREATE INDEX would execute synchronously and it will only return after all existing data would be indexed.
If you change the order in the script to have the index defined before you insert any data, I'd expect the 2 selects to return results.

Using Cassandra 2.1.0, I get results regardless of whether the index is created before or after data is inserted.
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.0 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh>
cqlsh> CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
cqlsh> use test;
cqlsh:test> CREATE TABLE users ( user_id int PRIMARY KEY, fname text, lname text);
cqlsh:test> INSERT INTO users (user_id, fname, lname)
... VALUES (1745, 'john', 'smith');
cqlsh:test> INSERT INTO users (user_id, fname, lname)
... VALUES (1744, 'john', 'doe');
cqlsh:test> INSERT INTO users (user_id, fname, lname)
... VALUES (1746, 'john', 'smith');
cqlsh:test> CREATE INDEX ON users (lname);
cqlsh:test> SELECT * FROM users WHERE lname = 'smith';
user_id | fname | lname
---------+-------+-------
1745 | john | smith
1746 | john | smith
(2 rows)
cqlsh:test> SELECT * FROM users WHERE lname = 'doe';
user_id | fname | lname
---------+-------+-------
1744 | john | doe
(1 rows)

Here's the platform and version info for my installation:
john#piggies:~/Dropbox/source/casandra$ nodetool -h localhost version
ReleaseVersion: 2.0.10
john#piggies:~/Dropbox/source/casandra$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.1 LTS
Release: 14.04
Codename: trusty
john#piggies:~/Dropbox/source/casandra$ ^C
john#piggies:~/Dropbox/source/casandra$

Related

vitess kubernetes ERROR 1105 (HY000): table xxx not found

kubernetes version: v1.16.3
linux version: 7.3.1611
Starting Vitess cluster on kubernetes, then login to VTGate and create table
./mysql -h 127.0.0.1 -P 15306 -uuser
mysql> CREATE TABLE sbtest1( id INTEGER NOT NULL AUTO_INCREMENT, k INTEGER DEFAULT '0' NOT NULL, c CHAR(120) DEFAULT '' NOT NULL, pad CHAR(60) DEFAULT '' NOT NULL, PRIMARY KEY (id) );
Query OK, 0 rows affected (0.32 sec)
mysql> show tables;
+--------------------+
| Tables_in_commerce |
+--------------------+
| sbtest1 |
+--------------------+
1 row in set (0.00 sec)
mysql> select * from sbtest1;
ERROR 1105 (HY000): table sbtest1 not found
show tables; show the table sbtest1 is already exists, but when select from it, error occurs.
This may because you have a sharded keyspace but haven't created a vschema for the table.
Try
vtctlclient ApplyVSchema -vschema="{\"sharded\": false, \"tables\": { \"sbtest1\": { }}}" commerce

Remove trailing zeroes of time datatype in CQL

Trying to remove trailing zero of arrival_time, the column data type is Time
SELECT * FROM TABLE
And I got this:
station_name | arrival_time
--------------+--------------------
Wellington | 06:05:00.000000000
and I need the result to look like this:
station_name | arrival_time
--------------+--------------------
Wellington | 06:05:00
I'm new to CQL, Thanks in advance.
So you can't actually do that in Cassandra with the time type. You can however, do it with a timestamp.
cassdba#cqlsh:stackoverflow> CREATE TABLE arrival_time2 (station_name TEXT PRIMARY KEY,
arrival_time time, arrival_timestamp timestamp);
cassdba#cqlsh:stackoverflow> INSERT INTO arrival_time2 (station_name , arrival_time , arrival_timestamp)
VALUES ('Wellington','06:05:00','2018-03-22 06:05:00');
cassdba#cqlsh:stackoverflow> SELECT * FROM arrival_time2;
station_name | arrival_time | arrival_timestamp
--------------+--------------------+---------------------------------
Wellington | 06:05:00.000000000 | 2018-03-22 11:05:00.000000+0000
(1 rows)
Of course, this isn't what you want either, really. So next you need to set a time_format in the [ui] section of your ~/.cassandra/cqlshrc.
[ui]
time_format = %Y-%m-%d %H:%M:%S
Restart cqlsh, and this should work:
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cassdba#cqlsh> SELECT station_name,arrival_timestamp
FROm stackoverflow.arrival_time2 ;
station_name | arrival_timestamp
--------------+---------------------
Wellington | 2018-03-22 11:05:00
(1 rows)
select station_name, SUBSTRING( Convert(Varchar(20),arrival_time), 0, 9) As arrival_time
from [Table]
Used following table and data format
CREATE TABLE [dbo].[ArrivalStation](
[station_name] [varchar](500) NULL,
[arrival_time] [Time](7) NULL
) ON [PRIMARY]
INSERT [dbo].[ArrivalStation] ([station_name], [arrival_time]) VALUES (N'Wellington ', N'06:05:00.0000000')
INSERT [dbo].[ArrivalStation] ([station_name], [arrival_time]) VALUES (N'Singapore', N'12:35:29.1234567')

Cassandra how to add values in a single row on every hit

In this table application will feed us with the below data and it will be incremental as and when we will receive updates on the status . So initially table will look like the below as shown:-
+---------------+---------------+---------------+---------------+
| ID | Total count | Failed count | Success count |
+---------------+---------------+---------------+---------------+
| 1 | 30 | 10 | 20 |
+---------------+---------------+---------------+---------------+
Now let’s assume total 30 messages are pushed now out of which 10 Failed and 20 Success as shown above.Now again application is run and values changed . Now total 20 new records came in out of which all are success. This should be updated in the same row .
+---------------+---------------+---------------+---------------+
| ID | Total count | Failed count | Success count |
+---------------+---------------+---------------+---------------+
| 1 | 50 | 10 | 40 |
+---------------+---------------+---------------+---------------+
Is it feasible in Cassandra DB using Counter data type?
Of course you can use counter tables in your case.
Let's assume table structure like :
CREATE KEYSPACE Test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
CREATE TABLE data (
id int,
data string,
PRIMARY KEY (id)
);
CREATE TABLE counters (
id int,
total_count counter,
failed_count counter,
success_coutn counter,
PRIMARY KEY (id)
);
You can increment counters by running queries like :
UPDATE counters
SET total_count = total_count + 1,
success_count = success_count + 1
WHERE id= 1;
Hope this can help you.

Cassandra Delete by Secondary Index or By Allowing Filtering

I’m trying to delete by a secondary index or column key in a table. I'm not concerned with performance as this will be an unusual query. Not sure if it’s possible? E.g.:
CREATE TABLE user_range (
id int,
name text,
end int,
start int,
PRIMARY KEY (id, name)
)
cqlsh> select * from dat.user_range where id=774516966;
id | name | end | start
-----------+-----------+-----+-------
774516966 | 0 - 499 | 499 | 0
774516966 | 500 - 999 | 999 | 500
I can:
cqlsh> select * from dat.user_range where name='1000 - 1999' allow filtering;
id | name | end | start
-------------+-------------+------+-------
-285617516 | 1000 - 1999 | 1999 | 1000
-175835205 | 1000 - 1999 | 1999 | 1000
-1314399347 | 1000 - 1999 | 1999 | 1000
-1618174196 | 1000 - 1999 | 1999 | 1000
Blah blah…
But I can’t delete:
cqlsh> delete from dat.user_range where name='1000 - 1999' allow filtering;
Bad Request: line 1:52 missing EOF at 'allow'
cqlsh> delete from dat.user_range where name='1000 - 1999';
Bad Request: Missing mandatory PRIMARY KEY part id
Even if I create an index:
cqlsh> create index on dat.user_range (start);
cqlsh> delete from dat.user_range where start=1000;
Bad Request: Non PRIMARY KEY start found in where clause
Is it possible to delete without first knowing the primary key?
No, deleting by using a secondary index is not supported: CASSANDRA-5527
When you have your secondary index you can select all rows from that index. When you have your rows you know the primary key and can then delete the rows.
I came here looking for a solution to delete rows from cassandra column family.
I ended up doing an INSERT and set a TTL (time to live) so that I don't have to worry about deleting it.
Putting it out there, might help someone.

Is there a way to make clustering order by data type and not string in Cassandra?

I created a table in CQL3 in the cqlsh using the following CQL:
CREATE TABLE test (
locationid int,
pulseid int,
name text, PRIMARY KEY(locationid, pulseid)
) WITH CLUSTERING ORDER BY (locationid ASC, pulseid DESC);
Note that locationid is an integer.
However, after I inserted data, and ran a select, I noticed that locationid's ascending sort seems to be based upon string, and not integer.
cqlsh:citypulse> select * from test;
locationid | pulseid | name
------------+---------+------
0 | 3 | test
0 | 2 | test
0 | 1 | test
0 | 0 | test
10 | 3 | test
5 | 3 | test
Note the 0 10 5. Is there a way to make it sort via its actual data type?
Thanks,
Allison
In Cassandra, the first part of the primary key is the 'partition key'. That key is used to distribute data around the cluster. It does this in a random fashion to achieve an even distribution. This means that you can not order by the first part of your primary key.
What version of Cassandra are you on? In the most recent version of 1.2 (1.2.2), the create statement you have used an example is invalid.

Resources