I want to use case insensivity in more tables which came from other DB where the fields and indexes can be case insensitive.
This means that we can search the needed row in any string format (DAta, Data, data, etc.), we can find that by any of these keys.
I tried to use upper function with index, and use this in a primary key to preserve the program logic.
But I failed with it. I didn't find any valid SQL statement to define it.
Maybe it's an impossible mission?
Or you know which ways I define Primary Key with "upper" index?
Thanks for any info!
If you want to do case insensitive search you're supposed to use case insensitive collation. In case you always want to treat the field's value in case insensitive manner you should define it at the field level, ie
CREATE TABLE T (
Foo VARCHAR(42) CHARACTER SET UTF8 COLLATE UNICODE_CI,
...
)
but you can also specify the collation at the search like
SELECT * FROM T WHERE Foo = 'bar' COLLATE UNICODE_CI
Read more about available collations at the Firebird's language reference.
IMHO better way is to use index by expresion
create index idx_upper on persons computed by (upper(some_name))
sql queries
select * from persons order by upper(some_name);
select * from persons where upper(some_name) starting with 'OBAM';
will use index idx_upper
Related
We are trying to remove 2 columns in a table with 3 types and make them as UDT instead of having those 2 as columns. So we came up with below two options. I just wanted to understand if there are any difference in these two UDT in Cassandra database?
First option is:
CREATE TYPE test_type (
cid int,
type text,
hid int
);
and then using like this in a table definition
test_types set<frozen<test_type>>,
vs
Second option is:
CREATE TYPE test_type (
type text,
hid int
);
and then using like this in a table definition
test_types map<int, frozen<test_type>
So I am just curious which one is a preferred option here for performance related or they both are same in general?
It's really depends on how will you use it - in the first solution you won't able to select element by cid, because to access the set element you'll need to specify the full UDT value, with all fields.
The better solution would be following, assuming that you have only one collection column:
CREATE TYPE test_type (
type text,
hid int
);
create table test (
pk int,
cid int
udt frozen<test_type>,
primary key(pk, cid)
);
In this case:
you can easily select individual element by specifying the full primary key. The ability to select individual elements from map is coming only in Cassandra 4.0. See the CASSANDRA-7396. Until that you'll need to get full map back, even if you need one element, and this will limit you on the size of the map
you can even select the range of the values, using the range query
you can get all values by specifying only partition key (pk in this example)
you can select multiple non-consecutive values by doing select * from test where pk = ... and cid in (..., ..., ...);
See the "Check use of collection types" section in the data model checks best practices doc.
By default, the Presto performs case sensitive group by. But I wanted to know how to do case insensitive group by. One method is convert all the things in the column to lower case and then perform group by ie
select * from ( select lower(name_of_the_column)), other_columns from table)
where conditions..
group by name_of_the_column
One way we can reduce time is by putting the conditions in the select statment inside the brackets. Is there any better method?
You don't need to push lower(...) into a subquery. If you simply write:
SELECT lower(name_of_the_column), ...
FROM ...
GROUP BY lower(name_of_the_column) -- or just "GROUP BY 1"
Presto will do the conversion to lowercase only once for each row (not twice).
When I used MySQL I was able to query the database with a statement like SELECT * FROM table WHERE col LIKE "%attribute%";
Is there a way I can do that in Cassandra?
Cassandra CQL doesn't have a LIKE operator. It has limited filtering capabilities so you are restricted to equals, range queries on some numeric fields, and the IN operator which is similar to equals.
The most common approach to doing searches of Cassandra data seems to be pairing Cassandra with Apache Solr. Or you can pair it with Apache Spark which has more filtering capabilities than CQL.
If your col is a collection of data like set, list, map. You could use CONTAINS, to perform search.
Sample:
SELECT id, description FROM products WHERE features CONTAINS '32-inch';
For map data type,
SELECT id, description FROM products WHERE features CONTAINS KEY 'refresh-rate';
References:
http://www.datastax.com/dev/blog/cql-in-2-1
CQL LIKE statements now available are in Scylla Open Source 3.2 RC1, the release candidate for Scylla, a CQL-compatible database. We'd love feedback before release. Here's the details:
CQL: LIKE Operation #4477
The new CQL LIKE keyword allows matching any column to a search pattern, using % as a wildcard. Note that LIKE only works with ALLOW FILTERING.
LIKE Syntax support:
'_' matches any single character
'%' matches any substring (including an empty string)
'\' escapes the next pattern character, so it matches verbatim
any other pattern character matches itself
an empty pattern matches empty text fields
For example:
INSERT INTO t (id, name) VALUES (17, ‘Mircevski’)
SELECT * FROM t where name LIKE 'Mirc%' allow filtering
Source: [RELEASE] Scylla 3.2 RC1 2
Is there a way to get all the types of string cases while doing this:
select count(word) from table where word="abcd"
Actually when doing this, it is not the same as this:
select count(word) from table where word="ABCD"
Ignoring the case in a where clause is very simple. You can, for example, convert both sides of the comparison to all caps notation:
SELECT COUNT(word)
FROM table
WHERE UPPER(word)=UPPER('ABCD')
Regardless of the capitalization used for the search term , the UPPER function makes them match as desired.
select count(word) from table where lower(word)="abcd"
However this assumes it's not a partitioned table. If it's partitioned by word you would start doing a full table scan because of the "lower("
SELECT count(word) FROM table
WHERE word RLIKE
"(?i)WOrd1|wOrd2"
How would I set up an index based on lower case only?
Even though the actual field contains both upper and lower case letters.
Also, can I run a query and have only the lower case index value returned?
You can create the index and transform the field to upper- or lower-case. Then when you do your queries, you can do the same transform and it'll do the right thing.
So:
CREATE UNIQUE INDEX lower_case_username ON users ((lower(username)));
Then query for the same thing:
SELECT username FROM users WHERE lower(username) = 'bob';
According to the docs you can do this:
CREATE UNIQUE INDEX lower_title_idx ON films ((lower(title)));
You can also use this for wildcard searches:
CREATE INDEX IF NOT EXISTS foo_table_bar_field ON foo_table(lower(username))
Query like so:
SELECT * FROM foo_table WHERE lower(username) like 'bob%'
CREATE UNIQUE INDEX my_index_name ON my_table (LOWER(my_field));