Cassandra using 'like' in query condition - cassandra

When I query from Cassandra with a CQL statement of:
select * from abctpl where tpl like '1-1'
In the table, the content of tpl which I want is '1-1-1', and it's unique.
But actually I get 3 rows. The other two tpls do not contain a string '1-1-1', I guess Cassandra regard '-' as a wildcard character. If tpl's word like '11111111' also can be selected.
So how can I edit the CQL to make it query the exact data?

select * from abctpl where tpl like '1-1';
I think the problem here, is that you're not providing the LIKE wildcard character %. If your SASI index is defaulted to PREFIX mode, then this should work:
select * from abctpl where tpl like '1-1%';
Take a look through the DataStax docs on using SASI indexes: https://docs.datastax.com/en/dse/6.7/cql/cql/cql_using/useSASIIndex.html . That has some query examples, along with how to specify the mode at index creation.
make it query the exact data?
And if it's exact data that you're after, using equals (=) does a better job of that than LIKE does.

Related

ArangoDB wildcard search is slow

I am working on the below query and trying to implement an ArangoDB wildcard search. The criteria is very simple, I'd like to match records similar to the name or a number field and limit the records to 25. The query works but is very slow, taking upwards of 30seconds. The goal is to optimize this query and get it as close to sub second as possible. I'd like the query to function similar to how a MySQL LIKE would work, matching using the % wildcard on both sides.
https://www.arangodb.com/docs/stable/release-notes-new-features37.html#wildcard-search
Note, one thing I noticed is that in the release note examples, rather than using FILTER, they are using SEARCH.
Additional info:
name is alphanumeric
number is going to by an 8 digit number
LET str = CONCAT("%", 'test', '%")
LET search = (
FOR doc IN name_search
FILTER ANALYZER(doc.name LIKE str, "text_en") OR
FILTER ANALYZER(doc.number LIKE str, "text_en")
LIMIT 25
RETURN doc
)
RETURN SEARCH
FILTER doesn't utilize indices. To speedup your wildcard queries you have to create an ArangoSearch view over a collection and use SEARCH keyword.
Feel free to check the following interactive tutorial (see "LIKE Support" section):
https://www.arangodb.com/learn/search/arangosearch-tutorial-3-7/

Why aren't the values inserted into the mysql2 query?

Can you tell me, please?
Why does the mysql2 value substitution not work inside the IN operator?
I do this, but nothing works.
Only the first character of the array is being substituted (number 6)
"select * from products_categories WHERE category_id IN (?)", [6,3]);
You can do it like this, of course:
IN(?,?,?,?,?,?,?,?,?,?) [6,3,1,1,1,1,1,1,1,1,1]
But that's not right, I thought that the IN should be automatically substituted from an array =(
I haven't used this, but my gut feeling tells that array items map to question marks based on indexes, so in your case 6 binds to first ? and 3 looks for another one, but doesn't find.
If I were you, I'd try to make sure that my first array item is then actually array, so I'd rewrite it:
"select * from products_categories WHERE category_id IN (?)", [[6,3]]);
I suspect you are using this with .execute(), which is short for prepared statements "prepare first if never executed before"+execute. While api is very similar to .query() one biggest difference is that in case of prepared statement only parameters are sent at execution time, unlike .query() where whole query text is interpolated with all parameters on the client. As a result, you need to send exactly the number of parameters as you have number of placeholders in original query text ( in you example - one ?). The whole [6,3,1,1,1,1,1,1,1,1,1] is treated as one parameter and sent to server as "6,3,1,1,1,1,1,1,1,1,1" string ( because during prepare step that parameter was likely reported by server as VAR_CHAR )
The solution is 1) use .query() and interpolate on the client or 2) build enough ?s dynamically and prepare different PS for different number of IN parameters

Lucene syntax: what is the difference between AND and +

I am newest in Lucene.
I'm using Lucene.NET version 2.9.4.
What is the difference between these queries?
the first is:
title:hello AND tags:word
the second is:
+title:hello +tags:word
I testing a software, and I note that the first returns 3 records, and the second returns many records.
I observe that the first returns records where title and tags fields are fuel, but the second returns records where title and tags can be empty.
Is it the difference?
There is no difference between the two. clause1 AND clause2 is effectively shorthand for +clause1 +clause2
Similarly: clause1 clause2 = clause1 OR clause2
Note, there is really no equivalent for +clause1 clause2 using the boolean operators.
Are you sending the query over the Internet, if you are and not urlencoding the request correctly it could be misinterting the '+' as an encoded space and therefore lucene just runs the second query as if the +'s not there which would just OR the two parts and give the results you get.
title:hello tags:word

Separating fields out of a string in Hive

I have the following problem...
I work with Hive and want to add a file with several (different) rows of Strings. Those contain fields with a fixed size, like this:
A20130420bcd 34 fgh
where the fields have the length 1,8,6,4,3.
Separated it would look like this:
"A,20130420,bcd,fgh"
Is there any possibility to read the String and sort it into a field besides getting it as a substring for every field like
substring(col_value,1,1) Field1
etc?
I would imagine that cutting the already read part of the string would increase the performance, but i could think of any way to do this with the given functions here.
Secondly, as stated before, there are different types of strings, ordered and identified by the first character.right now just check those with the WHERE-Statement, but it's horrible, as it runs through the whole file just to find only the first String. Is there any way to read specific lines by their number? If i know, that the first string will be of a certain kind, can read it directly?
right it looks like this:
insert overwrite table TEST
SELECT
substring(col_value,1,1) field1,
...
substring(col_value,10,3) field 5
from temp_data WHERE substring(col_value,1,1) = 'A';
any ideas on this?
I would love to hear some ideas =)
You need to write yours generic-UDF parser that output the struct or map or whatever appropriate. you can refer to UDF that output multi-values.
then you can write
insert overwrite table output
select parsed.first, parsed.second
from (
select parse(taget)
from input
) parsed
where first='X';
About second question,you may need to check "explain" command of hive to see if hive do filter push-down for you.(just see how many map reduce it takes, theoretically it should be one map, depending on 1.hive version,
2.output table format
.)
In general sense, this is why database is popular -- take optimization into consideration for you .

How to read cassandra data with out case sensitive

I need to get the data from cassandra with out case sensitive. Please help me.
There is no case-sensitivity concept in Cassandra. All the data is stored as byte[], so it's not even a String.
You can make a custom comparator (see the API) which transforms byte[] to String and disregards case.
The other thing to do is just get the data and transform it on the client side.
Actually, your question is quite unclear as of what is your goal, so I can't give more details.
Update: Run a one-time job that fetches all records from the db and updates them, setting to lower-case. Then continue inserting everything with lowercase.
This has been resolved if you have SOLR enabled using:
CREATE SEARCH INDEX ON tableName WITH COLUMNS *, camelCaseColumn { lowerCase : true };
An index is created that allows the select statement to use lowercase in the where clause. For more details search for LowerCaseStrField.

Resources