Elastic search : Searching for integers with wildcards - search

I am currently using the tire client for elastic search. Lets say I have a field which is indexed as a field of type long in my elastic search mapping.
I am trying to achieve something like this:
search.query {|query| query.string "30*", :fields => ['id']}
Here 'id' is the long field about which I was talking about. But since I specify the fields in the query, the wildcard doesn't work and I end up getting the exact match as the only result.
But doing the same thing works with the _all search as the field type doesn't matter. I want this wildcard search to work while also searching for the search key in that particular field. Is there any way to do this without changing my mapping?

I see next solutions:
use multifield and make this also of a string type (but requires mapping change)
use range and translate this into something like:
(from 30 to 39) or (from 300 to 309) or (from 3000 to 3099)
or (from 30000 to 30999) or ... (to max value)
use http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-script-filter.html and check this using scripting

Thanks to #alex on that scripting tip. Finally I found something which worked. Phew!
So I ended up doing this(briefly):
search.query do |query|
query.filtered do |f|
f.filter :script, {
:script => "doc['id'].value.toString() ~= '^30[0-9]*$'"
}
end
end
Hope it helps.

Related

How to get a substring with Regex in Python

I am trying to formnulate a regex to get the ids from the below two strings examples:
/drugs/2/drug-19904-5106/magnesium-oxide-tablet/details
/drugs/2/drug-19906/magnesium-moxide-tablet/details
In the first case, I should get 19904-5106 and in the second case 19906.
So far I tried several, the closes I could get is [drugs/2/drug]-.*\d but would return g-19904-5106 and g-19907.
Please any help to get ride of the "g-"?
Thank you in advance.
When writing a regex expression, consider the patterns you see so that you can align it correctly. For example, if you know that your desired IDs always appear in something resembling ABCD-1234-5678 where 1234-5678 is the ID you want, then you can use that. If you also know that your IDs are always digits, then you can refine the search even more
For your example, using a regex string like
.+?-(\d+(?:-\d+)*)
should do the trick. In a python script that would look something like the following:
match = re.search(r'.+?-(\d+(?:-\d+)*)', my_string)
if match:
my_id = match.group(1)
The pattern may vary depending on the depth and complexity of your examples, but that works for both of the ones you provided
This is the closest I could find: \d+|.\d+-.\d+

ArangoDB wildcard search is slow

I am working on the below query and trying to implement an ArangoDB wildcard search. The criteria is very simple, I'd like to match records similar to the name or a number field and limit the records to 25. The query works but is very slow, taking upwards of 30seconds. The goal is to optimize this query and get it as close to sub second as possible. I'd like the query to function similar to how a MySQL LIKE would work, matching using the % wildcard on both sides.
https://www.arangodb.com/docs/stable/release-notes-new-features37.html#wildcard-search
Note, one thing I noticed is that in the release note examples, rather than using FILTER, they are using SEARCH.
Additional info:
name is alphanumeric
number is going to by an 8 digit number
LET str = CONCAT("%", 'test', '%")
LET search = (
FOR doc IN name_search
FILTER ANALYZER(doc.name LIKE str, "text_en") OR
FILTER ANALYZER(doc.number LIKE str, "text_en")
LIMIT 25
RETURN doc
)
RETURN SEARCH
FILTER doesn't utilize indices. To speedup your wildcard queries you have to create an ArangoSearch view over a collection and use SEARCH keyword.
Feel free to check the following interactive tutorial (see "LIKE Support" section):
https://www.arangodb.com/learn/search/arangosearch-tutorial-3-7/

Want to find all results containing specific pattern in Azure Search explorer

I want to find all records containing the pattern "170629-2" in Azure Search explorer, did try with
query string : customOfferId eq "170629-2*"
which only give one result back, which is the exactly match of "170629-2", but i do not get the records which have the patterns of "170629-20", "170629-21" or "170629-201".
Two things.
1-You can't use standard analyzer as it will break your "words" in two parts:
e.g. 170629-20 will be breaked as 170629 and another entry as 20.
2-You can use regex and specify the pattern you want:
170629-2+.*
https://learn.microsoft.com/en-us/azure/search/query-lucene-syntax#bkmk_regex
PS: use &queryType=full to allow regex

Azure Search filter on the whole field

I've been trying to create a filter matching the end of the whole field text.
For example, taking a text field with the text: the brown fox jumped over the lazy dog
I would like it to match with a query that searches for fields with values ending with g. Something like:
{
"search":"*",
"queryType":"full",
"searchMode": "any",
...
"filter":"search.ismatchscoring('/g$/','MyField')"
}
The result is only records where MyField contains values with words composed by a the single g character anywhere on the string.
Using the filter directly also produces no results:
{
"search":"*",
"queryType":"full",
"searchMode": "any",
...
"filter":"MyField eq '*g'"
}
As far as I can see, the tokenization will always be the base for the search and filter, which means that on the above query, $ is completely ignored and matches will be by word, not by field.
Probably I could use the keyword_v2 analyzer on this field but then I would lose the tokenizarion that I use when searching normally.
One possible solution could be defining a second field in your index, with the same value as ‘MyField’, but with a different analyzer (e.g. keyword_v2). That way you may still search over the original field while filtering over the other.
Regardless, you might have simplified the filter for the sake of the example, but otherwise it seems redundant to use search.ismatchscoring() when not combining with another filter clause via ‘or’ – one can use the search parameter directly.
Moreover, regex might not be working because the default queryType for search.ismatchscoring() is simple, not full - please see docs here

Neo4j - index lookup issue

I was trying to set index type from exact to fulltext in neo4j shell, so i can do incasesensitive search with lucene query. So i used this command:
index --set-config Destination type fulltext
but it didn't work. Still couldn't do case insensitive search, so a played around and change some other values, like _blueprints:type and to_lower_case.
That didn't do any good.
Now it somehow ignores first character of name value ( weird ! ) . So if i am searching for "London" for example and i type "Lon" it returns nothing. But if i type "ond" it returns the node. The same for every node.
I tried setting everything back to normal. Didn`t help.
What did i mess up? What am i missing?
I am using a Everyman PHP library to communicate with database.
I created new index with "to_lower_case" property.
I think that will solve my problem, just have to convert string to lower case before inserting it into query. It seems to work.
Setting configuration afterwards doesn't update already indexed values (as the shell notes, I think). If you've created your index with "to_lower_case=true" then additions as well as queries will have the values converted to lower case. Calling Index#get will still require you to lower-case it yourself.

Resources