Dimension Search with given number of characters in Endeca

Dimension Search with given number of characters in Endeca - search

I have requirement to search an dimension in endeca which start with some word.For example if the dimension is Arena it can be searched with Ar or ar or Are or Aren.
How to achive this ?
What configuration needs to be done ?

Is wildcard AR* what you are looking for?
It needs to be enabled on the dimension or property being searched.

Related

Why does Azure Search give higher score to less relevant document?

I have two documents indexed in Azure Search (among many others):
Document A contains only one instance of "BRIG" in the whole document.
Document B contains 40 instances of "BRIG".
When I do a simple search for "BRIG" in the Azure Search Explorer via Azure Portal, I see Document A returned first with "#search.score": 7.93229 and Document B returned second with "#search.score": 4.6097126.
There is a scoring profile on the index that adds a boost of 10 for the "title" field and a boost of 5 for the "summary" field, but this doesn't affect these results as neither have "BRIG" in either of those fields.
There's also a "freshness" scoring function with a boost of 15 over 365 days with a quadratic function profile. Again, this shouldn't apply to either of these documents as both were created over a year ago.
I can't figure out why Document A is scoring higher than Document B.

It's possible that document A is 'newer' than document B and that's the reason why it's being displayed first (has a higher score). Besides Term relevance, freshness can also impact the score.
EDIT:
After some research it looks like that newer created Azure Cognitive Search uses BM25 algorithm by default. (source: https://learn.microsoft.com/en-us/azure/search/index-similarity-and-scoring#scoring-algorithms-in-search)
Document length and field length also play a role in the BM25 algorithm. Longer documents and fields are given less weight in the relevance score calculation. Therefore, a document that contains a single instance of the search term in a shorter field may receive a higher relevance score than a document that contains the search term multiple times in a longer field.

Test your scoring profile configurations. Perhaps try issuing queries without scoring profiles first and see if that meets your needs.
The "searchMode" parameter controls precision and recall. If you want more recall, use the default "any" value, which returns a result if any part of the query string is matched. If you favor precision, where all parts of the string must be matched, change searchMode to "all". Try the above query both ways to see how searchMode changes the outcome. See Simple Query Examples.
If you are using the BM25 algorithm, you also may want to tune your k1 and b values. See Set BM25 Parameters.
Lastly, you may want to explore the new Semantic search preview feature for enhanced relevance.

Negative boost in Azure Search Profiles

We have been working on creating scoring profiles for our search. We need a way to "bury" or give "negative" boosts to some fields in case of types of scoring function "Magnitude", "Freshness", "Tags". We noticed that we cannot add a negative value for boost. Is there any other way to achieve this kind of behavior (burying results based the field)
We cannot use $OrderBy because it takes precedence over the scoring profile.
Please advise. Thanks!

you should only set positive boosting values, as described [here][1]. There may be a few things you could do. The first thing I would try is to set the weight to 0 for the fields that you do not care about. In that case, they will simply not impact the relevance.
Another option: If you know that a field should not impact relevance you could simply make that field not 'searchable'. That said, this is a property of the index definition -- so you would need to create a different index for each combination of non-searchable fields.
Depending on your scenario, you could also make a field filterable, and filter based on that field. Something like $filer=Freshenss eq 'Really Fresh'. See this link for more information on using filters.
thanks!
-Luis Cabrera

For "Magnitude", "Freshness", you can set the set the range start as higher value and range end as lower value. Would this be considered as negative impact?
Like this:

I resolved that scenario by creating negative values (using an INT field) for the field we wanted to bury. That gave us the negative boost we needed.
I used a similar technique for Date "Freshness" too, where we counted the days from some event and the higher the number the less fresh the date is and used a "magnitude" function for it.
Thanks!

I have thought the about the need for this too.
One idea I have, but haven't tried, is to do a second search on just the negative keywords. That search result will have scores as well.
Then use those scores in a function to reduce first search result scores.
(yes, it would be nicer if it could be do as part of ACS)

Azure search - Using Microsoft English Analyzer increases size of Index

Earlier my index was using lucene analyzer. I changed it to Microsoft. Now the size of index has largely increased. Why does the size increase so much . ? P.S. the attachment.

Difference in index size is expected. For each word in your documents a Microsoft analyzer produces the original word and the base form of that word, for example, if your document has the word running, Azure Search will index two terms: running and run. See my answer in the following post for more details: Azure Search: Searching for singular version of a word, but still include plural version in results
Lucene analyzers stem words what results in fewer unique terms in the index.
You can learn more about the differences here: https://learn.microsoft.com/en-us/rest/api/searchservice/Language-support?redirectedfrom=MSDN
Depending on the analyzer/language the impact on the index size will be different. You can test the behavior of the analyzer you are using with the Analyze API: https://learn.microsoft.com/en-us/rest/api/searchservice/test-analyzer.
That being said, the difference you are seeing is more than I would expect. Please reach out to me at janusz.lembicz at microsoft to discuss the details of your scenario.

Azure Search Distance filter with variable distance

Suppose I have the following scenario:
A search UI to allow individuals to find plumbers who are able to service their home location.
When a plumber enters their info into the system, they provide their coordinates and a maximum distance they are willing to travel.
The individual can then enter their home coordinates and should be presented with a list of plumbers who are eligible.
Looking at the Azure Search geo.distance function, I cannot see how to do this. Scenarios where the searcher provides a distance are well covered but not where the distance is different for each search record.
The documentation provides the following example:
$filter=geo.distance(location, geography'POINT(-122.131577 47.678581)') le 10
This works correctly but if I try and change the 10 to the maxDistance field, it fails with
Comparison must be between a field, range variable or function call
and a literal value
My requirement seems fairly basic but am now wondering if this is currently possible with Azure Search?

I found an azure feedback suggestion asking for this feature but no news on if/when it will be implemented. Therefore it is safe to assume that this scenario is not currently supported.

To add to Paul's answer, one possible workaround is to use a conservatively large constant value instead of referencing the maxDistance field in your $filter expression. Then, you can filter the resulting list of plumbers on the client to take each plumber's max distance into account and produce final list of plumbers.

Issue in azure search result when use both search keyword and Orderby clues

Having issue when I do document search in index, I use keywords as search param and distance as order by clues in api parameter.
The outcome result has sorted the result by distance, but the keyword based best data never come up into result.
https://****/indexes/IndexName/docs?api-version=2014-10-20-Preview&$filter= geo.distance(geolocation, geography'POINT(-157.825459241867 21.2753200113279)') le 16091.8615317766&search=the beach villas &$orderby=geo.distance(geolocation, geography'POINT(-157.825459241867 21.2753200113279)')&$skip=0&$top=10&$count=true

It is very possible that there is an issue, but I would like to step back and make sure you actually want to use sorting as opposed to scoring profiles. Based on the query, it seems as though what you want to do is boost items that are close to the user. A good way to do this is to use our Distance scoring profile that allows you to provide additional weighting to documents that are closer to the location specified by the user. You can also apply an exponential or linear interpolation to this scoring. Using exponential the villa closest to the location get a really large boost and the further ones get a small boost. Or using linear it is more of a gradual degradation of weighted boosting as it gets farther from the point.
Liam
Please see this page for more details on this: https://msdn.microsoft.com/en-us/library/azure/dn798928.aspx

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Dimension Search with given number of characters in Endeca - search

I have requirement to search an dimension in endeca which start with some word.For example if the dimension is Arena it can be searched with Ar or ar or Are or Aren. How to achive this ? What configuration needs to be done ?

Is wildcard AR* what you are looking for? It needs to be enabled on the dimension or property being searched.

Related

Why does Azure Search give higher score to less relevant document?

Negative boost in Azure Search Profiles

Azure search - Using Microsoft English Analyzer increases size of Index

Azure Search Distance filter with variable distance

Issue in azure search result when use both search keyword and Orderby clues

Categories

Resources