Partial Match Search Takes More time in MarkLogic

Partial Match Search Takes More time in MarkLogic - search

When i do partial match search instead of exact term match search, it takes more time than usual. I believe it generates more resultsets and compare with in it for partial search. How do i improve the performance of partial match search?
Ex: My Search Term : World Forum
If Marklogic Dictionary contains "World Economic Forum" as a term, it has to come as my result.

In general, response times are related to the # of matching documents. So if you query is slow, someone will need to know more about your data, index settings, and query itself to help you.

Related

What is the difference between arango search and filter keyword

Can somebody explain in details on difference between :
Search and filter keyword
I have already gone through https://www.arangodb.com/learn/search/tutorial/ -> SEARCH vs FILTER
Do anybody has any other experience on the difference?
Thanks,
Nilotpal

FILTER corresponds to the WHERE clause in SQL. It does, what the name says. It uses all sorts of arithmetic and AQL operators to filter the search result. It can make use of regular indexes. There is no ranking of filtered results. Filters operate on single collection result sets.
SEARCH offers a full fledged search engine very much like what you would get from regular search engines like Google's page ranking based on a grammar that you could formulate on your own and can operate on multiple collection contents. Its most natural functionality would be a full text search and ranking. In that use it would be a much more powerful version of the full-text index. But it can do much more: normalisation, tokenisation based on language ...
The list goes on and on. Please refer to the documentation of search here:
https://www.arangodb.com/docs/stable/arangosearch.html

Search unspecific term with specific answer

I am building a database in Neo4J. I am trying to build a match query within the fulltext search. The search query has to be quite robust as it will take queries from users which are not familiar with the search term and return the node which best matches the term. I am aware of a few ways of doing this, but all require that the search term is fuzzied and not the return term. My current rules rely on contains / does not contain and loops, without building a new database, is there a way to fuzzy the search term so that essentially the nodes will search through the term provided and not the other way round?
I am aware that this may not make sense. It is only my 3rd day on Neo4J. Please let me know if you need any more clarification.
Edit: I figure that I can combine the does/does not contain search terms and fuzzy the search term, by increasing the does contain score and decreasing the does not contain score.

What indexer do I use to find the list in the collection that is most similar to my list?

Lets say I have my list of ingredients:
{'potato','rice','carrot','corn'}
and I want to return lists from a database that are most similar to mine:
{'beans','potato','oranges','lettuce'},
{'carrot','rice','corn','apple'}
{'onion','garlic','radish','eggs'}
My query would return this first:
{'carrot','rice','corn','apple'}
I've used Solr, and have looked at CloudSearch, ElasticSearch, Algolia, Searchify and Swiftype. These engines only seem to let me put in one query string and then filter by other facets.
In a real scenario my search list will be about 200 items long and will be matching against about a million lists in my database.
What technology should I use to accomplish what I want to do?
Should I look away from search indexers and more towards database-esque things like mongo, map reduce, hadoop... All I know are the names of other technologies and I just need someone to point me in the right direction on what technology path I should be exploring for this.
With so much data I can't really loop through it, I need to query everything at once.

I wonder what keeps you from trying it with Solr, as Solr provides much of what you need. You can declare the field as type="string" multiValued="true and save each list item as a value. Then, when querying, you specify each of the items in the list to look for as a search term for that field, and Solr will – by default – return the closest match.
If you need exact control over what will be regarded as a match (e.g. at least 40% of the terms from the search list have to be in a matching list) you can use the mm EDisMax parameter, cf. Solr Wiki
Having said that, I must add that I’ve never searched for 200 query terms (do I unerstand correctly that the list whose contents should be searched will contain about 200 items?) and do not know how well that performs. But I guess that setting up a test core and filling it with random lists using a script should not take more than a few hours, so it should be possible to evaluate the performance of this approach without investing too much time.

NEST elasticsearch -C# - Case sensitive Search

We are new to elastic search and NEST.
We are trying to do case sensitive search using C# client - NEST.
We have read lots of posts but could not figure out it. Can someone please us with detail step by step instructions.
Any help will be highly appreciated.
Thanks,
VB.

I know this is an older question, but I ran across it in my research. So, here's my answer.
First, switching to a TERM query did not help. Upon learning more about how ElasticSearch works by default, I understand why.
By default, ElasticSearch is case-insensitive. When documents are indexed, the default analyzer lowercases all of the string values and keeps the lowercase values for future searches. This does not affect the values stored in the documents themselves, but the lowercasing does affect searches.
If you are using the default analyzer, then your search terms for string values should be all lowercase.
Before I learned how this worked, I spent a fair amount of time looking at a mixed-case field value in an indexed document, then searching with a query term that used the same mixed-case value. Zero results. It wasn't until I forced the value my query used to all lowercase that I started getting results.
You can read more about ElasticSearch analyzers here: ElasticSearch - Analysis

Try TERM query, are values passed to TERM query are not analyzed, thus ES is not making lower case of your input.
Here: http://www.elasticsearch.org/guide/reference/query-dsl/term-query/

How to partial search over lucene index? (search some documents, not all)

Not sure if the tittle is correct for the purpose, but what i want is to be able to search only few documents (and not all) from a lucene index.
Think about it as the following context:
The user wants to search inside a book, which is indexed on lucene, chapter by chapter (every chapter corresponds to a document). The user needs to be able to select the chapters he wants to search in, avoiding irrelevant occurences for his study.
Is that possible to restrict the search to only some documents? or do i have to search ALL index and then filter the results?
Thank you!

Lucene allows you to apply Query Filters, so that you can restrict the results only for those which match the filter criteria.
So basically you can search for chapter:chapter1 and the search will be limited only for chapter one documents

Look at the QueryWrapperFilter. It will let you easily do this kind of thing.
Note however that this is more for ease of coding. This won't really help performance, because in the background, it's effectively searching the entire index, but it makes it easier to code "search within a search." Searching the entire index is not a problem because that's the whole purpose of an index--to make indexed searching extremely fast. This assumes that you have a book ID that is indexed, incidentally. If that is the case, then including the book ID in your search allows for very fast searches of the entire index for that particular book.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Partial Match Search Takes More time in MarkLogic - search

In general, response times are related to the # of matching documents. So if you query is slow, someone will need to know more about your data, index settings, and query itself to help you.

Related

What is the difference between arango search and filter keyword

Search unspecific term with specific answer

What indexer do I use to find the list in the collection that is most similar to my list?

NEST elasticsearch -C# - Case sensitive Search

How to partial search over lucene index? (search some documents, not all)

Categories

Resources