MongoDB: custom query - node.js

I have a nodejs application in which people will search for random articles. The articles rely on a MongoDB with the following main schema:
articles_table - title
- description
- content
The return of articles must be done algorithmically for reliability:
I must split the search query into separated words
I must search for articles containing the words in columns: title, description, content
I must greedy count the found words in each column (title, description, content)
I must order the results based on the count performed at dot 3 and then return them sorted
I could perform dots 1,2 on MongoDB and 3,4 on Nodejs. However I'd like to know if it's possible to do all these tasks directly in Mongo. If so, how could I do it?

Related

Solr default search field for multiple fields which has different analyzers

I have a document which has title, stockCode, category fields.
I have different field types (and analysis chains) for each. For instance title has EdgeNGram 2 to 20, category has EdgeNGram 3 to 10 with different range and stockCode just has lowercase filter.
So that, I don't want to search from documents with keyword "sample" with building the query like title:sample OR stockCode:sample OR category:sample.
I'd like to search with just "q=sample".
I copied my fields to text but It does not work. Because all fields analyzed as same. But I don't want to index stockCode as EdgeNGram or any other filters. I'd like to index my fields as I configured and I'd like to search a keyword over them base on my indexes.
I've been researching about that for three days, and Solr has a little bit poor documentation.
You can use the edismax handler, as this will allow you to give a list of fields to query and supply the query by itself. You can also give separate weights to each field for scoring them differently.
defType=edismax&q=sample&qf=title^10 stockCode category
.. will search for sample in each of the three fields, giving a 10x boost to any hits in the title field.
You can find the documentation about the edismax query parser under Searching in the reference guide.

Match only by values in mongodb and return id of the document

I have inserted some json data into mongodb and I wanted to perform a simple search by matching only the values irrespective of the keys (Since keys are different for different documents) and wanted to return the id of the document. I don't know how to compare only by values in mongodb.
Example: Suppose if am searching for word "Knowledge" it should return all the ids of the document which contain the word "Knowledge" irrespective of its key value.
You need to use Wildcard Text Indexes.
db.collection.createIndex( { "$**": "text" } )
If there is a static superset of fieldnames, you may find text indexes and the $text query operator useful for word-based searches.
Create the text index on every potential field, and those contained in each document will be included.

mongoDB Search without joins

I have two collections: Profiles and Employees.
Employees consists of firstName, lastName etc.
Profiles-Collection, amongst other data, has a bunch of key value pairs that describe the profession or level of experience, e.g."software-engineer": true, "javascript": 3
Since you can't have joins in mongoDB I need to search each collection individually and then "join" that result. That leaves me with 2 options:
1) Have two separate search bars on the frontend so that I know which search query belongs to which collection
2) Have a single search bar and search both collections with the same query
Option two is implemented in a way that a search on a single collection either returns the desired data when the search query has a match or returns ALL data when the search query finds no match. That means searching after "John Doe" gives us john and searching after "angular" gives us all employees that work with angular. But it also means searching after "john angular" gives us all employees that work with angular OR are called john.
What I actually want is a AND search (like in option 1) but with a single search bar. Is there a way to implement this in MongoDB or is this only possible in a relational database?

The implication of #search.score in Azure Search Service

I understood the reason for having search profile and boosting results based on some fields e.g. distance, rating, etc. To me, that's most likely applicable to structured documents like json files. The scenario that I cannot make sense of it is when indexer gets search service index let's say a MS Word or PDF document in azure blob. We have two entries of "id" and "content" which I don't know how the search score would apply to it.
For e.g. there are two documents with different contents. I searched for a keyword and the same keyword found in two documents resulted into getting two different scores for two MS Word documents. My challenge is why this score should be different while both documents contain the same keyword?
The score is determined by many factors, for example, the count of terms in each document, and the number of searchable fields in which query terms were found. In your example, the documents have different lengths, so naturally they'll have different scores. HTH.

Lucene is not finding results that are present in the index

I'm inspecting a Lucene index with Luke.
All documents have a field 'Title' and I would like to do a search for the search expression Title:Power, by which I want to find all documents with a title containing the word Power.
In Luke, I go to the tab "Search" and enter +Title:Power
When searching, there are no results. However, when I search by another field, I do find the document: +ContentType:MyContentType
In the column Title, I can clearly see the value of the document being: Power Quality Guide.
What could be the reasons I'm not finding this document when searching on Title?
There can be a number of reasons. Most common ones:
Title field could just be stored in the index but not indexed for search (Field.Store.YES, Field.Index.NO), unlike for the field for which you can find results (ContentType);
document(s) could be indexed using one analyzer but query is using a different one;
document is indexed using NOT_ANALYZED option which would store a field as a single term

Resources