Return only _source from a search - node.js

Is it possible to only retrieve the _source document(s) when I execute a search query with the (official) nodejs-elasticsearch library? According to the documentation, there seems to be a way, sort of:
Use the /{index}/{type}/{id}/_source endpoint to get just the _source field of the document, without any additional content around it. For example:
curl -XGET 'http://localhost:9200/twitter/tweet/1/_source'
And the corresponding API call in the nodejs library is:
client.getSource([params, [callback]])
However, this method only seems to be able to retrieve documents on an ID basis. I need to issue a full search body (with filters and query_strings and whatnot), which this method doesn't support.
I'm running ES 1.4

You can use "fields" for this. See a simplified example below. Go ahead and customize your query as per your requirement:
{
"fields": [
"_source"
],
"query": {
"match_all": {}
}
}
The value of fields _index, _type, _id and _score will always be present in the response of Search API.

Related

Unable to full text search in Solr

I have some data in solr. I want to search which name is Chinmay Sahu See below I have 3 results in output. But I got 3 instead of 1. Because Content name searched partially.
I want to full search those name having Chinmay Sahu only that contents will come.
Output:
"docs": [
{
"id": "741fde46a654879949473b2cdc577913",
"content_id": "1277",
"name": "Chinmay Sahu",
"_version_": 1596995745829879800
},
{
"id": "4e98d680efaab3afe051f3ddc00dc5f2",
"content_id": "1825",
"name": "Chinmay Panda",
"_version_": 1596995745829879800
}
{
"id": "741fde46a654879949473b2cdc577913",
"content_id": "1259",
"name": "Sasmita Sahu",
"_version_": 1596995745829879800
}
]
Query:
name:Chinmay Sahu
Expected :
"docs": [
{
"id": "741fde46a654879949473b2cdc577913",
"content_id": "1277",
"name": "Chinmay Sahu",
"_version_": 1596995745829879800
},
]
Please help
Try doing this
name:"Chinmay Sahu"
You need to do a phrase query to match the exact name.
I am guessing in your case the name field is using Standard tokenizer which will split tokens if whitespace is there. So while indexing in all the 3 docs there will be a token called "chinmay".
While you search using
name:Chinmay Sahu
Solr will search it like this since if there is no fieldName specified before a token solr automatically searches it in default_field.(however default field is removed from solr 7.3, So it depends on what version of solr are you using.
)
Name:chinmay AND default_field:sahu
So since all the three docs are having chinmay as a token in the index,the query will match all 3 docs.
Now i dont know what your default field is? can you post your solr schema? That way we can explain why you are seeing those 3 docs.
Since root545 already explained that field:foo bar will search for foo in field and bar in the default search field, I'll suggest that it seems like you don't want to concern yourself with the exact Lucene syntax for searching. The edismax query parser is well suited for separating the typed search string from what fields are being searched and whether you want all tokens to match.
The query in that case would be just Chinmay Sahu, while you'd set q.op=AND (all terms must match), defType=edismax (use the edismax query parser) and qf=name (search the name field):
q=Chinmay Sahu&q.op=AND&defType=edismax&qf=name
You can also tune the different phrase parameters to make sure that names with the tokens in the exact same sequence will be boosted higher than those that have them in the opposite sequence (i.e. Sahu Chinmay).
If this is a programmatic search where no user is actually typing in the suggestion, using a phrase search as suggested is the way to go (name:"Chinmay Sahu").
I would suggest using query like
name:(Chinmay Sahu)
And make sure default operator is AND either in settings or query string like q.op=AND
With that approach you can use user input much easier since you don't need to parse it too much.

JSON object selector to describe the criteria for querying documents in Azure Cosmos/ Document DB

I am using a Javascript Azure function to bind to CosmosDB (Document DB) and query documents within a collection. I would like the SELECT query to be formed based on a JSON object that would be coming in the request body. IBM Cloudant provides a feature wherein you can pass a JSON object (selector) to describe the criteria for selecting documents. How do I achieve the same in Azure?
The JSON selector looks like this-
{
"selector": {
"id": {
"$gt": 0
},
USERS": {
"username": "Jack",
"department": "HR"
}
}
}
The sql-from-mongo npm package provides conversion of expressions similar to these into CosmosDB SQL. The module can be easily loaded in your Azure Functions or it can even be loaded into a sproc with a little bit of manipulation.
Full disclosure: I'm the author of the npm package.

Marklogic QueryByExample in collection NodeJS

TLDR
Is there a way to limit queryByExample to a collection in NodeJS?
Problem faced
I have a complex query with some optional fields (i.e. sometimes some search fields will be omitted). So I need to create a query dynamically, e.g. in JSON. QueryByExample seems to be the right tool to use here as it gives me that flexibility to pass a JSON. However my problem is that I would like to limit my search to only one collection or directory.
e.g. I was hoping for something like
searchJSON = {
title: { $word: "test" },
description: { $word: "desc" }
};
//query
db.documents.query(qb.where(
qb.collection("collectionName"),
qb.byExample(searchJSON)
)).result()...
In this case searchJSON could have been built dynamically, for example maybe sometimes title may be omitted from the search.
This doesn't work because the query builder only allows queryByExample to be the only query. But I'd instead like to built a dynamic search query which is limited to a collection or directory.
At present, I think you would have to express the query with QueryBuilder instead of Query By Example using
qb.and([
qb.collection('collectionName'),
qb.word('title', 'test'),
qb.word('description', 'desc')
])
See http://docs.marklogic.com/jsdoc/queryBuilder.html#word
That said, it should be possible for the Node.js API to relax that restriction based on the fixes in MarkLogic 9.0-2
Please file an issue on https://github.com/marklogic/node-client-api

Elasticsearch - Why am I not getting the same search results after updating a document?

Here's what I'm doing:
First, I make a search and get some documents
curl -XPOST index/type/_search
{
"query" : {
"match_all": {}
},
"size": 10
}
Then, I'm updating one of the documents resulted in the search
curl -XPOST index/type/_id/_update
{
"doc" : {
"some_field" : "Some modification goes here."
}
}
And finally, I'm doing exactly the same search as above.
But the curious thing is that I get all the previous documents, except the updated one. Why is it no longer among the documents in the search?
Thank you!
Since you're not sorting your documents, they are sorted by score. Your modification might have changed the document score after which the documents are sorted by default.
And since you're only taking the first 10 documents, you have no guarantee that your new document will come back in those 10 documents.

Change notification in CouchDB when a field is set

I'm trying to get notifications in a CouchDB change poll as soon as pre-defined field is set or changed. I've already had a look at filters that can be used for filtering change events(db/_changes?filter=myfilter). However, I've not yet found a way to include this temporal information, because you can only get the current version of the document in this filter functions.
Is there any possibility to create such a filter?
If it does not work, I could export my field to a separate database and the only poll for changes in that db, but I'd prefer to keep together my data for obvious reasons.
Thanks in advance!
You are correct: filters and _changes feeds can only see snapshots of a document. What you need is a function which can see the old document and the new document and act correctly. But that is unavailable in _filters and _changes.
Obviously your client code knows if it updates that field. You might update your client code however there is a better solution.
Update functions can access both documents. I suggest you make an _update
function which notices the field change and flags that in the document. Next you
have a simple filter checking for that flag. The best part is, you can use a
rewrite function to make the HTTP API exactly the same as before.
1. Create an update function to flag interesting updates
Your _design/myapp would be {"updates", "smart_updater": "(see below)"}.
Update functions are very flexible (see my recent update handlers
walkthrough). However we only want to mimic the normal HTTP/JSON API.
Your updates.smart_updater field would look like this:
function (doc, req) {
var INTERESTING = 'dollars'; // Set me to the interesting field.
var newDoc = JSON.parse(req.body);
if(newDoc.hasOwnProperty(INTERESTING)) {
// dollars was set (which includes 0, false, null, undefined
// values. You might test for newDoc[INTERESTING] if those
// values should not trigger this code.
if((doc === null) || (doc[INTERESTING] !== newDoc[INTERESTING])) {
// The field changed or created!
newDoc.i_was_changed = true;
}
}
if(!newDoc._id) {
// A UUID generator would be better here.
newDoc._id = req.id || Math.random().toString();
}
// Return the same JSON the vanilla Couch API does.
return [newDoc, {json: {'id': newDoc._id}}];
}
Now you can PUT or POST to /db/_design/myapp/_update/[doc_id] and it will feel
just like the normal API except if you update the dollars field, it will add
an additional flag, i_was_changed. That is how you will find this change
later.
2. Filter for documents with the changed field
This is very straightforward:
function(doc, req) {
return doc.i_was_changed;
}
Now you can query the _changes feed with a ?filter= parameter. (Replication
also supports this filter, so you could pull to your local system all documents
which most recently changed/created the field.
That is the basic idea. The remaining steps will make your life easier if you
already have lots of client code and do not want to change the URLs.
3. Use rewriting to keep the HTTP API the same
This is available in CouchDB 0.11, and the best resource is Jan's blog post,
nice URLs in CouchDB.
Briefly, you want a vhost which sends all traffic to your rewriter (which itself
is a flexible "bouncer" to all design doc functionality based on the URL).
curl -X PUT http://example.com:5984/_config/vhosts/example.com \
-d '"/db/_design/myapp/_rewrite"'
Then you want a rewrites field in your design doc, something like (not
tested)
[
{
"comment": "Updates should go through the update function",
"method": "PUT",
"from": "db/*",
"to" : "db/_design/myapp/_update/*"
},
{
"comment": "Creates should go through the update function",
"method": "POST",
"from": "db/*",
"to" : "db/_design/myapp/_update/*"
},
{
"comment": "Everything else is just like normal",
"from": "*",
"to" : "../../../*"
}
]
(Once again, I got this code from examples and existing code I have laying
around but it's not 100% debugged. However I think it makes the idea very clear.
Also remember this step is optional however the advantage is, you never have to
change your client code.)

Resources