SOLR Partial / Exact word Search term Highlighting - search

I am new to SOLR. I am getting the desired response that I want but I am not able to highlight the exact search term.
SOLR Version: 8.6.3
Query: /selectq=*sta*&df=title&start=0&rows=2&mm=100&defType=edismax&fl=title&hl.method=unified&pf3=title^3&ps3=1&pf2=title^3&ps2=1&qf=title&hl=on&hl.q=*sta*
Existing:
Search Term: sta
"docs": [
{
"title": "Stanford University",
},
{
"title": "Texas State University",
}
]
},
"highlighting": {
"9781743310": {
"title": [
"<em>Stanford</em> University"
]
},
"9781909421": {
"title": [
"Texas <em>State</em> University"
]
}
Expected:
Search Term: sta
"docs": [
{
"title": "Stanford University",
},
{
"title": "Texas State University",
}
]
},
"highlighting": {
"9781743310": {
"title": [
"<em>Sta</em>stanford University"
]
},
"9781909421": {
"title": [
"Texas <em>Sta</em>te University"
]
}

Related

MongoDB Aggregation - How to condition for $set

I am unwinding a lookup joined by { preserveNullAndEmptyArrays: true } option to include documents whose sizes field is null, missing, or an empty array. But in my case, I need to join nested lookup and the output result is not as I expected.
Specifically, I am being to set a new buyer field by the $set operator from lookup but if this lookup is null or empty array it will return [ { } ] and I don't want that. How can I replace [ { } ] with empty array?
The result I got:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"buyers": [
{
"_id": ObjectId("5a934e000102030405000003"),
"buyer": {
"_id": ObjectId("5a934e000102030405000004"),
"display_name": "Phineas",
"user_id": "123"
},
"post_id": 1,
"user_id": "123"
}
],
"content": "content book 1",
"title": "title book 1"
},
{
"_id": ObjectId("5a934e000102030405000002"),
"buyers": [
{}
],
"content": "content book 3",
"title": "title book 3"
},
{
"_id": ObjectId("5a934e000102030405000001"),
"buyers": [
{}
],
"content": "content book 2",
"title": "title book 2"
}
]
The result that I expected:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"buyers": [
{
"_id": ObjectId("5a934e000102030405000003"),
"buyer": {
"_id": ObjectId("5a934e000102030405000004"),
"display_name": "Phineas",
"user_id": "123"
},
"post_id": 1,
"user_id": "123"
}
],
"content": "content book 1",
"title": "title book 1"
},
{
"_id": ObjectId("5a934e000102030405000002"),
"buyers": [],
"content": "content book 3",
"title": "title book 3"
},
{
"_id": ObjectId("5a934e000102030405000001"),
"buyers": [],
"content": "content book 2",
"title": "title book 2"
}
]
Now you can see my code in the Mongo playground link:
https://mongoplayground.net/p/qiC7FySejxF
Comparing your actual output and MongoDB Playground link, I believe that you miss out the $unset stage.
{
$unset: "buyers.buyers"
}
While after the $unset stage, you can add $set stage to update the buyers field by filtering the document which is not empty document via $filter.
{
$set: {
buyers: {
$filter: {
input: "$buyers",
cond: {
$ne: [
"$$this",
{}
]
}
}
}
}
}
Demo # Mongo Playground

Mongo Query to get a result

My mongo collection name tests and whose having the following documents in it.
[
{
"title": "One",
"uid": "1",
"_metadata": {
"references": [
{
"uid": "2"
},
{
"asssetuid": 10
}
]
}
},
{
"title": "Two",
"uid": "2",
"_metadata": {
"references": [
{
"uid": "3"
},
{
"asssetuid": 11
}
]
}
},
{
"title": "Three",
"uid": "3",
"_metadata": {
"references": []
}
}
]
And I want the result in the following format (for uid:1)
[
{
"title": "One",
"uid": 1,
"_metadata": {
"references": [
{
"asssetuid": 10
},
{
"asssetuid": 11
},
{
"title": "Two",
"uid": "2",
"_metadata": {
"references": [
{
"title": "Three",
"uid": "3"
}
]
}
}
]
}
}
]
for uid:2 I want the following result
[
{
"title": "Two",
"uid": 2,
"_metadata": {
"references": [
{
"asssetuid": 11
},
{
"title": "Three",
"uid": "3"
}
]
}
}
]
Which query I used here to get a respected result. according to its uid. here I want the result in the parent-child relationship. is this possible using MongoDB graph lookup query or any other query that we can use to get the result. Please help me with this.
New Type Output
[{
"title": "One",
"uid": 1,
"_metadata": {
"assets": [{
"asssetuid": 10,
"parent": 1
}, {
"asssetuid": 11,
"parent": 2
}],
"entries": [{
"title": "Two",
"uid": "2",
"parent": 1
}, {
"title": "Three",
"uid": "3",
"parent": 2
}]
}
}]
Mongo supports the automatic reference resolution using $ref but for that, you need to change your schema a little and resolve resolution is only supported by some drivers.
You need to store your data in this format:
[
...
{
"_id": ObjectId("5a934e000102030405000000"),
"_metadata": {
"references": [
{
"$ref": "collection",
"$id": ObjectId("5a934e000102030405000001"),
"$db": "database"
},
{
"asssetuid": 10
}
]
},
"title": "One",
"uid": "1"
},
....
]
For more details on $ref refer to official documentation: label-document-references
OR
you can resolve the reference using the $graphLookup but the only problem with the $graphlookup is that you will lose the assetuid. Here is the query and it will resolve references and give output in flat map
db.collection.aggregate([
{
$match: {
uid: "1"
}
},
{
$graphLookup: {
from: "collection",
startWith: "$_metadata.references.uid",
connectFromField: "_metadata.references.uid",
connectToField: "uid",
depthField: "depth",
as: "resolved"
}
},
{
"$addFields": {
"references": "$resolved",
"metadata": [
{
"_metadata": "$_metadata"
}
]
}
},
{
"$project": {
"references._metadata": 0,
}
},
{
"$project": {
"references": "$references",
"merged": {
"$concatArrays": [
"$metadata",
"$resolved"
]
}
}
},
{
"$project": {
results: [
{
merged: "$merged"
},
{
references: "$references"
}
]
}
},
{
"$unwind": "$results"
},
{
"$facet": {
"assest": [
{
"$match": {
"results.merged": {
"$exists": true
}
}
},
{
"$unwind": "$results.merged"
},
{
"$unwind": "$results.merged._metadata.references"
},
{
"$match": {
"results.merged._metadata.references.asssetuid": {
"$exists": true
}
}
},
{
"$project": {
_id: 0,
"asssetuid": "$results.merged._metadata.references.asssetuid"
}
}
],
"uid": [
{
"$match": {
"results.references": {
"$exists": true
}
}
},
{
"$unwind": "$results.references"
},
{
$replaceRoot: {
newRoot: "$results.references"
}
}
]
}
},
{
"$project": {
"references": {
"$concatArrays": [
"$assest",
"$uid"
]
}
}
}
])
Here is the link to the playground to test it: Mongo Playground

Return the documents if the array field length greater than 0 in esclient nodejs

I have millions of documents in my es index.
I wanted to fetch the documents where the array field length greater than 0.
My docs looks like this
[
{
"primaryKey": "9c30d9e8-af04-4cc8-afcb-0c1311988c1e",
"language": "all",
"industry": [
"Accounting & auditing"
],
"text": "what's the status of my incident?",
"textId": "d0c70fc4-5e2a-4cab-a5f6-32339e6632dd",
"extractions": [],
"active": true,
"status": "active",
"createdAt": 1620208485092,
"updatedAt": 1620208485092,
"secondaryKey": "5db5f725-ec09-49da-9507-7bb2f94fd741"
},
{
"primaryKey": "9c30d9e8-af04-4cc8-afcb-0c1311988c1e",
"language": "all",
"industry": [
"Accounting"
],
"text": "What is the rating of my incident",
"textId": "4a53533f-293e-440c-aaa9-f7e5ae1436ca",
"extractions": [
{
"name": "Abinas Patra",
"role": "api-user",
"primaryKey": "ed12851d-c18d-4c92-8cc3-1782e41bc9d0"
},
{
"name": "Anil Patra",
"role": "ui-user",
"primaryKey": "933fad33-78b3-4779-a7bd-c62c6e02af75"
}
],
"active": true,
"status": "active",
"createdAt": 1620208485092,
"updatedAt": 1620208485092,
"secondaryKey": "5db5f725-ec09-49da-9507-7bb2f94fd741"
}
]
I am using elasticsearch nodejs client.
I tried in the below way
let dataCount = await esClient.count({
index: "indexName",
type: "docType",
body: {
query: {
bool: {
must: [
{
"script": {
"script": {
"inline": "doc['extractions'].values.length > 0",
"lang": "painless"
}
}
},
{
"match": {
"primaryKey": {
query: primaryKey,
"operator": "and"
}
}
},
{
"match": {
"language": {
query: language,
"operator": "and"
}
}
}
]
}
}
}
});
I get runtime parsing error everytime, i tried with exist field as well.
{"error":{"root_cause":[{"type":"script_exception","reason":"runtime error","script_stack":["org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:65)","org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:27)","doc[\'extractions\'].values.length > 1"," ^---- HERE"],
tried this as well
must_not:[
{
"script": {
"script": "_source.extractions.size() > 0"
}
}
]
Can anyone please help here.
thanks :)

How can I update one document at nested array

{
"_id": "5e28b029a0c8263a8a56980a",
"name": "Recruiter",
"data": [
{
"_id": "5e28b0980f89ba3c0782828f",
"targetLink": "https://www.linkedin.com/in/dan-kelsall-7aa0926b/",
"name": "Dan Kelsall",
"headline": "Content Marketing & Copywriting",
"actions": [
{
"result": 1,
"name": "VISIT"
},
{
"result": 1,
"name": "FOLLOW"
}
]
},
{
"_id": "5e28b0980f89ba3c078283426f",
"targetLink": "https://www.linkedin.com/in/56wergwer/",
"name": "56wergwer",
"headline": "asdgawehethre",
"actions": [
{
"result": 1,
"name": "VISIT"
}
]
}
]
}
Here is one of my mongodb document. I'd like to update data->actions->result
So this is what I've done
Campaign.updateOne({
'data.targetLink': "https://www.linkedin.com/in/dan-kelsall-7aa0926b/",
'data.actions.name': "Follow"
}, {$set: {'data.$.actions.result': 0}})
But it seems not updating anything and even it can't find the document by this 'data.actions.name'
You need the positional filtered operator since the regular positional operator ($) can only be used for one level of nested arrays:
Campaign.updateOne(
{ "_id": "5e28b029a0c8263a8a56980a", "data.targetLink": "https://www.linkedin.com/in/dan-kelsall-7aa0926b/" },
{ $set: { "data.$.actions.$[action].result": 0 } },
{ arrayFilters: [ { "action.name": "Follow" } ] }
)

cloudant searching index by multiple values

Cloudant is returning error message:
{"error":"invalid_key","reason":"Invalid key use-index for this request."}
whenever I try to query against an index with the combination operator, "$or".
A sample of what my documents look like is:
{
"_id": "28f240f1bcc2fbd9e1e5174af6905349",
"_rev": "1-fb9a9150acbecd105f1616aff88c26a8",
"type": "Feature",
"properties": {
"PageName": "A8",
"PageNumber": 1,
"Lat": 43.051523,
"Long": -71.498852
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-71.49978935969642,
43.0508382914137
],
[
-71.49978564033566,
43.052210148524
],
[
-71.49791499857444,
43.05220740550381
],
[
-71.49791875962663,
43.05083554852429
],
[
-71.49978935969642,
43.0508382914137
]
]
]
}
}
The index that I created is for field "properties.PageName", which works fine when I'm just querying for one document, but as soon as I try for multiple ones, I would receive the error response as quoted in the beginning.
If it helps any, here is the call:
POST https://xyz.cloudant.com/db/_find
request body:
{
"selector": {
"$or": [
{ "properties.PageName": "A8" },
{ "properties.PageName": "M30" },
{ "properties.PageName": "AH30" }
]
},
"use-index": "pagename-index"
}
In order to perform an $or query you need to create a text (full text) index, rather than a json index. For example, I just created the following index:
{
"index": {
"fields": [
{"name": "properties.PageName", "type": "string"}
]
},
"type": "text"
}
I was then be able to perform the following query:
{
"selector": {
"$or": [
{ "properties.PageName": "A8" },
{ "properties.PageName": "M30" },
{ "properties.PageName": "AH30" }
]
}
}

Resources