adding new documents not being show in ElasticSearch index - python-3.x

I am new to ElasticsSearch and was messing around with it today. I have a node running on my localhost and was creating/updating my cat index. As I was adding more documents into my cat indexes, I noticed that when I do a GET request to see all of the documents in Postman, the new cats I make are not being added. I started noticing the issue after I added my tenth cat. All code is below.
ElasticSearch Version: 6.4.0
Python Version: 3.7.4
my_cat_mapping = {
"mappings": {
"_doc": {
"properties": {
"breed": { "type": "text" },
"info" : {
"cat" : {"type" : "text"},
"name" : {"type" : "text"},
"age" : {"type" : "integer"},
"amount" : {"type" : "integer"}
},
"created" : {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
cat_body = {
"breed" : "Persian Cat",
"info":{
"cat":"Black Cat",
"name": " willy",
"age": 5,
"amount": 1
}
}
def document_add(index_name, doc_type, body, doc_id = None):
"""Funtion to add a document by providing index_name,
document type, document contents as doc and document id."""
resp = es.index(index=index_name, doc_type=doc_type, body=body, id=doc_id)
print(resp)
document_add("cat", "cat_v1", cat_body, 100 )

Since the document id is passed as 100 it just updates the same cat document. I'm assuming its not changed on every run !?
You have to change the document id doc_id with every time to add new cat instead of updating existing ones.
...
cat_id = 100
cat_body = {
"breed" : "Persian Cat",
"info":{
"cat":"Black Cat",
"name": " willy",
"age": 5,
"amount": 1
}
}
...
document_add("cat", "cat_v1", cat_body, cat_id )
With this you can change both cat_id and cat_body to get new cats.

Related

Elasticsearch Empty Filters Values

I have the following Elasticsearch query that I use with python3. My Elasticsearch version is 7.9.3
search_body = {
"explain": "true",
"query":{
"bool": {
"must": {
"multi_match":{
"query": "some text",
"fields":[
f"title^5",
f"description^3"
],
"tie_breaker": 0.5
}
},
"should":[
{
"match_phrase": {
"position_term": "some text"
}
}
],
"filter": basic_filters
}
}
}
I set out filters separately.
# filters
user_trade = [529, 601]
user_exp = -100
user_minsalary = 0
user_schedule = 2
user_branch = [10, 15, 16]
basic_filters = [
{"terms" : {"trade" : user_trade}},
{"term" : {"experience" : user_exp}},
{"range" : {"min_salary" : {"gte": user_minsalary}}},
{"term" : {"schedule" : user_schedule}},
{"terms" : {"branch" : user_branch}}
]
I want to modify the filters so that if at least one variable is empty, the search will return all documents by that variable and filter by the others, moreover, if all filter variables are empty, the search will return all documents that match only the "query" and "position_term" value (it's the same value), but I don’t know how to do it properly.
I'm a complete beginner in Elasticsearch, hope someone can help.

Fetched sorted API data(NodeJs &Mongoose) not getting displayed in sorted order when try display in Angular UI

I have tried to get sorted in backend & tested via postman and I am getting sorted order.
const locationInfo = await locationDetails.find(query).sort({sectionName:1});
res.json(locationInfo);
[
{ //some other keys &values
"sectionName": "Closet",
},
{
"sectionName": "Dining",
},
{
"sectionName": "Kitchen",
},
{
"sectionName": "Other",
},
{
"sectionName": "Refrigerator",
}
]
After REST call storing result to,
this.result=data;
but when I try to display the same resultant data on UI, Its not getting displayed in sorted order as well as checked in console also resultant data order got changed.
Console Data
[{
sectionName: "Refrigerator",
},
{
sectionName: "Kitchen",
},
{
sectionName: "Dining",
},
{
sectionName: "Closet",
},
{
sectionName: "Other",
}]
Note: Tried to sort from .ts file also but it is not working.
this.result.sort(function(a,b){a.sectionName-b.sectionName});
If any help would be appreciated. Thanks!
SectioName is not a valid criterion for MongoDB to sort the return result. In this case, MongoDB does not know how to sort it.
Here is an example directly from the MongoDB documentation about cursor.sort():
db.restaurants.insertMany( [
{ "_id" : 1, "name" : "Central Park Cafe", "borough" : "Manhattan"},
{ "_id" : 2, "name" : "Rock A Feller Bar and Grill", "borough" : "Queens"},
{ "_id" : 3, "name" : "Empire State Pub", "borough" : "Brooklyn"},
{ "_id" : 4, "name" : "Stan's Pizzaria", "borough" : "Manhattan"},
{ "_id" : 5, "name" : "Jane's Deli", "borough" : "Brooklyn"},
] );
# The following command uses the sort() method to sort on the borough field:
db.restaurants.find().sort( { "borough": 1 } )
Documents are returned in alphabetical order by borough, but the order of those documents with duplicate values for borough might not be the same across multiple executions.
.sort works best with numerical values. If you are in control of the backend and are able to change how data is stored in the database. I suggest you create a field for the creation date or just an index to indicate the order of the items.
Let's say your document looks something like this:
# Doc 1
{
sectionName: "Refrigerator",
order:1
}
# Doc 2
{
sectionName: "Refrigerator",
order:2
}
Then you can do
const locationInfo = await locationDetails.find(query).sort({order:1});
which will return you the documents sorted using the order field, and the order will be consistent.

Elasticsearch Search/filter by occurrence or order in an array

I am having a data field in my index in which,
I want only doc 2 as result i.e logically where b comes before
a in the array field data.
doc 1:
data = ['a','b','t','k','p']
doc 2:
data = ['p','b','i','o','a']
Currently, I am trying terms must on [a,b] then checking the order in another code snippet.
Please suggest any better way around.
My understanding is that the only way to do that would be to make use of Span Queries, however it won't be applicable on an array of values.
You would need to concatenate the values into a single text field with whitespace as delimiter, reingest the documents and make use of Span Near query on that field:
Please find the below mapping, sample document, the query and response:
Mapping:
PUT my_test_index
{
"mappings": {
"properties": {
"data":{
"type": "text"
}
}
}
}
Sample Documents:
POST my_test_index/_doc/1
{
"data": "a b"
}
POST my_test_index/_doc/2
{
"data": "b a"
}
Span Query:
POST my_test_index/_search
{
"query": {
"span_near" : {
"clauses" : [
{ "span_term" : { "data" : "a" } },
{ "span_term" : { "data" : "b" } }
],
"slop" : 0, <--- This means only `a b` would return but `a c b` won't.
"in_order" : true <--- This means a should come first and the b
}
}
}
Note that slop controls the maximum number of intervening unmatched positions permitted.
Response:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.36464313,
"hits" : [
{
"_index" : "my_test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.36464313,
"_source" : {
"data" : "a b"
}
}
]
}
}
Let me know if this helps!

How to select a single field for documents in a MongoDB collection?

I tried get(selected values) query in mongodb.that query in working fine in mongodb shell .but I tried to run node js it not working it showing all data.how to fix it.
query
db.collection('olc_prod_db_category').find({name: { $nin:['DISCONTINUE', 'LIQUOR MINI']}},{ "_id": 0}).toArray()
Expected ouput:
{ "id" : 3, "name" : "IRISH WHISKEY", "hasSubCategory" : "false", "parentId" : "30" }
but I got this output:
{
"_id": "5b4efd6fd53be829188070ca",
"id": 3,
"name": "IRISH WHISKEY",
"hasSubCategory": "false",
"parentId": "30"
}
Use .project cursor method
db.collection('olc_prod_db_category').find(
{ name: { $nin:['DISCONTINUE', 'LIQUOR MINI']}}
).project({ _id: 0 }).toArray()

Searching after indexing in ElasticSearch

I want to index 1 billion records. each record has 2 attributes (attribute1 and attribute2).
each record that has same value in attribute1 must be merge. for example, I have two record
attribute1 attribute2
1 4
1 6
my elastic document must be
{
"attribute1": "1"
"attribute2": "4,6"
}
due to huge amount of data, I must to read a bulk (about 1000 records) and merge them based on the above rule (in memory) and then search them in ElasticSearch and merge them with search result and then index/reindex them.
In summary I have to Search and Index per bulk respectively.
I implemented this rule but in some cases Elastic does not return all results and some documents have been indexed duplicately.
after each Index I Refresh ElasticSearch so that it be ready for next search. but in some case it doesn’t work.
my index setting is followed as:
{
"test_index": {
"settings": {
"index": {
"refresh_interval": "-1",
"translog": {
"flush_threshold_size": "1g"
},
"max_result_window": "1000000",
"creation_date": "1464577964635",
"store": {
"throttle": {
"type": "merge"
}
}
},
"number_of_replicas": "0",
"uuid": "TZOse2tLRqGk-vHRMGc2GQ",
"version": {
"created": "2030199"
},
"warmer": {
"enabled": "false"
},
"indices": {
"memory": {
"index_buffer_size": "40%"
}
},
"number_of_shards": "5",
"merge": {
"policy": {
"max_merge_size": "2g"
}
}
}
}
how can I resolve this problem?
Is there any other setting to handle this situation?
In your bulk commands, you need to use the index operation for the first occurence and then update with a script to update your attribute2 property:
{ "index" : { "_index" : "test_index", "_type" : "test_type", "_id" : "1" } }
{ "attribute1" : "1", "attribute2": [4] }
{ "update" : { "_index" : "test_index", "_type" : "test_type", "_id" : "1" } }
{ "script" : { "inline": "ctx._source.attribute2 += attr2", "params" : {"attr2" : 6}}}
After the first index operation your document will look like
{
"attribute1": "1"
"attribute2": [4]
}
After the second update operation, your document will look like
{
"attribute1": "1"
"attribute2": [4, 6]
}
Note that it is also possible to only use update operations with doc_as_upsert and script.

Resources