Insert on array value in ArangoDB - arangodb

I have a document just like this.
{
"Node": {
"-name": "Dev6",
"Interface": [
{
"-ip": "10.20.18.65",
"-mask": "255.255.255.192"
},
{
"-ip": "10.20.18.129",
"-mask": "255.255.255.192"
}
]
}
}
My perl program is following.
my $dbs_update_Node_by_key ='FOR u IN Node FILTER u._key == #key UPDATE u WITH {
name: #name,
Interface: #Interface
} IN Node';
......
(comments: $inf means [{"-ip","-mask"},{"-ip","-mask"}])
my $bind_args = {
key => $doc->{'_key'},
name => $node_attrs->{'-name'},
Interface => $inf
};
$sth = $itdb->query($dbs_update_Node_by_key)->bind($bind_args)->execute();
It returns "Invalid bind parameter value". I think ArangoDB perl driver didn't support it.
How can I use AQL or REST API to implement it? Thanks!

I think the problem is that
[{"-ip","-mask"},{"-ip","-mask"}]
won't work. When using the curly brackets and member names (e.g. "-ip", "-mask"), there must be a value associated to each member. Using this value instead should work:
[{"-ip": "a.b.c.d", "-mask": "a.b.c.d" }, {"-ip": "a.b.c.d" ,"-mask": "a.b.c.d" }]
Please also note that in your above query, you will update an attribute named "name", whereas in the example document the attribute name is "-name" (with minus sign in front). To use an attribute name with a minus sign at the beginning, it needs to be quoted in backticks in AQL (see below).
Additionally, the example document has attributes "-name" and "Interface" inside a sub-attribute "Node", whereas the UPDATE command will update attributes "name" and "Interface" on the top level of the document.
I have adjusted the query a bit. The following sequence seems to work from the ArangoShell:
db._create("Node");
db.Node.save({
"_key": "test",
"Node": {
"someAttribute": "someValue",
"-name": "Dev6",
"Interface": [
{
"-ip": "10.20.18.65",
"-mask": "255.255.255.192"
},
{
"-ip": "10.20.18.129",
"-mask": "255.255.255.192"
}
]
}
});
dbs_update_Node_by_key = 'FOR u IN Node FILTER u._key == #key ' +
'UPDATE u WITH { Node: { `-name`: #name, Interface: #Interface } } IN Node';
bind_args = {
key: "test",
name: "Dev8",
Interface: [
{
"-ip": "8.8.8.8",
"-mask": "255.255.255.192"
},
{
"-ip": "192.168.0.1",
"-mask": "255.255.255.255"
}
]
};
db._query(dbs_update_Node_by_key, bind_args);
db.Node.toArray();
This will produce:
[
{
"_id" : "Node/test",
"_key" : "test",
"_rev" : "18996044030550",
"Node" : {
"-name" : "Dev8",
"someAttribute" : "someValue",
"Interface" : [
{
"-ip" : "8.8.8.8",
"-mask" : "255.255.255.192"
},
{
"-ip" : "192.168.0.1",
"-mask" : "255.255.255.255"
}
]
}
}
]
I am not sure if this is what you required, but at least it updates the document and overwrites the "Interface" attribute with new values.

Related

Elasticsearch Empty Filters Values

I have the following Elasticsearch query that I use with python3. My Elasticsearch version is 7.9.3
search_body = {
"explain": "true",
"query":{
"bool": {
"must": {
"multi_match":{
"query": "some text",
"fields":[
f"title^5",
f"description^3"
],
"tie_breaker": 0.5
}
},
"should":[
{
"match_phrase": {
"position_term": "some text"
}
}
],
"filter": basic_filters
}
}
}
I set out filters separately.
# filters
user_trade = [529, 601]
user_exp = -100
user_minsalary = 0
user_schedule = 2
user_branch = [10, 15, 16]
basic_filters = [
{"terms" : {"trade" : user_trade}},
{"term" : {"experience" : user_exp}},
{"range" : {"min_salary" : {"gte": user_minsalary}}},
{"term" : {"schedule" : user_schedule}},
{"terms" : {"branch" : user_branch}}
]
I want to modify the filters so that if at least one variable is empty, the search will return all documents by that variable and filter by the others, moreover, if all filter variables are empty, the search will return all documents that match only the "query" and "position_term" value (it's the same value), but I don’t know how to do it properly.
I'm a complete beginner in Elasticsearch, hope someone can help.

Elasticsearch Search/filter by occurrence or order in an array

I am having a data field in my index in which,
I want only doc 2 as result i.e logically where b comes before
a in the array field data.
doc 1:
data = ['a','b','t','k','p']
doc 2:
data = ['p','b','i','o','a']
Currently, I am trying terms must on [a,b] then checking the order in another code snippet.
Please suggest any better way around.
My understanding is that the only way to do that would be to make use of Span Queries, however it won't be applicable on an array of values.
You would need to concatenate the values into a single text field with whitespace as delimiter, reingest the documents and make use of Span Near query on that field:
Please find the below mapping, sample document, the query and response:
Mapping:
PUT my_test_index
{
"mappings": {
"properties": {
"data":{
"type": "text"
}
}
}
}
Sample Documents:
POST my_test_index/_doc/1
{
"data": "a b"
}
POST my_test_index/_doc/2
{
"data": "b a"
}
Span Query:
POST my_test_index/_search
{
"query": {
"span_near" : {
"clauses" : [
{ "span_term" : { "data" : "a" } },
{ "span_term" : { "data" : "b" } }
],
"slop" : 0, <--- This means only `a b` would return but `a c b` won't.
"in_order" : true <--- This means a should come first and the b
}
}
}
Note that slop controls the maximum number of intervening unmatched positions permitted.
Response:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.36464313,
"hits" : [
{
"_index" : "my_test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.36464313,
"_source" : {
"data" : "a b"
}
}
]
}
}
Let me know if this helps!

Mongo text search with AND operation for multiple words partially enered

{
TypeList" : [
{
"TypeName" : "Carrier"
},
{
"TypeName" : "Not a Channel Member"
},
{
"TypeName" : "Service Provider"
}
]
}
Question :
db.supplies.find("text", {search:"\"chann\" \"mem\""})
For above query I want display :
{
TypeName" : "Not a Channel Member"
}
But I am unable to get my result.
What are changes I have to do in query .
Please help me.
The below query will return your desired result.
db.supplies.aggregate([
{$unwind:"$TypeList"},
{$match:{"TypeList.TypeName":{$regex:/.*chann.*mem.*/,$options:"i"}}},
{$project:{_id:0, TypeName:"$TypeList.TypeName"}}
])
If you can accept to get an output like this:
{
"TypeList" : [
{
"TypeName" : "Not a Channel Member"
}
]
}
then you can get around using the aggregation framework which generally helps performance by running the following query:
db.supplies.find(
{
"TypeList.TypeName": /chann.*mem/i
},
{ // project the list in the following way
"_id": 0, // do not include the "_id" field in the output
"TypeList": { // only include the items from the TypeList array...
$elemMatch: { //... where
"TypeName": /chann.*mem/i // the "TypeName" field matches the regular expression
}
}
})
Also see this link: Retrieve only the queried element in an object array in MongoDB collection

Searching after indexing in ElasticSearch

I want to index 1 billion records. each record has 2 attributes (attribute1 and attribute2).
each record that has same value in attribute1 must be merge. for example, I have two record
attribute1 attribute2
1 4
1 6
my elastic document must be
{
"attribute1": "1"
"attribute2": "4,6"
}
due to huge amount of data, I must to read a bulk (about 1000 records) and merge them based on the above rule (in memory) and then search them in ElasticSearch and merge them with search result and then index/reindex them.
In summary I have to Search and Index per bulk respectively.
I implemented this rule but in some cases Elastic does not return all results and some documents have been indexed duplicately.
after each Index I Refresh ElasticSearch so that it be ready for next search. but in some case it doesn’t work.
my index setting is followed as:
{
"test_index": {
"settings": {
"index": {
"refresh_interval": "-1",
"translog": {
"flush_threshold_size": "1g"
},
"max_result_window": "1000000",
"creation_date": "1464577964635",
"store": {
"throttle": {
"type": "merge"
}
}
},
"number_of_replicas": "0",
"uuid": "TZOse2tLRqGk-vHRMGc2GQ",
"version": {
"created": "2030199"
},
"warmer": {
"enabled": "false"
},
"indices": {
"memory": {
"index_buffer_size": "40%"
}
},
"number_of_shards": "5",
"merge": {
"policy": {
"max_merge_size": "2g"
}
}
}
}
how can I resolve this problem?
Is there any other setting to handle this situation?
In your bulk commands, you need to use the index operation for the first occurence and then update with a script to update your attribute2 property:
{ "index" : { "_index" : "test_index", "_type" : "test_type", "_id" : "1" } }
{ "attribute1" : "1", "attribute2": [4] }
{ "update" : { "_index" : "test_index", "_type" : "test_type", "_id" : "1" } }
{ "script" : { "inline": "ctx._source.attribute2 += attr2", "params" : {"attr2" : 6}}}
After the first index operation your document will look like
{
"attribute1": "1"
"attribute2": [4]
}
After the second update operation, your document will look like
{
"attribute1": "1"
"attribute2": [4, 6]
}
Note that it is also possible to only use update operations with doc_as_upsert and script.

Mongoose mapReduce : reduce returns object or array?

I Have the following Collection :
/* 0 */
{
"clientID" : ObjectId("51b9c10d91d1a3a52b0000b8"),
"_id" : ObjectId("532b4f1cb3d2eacb1300002b"),
"answers" : [],
"questions" : []
}
/* 1 */
{
"clientID" : ObjectId("51b9c10d91d1a3a52b0000b8"),
"_id" : ObjectId("532b6b9eb3d2eacb1300002c"),
"answers" : [
"1",
"8"
],
"questions" : [
"1",
"2",
"3"
]
}
/* 2 */
{
"clientID" : ObjectId("51b9c10d91d1a3a52b0000b8"),
"_id" : ObjectId("532b6baeb3d2eacb1300002d"),
"answers" : [
"1",
"8"
],
"questions" : [
"1",
"2",
"3"
]
}
/* 3 */
{
"clientID" : ObjectId("5335f9d864e2b1290c00012e"),
"_id" : ObjectId("533b828146ca43634000002d"),
"answers" : [
"ORANGE"
],
"questions" : [
"Color"
]
}
/* 4 */
{
"clientID" : ObjectId("5335f9d864e2b1290c00012e"),
"_id" : ObjectId("5351be327b539a4d1a00002b"),
"answers" : [
"ORANGE"
],
"questions" : [
"Color"
]
}
/* 5 */
{
"clientID" : ObjectId("5335f9d864e2b1290c00012e"),
"_id" : ObjectId("5351be5ec89d717d1a00002b"),
"answers" : [
"ORANGE"
],
"questions" : [
"Color"
]
}
I am running the following code in order to find how many times the (questions,answers) combination appears in the collection:
o.map= function(){
emit({"questions" : this.questions, "answers" :this.answers },this.clientID)
};
o.reduce = function(answers, collection){
return collection.length;
};
logSearchDB.mapReduce(o,function (err, results) {
results.sort(function(a, b){return b.value-a.value});
for (var i = 0; i < results.length; i++) {
console.log(JSON.stringify(results[i]))
};
})
The output is:
{"_id":{"questions":[],"answers":[]},"value":"51b9c10d91d1a3a52b0000b8"}
{"_id":{"questions":["Color"],"answers":["ORANGE"]},"value":3}
{"_id":{"questions":["1","2","3"],"answers":["1","8"]},"value":2}
I expected that the first row will have "value" : 1
I guess the 'reduce' function got a 'collection' object : "51b9c10d91d1a3a52b0000b8", instead of getting an array : ["51b9c10d91d1a3a52b0000b8"].
Why the map reduce doesn't collect everything into an array?
The reason why you have just a plain value in that first row is because there was only one occurrence of your key value. This is generally how mapReduce works, at least in the way it was specified in the original papers.
So the reduce function is not actually called when there only is a single key. To work around this you use the finalize function in your map reduce:
var finalize = function(key,value) {
if ( typeof(value) != "number" )
value = 1;
return value;
};
db.collection.mapReduce(
mapper,
reducer,
{
"finalize": finalize,
"out": { "inline": 1 }
}
);
That runs over all of the output and sees that when the value is seen to be not a nunber, being the clientID you are emitting, then the value is set at 1 because that is how hany are in the grouping.
Really your query is better suited to the aggregation framework than mapReduce. The aggregation framework is a native code implementation as opposed to using a JavaScript interpreter. It runs much faster than mapReduce:
db.collection.aggregate([
{ "$group": {
"_id": {
"questions": "$questions",
"answers": "$answers"
},
"count": { "$sum": 1 }
}}
])
So it is the better option to use. It was a later introduction to MongoDB so people still tend to think in terms of mapReduce or otherwise there is legacy code from earlier versions of MongoDB. But this has been around for quite a while now.
Also see the operator reference for the aggregation framework.

Resources