I want to query a Mongo collection for documents where a specific field is either missing or has a value that would evaluate to false in Python. This includes atomic values null, 0, '' (the empty string), false, []. However, arrays containing such values (such as ['foo', ''] or just ['']) are not falsey and must not be matched. Can I do this with Mongo’s structured queries (without resorting to JavaScript)?
$type doesn’t seem to help:
> db.foo.insert({bar: ['baz', '', 'qux']});
> db.foo.find({$and: [{bar: ''}, {bar: {$type: 2}}]});
{ "_id" : ObjectId("50599937da5254d6fd731816"), "bar" : [ "baz", "", "qux" ] }
This should work
db.test.find({$or:[{a:{$size:0}},{"a.0":{$exists:true}}]})
Just make sure the a field doesn't have an object inside with the 0 key.
e.g.
> db.test.find()
{ "_id": ObjectId("5059ac3ab1cee080a7168fff"), "bar": [ "baz", "", "qux" ] }
{ "_id": ObjectId("5059ac48b1cee080a7169000"), "hello": 1, "bar": false, "world": 34 }
{ "_id": ObjectId("5059ac53b1cee080a7169001"), "hello": 1, "world": 42 }
{ "_id": ObjectId("5059ac60b1cee080a7169002"), "hello": 13, "bar": null, "world": 34 }
{ "_id": ObjectId("5059ac6bb1cee080a7169003"), "hello": 133, "bar": [ ], "world": 334 }
{ "_id": ObjectId("5059b36cb1cee080a7169004"), "hello": 133, "bar": [ "" ], "world": 334 }
{ "_id": ObjectId("5059b3e3b1cee080a7169005"), "hello": 133, "bar": "foo", "world": 334 }
{ "_id": ObjectId("5059b3f8b1cee080a7169006"), "hello": 1333, "bar": "", "world": 334 }
{ "_id": ObjectId("5059b424b1cee080a7169007"), "hello": 1333, "bar": { "0": "foo" }, "world": 334 }
> db.test.find({$or: [{bar: {$size: 0}}, {"bar.0": {$exists: true}}]})
{ "_id": ObjectId("5059ac3ab1cee080a7168fff"), "bar": [ "baz", "", "qux" ] }
{ "_id": ObjectId("5059ac6bb1cee080a7169003"), "hello": 133, "bar": [ ], "world": 334 }
{ "_id": ObjectId("5059b36cb1cee080a7169004"), "hello": 133, "bar": [ "" ], "world": 334 }
{ "_id": ObjectId("5059b424b1cee080a7169007"), "hello": 1333, "bar": { "0": "foo" }, "world": 334 }
I found this: https://jira.mongodb.org/browse/SERVER-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:changehistory-tabpanel from back in the day.
I can replicate it on my MongoDB 2.0.1 (I also played around a bit to see if it picked it up as something else):
> db.g.find()
{ "_id" : ObjectId("50599eb65395c82c7a47d124"), "bar" : [ "baz", "", "qux" ] }
{ "_id" : ObjectId("5059a0005395c82c7a47d125"), "a" : 3, "b" : { "a" : 1, "b" : 2 } }
> db.g.find({bar: {$type: 4}});
> db.g.find({a: {$type: 2}});
> db.g.find({a: {$type: 16}});
> db.g.find({bar: {$type: 16}});
> db.g.find({bar: {$type: 2}});
{ "_id" : ObjectId("50599eb65395c82c7a47d124"), "bar" : [ "baz", "", "qux" ] }
> db.g.find({bar: {$type: 3}});
> db.g.find({b: {$type: 3}});
{ "_id" : ObjectId("5059a0005395c82c7a47d125"), "a" : 3, "b" : { "a" : 1, "b" : 2 } }
And when I use $type 4 I cannot get the document with the array in. As you can see $type 3 works fine (Which could be related to: https://jira.mongodb.org/browse/SERVER-1475) but the array cannot seem to be picked up.
It is possible that you are seeing a bug. If you file a JIRA on MongoDBs site (jira.mongodb.org) it could help to solve the problem.
However, though, the $type op might not solve your problem. This might be better done via a client side method like unsetting the field totally if it has no elements that way you query for existance. This standardises your querying patterns and makes it easier to integrate in general.
So my personal recommendation here is to standardise "falsey" values into one single conherrent value.
Edit
I noticed they have marked the original bug as a duplicate (that's why it is closed) however I am not sure how it is a duplicate. These arrays are not being picked up as objects but rather as strings, most likely since $type is acting on every element within that field rather than the field itself (or something like that).
I would still open a JIRA and stress that the array cannot be picked up at all.
Related
I have a list example_list contains two dict objects, it looks like this:
[
{
"Meta": {
"ID": "1234567",
"XXX": "XXX"
},
"bbb": {
"ccc": {
"ddd": {
"eee": {
"fff": {
"xxxxxx": "xxxxx"
},
"www": [
{
"categories": {
"ppp": [
{
"content": {
"name": "apple",
"price": "0.111"
},
"xxx: "xxx"
}
]
},
"date": "A2020-01-01"
}
]
}
}
}
}
},
{
"Meta": {
"ID": "78945612",
"XXX": "XXX"
},
"bbb": {
"ccc": {
"ddd": {
"eee": {
"fff": {
"xxxxxx": "xxxxx"
},
"www": [
{
"categories": {
"ppp": [
{
"content": {
"name": "banana",
"price": "12.599"
},
"xxx: "xxx"
}
]
},
"date": "A2020-01-01"
}
]
}
}
}
}
}
]
now I want to filter the items and only keep "ID": "xxx" and the correspoding value for "price": "0.111", expected result can be something similar to :
[{"ID": "1234567", "price": "0.111"}, {"ID": "78945612", "price": "12.599"}]
or something like {"1234567":"0.111", "78945612":"12.599" }
Here's what I've tried:
map_list=[]
map_dict={}
for item in example_list:
#get 'ID' for each item in 'meta'
map_dict['ID'] = item['meta']['ID']
# get 'price'
data_list = item['bbb']['ccc']['ddd']['www']
for data in data_list:
for dataitem in data['categories']['ppp']
map_dict['price'] = item["content"]["price"]
map_list.append(map_dict)
print(map_list)
The result for this doesn't look right, feels like the item isn't iterating properly, it gives me result:
[{"ID": "78945612", "price": "12.599"}, {"ID": "78945612", "price": "12.599"}]
It gave me duplicated result for the second ID but where is the first ID?
Can someone take a look for me please, thanks.
Update:
From some comments from another question, I understand the reason for the output keeps been overwritten is because the key name in the dict is always the same, but I'm not sure how to fix this because the key and value needs to be extracted from different level of for loops, any help would be appreciated, thanks.
as #Scott Hunter has mentioned, you need to create a new map_dict everytime you are trying to do this. Here is a quick fix to your solution (I am sadly not able to test it right now, but it seems right to me).
map_list=[]
for item in example_list:
# get 'price'
data_list = item['bbb']['ccc']['ddd']['www']
for data in data_list:
for dataitem in data['categories']['ppp']:
map_dict={}
map_dict['ID'] = item['meta']['ID']
map_dict['price'] = item["content"]["price"]
map_list.append(map_dict)
print(map_list)
But what are you doing here is that you are basically just "forcing" your way through ... I recommend you to take a break and check out somekind of tutorial, which will help you to understand how it really works in the back-end. This is how I would have written it:
list_dicts = []
for example in example_list:
for www in item['bbb']['ccc']['ddd']['www']:
for www_item in www:
list_dicts.append({
'ID': item['meta']['ID'],
'price': www_item["content"]["price"]
})
Good luck with this problem and hope it helps :)
You need to create a new dictionary for map_dict for each ID.
Given the following document structure:
{
"name": [
{
"use": "official",
"family": "Chalmers",
"given": [
"Peter",
"James"
]
},
{
"use": "usual",
"given": [
"Jim"
]
},
{
"use": "maiden",
"family": "Windsor",
"given": [
"Peter",
"James"
]
}
]
}
Query:
FOR client IN Patient FILTER client.name[*].use=='official' RETURN client.name[*].given
I have telecom and name array.
I want to query to compare if name[*].use=='official' then print corresponding give array.
Expected result:
"given": [
"Peter",
"James"
]
client.name[*].use is an array, so you need to use an array operator. It can be either of the following:
'string' in doc.attribute
doc.attribute ANY == 'string'
doc.attribute ANY IN ['string']
To return just the given names from the 'official' array, you can use a subquery:
RETURN { given:
FIRST(FOR name IN client.name FILTER name.use == 'official' LIMIT 1 RETURN name.given)
}
Alternatively, you can use an inline expression:
FOR client IN Patient
FILTER 'official' IN client.name[*].use
RETURN { given:
FIRST(client.name[* FILTER CURRENT.use == 'official' LIMIT 1 RETURN CURRENT.given])
}
Result:
[
{
"given": [
"Peter",
"James"
]
}
]
In your original post, the example document and query didn't match, but assuming the following structure:
{
"telecom": [
{
"use": "official",
"value": "+1 (03) 5555 6473 82"
},
{
"use": "mobile",
"value": "+1 (252) 5555 910 920 3"
}
],
"name": [
{
"use": "official",
"family": "Chalmers",
"given": [
"Peter",
"James"
]
},
{
"use": "usual",
"given": [
"Jim"
]
},
{
"use": "maiden",
"family": "Windsor",
"given": [
"Peter",
"James"
]
}
]
}
… here is a possible query:
FOR client IN Patient
FILTER LENGTH(client.telecom[* FILTER
CONTAINS(CURRENT.value, "(03) 5555 6473") AND
CURRENT.use == 'official']
)
RETURN {
given: client.name[* FILTER CURRENT.use == 'official' RETURN CURRENT.given]
}
Note that client.telecom[*].value LIKE "..." causes the array of phone numbers to be cast to a string "[\"+1 (03) 5555 6473 82\",\"+1 (252) 5555 910 920 3\"]" against which the LIKE operation is run - this kind of works, but it's not ideal.
CONTAINS() is also faster than LIKE with % wildcards on both sides.
It would be possible that there are multiple 'official' elements, which might require an extra level of array nesting. Above query produces:
[
{
"given": [
[
"Peter",
"James"
]
]
}
]
If you know that there is only one element or restrict it to one element explicitly then you can get rid of one of the wrapping square brackets with FIRST() or FLATTEN().
I have a little issue with some sorting in my Groovy scrip and I am not sure why it is not working as expected.
Below is the JSON I am trying to sort:
{
"aaa": [
{
"aaa1": xxx,
"aaa2": "xxx",
"bbb": [ {
"bbb1": xx,
"bbb2": [
1,
2
],
"ccc": [],
"ccc1": xxx
}]
},
{
"aaa1": xxx,
"aaa2": "xxx",
"bbb": [ {
"bbb1": xx,
"bbb2": [
1,
2
],
"ccc": [],
"ccc1": xxx
}]
}
]
}
]
}
I am trying to sort this JSON by 'policyid' but it doesn't seem to sort it and I have no idea why it isn't as to me the code seems correct:
Below is what I want it to output:
[ {ccc=[1], bbb1=xxx, aaa1=[]},
{ccc=[1, 2], bbb2=xxx, aaa2=[]}]
There is a typo in your code:
log.info testsort.sort{a,b -> a.policyid <=> b.policyid}
You access policyid field while it should be policyId. Change mentioned line to:
log.info testsort.sort{a,b -> a.policyId <=> b.policyId}
and you will get the output:
[policyId:31, passengerSeqIds:[2], optionalPassengerSeqIds:[], cost:25]
[policyId:34, passengerSeqIds:[1], optionalPassengerSeqIds:[], cost:40]
[policyId:35, passengerSeqIds:[1, 2], optionalPassengerSeqIds:[], cost:72]
i am trying to query nested array of objects in mongodb from node js, tried all the solutions but no luck. can anyone please help this on priority?
I have tried following :
{
"name": "Science",
"chapters": [
{
"name": "ScienceChap1",
"tests": [
{
"name": "ScienceChap1Test1",
"id": 1,
"marks": 10,
"duration": 30,
"questions": [
{
"question": "What is the capital city of New Mexico?",
"type": "mcq",
"choice": [
"Guadalajara",
"Albuquerque",
"Santa Fe",
"Taos"
],
"answer": [
"Santa Fe",
"Taos"
]
},
{
"question": "Who is the author of beowulf?",
"type": "notmcq",
"choice": [
"Mark Twain",
"Shakespeare",
"Abraham Lincoln",
"Newton"
],
"answer": [
"Shakespeare"
]
}
]
},
{
"name": "ScienceChap1test2",
"id": 2,
"marks": 20,
"duration": 30,
"questions": [
{
"question": "What is the capital city of New Mexico?",
"type": "mcq",
"choice": [
"Guadalajara",
"Albuquerque",
"Santa Fe",
"Taos"
],
"answer": [
"Santa Fe",
"Taos"
]
},
{
"question": "Who is the author of beowulf?",
"type": "notmcq",
"choice": [
"Mark Twain",
"Shakespeare",
"Abraham Lincoln",
"Newton"
],
"answer": [
"Shakespeare"
]
}
]
}
]
}
]
}
Here is what I've tried so far but still can't get it to work
db.quiz.find({name:"Science"},{"tests":0,chapters:{$elemMatch:{name:"ScienceChap1"}}})
db.quiz.find({ chapters: { $elemMatch: {$elemMatch: { name:"ScienceChap1Test1" } } }})
db.quiz.find({name:"Science"},{chapters:{$elemMatch:{$elemMatch:{name:"ScienceChap1Test1"}}}}) ({ name:"Science"},{ chapters: { $elemMatch: {$elemMatch: { name:"ScienceChap1Test1" } } }})
Aggregation Framework
You can use the aggregation framework to transform and combine documents in a collection to display to the client. You build a pipeline that processes a stream of documents through several building blocks: filtering, projecting, grouping, sorting, etc.
If you want get the mcq type questions from the test named "ScienceChap1Test1", you would do the following:
db.quiz.aggregate(
//Match the documents by query. Search for science course
{"$match":{"name":"Science"}},
//De-normalize the nested array of chapters.
{"$unwind":"$chapters"},
{"$unwind":"$chapters.tests"},
//Match the document with test name Science Chapter
{"$match":{"chapters.tests.name":"ScienceChap1test2"}},
//Unwind nested questions array
{"$unwind":"$chapters.tests.questions"},
//Match questions of type mcq
{"$match":{"chapters.tests.questions.type":"mcq"}}
).pretty()
The result will be:
{
"_id" : ObjectId("5629eb252e95c020d4a0c5a5"),
"name" : "Science",
"chapters" : {
"name" : "ScienceChap1",
"tests" : {
"name" : "ScienceChap1test2",
"id" : 2,
"marks" : 20,
"duration" : 30,
"questions" : {
"question" : "What is the capital city of New Mexico?",
"type" : "mcq",
"choice" : [
"Guadalajara",
"Albuquerque",
"Santa Fe",
"Taos"
],
"answer" : [
"Santa Fe",
"Taos"
]
}
}
}
}
$elemMatch doesn't work for sub documents. You can use the aggregation framework for "array filtering" by using $unwind.
You can delete each line from the bottom of each command in the aggregation pipeline in the above code to observe the pipelines behavior.
You should try the following queries in the mongodb simple javascript shell.
There could be Two Scenarios.
Scenario One
If you simply want to return the documents that contain certain chapter names or test names for example just one argument in find will do.
For the find method the document you want to be returned is specified by the first argument. You could return documents with the name Science by doing this:
db.quiz.find({name:"Science"})
You could specify criteria to match a single embedded document in an array by using $elemMatch. To find a document that has a chapter with the name ScienceChap1. You could do this:
db.quiz.find({"chapters":{"$elemMatch":{"name":"ScienceChap1"}}})
If you wanted your criteria to be a test name then you could use the dot operator like this:
db.quiz.find({"chapters.tests":{"$elemMatch":{"name":"ScienceChap1Test1"}}})
Scenario Two - Specifying Which Keys to Return
If you want to specify which keys to Return you can pass a second argument to find (or findOne) specifying the keys you want. In your case you can search for the document name and then provide which keys to return like so.
db.quiz.find({name:"Science"},{"chapters":1})
//Would return
{
"_id": ObjectId(...),
"chapters": [
"name": "ScienceChap2",
"tests: [..all object content here..]
}
If you only want to return the marks from each test object you can use the dot operator to do so:
db.quiz.find({name:"Science"},{"chapters.tests.marks":1})
//Would return
{
"_id": ObjectId(...),
"chapters": [
"tests: [
{"marks":10},
{"marks":20}
]
}
If you only want to return the questions from each test:
db.quiz.find({name:"Science"},{"chapters.tests.questions":1})
Test these out. I hope these help.
[updated 17:15 on 28/09]
I'm manipulating json data of type:
[
{
"id": 1,
"title": "Sun",
"seeAlso": [
{
"id": 2,
"title": "Rain"
},
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 2,
"title": "Rain",
"seeAlso": [
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 3,
"title": "Cloud",
"seeAlso": [
{
"id": 1,
"title": "Sun"
}
]
},
];
After inclusion in the database, a node.js search using
db.documents.query(
q.where(
q.collection('test films'),
q.value('title','Sun')
).withOptions({categories: 'none'})
)
.result( function(results) {
console.log(JSON.stringify(results, null,2));
});
will return both the film titled 'Sun' and the films which have a seeAlso/title property (forgive the xpath syntax) = 'Sun'.
I need to find 1/ films with title = 'Sun' 2/ films with seeAlso/title = 'Sun'.
I tried a container query using q.scope() with no success; I don't find how to scope the root object node (first case) and for the second case,
q.where(q.scope(q.property('seeAlso'), q.value('title','Sun')))
returns as first result an item which matches all text inside the root object node
{
"index": 1,
"uri": "/1.json",
"path": "fn:doc(\"/1.json\")",
"score": 137216,
"confidence": 0.6202662,
"fitness": 0.6701325,
"href": "/v1/documents?uri=%2F1.json&database=Documents",
"mimetype": "application/json",
"format": "json",
"matches": [
{
"path": "fn:doc(\"/1.json\")/object-node()",
"match-text": [
"Sun Rain Cloud"
]
}
]
},
which seems crazy.
Any idea about how doing such searches on denormalized json data?
Laurent:
XPaths on JSON are supported by MarkLogic.
In particular, you might consider setting up a path range index to match /title at the root:
http://docs.marklogic.com/guide/admin/range_index#id_54948
Scoped property matching required either filtering or indexed positions to be accurate. An alternative is to set up another path range index on /seeAlso/title
For the match issue it would be useful to know the MarkLogic version and to see the entire query.
Hoping that helps,