what should be the mongo query for this

what should be the mongo query for this - node.js

Below if a document from my collection of over 20,000,000 documents.
I need to find documents by a particular zip, out of these documents I need to select one record from each postal address (ADDR, CITY, STATE, ZIP, APT) and which has a age value of 18 or higher.
The results need to be limited to a number as well which is entered by the end-user.
{
"_id" : ObjectId("55e86e98f493590878bb45d7"),
"RecordID" : 84096380,
"FN" : "Michael",
"MI" : "",
"LN" : "Horn",
"NAME_PRE" : "MR",
"ADDR" : "160 Yankee Camp Rd",
"CITY" : "Telford",
"ST" : "TN",
"ZIP" : 37690,
"APT" : "",
"Z4" : 2200,
"DPC" : 605,
"CAR_RTE" : "R001",
"WALK_SEQ" : 228,
"LOT" : "0136A",
"FIPS_ST" : 47,
"FIPS_CTY" : 179,
"LATITUDE" : 36.292787,
"LONGITUDE" : -82.568171,
"ADDR_TYP" : 1,
"MSA" : 3660,
"CBSA" : 27740,
"ADDR_LINE" : 3,
"DMA_SUPPR" : "",
"GEO_MATCH" : 1,
"CENS_TRACT" : 61900,
"CENS_BLK_GRP" : 1,
"CENS_BLK" : 17,
"CENS_MED_HOME_VALUE" : 953,
"CENS_MED_HH_INCOME" : 304,
"CRA" : "",
"Z4_TYP" : "S",
"DSF_IND" : 1,
"DPD_IND" : "N",
"PHONE_FLAG" : "Y",
"PHONE" : NumberLong("4237730233"),
"TIME_ZN" : "E",
"GENDER" : "M",
"NEW_TO_BLD" : "",
"SOURCES" : 19,
"BASE_VER_DT" : 20101,
"COMP_ID" : NumberLong("3769001836"),
"IND_ID" : 1,
"INF_HH_RANK" : 1,
"HOME_OWNR_SRC" : "V",
"DOB_YR" : 1975,
"DOB_MON" : 7,
"DOB_DAY" : 10,
"EXACT_AGE" : 39,
"AGE" : 39,
"HH_INCOME" : "D"
}

if you are using mongoose, we can chain the operations by dot(.) operator. Since i see all your needs is conditional here is the example -
Person.
find({
ZIP: "37690",
ADDR : "",
STATE : "", //so on
AGE: { $gt: 18 }
}).
limit(10).
exec(callback);
more info - http://mongoosejs.com/docs/queries.html

You need to use aggregate operation.
var pipeline = [
{
$match: {ZIP: 37690, AGE: {$gt: 18}}
}, {
$group: {
_id: {ADDR: '$ADDR', CITY: '$CITY', STATE: '$STATE', ZIP: '$ZIP', APT: '$APT'},
PHONE: {$first: '$PHONE'}
}
},
{$limit: 10}
];
db.mycoll.aggregate(pipeline)
enhance the above to project whatever fields you require in results

I think This query will solve your problem.
Person.find({
ZIP: "37690",
AGE: { $gt: 18 }
}).
limit(50).
exec(callback);

Related

Sort JSON document by values embedded in an array of objects

I have a document in the below format. The goal is to group the document by student name and sort it by rank in the ascending order. Once that is done, iterate through the rank(within a student) and if each subsequent rank is greater than the previous one, the version field needs to be incremented. As part of a pipeline, student_name will be passed to me so matching by student name should be good instead of grouping.
NOTE: Tried it with python and works to some extent. A python solution would also be great!
{
"_id" : ObjectId("5d389c7907bf860f5cd11220"),
"class" : "I",
"students" : [
{
"student_name" : "AAA",
"Version" : 2,
"scores" : [
{
"value" : "50",
"rank" : 2
},
{
"value" : "70",
"rank" : 1
}
]
},
{
"student_name" : "BBB",
"Version" : 5,
"scores" : [
{
"value" : 80,
"rank" : 2
},
{
"value" : 100,
"rank" : 1
},
{
"value" : 100,
"rank" : 1
}
]
}
]
}
I tried this piece of code to sort
def version(student_name):
db.column.aggregate(
[
{"$unwind": "$students"},
{"$unwind": "$students.scores"},
{"$sort" : {"students.scores.rank" : 1}},
{"$group" : {"students.student_name}
]
)
for i in range(0,(len(students.scores)-1)):
if students.scores[i].rank < students.scores[i+1].rank:
tag.update_many(
{"$inc" : {"students.Version":1}}
)
The expected output for student AAA should be
{
"_id" : ObjectId("5d389c7907bf860f5cd11220"),
"class" : "I",
"students" : [
{
"student_name" : "AAA",
"Version" : 3, #version incremented
"scores" : [
{
"value" : "70",
"rank" : 1
},
{
"value" : "50",
"rank" : 2
}
]
}

I was able to sort the document.
pipeline = [
{"$unwind": "$properties"},
{"$unwind": "$properties.values"},
{"$sort" : {"$properties.values.rank" : -1}},
{"$group": {"_id" : "$properties.property_name", "values" : {"$push" : "$properties.values"}}}
]
import pprint
pprint.pprint(list(db.column.aggregate(pipeline)))

How to group a document with the same name that has different values for a specific attribute in one array using Mongodb?

If I have these objects :
{
"_id" : ObjectId("5caf2c1642e3731464c2c79d"),
"requested" : [],
"roomNo" : "E0-1-09",
"capacity" : 40,
"venueType" : "LR(M)",
"seatingType" : "TB",
"slotStart" : "8:30AM",
"slotEnd" : "9:50AM",
"__v" : 0
}
/* 2 */
{
"_id" : ObjectId("5caf2deb4a7f5222305b55d5"),
"requested" : [],
"roomNo" : "E0-1-09",
"capacity" : 40,
"venueType" : "LR(M)",
"seatingType" : "TB",
"slotStart" : "10:00AM",
"slotEnd" : "11:20AM",
"__v" : 0
}
is it possible to get something like this using aggregate in mongodb?
[{ roomNo: "E0-1-09" , availability : [{slotStart : "8:30AM", slotEnd: "9:50AM"} ,
{slotStart: "10:00AM", slotEnd : "11:20AM"}]
what im using currently:
db.getDB().collection(collection).aggregate([
{ $group: {_id:{roomNo: "$roomNo", availability :[{slotStart:"$slotStart", slotEnd:"$slotEnd"}]}}}
])
actually getting it twice like so :
[{ roomNo: "E0-1-09" , availability : [{slotStart : "8:30AM", slotEnd: "9:50AM"}]
[{ roomNo: "E0-1-09" , availability : [{slotStart: "10:00AM", slotEnd : "11:20AM"}]

You have to use $push accumulator
db.collection.aggregate([
{ "$group": {
"_id": "$roomNo",
"availability": {
"$push": {
"slotEnd": "$slotEnd",
"slotStart": "$slotStart"
}
}
}}
])

How to get sum of a particular field of a collection in MongoDB collection using PyMongo?

My MongoDB contains the following data
{
"_id" : ObjectId("5c1b742eb1829b69963029e8"),
"duration" : 12,
"cost" : 450,
"tax" : 81,
"tags" : [],
"participants" : [
ObjectId("5c1b6a8f348ddb15e4a8aac7"),
ObjectId("5c1b742eb1829b69963029e7")
],
"initiatorId" : ObjectId("5c1b6a8f348ddb15e4a8aac7"),
"context" : "coach",
"accountId" : ObjectId("5bdfe7b01cbf9460c9bb5d68"),
"status" : "over",
"webhook" : "http://d4bdc1ef.ngrok.io/api/v1/webhook_callback",
"hostId" : "5be002109a708109f862a03e",
"createdAt" : ISODate("2018-12-20T10:51:26.143Z"),
"updatedAt" : ISODate("2018-12-20T10:51:44.962Z"),
"__v" : 0,
"endedAt" : ISODate("2018-12-20T10:51:44.612Z"),
"startedAt" : ISODate("2018-12-20T10:51:32.992Z"),
"type" : "voip"
}
{
"_id" : ObjectId("5c1b7451b1829b69963029ea"),
"duration" : 1,
"cost" : 150,
"tax" : 27,
"tags" : [],
"participants" : [
ObjectId("5c1b6a8f348ddb15e4a8aac7"),
ObjectId("5c1b7451b1829b69963029e9")
],
"initiatorId" : ObjectId("5c1b6a8f348ddb15e4a8aac7"),
"context" : "coach",
"accountId" : ObjectId("5bdfe7b01cbf9460c9bb5d68"),
"status" : "over",
"webhook" : "http://d4bdc1ef.ngrok.io/api/v1/webhook_callback",
"hostId" : "5be002109a708109f862a03e",
"createdAt" : ISODate("2018-12-20T10:52:01.560Z"),
"updatedAt" : ISODate("2018-12-20T10:52:08.018Z"),
"__v" : 0,
"endedAt" : ISODate("2018-12-20T10:52:07.667Z"),
"startedAt" : ISODate("2018-12-20T10:52:06.762Z"),
"type" : "voip"
}
I want to get the total duration (sum of duration field) for a particular accountID where status is equals to "over" for a particular date range. Anyway to accomplish this using PyMongo? I am unable to form the query

Well I was doing some pretty basic mistakes while converting the query to PyMongo aggregation function. All I would say is be careful with the query structure format and especially the keys are to be encapsulated within quotes(""). To solve this all I have to do was
from bson.objectid import ObjectId
pipe = [
{"$match": {"accountId": ObjectId(accountId),
"status": "over",
"startedAt": {"$gte": startDate,
"$lte": EndDate
}
}},
{"$project": {"readableDate":
{"$dateToString":
{"format": "%Y-%m-%d", "date": "$startedAt"}},
"accountId": str("$accountId"),
"duration": "$duration"
}},
{"$group": {"_id": {"date": "$readableDate",
"accountId": str("$accountId")}, "totalCallDuration": {"$sum": "$duration"}}}]
for doc in db.VoiceCall.aggregate(pipe):
print(doc)
Just a reminder : the startDate and EndDate are in Python datetime format.

how to find recent record of record matching particular criteria

{
"_id" : ObjectId("5514ecc73910d3e808b9417c"),
"endingReciptBookNumber" : 2999,
"startingReciptBookNumber" : 2900,
"User" : 8,
"allRecipt" : [
{
"recipt_Number" : 2999,
"amount" : 24124,
"_id" : ObjectId("5514ecc73910d3e808b94180")
},
{
"recipt_Number" : 100,
"amount" : 2414,
"_id" : ObjectId("5514ecc73910d3e808b9417f")
},
{
"recipt_Number" : 101,
"amount" : 242,
"_id" : ObjectId("5514ecc73910d3e808b9417e")
},
{
"recipt_Number" : 102,
"amount" : 2424,
"_id" : ObjectId("5514ecc73910d3e808b9417d")
}
],
"__v" : 0
}
I have many documents like this in a collection in mongoose .I want to find a latest entered recipt_Number for a particular user. like in this case it should give me 102 as answer.

i have also attached snippet of lines of code. Its also a way to get same result.
db.topics.find( {'User': 8}, { 'allRecipt': { $slice: -1 },'startingReciptBookNumber':0,'endingReciptBookNumber':0,'User':0,'_id':0,'__v':0 } )
query result like below
{
"allRecipt" : [
{
"recipt_Number" : 102,
"amount" : 2424,
"_id" : ObjectId("5514ecc73910d3e808b9417d")
}
]
}
Though query won't give any single number in result but it will give desired outcome through result.allRecipt.0.recipt_Number, Your desired number will always get into in 0 index. I think this is your desired number.
Here $slice make a difference.
Thanks

MongoDB-Query Optimization

I have a collection with a sub-document consisting of more than 40K records.
My aggregate query takes about 300 secs. I have tried optimizing the same using compound as well as multi-key indexing, which completes in 180 secs.
I still require a reduced query time execution.
here is my collection:
{
"_id" : ObjectId("545b32cc7e9b99112e7ddd97"),
"grp_id" : 654,
"user_id" : 2,
"mod_on" : ISODate("2014-11-06T08:35:40.857Z"),
"crtd_on" : ISODate("2014-11-06T08:35:24.791Z"),
"uploadTp" : 0,
"tp" : 1,
"status" : 3,
"id_url" : [
{"mid":"xyz12793"},
{"mid":"xyz12794"},
{"mid":"xyz12795"},
{"mid":"xyz12796"}
],
"incl" : 1,
"total_cnt" : 25,
"succ_cnt" : 25,
"fail_cnt" : 0
}
and following is my query
db.member_id_transactions.aggregate([ { '$match':
{ id_url: { '$elemMatch': { mid: 'xyz12794' } } } },
{ '$unwind': '$id_url' },
{ '$match': { grp_id: 654, 'id_url.mid': 'xyz12794' } } ])
has anyone faced the same issue?
here's the o/p for aggregate query with explain option
{
"result" : [
{
"_id" : ObjectId("546342467e6d1f4951b56285"),
"grp_id" : 685,
"user_id" : 2,
"mod_on" : ISODate("2014-11-12T11:24:01.336Z"),
"crtd_on" : ISODate("2014-11-12T11:19:34.682Z"),
"uploadTp" : 1,
"tp" : 1,
"status" : 3,
"id_url" : [
{"mid":"xyz12793"},
{"mid":"xyz12794"},
{"mid":"xyz12795"},
{"mid":"xyz12796"}
],
"incl" : 1,
"__v" : 0,
"total_cnt" : 21406,
"succ_cnt" : 21402,
"fail_cnt" : 4
}
],
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("545c8d37ab9cc679383a1b1b")
}
}

One way to reduce the number of records being filtered further is to include the field grp_id, in the first $match operator.
db.member_id_transactions.aggregate([
{$match:{ "id_url.mid": 'xyz12794',"grp_id": 654 } },
{$unwind: "$id_url" },
{$match: { "id_url.mid": "xyz12794" } }
])
See how the performance is now. Add grp_id to the index to get better response time.
The above aggregation query though it works, is unnecessary. since you are not altering the structure of the document, and you expect only one element in the array to match the filter condition, you could just use a simple find and project.
db.member_id_transactions.find(
{ "id_url.mid": "xyz12794","grp_id": 654 },
{"_id":0,"grp_id":1,"id_url":{$elemMatch:{"mid":"xyz12794"}},
"user_id":1,"mod_on":1,"crtd_on":1,"uploadTp":1,
"tp":1,"status":1,"incl":1,"total_cnt":1,
"succ_cnt":1,"fail_cnt":1
}
)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

what should be the mongo query for this - node.js

if you are using mongoose, we can chain the operations by dot(.) operator. Since i see all your needs is conditional here is the example - Person. find({ ZIP: "37690", ADDR : "", STATE : "", //so on AGE: { $gt: 18 } }). limit(10). exec(callback); more info - http://mongoosejs.com/docs/queries.html

I think This query will solve your problem. Person.find({ ZIP: "37690", AGE: { $gt: 18 } }). limit(50). exec(callback);

Related

Sort JSON document by values embedded in an array of objects

How to group a document with the same name that has different values for a specific attribute in one array using Mongodb?

How to get sum of a particular field of a collection in MongoDB collection using PyMongo?

how to find recent record of record matching particular criteria

MongoDB-Query Optimization

Categories

Resources