I have big collection of tweets stored in MongoDB. Tweets look like this one:
"_id" : ObjectId("4c02c58de500fe1be1000005"),
"contributors" : null,
"text" : "Hello world",
"user" : {
"following" : null,
"followers_count" : 5,
"utc_offset" : null,
"location" : "",
"profile_text_color" : "000000",
"friends_count" : 11,
"profile_link_color" : "0000ff",
"verified" : false,
"protected" : false,
"url" : null,
"contributors_enabled" : false,
"created_at" : "Sun May 30 18:47:06 +0000 2010",
"geo_enabled" : false,
"profile_sidebar_border_color" : "87bc44",
"statuses_count" : 13,
"favourites_count" : 0,
"description" : "",
"notifications" : null,
"profile_background_tile" : false,
"lang" : "en",
"id" : 149978111,
"time_zone" : null,
"profile_sidebar_fill_color" : "e0ff92"
},
"geo" : null,
"coordinates" : null,
"in_reply_to_user_id" : 149183152,
"place" : null,
"created_at" : "Sun May 30 20:07:35 +0000 2010",
"source" : "web",
"in_reply_to_status_id" : {
"floatApprox" : 15061797850
},
"truncated" : false,
"favorited" : false,
"id" : {
"floatApprox" : 15061838001
For example, If I want to find tweets about some topic for example, canon, then How should I write a query which checks the "text" and finds all tweets about "canon"?
MongoDB does not have directly native query support to search within text. There is official documentation showing you how you can achieve a simple approach to full text search:
http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo
It involves splitting the text into words, and storing them in an array, which you index. This lets you match against the contents of an array. How you split them up is your choice. Maybe you just do words, lowercase, and match against a lower case keyword. Or maybe you need autocompletion so you do variations of each word, or phonetics, etc. Thats all stemming.
Its not as robust as a full text search engine, designed to do this, but it works. Depending on the language you are using, some frameworks have search packages. For instance, I use mongodb with django's nonrel project, and there is a search app for that which provides stemming and different tool for searching.
Related
I want to perform a search in mongodb and nodejs which will return the count of ids that I will provide.
my collection is a log table
{
"_id" : ObjectId("5836d0f7f8462cbc6d0caffc"),
"DeviceId" : "abcd1234",
"AppType" : "web",
"UserId" : "5836cb01f8462cbc6d0caff8",
"ArticleId" : "5836cb01f8462cbc6d0caff8",
"Timestamp" : ISODate("2016-11-24T11:37:27.851Z")
},
{
"_id" : ObjectId("5836dba8a2943528448a3050"),
"DeviceId" : null,
"AppType" : null,
"UserId" : null,
"ArticleId" : 5836e493f2acbd1d34648e78,
"Timestamp" : ISODate("2016-11-24T12:23:04.484Z")
},
{
"_id" : ObjectId("5836e445c3b43429b4810ad4"),
"DeviceId" : null,
"AppType" : null,
"UserId" : null,
"ArticleId" : 5836d0f7f8462cbc6d0caffc,
"Timestamp" : ISODate("2016-11-24T12:59:49.820Z")
},
{
"_id" : ObjectId("5836e493f2acbd1d34648e78"),
"DeviceId" : null,
"AppType" : null,
"UserId" : null,
"ArticleId" : 5836d0f7f8462cbc6d0caffc,
"Timestamp" : ISODate("2016-11-24T13:01:07.030Z")
}
and so on...
my search string will be, search need to be performed on ArticleId
{"5836d0f7f8462cbc6d0caffc",
"5836e493f2acbd1d34648e78",
"5836dba8a2943528448a3050"}
and I want a result set like below
{
"1":{ArticleId:"5836d0f7f8462cbc6d0caffc", count:2},
"2":[ArticleId:"5836e493f2acbd1d34648e78", count:9},
"3":[ArticleId:"5836dba8a2943528448a3050", count:35}
}
Can any one please provide me the query, thanks in advance
I don't know if you realize, but all of your counts will be 1.
That's because every _id in Mongo is unique so if it's there then it's there only once.
Also, the output that you want is invalid:
{ 1:{_id:ObjectId("5836d0f7f8462cbc6d0caffc"), count:2},
2:{ObjectId("5836e493f2acbd1d34648e78"), count:9},
3:{ObjectId("5836dba8a2943528448a3050"), count:35}}
It's not valid JSON, not valid JavaScript, not valid Hjson, not valid JSON5, not valid anything - so you will never be able to get that result no matter what you do. (And also, if you change the expected output, every count will still be 1 so fixing the format is pointless anyway.)
Hello all I got stuck somewhere, I am working on mongodb with node.js where my collection data deleted automatically after 1 year on certain date and I want to stop that permanently how can I do that ? I have checked the available material on google but didn't got much success please help me friends ...
I have checked the index in one of my collection and it is showing data like this . Can you please tell me its is having TTL index or not
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "firstfive.teachers"
},
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "firstname_lastname_text",
"weights" : {
"firstName" : 1,
"lastName" : 1
},
"default_language" : "english",
"language_override" : "language",
"ns" : "firstfive.teachers",
"textIndexVersion" : 2
}
]
most likely you have TTL (time to limit) index defined on collection you're working with (https://docs.mongodb.com/v3.2/core/index-ttl/)
yu can check it by running db.your_collection.getIndexes() (it will be one with expireAfterSeconds) in mongo shell.
as any other index it can be removed - but do it carefully, apparently someone did it deliberately
I have some events in the event collection and some users in the users collection.
I have a mapping of events and users in another collection. The event_id field in that collection is a reference to the event collection(ObjectId). But when I search schema using the following command I get null as a response
db.eventusers.findOne({event_id :'57988cd30e9811750324c080'})
returns null
on the other hand when i search using user field which is not a reference, just a string containing user id, I get the result as follows.
db.eventusers.findOne({user_id :"578cdcdd56eaec041b6caf3e"})
{
"_id" : ObjectId("57988d190e9811750324c081"),
"created_at" : "1469615385595",
"updated_at" : "1469615618502",
"user_id" : "578cdcdd56eaec041b6caf3e",
"event_id" : ObjectId("57988cd30e9811750324c080"),
"deleted" : false,
"invited" : false,
"host" : true,
"status" : -1,
"__v" : 0
}
I have found out the solution. If the field was defined in Mongoose Schema as Schema.Types.ObjectId then in the Query id should be ObjectId rather than String. So instead of query {event_id :'57988cd30e9811750324c080'} it should be {event_id :ObjectId('57988cd30e9811750324c080')}:
db.eventusers.findOne({event_id :ObjectId('57988cd30e9811750324c080')})
{
"_id" : ObjectId("57988d190e9811750324c081"),
"created_at" : "1469615385595",
"updated_at" : "1469615618502",
"user_id" : "578cdcdd56eaec041b6caf3e",
"event_id" : ObjectId("57988cd30e9811750324c080"),
"deleted" : false,
"invited" : false,
"host" : true,
"status" : -1,
"__v" : 0
}
In Node with Mongoose I want to find an object in the collection Content. It has a list of sub-documents called users which has the properties stream, user and added. I do this to get all documents with a certain user's _id property in there users.user field.
Content.find( { 'users.user': user._id } ).sort( { 'users.added': -1 } )
This seems to work (although I'm unsure if .sort is really working here. However, I want to match two fields, like this:
Content.find( { 'users.user': user._id, 'users.stream': stream } } ).sort( { 'users.added': -1 } )
That does not seem to work. What is the right way to do this?
Here is a sample document
{
"_id" : ObjectId("551c6b37859e51fb9e9fde83"),
"url" : "https://www.youtube.com/watch?v=f9v_XN7Wxh8",
"title" : "Playing Games in 360°",
"date" : "2015-03-10T00:19:53.000Z",
"author" : "Econael",
"description" : "Blinky is a proof of concept of enhanced peripheral vision in video games, showcasing different kinds of lens projections in Quake (a mod of Fisheye Quake, using the TyrQuake engine).\n\nDemo and additional info here:\nhttps://github.com/shaunlebron/blinky\n\nThanks to #shaunlebron for making this very interesting proof of concept!\n\nSubscribe: http://www.youtube.com/subscription_center?add_user=econaelgaming\nTwitter: https://twitter.com/EconaelGaming",
"duration" : 442,
"likes" : 516,
"dislikes" : 13,
"views" : 65568,
"users" : [
{
"user" : "54f6688c55407c0300b883f2",
"added" : 1427925815190,
"_id" : ObjectId("551c6b37859e51fb9e9fde84"),
"tags" : []
}
],
"images" : [
{
"hash" : "1ab544648d7dff6e15826cda7a170ddb",
"thumb" : "...",
"orig" : "..."
}
],
"tags" : [],
"__v" : 0
}
Use $elemMatch operator to specify multiple criteria on an array of embedded documents:
Content.find({"users": {$elemMatch: {"user": user.id, "stream": stream}}});
Scenario: I have a collection 'People' with following documents
{
"_id" : ObjectId("512bc95fe835e68f199c8686"),
"name": "David",
"age" : 78
},
{ "_id" : ObjectId("512bc962e835e68f199c8687"),
"name" : "Dave",
"age" : 35
}
When I query using following code from Node.js
db.articles.aggregate(
{ $match : { author : "Dave" } }
);
The output will be like:
{ "_id" : ObjectId("512bc962e835e68f199c8687"),
"name" : "Dave",
"age" : 35
}
Issues: The above is just a sample of the actual scenario, I want the 'age' filed value to be embedded in double quotes i.e for above quoted example it should be like "age": "35".
That is full resultant document should be like following:
{ "_id" : ObjectId("512bc962e835e68f199c8687"),
"name" : "Dave",
"age" : "35"
}
Consider I have huge number of documents how efficiently I can achieve the same to get the desired output?
Question: Can someone help out with bright and efficient way to achieve the same?