Mongoose and nodejs: about schema and query

Mongoose and nodejs: about schema and query - node.js

I'm building a rest api that allows users to submit and retrieve data produced by surveys: questions are not mandatory and each "submit" could be different from each other. Each submit is a json with data and a "survey id":
{
id: abc123,
surveyid: 123,
array: [],
object: {}
...
}
I have to store this data and allow retrieving and querying.
First approach: going without schema and putting everything in a single collection: it works, but each json field is treated as a "String" and making queries on numeric values is problematic.
Second approach: get questions datatypes for each survey, make/save a mongoose schema on a json file and then keep updated this file.
Something like this, where entry "schema : {}" represent a mongoose schema used for inserting and querying/retrieving data.
[
{
"surveyid" : "123",
"schema" : {
"name": "string",
"username" : "string",
"value" : "number",
"start": "date",
"bool" : "boolean",
...
}
},
{ ... }
]
Hoping this is clear, I've some questions:
Right now I've a single collection for all "submits" and everything is treated as a string. Can I use a mongoose schema, without other modifications, in order to specify that some fields are numeric (or date or whatever)? Is it allowed or is it even a good idea?
Are there any disadvantage using an external json file? Mongoose schemas are loaded at run time when requested or does the service need to be restart when this file is updated?
How to store data with a "schema" that could change often ?
I hope it's clear!
Thank you!

Related

Adding ACL for nested field object in JSON schema object

I am working on specify ACL fields for fields inside objects. I have the validator to check for permission to edit a specific field. For example, the schema looks like this:
"basic_info": {
"properties": {
"cadi_id": {
...
},
"analysis_keywords": {
...
},
"abstract": {
"type": "string",
"title": "Abstract",
"acl": {
"users": ["test#test.org", "test1#test.org"]
}
},
"ana_notes": {
...
},
"conclusion": {
...
}
},
"title": "Basic Information",
"type": "object",
"id": "basic_info",
"required": ["cadi_id"]
}
I have the abstract field with acl. It works fine when the user(not in acl) is editing the abstract field and the validation error is thrown when the user is not in acl.
The problem comes when user(not in acl) is editing other field like conclusion and have the same ValidationError.
When editing any field in basic_info, for example conclusion field, the whole basic_info object is processed in the validator beacuse it's the parent field and now user should be able to edit the conclusion field because there is no acl set. but it gives the ValidationError because we also receive the abstract (which is unchanged) in the basic_info and it goes to validate method and since the user is not in acl it gives ValidationError .
Please let me know what I am missing here to let the user(not in acl) to edit the non acl field?
I tried to get the previous value from the db and check if the controlled field is edited by user or not, but it doesn't seems efficient for this use case and I want to know if there is any native way to do the field level validation. I could not find anything in the docs.

Sequelize "raw = true" changes json model attribute name with dot

I have Content model and it has many sub ContentImage item, like this;
const content = await db.Content.findOne({
where: {
permalink: req.params.permalink
},
include: [{
model: db.ContentImages
}]
raw: true
});
As you know raw:true covert SequelizeInstance to Object model. I have some problem in this point.
If I use raw:true, json model show me like this;
{
"id": 4706,
"name": "Content Title",
"content": "Content detail",
"t_content_images.id": 7633,
"t_content_images.content_id": 4706,
"t_content_images.image": "content-image-1.jpg",
"t_content_images.order_no": 1
}
Because of expressjs, I need like this model instead of SequelizeInstance;
{
"id": 4706,
"name": "Content Title",
"content": "Content detail",
"t_content_images": {
"id": 7633,
"content_id": 4706,
"image": "content-image-1.jpg",
"order_no": 1
}
}
Another problem, I have multiple content image and if I use like above first sample, it returns me just first content image.

You are thinking about it a bit backwards - when you use raw: true it doesn't convert it from a JSON object to a Model Instance.
If you think of how SQL results are structured, they always come back flat. This means that for joins where you have one base record linked to multiple children (Content -< ContentImages in this case) then the SQL results will repeat the info for the base record for each of the children. Sequelize will parse this into a JSON object, which is what you are seeing in the first example in your question. If you leave out raw: true then it will take it a step further and parse it into an instance of your model. You can then call Model.toJSON() to get a JSON representation of the parsed object.
Given the above, if you are fetching lots of children then it can be more efficient to get the data into two queries instead of one so you don't have to send the repeating data to the client.

Storing a complex Query within MongoDb Document [duplicate]

This is the case: A webshop in which I want to configure which items should be listed in the sjop based on a set of parameters.
I want this to be configurable, because that allows me to experiment with different parameters also change their values easily.
I have a Product collection that I want to query based on multiple parameters.
A couple of these are found here:
within product:
"delivery" : {
"maximum_delivery_days" : 30,
"average_delivery_days" : 10,
"source" : 1,
"filling_rate" : 85,
"stock" : 0
}
but also other parameters exist.
An example of such query to decide whether or not to include a product could be:
"$or" : [
{
"delivery.stock" : 1
},
{
"$or" : [
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 60
}
},
{
"delivery.filling_rate" : {
"$gt" : 90
}
}
]
},
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 40
}
},
{
"delivery.filling_rate" : {
"$gt" : 80
}
}
]
},
{
"$and" : [
{
"delivery.delivery_days" : {
"$lt" : 25
}
},
{
"delivery.filling_rate" : {
"$gt" : 70
}
}
]
}
]
}
]
Now to make this configurable, I need to be able to handle boolean logic, parameters and values.
So, I got the idea, since such query itself is JSON, to store it in Mongo and have my Java app retrieve it.
Next thing is using it in the filter (e.g. find, or whatever) and work on the corresponding selection of products.
The advantage of this approach is that I can actually analyse the data and the effectiveness of the query outside of my program.
I would store it by name in the database. E.g.
{
"name": "query1",
"query": { the thing printed above starting with "$or"... }
}
using:
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
Which results in:
2016-03-27T14:43:37.265+0200 E QUERY Error: field names cannot start with $ [$or]
at Error (<anonymous>)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:161:19)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
at insert (src/mongo/shell/bulk_api.js:646:20)
at DBCollection.insert (src/mongo/shell/collection.js:243:18)
at (shell):1:12 at src/mongo/shell/collection.js:161
But I CAN STORE it using Robomongo, but not always. Obviously I am doing something wrong. But I have NO IDEA what it is.
If it fails, and I create a brand new collection and try again, it succeeds. Weird stuff that goes beyond what I can comprehend.
But when I try updating values in the "query", changes are not going through. Never. Not even sometimes.
I can however create a new object and discard the previous one. So, the workaround is there.
db.queries.update(
{"name": "query1"},
{"$set": {
... update goes here ...
}
}
)
doing this results in:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 52,
"errmsg" : "The dollar ($) prefixed field '$or' in 'action.$or' is not valid for storage."
}
})
seems pretty close to the other message above.
Needles to say, I am pretty clueless about what is going on here, so I hope some of the wizzards here are able to shed some light on the matter

I think the error message contains the important info you need to consider:
QUERY Error: field names cannot start with $
Since you are trying to store a query (or part of one) in a document, you'll end up with attribute names that contain mongo operator keywords (such as $or, $ne, $gt). The mongo documentation actually references this exact scenario - emphasis added
Field names cannot contain dots (i.e. .) or null characters, and they must not start with a dollar sign (i.e. $)...
I wouldn't trust 3rd party applications such as Robomongo in these instances. I suggest debugging/testing this issue directly in the mongo shell.
My suggestion would be to store an escaped version of the query in your document as to not interfere with reserved operator keywords. You can use the available JSON.stringify(my_obj); to encode your partial query into a string and then parse/decode it when you choose to retrieve it later on: JSON.parse(escaped_query_string_from_db)

Your approach of storing the query as a JSON object in MongoDB is not viable.
You could potentially store your query logic and fields in MongoDB, but you have to have an external app build the query with the proper MongoDB syntax.
MongoDB queries contain operators, and some of those have special characters in them.
There are rules for mongoDB filed names. These rules do not allow for special characters.
Look here: https://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names
The probable reason you can sometimes successfully create the doc using Robomongo is because Robomongo is transforming your query into a string and properly escaping the special characters as it sends it to MongoDB.
This also explains why your attempt to update them never works. You tried to create a document, but instead created something that is a string object, so your update conditions are probably not retrieving any docs.

I see two problems with your approach.
In following query
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
a valid JSON expects key, value pair. here in "query" you are storing an object without a key. You have two options. either store query as text or create another key inside curly braces.
Second problem is, you are storing query values without wrapping in quotes. All string values must be wrapped in quotes.
so your final document should appear as
db.queries.insert({
"name" : "query1",
"query": 'the thing printed above starting with "$or"... '
})
Now try, it should work.

Obviously my attempt to store a query in mongo the way I did was foolish as became clear from the answers from both #bigdatakid and #lix. So what I finally did was this: I altered the naming of the fields to comply to the mongo requirements.
E.g. instead of $or I used _$or etc. and instead of using a . inside the name I used a #. Both of which I am replacing in my Java code.
This way I can still easily try and test the queries outside of my program. In my Java program I just change the names and use the query. Using just 2 lines of code. It simply works now. Thanks guys for the suggestions you made.
String documentAsString = query.toJson().replaceAll("_\\$", "\\$").replaceAll("#", ".");
Object q = JSON.parse(documentAsString);

mongoose exclude field in array

I have something like:
Schema Subdocument
name: String
data: Mixed
Schema Stuff
documents: [Subdocument]
Now, in my API there are two endpoints, one for the Subdocument and another for Stuff. When I want to get a Subdocument I need to contain the data field, but when I want to get Stuff, I want to show the name of those subdocuments, but I don't want to show the data field because is quite large and it won't be used.
So, to keep things clear, data is not private. It's just that I don't want it to be shown when I get it from Stuff
I tried by doing:
Stuff.findById(id)
.populate("documents")
.populate("-documents.data")
but that doesn't work... I'm getting the Stuffwith the Subdocumentcontaining the name and data. I feel like i'm missing to tell mongoose when I call populate("-documents.data") that documents is an array and I want to exclude the data field for each element in this array.
edit: Sorry the Schema I provided was not for my case. In my case it was not embedded, but a reference, like so:
Schema Subdocument
name: String
data: Mixed
Schema Stuff
documents: [{
type: Schema.Types.ObjectId,
ref: 'Subdocument'
}]

Assuming subDocument is not embedded and using as "ref" as you say populate is working but data part is not included:
Stuff.findById(id).populate( { "path" : "documents", "select" : "-data" })

Your documents have an "embedded" schema, so populate is not used here, it is used only for "referenced" schemas where the other objects are in another collection.
Fortunately with "embedded" there is an easy way using projection:
Stuff.findById(id,{ "documents.name": 1 },function(err,results) {
})
With results like
{ "documents": [{ "name": "this" },{ "name": "that" }] }
Or with .aggregate() and the $map operator:
Stuff.aggregate(
[
{ "$match": { "_id": ObjectID(id) } },
{ "$project": {
"documents": {
"$map": {
"$input": "$documents",
"as": "el",
"in": "$$el.name"
}
}
}}
],function(err,results) {
}
)
That will just tranform into an array of "only" the name "values", which is different to the last form.
{ "documents": ["this", "that"] }
Note, if using .aggregate() you need to properly cast the ObjectId as the autocasting from mongoose schema types does not work in aggregation pipeline stages.

join within a same collection using mongoose

Pardon my ignorance if it is a very basic question.
ABC collection in MongoDB has the following schema.
{
"metadata": {
"store": 5051,
"catg": "XYZ",
},
"category": {
"name": "XYZ",
"id": "CL141778",
}
}
I need to query (where "metadata.catg" == "category.name")
What is the best way to do it without using mongoose dbref ?

In MongoDB you can not compare two fields values with each other in a query. What I expect is that you need to redesign your schema. You need to look at how you query (and update) as supposed to thinking MongoDB is yet another RDBMs.
Show us the real schema, some example documents and a few query types that you need to run and I can see if I can help.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string