Storing and retrieving JavaScript objects in/from MongoDB

Storing and retrieving JavaScript objects in/from MongoDB - object

I am currently playing around with node.js and MongoDB using the node-mongo-native driver.
I tested a bit around using the Mongo console storing and retrieving JS objects. I figured out, that if I store an object that contains functions/methods the methods and functions will also be stored in the collection. This is interesting since I thought that functions could not be stored in MongoDB (with the exception of the system.js collection, as suggested by the Mongo docs).
Also it will not only store the methods but actually each method and member of the object's entire prototype chain. Besides that I dont like this behaviour and think it's unintuitive I mustn't have it.
I was going to manage users in a Mongo collection. To do this I have a User object containing all of the users methods functioning as a prototype for each instance of an user. The user object itself would only contain the users attributes.
If I store a user in the Mongo collection I only want to store the own properties of the user object. No prototype members and especially no prototype methods. Currently I do not see how to cleanly do this. The options that I figured might work are:
creating a shallow copy using foreach and hasOwnProperty and storing this copy in the collection.
Add a data attribute to each user that contains all the object's attributes and can be stored in the collection.
This just came to my mind writing this: I could also set all the prototypes properties to not enumerable which should prevent them from being stored in the collection.
However, I do have the same issues the other way around: when loading a user from a collection. AFAIK there is no way to change an objects prototype in JavaScript after it was created. And there's also no way to specify a prototype to use when Mongo instantiates objects it retrieved from a collection. So basically I always get objects that inherit from Object using Mongo. As far as I can tell I have 2 options to restore a usable user object from this point on:
Create a fresh object inheriting from User and copying each attribute on the result object to the newly created object. (Compatible to storing mechanisms 1 & 3)
Create a fresh object inheriting from User and storing the result object as a data attribute on the newly created object. (Compatible to storing mechanism 2)
Are my assumptions, especially about the possibility to specify a prototype for query results, correct? What's the right way to do it, and why? I'm surely not the first person struggling to store and resurrect objects in/from MongoDB using node.js.
Currently I would go with the approach 2/2. I don't really like it, but it is the most efficient and the only one that works cleanly with the API. However, I'd much rather hear that actually the API does nothing wrong, but I do for not knowing how to use it correctly. So please, enlighten me :)

I just recently realized, that it actually is possible to change an objects prototype in V8/node. While this is not in the standard it is possible in various browsers and especially in V8/node!
function User(username, email) {
this.username = username;
this.email = email;
}
User.prototype.sendMail = function (subject, text) {
mailer.send(this.email, subject, text);
};
var o = {username: 'LoadeFromMongoDB', email: 'nomail#nomail.no'};
o.__proto__ = User.prototype;
o.sendMail('Hello, MongoDB User!', 'You where loaded from MongoDB, but inherit from User nevertheless! Congratulations!');
This is used all over various modules and plugins - even core modules make use of this technique, allthough it is not ECMAScript standard. So I guess it is safe to use within node.js.

I'm not sure I'm following you question exactly... but fwiw one thing came to mind: Have you checked out the Mongoose ORM? (http://mongoosejs.com/)
It gives you a lot of options when it comes to defining models and methods. In particular "Virtuals" might be of interest (http://mongoosejs.com/docs/virtuals.html).
Anyway, hope it helps some!

Related

global identifiers? - iCloud + Core Data + Ensembles - duplicates when deleting objects

I am trying to implement iCloud sync in my Core Data app. I am not that pro in programming and this is really an advanced topic I learned... I found that Core Data sync Framework "Ensembles" by Drew McCormack. It seems to make iCloud Sync much easier.
I integrated it in my App and syncing does work quite well as long as I add new objects to my Core Data model. But when I delete an object, it creates duplicates. And then duplicates from duplicates. I ended up having the same Entry (object) like 3-4 times...
Why is that? What am I doing wrong? I did some research and my guess is that global identifiers could solve this?
What are global identifiers? My guess is that they help to avoid duplicates!? But how do I set this? I really have no idea, did a lot of research but couldn´t find an answer to that.
Thanks for help!
Update:
Thanks for help! I read the readme and the book, but since i am beginner not everything is clear to me.
I think I understand the use of global identifiers in Ensembles now, but I don´t know if I´m doing it correctly.
If I understand it right, I have to assign an identifier to each object. I can do this by storing it in an attribute. This identifier can be anything as long as it is unique and a NSString?
In my app the user can store different things, let´s say name, text, title, date and so on. The app is based on the Master-Detail-View template in Xcode and uses Core Data. My Core Data model has only a single entity with some attributes, most are strings and a NSDate. No relationships or anything. If the user hits "+" a new object is created and I store the things the user enters in the attributes.
What I did to add global identifiers is to add a new attribute that stores it.
So when a new object is created i do
/// I did find that to use as identifier !?
NSString *taskUniqueStringKey = newManagedObject.objectID.URIRepresentation.absoluteString;
/// and store it in the attribute.
[newManagedObject setValue:taskUniqueStringKey forKey:#"coreDataObjectID"];
Then i use this:
- (NSArray *)persistentStoreEnsemble:(CDEPersistentStoreEnsemble *)ensemble globalIdentifiersForManagedObjects:(NSArray *)objects
{
return [objects valueForKeyPath:#"coreDataObjectID"];;
}
This seems to work for me. But am I doing it right? Is this the right place to assign a global identifier? I have no awakeFromInsert !?
If this is working, I got the next problem. My app is already live and older entries that the user saved before the update will be missing the global identifier. What can I do about that? I thought what I already got and what is unique and the only thing I can think of is an attribute that saves [NSDate date] when the object is created.
I was trying to use this but I failed because Ensembles will only accept NSString and not NSDate!? Can I use this date attribute, is this unique enough and working as gloabl identifier? And if yes, could you please give me code example in how to convert this from date to string?
Syncing with Ensembles works quite good. No duplicates anymore, you can just switch off iCloud and the entries stay and switch it on again and it syncs like it should without loosing locally stored objects or so. Ensembles is really cool! I am seeing some minor strange behaviors like sometimes sync takes long, sometimes it´s really quick and if I edit things in a short time period on two different devices it gets a bit messed up like an object that I just deleted reappears. But I guess that´s normal? If I take some time between using the app on the different devices everything works fine.
Do I understand it right, there is only that one method to call for sync:
- (void)syncWithCompletion:(void(^)(void))completion
{
if (self.ensemble.isMerging) return;
if (!self.ensemble.isLeeched) {
[self.ensemble leechPersistentStoreWithCompletion:^(NSError *error) {
if (error) NSLog(#"Error in leech: %#", error);
if (completion) completion();
}];
}
else {
[self.ensemble mergeWithCompletion:^(NSError *error) {
if (completion) completion();
}];
}
and you just call it if needed? There is nothing else like doing merge without leeching before, or a method like "this is the actual status - save it like it is now" ?
There are different points in the app where you want to sync. On app start and when terminating will be a good point. In my app there are two points where I should sync I guess: when adding an object and save it to Core Data and when I save changes to the object. I could also provide a button like "sync now". Is this a good approach and do I always just call
[self syncWithCompletion:NULL];
Another question that came up. Can I exclude objects from sync with Ensembles? My app loads tutorial entries as objects once on first app start. I don´t want to sync them if that´s possible somehow?
Thanks a lot for your help! If I could help you with anything like localizing in german or so let me know ! ;)

Yes, this is almost certainly due to not setting up global identifiers for your objects, or at least not doing it properly.
When you leech your ensemble, the local persistent store is imported into the sync data. Without global identifiers, Ensembles will assign random ids to your objects, so it can track them across devices.
Duplicates arise when you leech a second device that has the same data. Ensembles has no way to know that the data represents the same logical objects as on the other device, so it again assigns random ids. Effectively, it treats the objects on each device as being completely independent, so that all end up in your data set after syncing.
The solution is global identifiers. By implementing a CDEPersistentStoreEnsemble delegate method, you can provide Ensembles with global ids, which it can use to identify which objects on different devices belong together.
What should you use for global ids? Often, just a UUID, though for singleton like objects you will just want to pick an id.
You can initialize them in awakeFromInsert. You can store the global ids in attributes on your entities. (Note that if you are migrating an existing app, you will want to check with a fetch if the global ids have been generated BEFORE you try to leech the store for syncing.)
More details are in the README on GitHub and in the book at leanpub.
Update
To answer your update questions:
Yes, an identifier just has to be a string, and immutable. It should not change once assigned.
The NSManagedObjectID is not a very good global identifier, in that it will be different on different devices. We really want something that is global across devices.
If you are starting from scratch, using NSUUID is a good approach. Just create a unique id, and store it in the object.
If you have an existing app, and it has been syncing via another mechanism, you need to come up with a way to provide the same global identifiers on each device. One way to do that is mash up the object properties in some way. Usually that will give you a pretty-close-to-unique value, and it will be good enough for the transition.
As an example, you do a quick fetch, and discover that your objects don't yet have global ids. You go through the objects, and set the global ids to a string comprised of creationDate + text. (You could even shorten this by taking a hash, but it probably isn't that important.) After this initial 'migration' to global identifiers, you would just use UUIDs for any newly created objects.
Note that you don't have to use awakeFromInsert. That is simply a convenient place to put it. As long as you assign the global identifier before saving the object you should be fine.
The easiest way to get a string from an NSDate is to call the description method, but another way would be to get a double using timeIntervalSince1970, and turning that into a string. (Be careful with dates as unique identifiers on their own: often objects created together will have the same creation date.)
You are correct about how you should do a sync: you can simply call syncWithCompletion:.
To answer the question about excluding objects: You can't exclude individual objects, mainly because it could become tricky when those objects have relationships to synced objects. You can handle these objects in one of two ways:
Put them in a separate persistent store, and add that store to the same persistent store coordinator.
Sync the objects, but give them global ids manually, so that the objects are treated the same on each device. Eg. You could just give global ids as 'Sample1', 'Sample2', etc.

To integrate Drew's answer, I guess the two steps are the following.
1 Implement CDEPersistentStoreEnsemble delegate method (see README)
- (NSArray *)persistentStoreEnsemble:(CDEPersistentStoreEnsemble *)ensemble
globalIdentifiersForManagedObjects:(NSArray *)objects {
return [objects valueForKeyPath:#"yourUniqueIdentifier"];
}
2 Generate the unique identifier for a NSManagedObject subclass
- (void)awakeFromInsert {
[super awakeFromInsert];
if (!self.yourUniqueIdentifier) {
self.yourUniqueIdentifier = [[NSUUID UUID] UUIDString];
}
}
In awakeFromInsert you can initialize special default property values, like for example an identifier.
The check is necessary, for example, when you have parent-child contexts. Otherwise you are overwriting the identifier previously set. See Why is awakeFromInsert called twice?.

Retrieving a value object without Aggreteroot

I'm developing an application with Domain Drive Design approach. in a special case I have to retrieve the list of value objects of an aggregate and present them. to do that I've created a read only repository like this:
public interface IBlogTagReadOnlyRepository : IReadOnlyRepository<BlogTag, string>
{
IEnumerable<BlogTag> GetAllBlogTagsQuery(string tagName);
}
BlogTag is a value object in Blog aggregate, now it works fine but when I think about this way of handling and the future of the project, my concerns grow! it's not a good idea to create a separate read only repository for every value object included in those cases, is it?
anybody knows a better solution?

You should not keep value objects in their own repository since only aggregate roots belong there. Instead you should review your domain model carefully.
If you need to keep track of value objects spanning multiple aggregates, then maybe they belong to another aggregate (e.g. a tag cloud) that could even serve as sort of a factory for the tags.
This doesn't mean you don't need a BlogTag value object in your Blog aggregate. A value object in one aggregate could be an entity in another or even an aggregate root by itself.
Maybe you should take a look at this question. It addresses a similar problem.

I think you just need a query service as this method serves the user interface, it's just for presentation (reporting), do something like..
public IEnumerable<BlogTagViewModel> GetDistinctListOfBlogTagsForPublishedPosts()
{
var tags = new List<BlogTagViewModel>();
// Go to database and run query
// transform to collection of BlogTagViewModel
return tags;
}
This code would be at the application layer level not the domain layer.
And notice the language I use in the method name, it makes it a bit more explicit and tells people using the query exactly what the method does (if this is your intent - I am guessing a little, but hopefully you get what I mean).
Cheers
Scott

How and where do you define your database structure in Meteor?

I am looking at the documentation for Meteor and it gives a few examples. I'm a bit confused about two things: First, where do you build the db (keeping security in mind)? Do I keep it all in the server/private folder to restrict client-side access? And second, how do I define the structure? For example, the code they show:
Rooms = new Meteor.Collection("rooms");
Messages = new Meteor.Collection("messages");
Parties = new Meteor.Collection("parties");
Rooms.insert({name: "Conference Room A"});
var myRooms = Rooms.find({}).fetch();
Messages.insert({text: "Hello world", room: myRooms[0]._id});
Parties.insert({name: "Super Bowl Party"});
I don't understand how a collection's structure is defined. Are they just able to define a collection and throw arbitrary data into it?

To answer your first question about where to put the new Meteor.Collection statements, they should go in a .js file in a folder accessible by both client and server, such as /collections. (With some exceptions: any collections that are never synced to the client, like server logs, should be defined inside /server somewhere; and any local collections should be defined in client code.)
As for your second question about structure: MongoDB is a document database, which by definition has no structure. Per the docs:
A database holds a set of collections. A collection holds a set of
documents. A document is a set of key-value pairs. Documents have
dynamic schema. Dynamic schema means that documents in the same
collection do not need to have the same set of fields or structure,
and common fields in a collection’s documents may hold different types
of data.
You may also have heard this called NoSQL. Each document (record in SQL parlance) can have different fields. Hence, there's no place where you define initial structure for a collection; each document gets its "structure" defined when it's inserted or updated.
In practice, I like to create a block comment above each new Meteor.Collection statement explaining what I intend the structure to be for most or all documents in that collection, so I have something to refer to later on when I insert or update the collection's documents. But it's up to me in those insert or update functions to follow whatever structure I define for myself.

A good practice would probably be defining your collection on both client and server with a single bit of javascript code. In other words, put the following
MyCollection = new Meteor.Collection("rooms");
// ...
anywhere but neither in the client nor in the server directory. Note that this directive alone does not expose any sensitive data to nobody.
A brand new meteor project would contain by default the insecure and autopublish packages. The former will basically allow any client to alter your database in every possible way, i.e. insert, update and remove documents. The latter will make sure that all database content is published to everyone, no matter how ridiculously this may sound. But fear not! Their only goal is to simplify the development process at the very early stage. You should get rid of these to guys from your project as soon as you start considering security issues of any kind.
As soon as the insecure package is removed from your project you can control the database privileges by defining MyCollection.allow and MyCollection.deny rules. Please check the documentation for more details. The only thing I would like to mention here is that this code should probably be considered as a sensitive one, so I guess you should put it into your server directory.
Removing the autopublish package has effect on the set of data that will be sent to your clients. Again you can control it and define privilages of your choice by implementing a custom Meteor.publish routine. This is all documented here. Here, you have no option. The code can only run in the server environment, so the best choice would be to put it in the server directory.
About your second question. The whole buzz about NoSQL databases (like mongodb) is to put as few restrictions on the structure of your database as possible. In other words, how the collections are structured is only up to you. You don't have to define no models and you can change the structure of your documents (and or remove fields) any time you want. Doesn't it sound great? :)

Referring to session-based data in a Mongoose virtual attribute

I have a Mongoose model that holds Places. Each place has a lat/lng. I can define a Mongoose virtual attribute called distance that would be used to sort Places in ascending order. What's the best way to refer to the user's location information (let's assume it's stored in a session variable for now) from inside the distance virtual attribute?

For anything involving external data, adding a method to the schema would be a better choice than a virtual property.

I'm solving a similar issue. The problem is that methods are fine if you want perform an operation on a single value but I'm retrieving a list and want to inject a new virtual field into every record in the list - but use session data to generate the field. to do this safely (avoiding globals), I think I'll need to use a QueryStream and inject the new field using an ArrayFormatter that takes the session variables as constructor parameters.
This also looks like a job for LINQ so another approach might be to use one of the ports of LINQ to JS.

If you sill prefer to use virtuals, you can store user location info in NodeJs globals. For example this code may be set after user login:
global.user_location = user.location;

Preventing StackOverflowException while serializing Entity Framework object graph into Json

I want to serialize an Entity Framework Self-Tracking Entities full object graph (parent + children in one to many relationships) into Json.
For serializing I use ServiceStack.JsonSerializer.
This is how my database looks like (for simplicity, I dropped all irrelevant fields):
I fetch a full profile graph in this way:
public Profile GetUserProfile(Guid userID)
{
using (var db = new AcmeEntities())
{
return db.Profiles.Include("ProfileImages").Single(p => p.UserId == userId);
}
}
The problem is that attempting to serialize it:
Profile profile = GetUserProfile(userId);
ServiceStack.JsonSerializer.SerializeToString(profile);
produces a StackOverflowException.
I believe that this is because EF provides an infinite model that screws the serializer up. That is, I can techincally call: profile.ProfileImages[0].Profile.ProfileImages[0].Profile ... and so on.
How can I "flatten" my EF object graph or otherwise prevent ServiceStack.JsonSerializer from running into stack overflow situation?
Note: I don't want to project my object into an anonymous type (like these suggestions) because that would introduce a very long and hard-to-maintain fragment of code).

You have conflicting concerns, the EF model is optimized for storing your data model in an RDBMS, and not for serialization - which is what role having separate DTOs would play. Otherwise your clients will be binded to your Database where every change on your data model has the potential to break your existing service clients.
With that said, the right thing to do would be to maintain separate DTOs that you map to which defines the desired shape (aka wireformat) that you want the models to look like from the outside world.
ServiceStack.Common includes built-in mapping functions (i.e. TranslateTo/PopulateFrom) that simplifies mapping entities to DTOs and vice-versa. Here's an example showing this:
https://groups.google.com/d/msg/servicestack/BF-egdVm3M8/0DXLIeDoVJEJ
The alternative is to decorate the fields you want to serialize on your Data Model with [DataContract] / [DataMember] fields. Any properties not attributed with [DataMember] wont be serialized - so you would use this to hide the cyclical references which are causing the StackOverflowException.

For the sake of my fellow StackOverflowers that get into this question, I'll explain what I eventually did:
In the case I described, you have to use the standard .NET serializer (rather than ServiceStack's): System.Web.Script.Serialization.JavaScriptSerializer. The reason is that you can decorate navigation properties you don't want the serializer to handle in a [ScriptIgnore] attribute.
By the way, you can still use ServiceStack.JsonSerializer for deserializing - it's faster than .NET's and you don't have the StackOverflowException issues I asked this question about.
The other problem is how to get the Self-Tracking Entities to decorate relevant navigation properties with [ScriptIgnore].
Explanation: Without [ScriptIgnore], serializing (using .NET Javascript serializer) will also raise an exception, about circular
references (similar to the issue that raises StackOverflowException in
ServiceStack). We need to eliminate the circularity, and this is done
using [ScriptIgnore].
So I edited the .TT file that came with ADO.NET Self-Tracking Entity Generator Template and set it to contain [ScriptIgnore] in relevant places (if someone will want the code diff, write me a comment). Some say that it's a bad practice to edit these "external", not-meant-to-be-edited files, but heck - it solves the problem, and it's the only way that doesn't force me to re-architect my whole application (use POCOs instead of STEs, use DTOs for everything etc.)
#mythz: I don't absolutely agree with your argue about using DTOs - see me comments to your answer. I really appreciate your enormous efforts building ServiceStack (all of the modules!) and making it free to use and open-source. I just encourage you to either respect [ScriptIgnore] attribute in your text serializers or come up with an attribute of yours. Else, even if one actually can use DTOs, they can't add navigation properties from a child object back to a parent one because they'll get a StackOverflowException.
I do mark your answer as "accepted" because after all, it helped me finding my way in this issue.

Be sure to Detach entity from ObjectContext before Serializing it.
I also used Newton JsonSerializer.
JsonConvert.SerializeObject(EntityObject, Formatting.Indented, new JsonSerializerSettings { PreserveReferencesHandling = PreserveReferencesHandling.Objects });

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string