stackmob 1 to-many relationship architecture - core-data

I'm searching for some architectural guidance for a 1 to-many relationship for an ios app with a backend that runs on stackmob (baas).
In my project I want to implement a facebook like stream of posts. Users will be allowed to write comments to the posts.
1 post -> n comments
When I create the view (iOS UITableview) of posts I want to display only the number of comments. The actual comments will be displayed in a subview. What is the best way to implement this? How should I query such related objects?
If I first fetch all the posts and then for each post I have to query the number of comments I'm afraid to run into a performance issues :-(
What is your proposal about querying 1 to-many relations without producing to much http overhead...
I found this post where I at least learned that Fetching on an attribute of a relationship (task where project.user == X) is not supported by StackMob:
NSPredicate traversing relationship (StackMob)
Thanks for your support!
Cheers,
Jan

I'm assuming you've read this tutorial.
CoreData uses a mechanism known as "faulting" to reduce the application's memory usage.
From Apple's docs:
"A fault is a placeholder object that represents a managed object that has not yet been fully realized, or a collection object that represents a relationship"
What this means is that when you read an Entity from CoreData it does not reads the entire object graph just to populate that one Entity.
In your case, if you read a "Post" entity it won't read all the comments related to that post. This is true even when you're using StackMob as the DataStore.
I made a very small project just to test this and this is what I got. By using the sniffer Charles I was able to see the request/response of the fetch.
Here is the query I used:
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:#"Post" inManagedObjectContext:self.managedObjectContext];
[fetchRequest setEntity:entity];
[self.managedObjectContext executeFetchRequest:fetchRequest onSuccess:^(NSArray *results) {
for (NSManagedObject *managedObject in results) {
NSArray *comments = [managedObject valueForKey:#"comments"];
NSLog(#"Post comments: %d", comments.count);
}
} onFailure:^(NSError *error) {
NSLog(#"Error fetching: %#", error);
}];
And here is the response I got (from using Charles)
[
{
"post_id": "85FC22A1-92ED-484B-9B52-78CBFE464E5D",
"text": "Post1",
"lastmoddate": 1370916620273,
"createddate": 1370915371016,
"comments": [
"5F421C86-F7E7-4BAD-AB21-F056EB7D9451",
"C1CD96E2-856A-43F0-A71D-09CF14732630",
"BAD9F652-C181-4AA1-8717-05DD5B2CA54E",
"AC7D869B-A2BD-4181-8DCF-21C8E127540F",
"2D1A1C80-9A80-48AE-819D-675851EA17D6"
]
},
{
"post_id": "8A84BE1B-6ECE-4C49-B706-7981490C5C2E",
"text": "Post2",
"lastmoddate": 1370916678114,
"createddate": 1370915379347,
"comments": [
"3C001706-2D91-4183-9A07-33D77D5AB307",
"E7DC3A89-2C3D-4510-83E5-BE13A9E626CF",
"2874B59C-781B-4605-97C7-E4EF7965AF4E"
]
},
{
"post_id": "13E60590-E4E7-4012-A3E2-2F0339498D94",
"text": "Post3",
"lastmoddate": 1370916750434,
"createddate": 1370915398649,
"comments": [
"AEA3E5E3-E32E-4DAA-A649-6D8DE02FB9B2",
"DFCBD7E2-9360-4221-99DB-1DE2EE5590CE",
"484E6832-3873-4655-A6C1-0303A42127D9",
"B2316F8B-0AAF-451F-A91C-9E8B4657B1A5"
]
}
]
As you can see from the response, all of the "comments" arrays have the 'Id' of the comment stored in StackMob, but it doesn't have the actual comment.
Hope this helps!

Related

Data model for nested array of objects in Firestore

I need advice from experienced NoSQL engineers on how I should structure my data.
I want to model my SQL data structure to NoSQL for Google Cloud Firestore.
I have no prior experience with NoSQL databases but I am proficient with traditional SQL.
I use Node.js for writing queries.
So far, I converted three tables to JSON documents with example data:
{
"session": {
"userId": 99992222,
"token": "jwttoken1191891j1kj1khjjk1hjk1kj1",
"created": "timestamp"
}
}
{
"user": {
"id": 99992222,
"username": "userName",
"avatarUrl": "https://url-xxxx.com",
"lastLogin": "2019-11-23 13:59:48.884549",
"created": "2019-11-23 13:59:48.884549",
"modified": "2019-11-23 13:59:48.884549",
"visits": 1,
"profile": true,
"basketDetail": { // I get this data from a third party API
"response": {
"product_count": 2,
"products": [
{
"product_id": 111,
"usageInMinutes_recent": 0,
"usageInMinutes": 0,
"usageInMinutes_windows": 0,
"usageInMinutes_mac": 0,
"usageInMinutes_linux": 0
},
{
"product_id": 222, // no recent usage here
"usageInMinutes": 0,
"usageInMinutes_windows": 0,
"usageInMinutes_mac": 0,
"usageInMinutes_linux": 0
}
]
}
}
}
}
{
"visitor": {
"id": 999922221,
"created": "2019-11-23 13:59:48.884549"
}
}
My questions:
session.userId, user.id, visitor.id can all signify the same user. What is the Firestore equivalent to foreign keys in SQL? How would I connect/join these three collections in a query?
What do I do about the nested object basketDetail? Is it fine where it is or should I define its own collection?
I anticipate queries
occasionally add all the recent usage.
frequently check if a user owns a specific product_id
frequently replace the whole baskedDetail object with new data.
occasionally update one specific product_id.
How would I connect collections user with basketDetail in a query if I separated it?
Thanks for the advice!
session.userId, user.id, visitor.id can all signify the same user. What is the Firestore equivalent to foreign keys in SQL? How would I connect/join these three collections in a query?
Unfortunately, there is not JOIN clause in Firestore. Queries in Firestore are shallow, can only get elements from the collection that the query is run against. There is no way you can get documents from two collections in a single query unless you are using collection group query, but it's not the case since the collections in your project have different names.
If you have three collections, then three separate queries are required. There is no way you can achieve that in a single go.
What do I do about the nested object basketDetail? Is it fine where it is or should I define its own collection?
There are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. So if you think that nested object basketDetail can stay within this limitation then you can use that schema, otherwise, add it to a subcollection. Besides that, all those operations are permitted in Firestore. If you'll have hard times implementing them, post another question so we can take a look at it.
How would I connect collections user with basketDetail in a query if I separated it?
You cannot connect/join two collections. If you separate basketDetail in a subcollection, then two queries are required.

MongoDB - Which one is more efficient - Nested array on a document or a reference to another collection

I am building an Instagram like application and using MongoDB. There are also albums within the app. Currently, database scheme is as follows:
Albums Collections:
[
{
albumId,
title,
posts:[
{
postId,
title,
imageUrl,
comments:[
{
userId,
comment,
date
}
]
likes:[
{
userId,
date
}
]
]
]
}
]
So there are double nested arrays in the album documents that raises the question of is this the right way to do it since this one document might have a very large nested array.
Is it better to create a new Collection called Posts and store every post under that collection and reference it to the album by albumId?
Many thanks

How do I retrieve a DocumentDB id when querying with the MongoDB API?

EDIT: we have just heard back from Microsoft that using the mongodb api against a "traditional" documentdb instance isn't supported. I'll leave this question open until someone comes up with a custom solution or Microsoft enhances the product.
We have successfully connected to our existing DocumentDB instance using the mongodb API, but the find() method does not return the id field inside the results.
The examples on the Query Tutorials page show an id field being returned, but they also show an "_id": "ObjectId(\"58f65e1198f3a12c7090e68c\")" field appended with no explanation as to how that 58f6... string is generated (it doesn't appear anywhere in the sample document at the top of the page).
For example, we can find this document using the query find({"type":"attribute", "association.Season":"Spring"}):
{
"id": "ATTR-TEST-ATT-00007",
"type": "attribute",
"name": "Spring Football",
"association": {
"Season": "Spring"
}
}
...but the mongodb API leaves out the id property from the document, so all we see is:
{
"type": "attribute",
"name": "Spring Football",
"association": {
"Season": "Spring"
}
}
Also, if we use find({"id": "ATTR-TEST-ATT-00001"}) nothing gets returned even though there is a document with that id in the database.
We have tried using the project argument to force inclusion of the id field, but it didn't work.
If we add an _id field to the document then the mongodb driver returns the _id field without needing a projection, but this is not a solution as we want to use both the DocumentDB SQL API and the mongodb API at the same time.
{
"id": "ATTR-TEST-ATT-00007",
"_id": "ATTR-TEST-ATT-00007",
"type": "attribute",
"name": "Spring Football",
"association": {
"Season": "Spring"
}
}
How do we get the mongodb API to return the document's id field in queries?

Conditionally update an array in mongoose [duplicate]

Currently I am working on a mobile app. Basically people can post their photos and the followers can like the photos like Instagram. I use mongodb as the database. Like instagram, there might be a lot of likes for a single photos. So using a document for a single "like" with index seems not reasonable because it will waste a lot of memory. However, I'd like a user add a like quickly. So my question is how to model the "like"? Basically the data model is much similar to instagram but using Mongodb.
No matter how you structure your overall document there are basically two things you need. That is basically a property for a "count" and a "list" of those who have already posted their "like" in order to ensure there are no duplicates submitted. Here's a basic structure:
{
"_id": ObjectId("54bb201aa3a0f26f885be2a3")
"photo": "imagename.png",
"likeCount": 0
"likes": []
}
Whatever the case, there is a unique "_id" for your "photo post" and whatever information you want, but then the other fields as mentioned. The "likes" property here is an array, and that is going to hold the unique "_id" values from the "user" objects in your system. So every "user" has their own unique identifier somewhere, either in local storage or OpenId or something, but a unique identifier. I'll stick with ObjectId for the example.
When someone submits a "like" to a post, you want to issue the following update statement:
db.photos.update(
{
"_id": ObjectId("54bb201aa3a0f26f885be2a3"),
"likes": { "$ne": ObjectId("54bb2244a3a0f26f885be2a4") }
},
{
"$inc": { "likeCount": 1 },
"$push": { "likes": ObjectId("54bb2244a3a0f26f885be2a4") }
}
)
Now the $inc operation there will increase the value of "likeCount" by the number specified, so increase by 1. The $push operation adds the unique identifier for the user to the array in the document for future reference.
The main important thing here is to keep a record of those users who voted and what is happening in the "query" part of the statement. Apart from selecting the document to update by it's own unique "_id", the other important thing is to check that "likes" array to make sure the current voting user is not in there already.
The same is true for the reverse case or "removing" the "like":
db.photos.update(
{
"_id": ObjectId("54bb201aa3a0f26f885be2a3"),
"likes": ObjectId("54bb2244a3a0f26f885be2a4")
},
{
"$inc": { "likeCount": -1 },
"$pull": { "likes": ObjectId("54bb2244a3a0f26f885be2a4") }
}
)
The main important thing here is the query conditions being used to make sure that no document is touched if all conditions are not met. So the count does not increase if the user had already voted or decrease if their vote was not actually present anymore at the time of the update.
Of course it is not practical to read an array with a couple of hundred entries in a document back in any other part of your application. But MongoDB has a very standard way to handle that as well:
db.photos.find(
{
"_id": ObjectId("54bb201aa3a0f26f885be2a3"),
},
{
"photo": 1
"likeCount": 1,
"likes": {
"$elemMatch": { "$eq": ObjectId("54bb2244a3a0f26f885be2a4") }
}
}
)
This usage of $elemMatch in projection will only return the current user if they are present or just a blank array where they are not. This allows the rest of your application logic to be aware if the current user has already placed a vote or not.
That is the basic technique and may work for you as is, but you should be aware that embedded arrays should not be infinitely extended, and there is also a hard 16MB limit on BSON documents. So the concept is sound, but just cannot be used on it's own if you are expecting 1000's of "like votes" on your content. There is a concept known as "bucketing" which is discussed in some detail in this example for Hybrid Schema design that allows one solution to storing a high volume of "likes". You can look at that to use along with the basic concepts here as a way to do this at volume.

CouchDB: is it possible to access linked documents inside filter function?

In a contact management app, each user will have his own database. When users wish to share certain categories of contacts with others, a backend will initiate a replication. Each contact is its own document, but also has various children documents such as notes and appointments.
Here is an example...
Contact:
{
"_id": 123,
"type": "contact",
"owner": "jimmy",
"category": "customer",
"name": "Bob Jones",
"email": "bob#example.com"
}
Note:
{
"_id": 456,
"type": "note",
"owner": "jimmy",
"contact_id": 123,
"timestamp": 1383919278,
"content": "This is a note about Bob Jones"
}
So let's say Jimmy wants to share his only his customers with sales manager Kevin, while his personal contacts remain private. When the note passes through the replication filter, is it possible to access the linked contact's category field?
Or do I have to duplicate the category field in every single child of a contact? I would prefer not to have to do this, as each contact may have many children which I would have to update manually every time the category changes.
Here is some pseudo-code for the filter function:
function(doc, req)
{
if(doc.type == “contact”) {
if(doc.category == req.query.category) {
return true;
}
}
else if(doc.contact_id) {
if(doc.contact.category == req.query.category) {
return true;
}
}
return false;
}
If this is possible, please describe how to do it. Thanks!
There are some other options.
There's a not-so-well-known JOIN trick in CouchDB. Instead of using replication, however, you'll have to share the results of a MapReduce View -- unfortunately you can use a view as a filter for replication. If you're using Cloudant (disclaimer: I'm employed by Cloudant) you can use chained-MapReduce to output the result to another database that you could then replication from...
Additionally, I think this SO post/answer on document structures and this join trick could be helpful: Modeling relationships on CouchDB between documents?
No, this is not possible. Each document must be consistent so it has no any explicit relations with others documents. Having contact_id value as reference is just an agreement from your side - CouchDB isn't aware about this.
You need to literally have category document be nested within contact one to do such trick e.g. have single document to process by filter function. This is good solution from point when you need to have consistent state of contact document.

Resources