Can you query the DB inside a validate_doc_update function?

Can you query the DB inside a validate_doc_update function? - couchdb

I am using validate_doc_update functions to do basic validation on the object to be stored. This is great to ensure that certain fields are present for example. But is there a way to do a validation based on a query from within the validate_doc_update function? For example, I want people to be able to sign up to bring items to a potluck. So each object will have fields for name, phone, and food (e.g. soda, salad, chips). So my validation function will check for each of those fields. No problem. But I also want to make sure that no more than two people sign up for the same food (not just a basic unique constraint). So if a new object with food value "chips" is being validated, and there are already 2 objects with food value "chips" in the DB, the validation should fail. Is there a way to do this with validation docs?

There is no facility to run a query in validate_doc_update.
One way to solve this issue is to decouple food items from user documents; instead have a document that represents the potluck:
{
_id: "potluck",
chips: {
needed: 2,
providers: ["user_id_1"]
},
soda: {
needed: 5,
providers: ["user_id_2","user_id_3"]
}
}
Here it is quite easy to validate sign ups of items. This document exudes a lot of information e.g. the number of items needed for any item is always needed - providers.length. User id's link food items users have signed up to provide.
It would be easy to generate a potluck report using a view or two with this approach.

Related

Cloudant/Couchdb Architecture

I'm building an address-book app that uses a back-end Cloudant database. The database stores 3 types of documents:
-> User Profile document
-> Group document
-> User-to-Group Link document
As the names of the document go, there are users in my database, there are groups for users(like whatsapp), and there are link documents for each user to a group (the link document also stores settings/privileges of that user in that group).
My client-side app on login, queries cloudant for the user document, and each group document using view collation over the link documents of that user.
Then using the groups that I have identified above, I find all the other users of that group.
Now, the challenge is that I need to monitor any changes on the group and user documents. I am using pouchdb on the app side, and can invoke the 'changes' API against the ids of all the group and user documents. But the scale of this can be maybe 500 users in each group, and a logged in user being part of 10-50 groups. That multiplied to 1000s of users will become a nightmare for the back-end to support.
Is my scalability concern warranted? Or is this normal for cloudant?

If I understand your schema correctly, you documents of this form:
{
_id: "user:glynn",
type: "user",
name: "Glynn Bird"
}
{
_id: "group:Developers",
type: "group",
name: "Software Developers"
}
{
_id: "user:glynn:developers"
}
In the above example, the primary key's sorting allows a user and all of its memberships to be retrieved by using startkey and endkey parameters do the database's _all_docs endpoint.
This is "scalable" in the sense that if is efficient for Cloudant retrieve data from a primary or secondary index because the index is held in a b-tree so data with adjacent keys is store next to each other. A limit parameter can be used to paginate through larger data sets.

yes the documents are more or less how you've specified.
Link documents are as follows:
{
"_id": <AutoGeneratedID>,
"type": "link",
"user": user_id,
"group": group_id
}
I've written the following view map function:
if(type == "link") {
emit(doc.user, {"_id": doc.user});
emit([doc.user, doc.group], {"_id": doc.group});
emit([doc.group, doc.user], {"_id": doc.user});
}
using the above 3 indexes and include-docs=true, 1st lets me get my logged-in user document, 2nd lets me get all group documents for my logged-in user (using start and end key), and 3rd lets me get all other user documents for a group (using start and end key again).
Fetching the documents is done, but now I need to monitor changes on users of each group, for this, don't I need to query the changes API with array of user ids ? Is there any other way ?
Cloudant retrieve data from a primary or secondary index because the
index is held in a b-tree so data with adjacent keys is store next to
each other
Sorry, I did not understand this statement ?
Thanks.

Part 1.
I recommend to get rid of the "link" type here - it's good for SQL world, but not for CouchDb.
Instead of this, it is better to utilize a benefit of Document Storage, i.e. store user groups in property "Groups" for "User"; and property "Users" for "Group".
With this approach you can set up filtered replication to process only changes of specific groups and these changes will already contain all the users of the group.
I want to notice, that I made an assumption, that number of groups for a user and number of groups is reasonable (hundreds at maximum) and doesn't change frequently.
Part 2.
You can just store ids in these properties and then use Views to "join" other data. Or I was also thinking about other approach (for my use case, but yours is similar):
1) Group contains only ids of users - no views needed.
2) You create a view of each user contacts, i.e. for each user get all users with whom he has mutual groups.
3) Replicate this view to client app.
When user opens a group, values (such as names and pics of contacts are taken from this local "dictionary").
This approach can save some traffic.
Please, let me know what do you think. Because right now I'm working on designing architecture of my solution. Thank you!)

MongoDb and Storing Relationships Between Objects

I am currently planning the development of an application using Node and I am stuck as to whether or not I should use MongoDb as a databse. Ideally I would like to use it. I understand how it works in general, but what I don't understand is how to reference other objects within a document model.
For example, let's say I have two objects; a User and an Order object.
{
Order : {
Id: 1,
Amount: 23.95
}
}
{
User: {
Id: 1,
Orders: [ ]
}
}
Essentially, a User will place an order, and upon creation of that Order object, I would like for the User object to update the Orders array appropriately.
First of all, I hear alot about MongoDb lacking relational functionality. So would I be able to store a reference to that order in the Orders array, perhaps by ID? Or should I just store a duplicate of the order object into the array?

If I were you, I would have a field named userId in Order to keep a reference to the user creating the order. Because the relation between User and Order is one-to-many, User may have many Order but Order only have one User.

DDD/CQRS: Combining read models for UI requirements

Let's use the classic example of blog context. In our domain we have the following scenarios: Users can write Posts. Posts must be cataloged at least in one Category. Posts can be described using Tags. Users can comment on Posts.
The four entities (Post, Category, Tag, Comment) are implemented as different aggregates because of I have not detected any rule for that an entity data should interfere in another. So, for each aggregate I will have one repository that represent it. Too, each aggregate reference others by his id.
Following CQRS, from this scenario I have deducted typical use cases that result on commands such as WriteNewPostCommand, PublishPostCommand, DeletePostCommand etc... along with their respective queries to get data from repositories. FindPostByIdQuery, FindTagByTagNameQuery, FindPostsByAuthorIdQuery etc...
Depending on which site of the app we are (backend or fronted) we will have queries more or less complex. So, if we are on the front page maybe we need build some widgets to get last comments, latest post of a category, etc... Queries that involve a simple Query object (few search criterias) and a QueryHandler very simple (a single repository as dependency on the handler class)
But in other places this queries can be more complex. In an admin panel we require to show in a table a relation that satisfy a complex search criteria. Might be interesting search posts by: author name (no id), categories names, tags name, publish date... Criterias that belongs to different aggregates and different repositories.
In addition, in our table of post we dont want to show the post along with author ID, or categories ID. We need to show all information (name user, avatar, category name, category icon etc).
My questions are:
At infrastructure layer, when we design repositories, the search methods (findAll, findById, findByCriterias...), should have return the corresponding entity referencing to all associations id's? I mean, If a have a method findPostById(uuid) or findPostByCustomFilter(filter), should return a post instance with a reference to all categories id it has, all tags id, and author id that it has? Or should my repo have some kind of method that populates a given post instance with the associations I want?
If I want to search posts created from 12/12/2014, written by John, and categorised on "News" and "Videos" categories and tags "sci-fi" and "adventure", and get the full details of each aggregate, how should create my Query and QueryHandler?
a) Create a Query with all my parameters (authorName, categoriesNames, TagsNames, if a want retrive User, Category, Tag association full detailed) and then his QueryHandler ensamble the different read models in a only one. Or...
b) Create different Queries (FindCategoryByName, FindTagByName, FindUserByName) and then my web controller calls them for later
call to FindPostQuery but now passing him the authorid, categoryid, tagid returned from the other queries?
The b) solution appear more clean but it seems me more expensive.

On the query side, there are no entities. You are free to populate your read models in any way suits your requirements best. Whatever data you need to display on (a part of) the screen, you put it in the read model. It's not the command side repositories that return these read models but specialized query side data access objects.
You mentioned "complex search criteria" -- I recommend you model it with a corresponding SearchCriteria object. This object would be technnology agnostic, but it would be passed to your Query side data access object that would know how to combine the criteria to build a lower level query for the specific data store it's targeted at.

With simple applications like this, it's easier to not get distracted by aggregates. Do event sourcing, subscribe to the events by one set of tables that is easy to query the way you want.
Another words, it sounds like you're main goal is to be able to query easily for the scenarios you describe. Start with that end goal. Now write your event handler to adjust your tables accordingly.
Start with events and the UI. Then everything else will fit easily. Google "Event Modeling" as it will help you formulate ideas sound what and how you want to build these style of applications.

I can see three problems in your approach and they need to be solved separately:
In CQRS the Queries are completely separate from the Commands. So, don't try to solve your queries with your Commands pipelines repositories. The point of CQRS is precisely to allow you to solve the commands and queries in very different ways, as they have very different requirements.
You mention DDD in the question title, but you don't mention your Bounded Contexts in the question itself. If you follow DDD, you'll most likely have more than one BC. For example, in your question, it could be that CategoryName and AuthorName belong to two different BCs, which are also different from the BC where the blog posts are. If that is the case and each BC properly owns its own data, the data that you want to search by and show in the UI will be stored potentially in different databases, therefore implementing a query in the DB with a join might not even be possible.
Searching and Reading data are two different concerns and can/should be solved differently. When you search, you get some search criteria (including sorting and paging) and the result is basically a list of IDs (authorIds, postIds, commentIds). When you Read data, you get one or more Ids and the result is one or more DTOs with all the required data properties. It is normal that you need to read data from multiple BCs to populate a single page, that's called UI composition.
So if we agree on these 3 points and especially focussing on point 3, I would suggest the following:
Figure out all the searches that you want to do and see if you can decompose them to simple searches by BC. For example, search blog posts by author name is a problem, because the author information could be in a different BC than the blog posts. So, why not implement a SearchAuthorByName in the Authors BC and then a SearchPostsByAuthorId in the Posts BC. You can do this from the Client itself or from the API. Doing it in the client gives the client a lot of flexibility because there are many ways a client can get an authorId (from a MyFavourites list, from a paginated list or from a search by name) and then get the posts by authorId is a separate operation. You can do the same by tags, categories and other things. The Post will have Ids, but not the extra details about those IDs.
Potentially, you might want more complicated searches. As long as the search criteria (including sorting fields) contain fields from a single BC, you can easily create a read model and execute the search there. Note that this is only for the search criteria. If the search result needs data from multiple BCs you can solve it with UI composition. But if the search criteria contain fields from multiple BCs, then you'll need some sort of Search engine capable of indexing data coming from multiple sources. This is especially evident if you want to do full-text search, search by categories, tags, etc. with large quantities of data. You will need to use some specialized service like Elastic Search and it won't belong to any of your existing BCs, it'll be like a supporting service.

From CQRS you will have a separeted Stack for Queries and Commands. Your query stack should represent a diferente module, namespace, dll or package at your project.
a) You will create one QueryModel and this query model will return whatever you need. If you are familiar with Entity Framework or NHibernate, you will create a Façade to hold this queries togheter, DbContext or Session.
b) You can create this separeted queries, but saying again, if you are familiar with any ORM your should return the set that represents the model, return every set as IQueryable and use LET (Linq Expression Trees) to make your Query stack more dynamic.
Using Entity Framework and C# for exemple:
public class QueryModelDatabase : DbContext, IQueryModelDatabase
{
public QueryModelDatabase() : base("dbname")
{
_products = base.Set<Product>();
_orders = base.Set<Order>();
}
private readonly DbSet<Order> _orders = null;
private readonly DbSet<Product> _products = null;
public IQueryable<Order> Orders
{
get { return this._orders.Include("Items").Include("Items.Product"); }
}
public IQueryable<Product> Products
{
get { return _products; }
}
}
Then you should do queries the way you need and return anything:
using (var db = new QueryModelDatabase())
{
var queryable = from o in db.Orders.Include(p => p.Items).Include("Details.Product")
where o.OrderId == orderId
select new OrderFoundViewModel
{
Id = o.OrderId,
State = o.State.ToString(),
Total = o.Total,
OrderDate = o.Date,
Details = o.Items
};
try
{
var o = queryable.First();
return o;
}
catch (InvalidOperationException)
{
return new OrderFoundViewModel();
}
}

Function for listing user parameters

I want to add a form into my application for generating rules considering the attributes of Liferay Users.
Do you know a function for getting a list of this attributes? (List of parameter names)
Example:
1. Address,
2. FullName,
3. AccountId,
4. Create Date,
5. Employee Numbre,
6. And so on.....
Do you know a function for getting the type of each parameter? (Due to check type errors)
Example:
1. Address -> String
2. FullName -> String
Thank you,
Oriol

AFAIK, there is not such a method. But even if it existed, it would not provide all the info you're going for, because some of it, is not an attribute of the User Class, or the corresponding 'user_' table in the LF database.
If you understand how ServiceBuilder Model works, you'll see that there's a complex Model running under the hood, and it's not working like attributes.
For example, there is no 'user.getAddress()'., Because, Address is a Complex Class, subclassing Contact, and keeps a FK to the User. If you want one of his addresses, You can only get all his addresses (User.getAddresses()), and iterate through them, check by ContantactType and e.g. get his "business address". Respectfully, you can't call 'user.setAddress(String)', not even a "user.addAddress(Address)". A working code would look much more like :
//update an existing Address
existingAddr.setStreet1(street);
existingAddr.setZip(zip);
existingAddr.setCity(city);
AddressLocalServiceUtil.updateAddress(existingAddr);
//then update the user, to store the changes.
UserLocalServiceUtil.updateUser(user);
The same goes for the birthday, the Phones, websites and facebook urls etc
For the rest of the 'Attributes' (names and Types), you should look here

You can get a User object by calling:
User u = userService.getUserById(0);
or check liferay docs for UserService
then you can use getters like:
u.getAddresses();
u.getBirthday();
u.getFullName();

you can get it from:
User user = UserLocalServiceUtil.getUser(userId);
user.getFullName();
user.getEmailAddress();

Mongoose: Only return one embedded document from array of embedded documents

I've got a model which contains an array of embedded documents. This embedded documents keeps track of points the user has earned in a given activity. Since a user can be a part of several activities or just one, it makes sense to keep these activities in an array. Now, i want to extract the hall of fame, the top ten users for a given activity. Currently i'm doing it like this:
userModel.find({ "stats.activity": "soccer" }, ["stats", "email"])
.desc("stats.points")
.limit(10)
.run (err, users) ->
(if you are wondering about the syntax, it's coffeescript)
where "stats" is the array of embedded documents/activeties.
Now this actually works, but currently I'm only testing with accounts who only has one activity. I assume that something will go wrong (sorting-wise) once a user has more activities. Is there anyway i can tell mongoose to only return the embedded document where "activity" == "soccer" alongside the top-level document?
Btw, i realize i can do this another way, by having stats in it's own collection and having a db-ref to the relevant user, but i'm wondering if it's possible to do it like this before i consider any rewrites.
Thanks!

You are correct that this won't work once you have multiple activities in your array.
Specifically, since you can't return just an arbitrary subset of an array with the element, you'll get back all of it and the sort will apply across all points, not just the ones "paired" with "activity":"soccer".
There is a pretty simple tweak that you could make to your schema to get around this though. Don't store the activity name as a value, use it as the key.
{ _id: userId,
email: email,
stats: [
{soccer : points},
{rugby: points},
{dance: points}
]
}
Now you will be able to query and sort like so:
users.find({"stats.soccer":{$gt:0}}).sort({"stats.soccer":-1})
Note that when you move to version 2.2 (currently only available as unstable development version 2.1) you would be able to use aggregation framework to get the exact results you want (only a particular subset of an array or subdocument that matches your query) without changing your schema.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string