django-registration user activation conflicts with "deleted" inactive users - django-registration

Recently I've been looking into options for user registration using activation steps and came across the django-registration project.
Basically, the problem I noticed walking thru the code is the following and I don't know what's the "cheapest" way to solve it:
There is an ambiguity between "deleted" users (marked inactive - that is being suggested in number of sources), inactive users that just didn't activate their account (but activation has not expired yet) and ones with expired activation.
All these users in auth_user db have is_active=False
To distinguish between "valid" and "invalid" users any of user db queries should go also to user's profile and check for couple of things: i.e. the following are valid users:
user.is_active=False AND user.get_profile().activated != ACTIVATED AND not_expired_yet
user.is_active=True
This basically impacts all the query operations on users objects if one wants to consider only valid users.
The reason I liked the profile extension over inheritance from User model (that creates SQL JOIN queries) was to avoid extra JOIN queries wherever I don't necessarily need to access user's profile fields.
So I guess to make long story short - two questions:
What would be your suggestions for django-registration like approach to solve the ambiguity? QuerySet.select_related is the most I could think of that will basically end up in JOIN for every query on userdb.
How to nicely wrap the direct calls to django.auth.contrib.User.objects methods like filter,get,exclude so that they take into account only the valid users? The best I could think of was to go over RegistrationProfile method even for user queries...
Any feedback appreciated

Related

Command accros multiple aggregates with CQRS and ES

I'm having an odd case while thinking about a solution for my problem.
A quick recap: I'm using an event store with CQRS, and i have 2 aggregates called 'Group' and 'User'.
Basically a User defines some characteristics like his region, age, and a couple of interests.
He then can choose to 'match' with a Group that is in the same region, around the same age and same interests.
Now here's the case: the 'matchmaking' part should happen completely on the backend, it can be a long running process, but for the client it's just 1 call to the endpoint and the end result should be him matching with a group.
So for this case, I have to query the groups which have the same region, the same age slice, the interests don't really matter in my query. I know have a list of groups, and the match maker is going to give each group a rating based on the common interests between the group and the user. The group with the best rating will be joined.
So again, using CQRS and ES, and my problem is that this case seems a mix between queries and a command, and mixing queries into a match command seems to go against the purpose of CQRS.
Querying multiple groups and filtering them against my write side, the event store, also is a bad idea as the aggregates have to be rebuilt and loaded in memory before being able to filter them out.
So I:m kind of stuck here, something is telling me that a long running process / saga could be an answer to my problem, but I don't see how I would still not break the mix of query and commands in my saga, as a saga is basically a chain of commands/events.
How do I tackle this specific case ? No real code is needed, a conceptual solution to get me going is perfect.
Hi this is actually a case where CQRS can shine.
Creating a dedicated matching model seems to be ideal for this case to allow answering what might be a rather non-trivial query in other forms.
So,
create a dedicated (possibly ephemeral, possibly checkpointed/persisted) query model as derived store.
Upon request run a query to get the top matches.
based on the results of the query send a command to update the event store with the new links.
The query model will not need to manage commands and could be updated on a push basis from the event store. This will keep it rather simple to build and keep up to date and further can be optimized to only have the data needed for for this particular query.
An in-memory graph might do well.
-Chris
p.s.
On the command side: the commands here would each only update a single aggregate instance.
Further using the write ahead pattern would allow for not needing any sort of process manager or "saga."
e.g.
For each new membership 1 command to add the new membership to the user stream, then 1 command to the group to add the new member information. Then a simple audit process can scan for incomplete membership assignments both on start up/recovery and as a periodic data quality check.
-Chris

MongoDB, how to manage user related records

I'm currently trying to learn Node.js and Mongoodb by building the server side of a web application which should manage insurance documents for the insurance agent.
So let's say i'm the user, I sign in, then I start to add my customers and their insurances.
So I have 2 collection related, Customers and Insurances.
I have one more collection to store the users login data, let's call it Users.
I don't want the new users to see and modify the customers and the insurances of other users.
How can I "divide" every user related record, so that each user can work only with his data?
I figured out I can actually add to every record, the _id of the one user who created the record.
For example I login as myself, I got my Id "001", I could add one field with this value in every customer and insurance.
In that way I could filter every query with this code.
Would it be a good idea? In my opinion this filtering is a waste of processing power for mongoDB.
If someone has any idea of a solution, or even a link to an article about it, it would be helpful.
Thank you.
This is more a general permissions problem than just a MongoDB question. Also, without knowing more about your schemas it's hard to give specific advice.
However, here are some approaches:
1) Embed sub-documents
Since MongoDB is a document store allowing you to store arbitrary JSON-like objects, you could simply store the customers and licenses wholly inside each user object. That way querying for a user would return their customers and licenses as well.
2) Denormalise
Common practice for NoSQL databases is to denormalise related data (ie. duplicate the data). This might include embedding a sub-document that is a partial representation of your customers/licenses/whatever inside your user document. This has the similar benefit to the above solution in that it eliminates additional queries for sub-documents. It also has the same drawbacks of requiring more care to be taken for preserving data integrity.
3) Reference with foreign key
This is a more traditionally relational approach, and is basically what you're suggesting in your question. Depending on whether you want the reference to be bi-directional (both documents reference each other) or uni-directional (one document references the other) you can either store the user's ID as a foreign user_id field, or store an array of customer_ids and insurance_ids in the user document. In relational parlance this is sometimes described to as "has many" or "belongs to" (the user has many customers, the customer belongs to a user).

Mapping DialogFlow Extracted Parameters to Database column values

What is the suggested solution for handling the mapping of extracted parameters from an intents training phrases to a user-defined value in a database specific to that user.
A practical example I think would be a shopping list app.
Through a Web UI, the user adds catchup to a shopping list which is stored in the database as item.
Then through that agent (i.e. Google Assistant), the utterance results in ketchup being extracted as the item parameter. I wouldn't have a way to know how to map the extracted parameter from the utterance to the user defined value in the daabase
So just to be clear
// in the database added by the user from a web UI
"catchup"
// extracted from voice utterance
"ketchup"
How should I accomplish making sure that the extracted parameters can be matched up to the free form values they have added to the list?
Also, I am inexperienced in this area and have looked through the docs quite a bit and may just be missing this. Wasn't sure if Developer entities, or Session Entities was the solution for this or not.
Either Developer or Session Entities may be useful here. It depends.
If you can enumerate all the possible things that a user can say, and possibly create aliases for some of them, then you should use a Developer Entity. This is easiest and works the best - the ML system has a better chance of matching words when they are pre-defined as part of the training model.
If you can't do that, and its ok that you just want to match things that they have already added to a database, then a Session Entity will work well. This really is best for things that you already have about the user, or which may change dramatically based on context.
You may even wish to offer a combination - define as many entities as you can (to get the most common replies), allow free-form replies, and incorporate these free-form replies as Session Entities.

Couchdb-lucene and ad-hoc queries for the authenticated user

I'm using CouchDB to store data coming from various sources and couchdb-lucene to allow ad-hoc queries. That's important for me because I display the data in a feed and I want this feed to be filterable. CL seems perfect for that.
However, I also want to introduce permissions to the feed app - a user should only be able to see a feed item if he/she has the permission to see it.
Now, I would like to be able to run ad-hoc queries and only return the feed items that the currently authenticated user has permissions to read.
The only solution that I could figure out (so far) was to add a 'permissions' field to each feed item where I store all the permission for the other users (obviously skipping the users that have no permissions for this item at all)
permissions: [{user_id: '123', read: true, write: true}, ...]
and then index this array in CL.
While this will probably work, I feel kind of bad being forced to nest the permissions metadata in the feed item...it might even be a better solution than keeping it separate, but I just don't like that I don't seem to have a choice here.
The only other solution (well, other than dumping CouchDB) would be to run the ad-hoc query without being concerned about the permissions, then run a second query on the server that selects all "my items" and do a set intersection. But those sets can be huge (and if I chunk it, it would require possibly many DB requests => slow).
Is my solution fine or is there anything better? Or is CouchDB just not a good fit for such queries?
Cheers!
You are on the right path with keeping that permission data on the document itself. This will be the easiest way for you to build views later on, which will enable you to check for user permissions. So dont worry and just let it flow in that direction. Feeling bad about nesting that data probably comes from previous ages when you were using SQL and RDBMS'es, where you'd want to normalize the hell out of each table. This time it's completely different :)
Btw, the only possibility to do "JOINS" in CouchDB is to use Linked Documents. If you are interested you can give that a try. However it wont enable you to look inside the linked document, while creating a view.

How to selectively replicate private and shared portions of a CouchDB database?

We're looking into using CouchDB/CouchCocoa to replicate data to our mobile app.
Our system has a large number of users. Part of the database is private to each user -- for example their tasks. These I've been able to replicate without problem using filtered replication.
Here's the catch... The database also includes shared information only some of which pertains to a given user. How do I selectively replicate that shared information? For example a user's task might reference specific shared documents. Is there a way to make sure those documents are included in the replication without including all the shared documents?
From the documentation it seems that adding doc_ids to the replication (or adding another replication with those doc ids) might be one solution. Has any one tried this? Are there other solutions?
EDIT: Given the number of users it seems impractical to tag each shared document with all the users sharing it but perhaps that's the only way to do this?
Final solution mostly depends on your documents structure, but currently I see two use-cases:
As you keep everything within single database, probably you have some fields set to recognize, that document is shared or document is private, right? Example:
owner: "Mike"
participants: [] // if there is nobody mentioned, document looks like as private(?)
So you just need some filter that would handle only private documents and only shared ones: by tags, number of participants, references or somehow.
Also, if you need to replicate some documents only for specific user (e.g. only for Mike), than you need special view to handle all these documents and, yes, use replication by document ids, but this wouldn't be an atomic request: you need some service script to handle these steps. If shared documents are defined by references to them, than the only solution is the same: some service script, view that generated document reference tree and replication by doc._id's.
Review your architecture. Having per user database is normal use-case for CouchDB and follows way of data partitioning and isolation. So you may create per user database that would be private only for that user. For shared documents you may create additional databases playing with database members of security options. Each "shared" database will handle only certain number of participants by names or by groups, so there couldn't be any data leaks unless that was not a CouchDB bug(:
This approach looks too weird from first sight, but everything you've needed there is to create some management script that would handle database creation and publication, replications would be easy as possible and users data is in safe.
P.S. I've supposed that "sharing" operation makes document visible not for every one, but for some set of users. If I was wrong and "shared" state means "public" state than p2. will be more simpler: N users databases + 1 public one.

Resources