What is the suggested solution for handling the mapping of extracted parameters from an intents training phrases to a user-defined value in a database specific to that user.
A practical example I think would be a shopping list app.
Through a Web UI, the user adds catchup to a shopping list which is stored in the database as item.
Then through that agent (i.e. Google Assistant), the utterance results in ketchup being extracted as the item parameter. I wouldn't have a way to know how to map the extracted parameter from the utterance to the user defined value in the daabase
So just to be clear
// in the database added by the user from a web UI
"catchup"
// extracted from voice utterance
"ketchup"
How should I accomplish making sure that the extracted parameters can be matched up to the free form values they have added to the list?
Also, I am inexperienced in this area and have looked through the docs quite a bit and may just be missing this. Wasn't sure if Developer entities, or Session Entities was the solution for this or not.
Either Developer or Session Entities may be useful here. It depends.
If you can enumerate all the possible things that a user can say, and possibly create aliases for some of them, then you should use a Developer Entity. This is easiest and works the best - the ML system has a better chance of matching words when they are pre-defined as part of the training model.
If you can't do that, and its ok that you just want to match things that they have already added to a database, then a Session Entity will work well. This really is best for things that you already have about the user, or which may change dramatically based on context.
You may even wish to offer a combination - define as many entities as you can (to get the most common replies), allow free-form replies, and incorporate these free-form replies as Session Entities.
Related
I'm currently trying to learn Node.js and Mongoodb by building the server side of a web application which should manage insurance documents for the insurance agent.
So let's say i'm the user, I sign in, then I start to add my customers and their insurances.
So I have 2 collection related, Customers and Insurances.
I have one more collection to store the users login data, let's call it Users.
I don't want the new users to see and modify the customers and the insurances of other users.
How can I "divide" every user related record, so that each user can work only with his data?
I figured out I can actually add to every record, the _id of the one user who created the record.
For example I login as myself, I got my Id "001", I could add one field with this value in every customer and insurance.
In that way I could filter every query with this code.
Would it be a good idea? In my opinion this filtering is a waste of processing power for mongoDB.
If someone has any idea of a solution, or even a link to an article about it, it would be helpful.
Thank you.
This is more a general permissions problem than just a MongoDB question. Also, without knowing more about your schemas it's hard to give specific advice.
However, here are some approaches:
1) Embed sub-documents
Since MongoDB is a document store allowing you to store arbitrary JSON-like objects, you could simply store the customers and licenses wholly inside each user object. That way querying for a user would return their customers and licenses as well.
2) Denormalise
Common practice for NoSQL databases is to denormalise related data (ie. duplicate the data). This might include embedding a sub-document that is a partial representation of your customers/licenses/whatever inside your user document. This has the similar benefit to the above solution in that it eliminates additional queries for sub-documents. It also has the same drawbacks of requiring more care to be taken for preserving data integrity.
3) Reference with foreign key
This is a more traditionally relational approach, and is basically what you're suggesting in your question. Depending on whether you want the reference to be bi-directional (both documents reference each other) or uni-directional (one document references the other) you can either store the user's ID as a foreign user_id field, or store an array of customer_ids and insurance_ids in the user document. In relational parlance this is sometimes described to as "has many" or "belongs to" (the user has many customers, the customer belongs to a user).
I think this question has been asked but there are no clear answers.
The question is simple.
Can you have an entity list on the server.
For example I have a list of Product names on my database which can be really big. I want the intent to recognise these entities based on a list on the server.
The other thing I would like to do is filter an entity list.
e.g. I have a list of stores. I want it to be filtered by location, say by distance and lat long showing only stores near you when I ask a question.
Things which are so easy to do in apps seem so difficult in Dialogflow.
Please do not provide solutions which can be done on the server through webhooks. I already know about that and have used it.
I just want a better way to use entities so that the NLP can become more powerful.
The best way to do will be using Entities with webhook.
You may enable slot filling for the parameters.
In the webhook, have a set of stores based on locations and hashmap with the location as key and set of stores as value.
when the location is provided, fetch the corresponding set of stores.
when the store is provided, see if that store is present in the set.
reprompt if the information is not correct by resetting the context if required.
UPDATE
You may ask the user for the product names. Match the entity name with the names in DB. If present, use it if not, provide the user with some option from the DB that may match with what the user is saying and ask them to choose one. You need to think from a conversation point of view how two people communicate with each other.
I am currently just trying to learn some new programming patterns and I decided to give event sourcing a shot.
I have decided to model a warehouse as my aggregate root in the domain of shipping/inventory where the number of warehouses is generally pretty constant (i.e. a company wont be adding warehouses too often).
I have run into the question of how to set my aggregateId, which should correspond to a warehouse, on my server. Most examples I have seen, including this one, show the aggregate ID being generated server side when a new aggregate is being created (in my case a warehouse), and then passed in the command request when referring to that aggregate for subsequent commands.
Would you say this is the correct approach? Can I expect the user to know and pass aggregate Ids when issuing commands? I realize this is probably domain dependent and could also be a UI/UX choice as well, just wondering what other's have done. It would make more sense to me if the number of my event sourced aggregates were more frequent, such as with meal tabs or shopping carts.
Thanks!
Heuristic: aggregate id, in many cases, is analogous to the primary key used to distinguish entities in a database table. Many of the lessons of natural vs surrogate keys apply.
Can I expect the user to know and pass aggregate Ids when issuing commands?
You probably can't depend on the human to know the aggregate ids. But the client that the human operator is using can very well know them.
For instance, if an operator is going to be working in a single warehouse during a session, then we might look up the appropriate identifier, cache it, and use it when constructing messages on behalf of the user.
Analog: when you fill in a web form and submit it, the browser does the work of looking at the form action and using that information to construct the correct URI, and similarly the correct HTTP Request.
The client will normally know what the ID is, because it just got it during a previous query.
Creation patterns are weird. It can, in some circumstances, make sense for the client to choose the identifier to be used when creating a new aggregate. In others, it makes sense for the client to provide an identifier for the command message, and the server decides for itself what the aggregate identifier should be.
It's messaging, so you want to be careful about coupling the client directly to your internal implementation details -- especially if that client is under a different development schedule. If you get the message contract right, then the server and client can evolve in any way consistent with the contract at any time.
You may want to review Greg Young's 10 year retrospective, which includes a discussion of warehouse systems. TL;DR - in many cases the messages coming from the human operators are events, not commands.
Would you say this is the correct approach?
You're asking if one of Greg Young's Event Sourcing samples represents the correct approach... Given that the combination of CQRS and Event Sourcing was essentially (re)invented by Greg, I'd say there's a pretty good chance of that.
In general, letting the code that implements the Command-side generate a GUID for every Command, Event, or other persistent object that it needs to write is by far the simplest implementation, since GUIDs are guaranteed to be unique. In a distributed system, uniqueness without coordination is a big thing.
Can I expect the user to know and pass aggregate Ids when issuing commands?
No, and you particularly can't expect a user to know the GUID of their assets. What you may be able to do is to present the user with a list of his or her assets. Each item in the list will have the GUID associated, but it may not be necessary to surface that ID in the user interface. It's just data that the underlying UI object carries around internally.
In some cases, users do need to know the ID of some of their assets (e.g. if it involves phone support). In that case, you can add a lookup API to address that concern.
I'm looking into converting part of an large existing VB6 system, into .net. I'm trying to use domain driven design, but I'm having a hard time getting my head around some things.
One thing that I'm completely stumped on is how I should handle complex find statements. For example, we currently have a screen that displays a list of saved documents, that the user can select and print off, email, edit or delete. I have a SavedDocument object that does the trick for all the actions, but it only has the properties relevant to it, and I need to display the client name that the document is for and their email address if they have one. I also need to show the policy reference that this document may have come from. The Client and Policy are linked to the SavedDocument but are their own aggregate roots, so are not loaded at the same time the SavedDocuments are.
The user is also allowed to specify several filters to reduce the list down. These to can be from properties that are stored on the SavedDocument or the Client and Policy.
I'm not sure how to handle this from a Domain driven design point of view.
Do I have a function on a repository that takes the filters and returns me a list of SavedDocuments, that I then have to turn into a different object or DTO, and fill with the additional client and policy information? That seem a little slow as I have to load all the details using multiple calls.
Do I have a function on a repository that takes the filters and returns me a list of SavedDocumentsForList objects that contain just the information I want? This seems the quickest but doesn't feel like I'm using DDD.
Do I load everything from their objects and do all the filtering and column selection in a service? This seems the slowest, but also appears to be very domain orientated.
I'm just really confused how to handle these situations, and I've not really seeing any other people asking questions about it, which masks me feel that I'm missing something.
Queries can be handled in a few ways in DDD. Sometimes you can use the domain entities themselves to serve queries. This approach can become cumbersome in scenarios such as yours when queries require projections of multiple aggregates. In this case, it is easier to use objects explicitly designed for the respective queries - effectively DTOs. These DTOs will be read-only and won't have any behavior. This can be referred to as the read-model pattern.
I am designing an API, and I'd like to ask a few questions about how best to secure access to the data.
Suppose the API is allowing access to artists. Artists have albums, that have songs.
The users of the API have access to a subset of all the artists. If a user calls the API asking for some artist, it is easy to check if the user is allowed to do so.
Next, if the user asks for an album, the API has to check if the album belongs to an artist that the user is allowed to access. Accessing songs means that the API has to check the album and then the artist before access can be granted.
In database terms, I am looking at an increasing number of joins between tables for each additional layer that is added. I don't want to do all those joins, and I also don't want to store the user id everywhere in order to limit the number of joins.
To work around this, I came up with the following approach.
The API gives the user a reference to an object, for instance an artist object. The user can then ask that artist object for the albums, which returns a list object. The list object can be traversed, and album objects can be obtained from it. Likewise, from an album object a songlist object can be obtained and from that, the individual song objects.
Since the API trusts the artist object, it also trusts any objects (albums in this case) that the user gets from it, without further checks. And so forth for all the other objects. So I am delegating the security/trust to objects down the chain.
I would like to ask you what you think of it, what's good or bad about it, and of course, how you would solve this "problem".
Second, how would you approach this if the API should be RESTful? My approach seems less applicable in that case.
Is this a real program or rather a sample to illustrate a question?
Because it is not clear why you would restrict access to the artists and albums rather than just to individual media items or even tracks.
I don't think that the joins should cost you that much, any half-smart DB system will do them cheaply enough when you are making a fairly simple criteria match on multiple tables.
IMHO, the problem with putting that much security logic into queries is that it limits your ability to handle more complex DRM issues that are sure to bound up. For example, what if the album is a collection from multiple artists? What if the album contains a track which is a duet and I only have access to one artist? etc, etc.
My view is that in those situations, a convenient programming model with sensible exception is much more important than the performance of individual queries, which you could always cache or optimize in the future. What you are trying to do with queries sounds like premature optimization.
Design your programming model as flexible as possible. Define a sensible sense of extensions, then work on implementing the database and optimize queries after profiling the real system.
It is possible that doing the joins is much faster than your object approach (although it is more elegant). With the joins you have only one db request, with the objects you have many. (Or you have to retrieve all the "possible" data in the first request, which could also slow down things)
I recommend doing the joins. If there is a problem about the sql you can ask at stackoverflow :D
Another idea:
If you make urls like "/beatles/whitealbum/happinesisawarmgun"
then you would know the artist in the begining of the request and could get the permission at once without traversing - because the url contains the traversal information. Just a thought.
It is a good idea to include a security descriptor for each resource and not only to a top-level one. In your example the security descriptor is simply artist's ID or a list of artists' IDs, if you support duets etc. So I would think about adding the list of IDs to both the artists and the songs tables. You can add a string field where the artist IDs for the resource will be written in comma-separated way.
Such solution scales well, you can add more layers without increasing time needed for security check. Adding a new resource also doesn't require any additional penalty except for one more field to insert (based on resource's parent field). And of course, this solution supports special situations described above (like more than one artists etc.).
This kind of solution also doesn't violate RESTful architecture.
And the fact that each resource contains its own security descriptor generalizes the resource's access permissions, making it possible to implement some completely different security policy in future (for example, making access permissions more granular, based on albums, not only artists).