How to avoid null insertion in pymongo? - python-3.x

I have an application where I let users access other third party applications and fetch data from them and perform some visualizations.For the ease of users, I only ask them to send the credentials once through POST request and then store it in Mongodb so next time onwards they ask for data without having to pass the credentials again and again.Now I plan to avoid duplication in my Mongodb database.So even though the user sends their credential, I don't reinsert it rather I use the upsert option.
users.update(doc, doc, upsert=True)
This does the trick but now when the user doesn't send any credentials, mongodb creates an object id for the particular request and puts None to value of each of the fields
{'account_id': None, 'password': None, 'username': None}
I have checked the following resource for the problem but the suggestions are specific to mongodb and not pymongo.
Avoid insertion in mongoDB Without Data
How do I ensure that I do not insert data with null values into my database?

You could use a simple if to make sure the doc is not None.
if doc:
users.update(doc, doc, upsert=True)
A database level solution is create a collection schema as pointed at resource that you linked at your question. The validators will guarantee correct insertions and upsertions.

Related

post middleware in mongoose not atomizing operation? still creates document even though next(err) is called

I have a model, let's call it Client and i have another model called Accounts. They are from difference collections, a client can have many different accounts. I reference the accounts within the client doc, as well as referencing back to the client from the account doc.
const Client = new mongoose.Schema({
accounts: [{
type:mongoose.ObjectId,
ref: 'Accounts',
}],
other....
})
const Accounts = new mongoose.Schema({
name:String,
clientID: mongoose.ObjectId
})
So as we can see they reference each other. I'm doing this for easy access to populating the account and such when requesting client info.
What im trying to do is when i create a new client, i also want to create a new default account and reference it in the accounts array. I tried using a pre hook when i create my new Client to create a new Account, however that doesn't update the Client account array with the newly created Account doc _id. I've tried using this.update()
Client.pre('save',async function(next,args){
if(this.isNew){
await Accounts.create({clientID:this._id})
.then(async doc=>{
console.log(doc) // this logs my account doc just fine, which means it got created
await this.update($push:{accounts:doc._id) // this doesnt seem to do anything
})
.catch(err=>next(err)
}
next()
}
So the pre hook almost did what i wanted it to do, but I can't figure out a way to update my newly created Client doc with the info from the newly created Account doc. It creates the Client doc, and it creates the Account doc. And the beauty of it is if I have an error when creating the Account doc, then since it is atomized then the Client doesn't get created. But alas, no updated accounts array...
So instead, I tried putting it into a post hook.
Client.pre('save',async function(doc, next){
await Accounts.create({clientID:doc._id})
.then(async acc=>{
await doc.update({$push:{accounts:[acc._id]}})
}).catch(err=>next(err)
}
And hey! this works!...kinda... I can create a Client document which creates an Account document, and then updates the Client to include the Account _id in its accounts array.
BUT!!! The issue im having with this approach is that it doesnt seem to be atomizing the operation. So if i deliberately make the account creation fail (by for example passing it a non ObjectID argument), then it calls next(err) which in my http request properly returns the error message, and even says that the operation failed. But in my database, the Client document still got created, unlike in the pre-hook where it stops the whole operation, in the post hook it does not 'undo' the creation of the Client.
SUMMARY AND SOLUTIONS
Basically I need a way to update a brand new doc inside of its pre.('save') hook so it will store any changed data i processed inside the hook.
Or some way to guarantee the atomization of the operation if i use a post hook to update the new doc.
Other things i've tried:
I also tried using save() inside the pre hook after creating the Account doc, but this resulted in an loop that maxed out the doc memory since it just became recursive
I tried using a pre-hook on the Accounts model so it would reference back to the Client model and update it, but this gives me both issues together. It does not update the new client doc (since it's technically not queryable yet) AND if the account creation fails, it still creates the Client.
Sorry for the long question, I appreciate any feedback or recommendations to fix this issue or different approach to achieve my goal. If you made it this far, thanks for reading!
My question was built up of a few questions, but i want to post the solution.
While i still don't know how to guarantee that an error in a post hook will make the whole operation behave atomically, the solution to this was quite simple.
Inside the pre hook,to modify the accounts array i just had to push() into it, no need to try using this.set or this.update or any other actual query, just direct modification of this
{
//inside Client pre hook
//create account doc
await Accounts.create(...).then(doc=>{
this.accounts.push(doc._id)
}).catch(err=>next(err)
}

Microsoft Graph API transient user filter

Is there a way for me to get back ONLY users when I call List group transitive users in graph API?
The response I get has objects for groups as well as users EG:
{group: IT support},
{user: Kevin},
{user: Bob},
{user: Phil},
{group: Developers},
{user:phil}
Id like to be able to filter out the group objects but no dice. Has anyone been able to do this before? Thanks
Usually, we can use $filter query parameter to retrieve the response we need, but it seems that '#odata.type' is not supported to be a query param here. I will double confirm with Azure support engineer.
By the way, the response is in json format, you can write a filter yourself to get the user objects only.
Update:
I have got an response from the Azure support engineer, '#odata.type' is not supported to be a query param here. So you need to deal with the data by yourself.

Stripe - create / retrieve customer in one call

Is there a stripe API call that we can use to create a user if they don't exist, and retrieve the new user?
say we do this:
export const createCustomer = function (email: string) {
return stripe.customers.create({email});
};
even if the user with that email already exists, it will always create a new customer id. Is there a method that will create a user only if the user email does not exist in stripe?
I just want to avoid a race condition where more than one stripe.customers.create({email}) calls might happen in the same timeframe. For example, we check to see if customer.id exists, and does not, two different server requests could attempt to create a new customer.
Here is the race condition:
const email = 'foo#example.com';
Promise.all([
stripe.customers.retrieve(email).then(function(user){
if(!user){
return stripe.customers.create(email);
}
},
stripe.customers.retrieve(email).then(function(user){
if(!user){
return stripe.customers.create(email);
}
}
])
obviously the race condition is more likely to happen in two different processes or two different server requests, than the same server request, but you get the idea.
No, there is no inbuilt way to do this in Stripe. Stripe does not require that a customer's email address be unique, so you would have to validate it on your side. You can either track your users in your own database and avoid duplicates that way, or you can check with the Stripe API if customers already exist for the given email:
let email = "test#example.com";
let existingCustomers = await stripe.customers.list({email : email});
if(existingCustomers.data.length){
// don't create customer
}else{
let customer = await stripe.customers.create({
email : email
});
}
Indeed it can be solved by validating stripe's customer data retrieval result against stored db.
And then call another API to create afterward.
However for simplicity sake, i agree with #user7898461 & would vouch for retrieveOrCreate customer api :)
As karllekko's comment mentions, Idempotent Keys won't work here because they only last 24 hours.
email isn't a unique field in Stripe; if you want to implement this in your application, you'll need to handle that within your application - i.e., you'll need to store [ email -> Customer ID ]s and do a lookup there to decide if you should create or not.
Assuming you have a user object in your application, then this logic would be better located there anyways, as you'd also want to do this as part of that - and in that case, every user would only have one Stripe Customer, so this would be solved elsewhere.
If your use case is like you don't want to create a customer with the same email twice.
You can use the concept of stripe idempotent request. I used it to avoid duplicate charges for the same order.
You can use customer email as an idempotent key. Stripe handles this at their end. the two request with same idempotent key won't get processed twice.
Also if you want to restrict it for a timeframe the create an idempotent key using customer email and that time frame. It will work.
The API supports idempotency for safely retrying requests without
accidentally performing the same operation twice. For example, if a
request to create a charge fails due to a network connection error,
you can retry the request with the same idempotency key to guarantee
that only a single charge is created.
You can read more about this here. I hope this helps

How do I save and retrieve information across invocations of my agent in Dialogflow?

I would like my Actions on Google agent to store and retrieve certain pieces of information across invocations - like a cookie. How do I do this?
You have a lot of options on how you want to do this, depending on exactly what you're trying to do. It isn't exactly like a web cookie, although there are similarities.
If you want the equivalent of a session cookie, information that is retained during a single conversation, then your options are
Using the Session ID provided as part of the information sent to you on each invocation and tracking this in your fulfillment.
Storing information you want retained using a Dialogflow context
If you are using the actions-on-google JavaScript library, storing this in the app.data object created for you.
If you want the equivalent of a long-lasting cookie to retain information between conversations then your options are
Using the anonymous User ID provided as part of the information sent to you on each invocation and tracking this in your fulfillment.
If you are using the actions-on-google javascript library, storing this in the app.userStorage object created for you.
Storing it as part of the string in the JSON response under data.google.userStorage.
Some more information about each of these
Session ID
A different Session ID is created for each conversation you have. You can get this Session ID by examining the JSON sent to your webhook in the sessionId parameter.
You can then look this up in a data store of some sort that you manage.
Dialogflow context
Contexts are powerful tools that are available with Dialogflow. You return a context as part of your fulfillment webhook and indicate the name of the context, its lifetime (how many more rounds of the conversation it will be passed back to your webhook), and any parameters associated with the context (string key/value pairs).
Contexts are especially useful in helping determine what intents may be called. You can indicate what contexts must be active for an Intent to be recognized by Dialogflow.
If you're using the actions-on-google node.js library, you can set a context using something like this:
var contextParameters = {
foo: "Something foothy",
bar: "Your local bar."
};
app.setContext( "remember_this", 5, contextParameters );
You need to do this before you call app.ask() or app.tell().
Or you can do the equivalent in the JSON as part of the contextOut block of the response
"contextOut": [
{
"name": "remember_this",
"lifespan": 5,
"parameters": {
"foo": "Something foothy",
"bar": "Your local bar."
}
}
]
The next time your webhook is called, you can fetch this context either by looking at the result.contexts array or by using the app.getContext() or app.getContextArgument() methods in the library.
Using app.data
If you're using the library, Google has done some of the work for you. The app.data object is created for you. Any values you set in the object are available for the lifetime of the session - you just read them in later calls to your webhook.
(Under the covers, Google uses a context for this, so there is no magic. The two work together and you're free to do both.)
Anonymous UserID
When a user first uses your action, a user ID is generated. This ID doesn't give you access to any specific information about them, and isn't used for any other action, but every time you see it, you can be assured that it was the same user that used it on a previous occurrence. Just like a cookie, however, the user can reset it and a new ID will be generated for them for your action.
You get this from the JSON at originalRequest.user.userId or by using app.getUser().userId. Once you have it, you'd use a data store of some sort to store and retrieve information about this user.
Using app.userStorage
Similar to app.data, there is also an app.userStorage object that is created for you for each user. Any changes you make to this object are saved in between conversations you have with this user.
Unlike app.data, however, this doesn't get stored in a context. It has its own storage method. Which leads to...
Storing it in JSON
If you're not using the actions-on-google library, you still have access to userStorage through the response and request JSON directly. You need to store this as a string, but if you need to store a more complex object, a common method is to stringify it as JSON.
You'll store this value under data.google.userStorage in the response and can retrieve it under originalRequest.data.user.userStorage in the request your webhook receives.
You can save the information in Context with a key value parameter.
SAVING VALUES IN CONTEXT :
agent.set.Context({
name:'context-name',
lifespan: 5,
parameters:{
'parameter-name':'parameter-value'
}
});
GETTING VALUES FROM CONTEXT
agent.getContext('context-name');
For more Details : https://dialogflow.com/docs/contexts/contexts-fulfillment
You could also use a Google Cloud database like BigQuery or Firestore
Sounds like you may want to checkout out Account Linking: https://developers.google.com/actions/identity/account-linking. With account linking you can collect end-user information which you exchange with Google by providing a unique key. This unique key becomes part of every request you receive from Google, so when you get that unique key you lookup the information you collected from the end-user. In your case, you would store credentials or whatever key is required to access the end-user information. After the initial linking, any new data you obtain could be stored along with the original information collected, based on the unique key obtained during account linking.
For this purpose, i just did a node module just for that, in external json file from api call, i need to store and add additional informations to retrieve later. I thing that you can do a lot with this module, Store object, array, json, value, Navigation history?, back to previous page.
It work like localStorage or Cookies.
There's no limit, you can create multiple storage by name (key) an value. It's new and i'm testing it for bugs right now on my own project.
Test on Runkit
On npm
vStorage = require('virtual-storage');
vStorage.set('name', '{title:'Title 1', description:'Descriptions 1'}')
let getStorage_name = vStorage.get('name');
console.log(getStorage_name.title);
vStorage.get('name')

CouchDB: Restricting users to only replicating their own documents

I'm having trouble finding documentation on the request object argument used in replication filters ('req' in the sample below):
function(doc, req) {
// what is inside req???
return false;
}
This old CouchBase blog post has a little code snippet that shows the userCtx variable being a part of the request object:
What is this userCtx? When you make an authenticated request against
CouchDB, either using HTTP basic auth, secure cookie auth or OAuth,
CouchDB will verify the user’s credentials. If they match a CouchDB
user, it populates the req.userCtx object with information about the
user.
This userCtx object is extremely useful for restricting replication of documents to the owner of the document. Check out this example:
function(doc, req) {
// require a valid request user that owns the current doc
if (!req.userCtx.name) {
throw("Unauthorized!");
}
if(req.userCtx.name == doc.owner) {
return true;
}
return false;
}
But the problem now is that CouchDB requires the filter method to be explicitly chosen by the initiator of the replication (in this case, the initiator is a mobile user of my web app):
curl -X POST http://127.0.0.1:5984/_replicate \
-d '{"source":"database", \
"target":"http://example.com:5984/database", \
"filter":"example/filtername"
}'
The Question
Is there a way to enforce a specific filter by default so that users are restricted to replicating only their own data? I'm thinking the best way to do this is to use a front end to CouchDB, like Nginx, and restrict all replication requests to ones that include that filter. Thoughts? Would love a way to do this without another layer in front of CouchDB.
Data replication stands right with user ability to read data. Since if your users shares data within single database all of them has right to replicate all of them to their local couches. So you couldn't apply any documents read restriction unless you've split single shared database into several personal ones - this is common use case for such situations.
There is no any way to enforce apply changes feed filter or other parameters like views has. However, you can use rewrites to wraps requests to some resources with predefined query parameters or even with dynamic ones. This is a little not solution that you'd expected, but still better that nginx and some logic at his side: probably, you'd to allow users to specify custom filters with custom query parameters and enforce you're own only if nothing specified, right?
P.S. Inside req object is very useful about current request. Partially it was described at wiki, but it's a little out of date. However, it's easily to view it with simple show function:
function(doc, req){
return {json: req}
}

Resources