delete_many() not deleting all [PyMongo] - python-3.x

I've recently started using PyMongo as an interface with MongoDB. But I'm having some strange issues when deleting documents from a collection.
Here is an example:
from bson import ObjectId
from pymongo import MongoClient
# Open connection
client = MongoClient(mongo_html)
collection_post = client["MyCollection"].posts
# Delete procedure
_ids_to_delete = [ObjectID("xxxxxxx..."), ..., ObjectID("xxxxxxx...")]
n_to_delete = len(_ids_to_delete)
result = collection_post.delete_many({'_id': {'$in': _ids_to_delete}})
n_delete = result.deleted_count
if n_delete != n_to_delete:
raise Exception("Well well well...")
Now, I know for a fact that all the documents in _ids_to_delete exist in the database, in fact, if I run the following if the exception is raised
if n_delete != n_to_delete:
for _id in _ids_to_delete:
search_result = collection_post.find({'_id': _id})
It will still find documents that were supposed to be deleted. To get around this, I tried using delete_one() instead and looping, with similar results.
Am I missing something here? Does the fact that another process on an another computer is writing to the same collection at the same time have this effect?

Related

How to perform Key based queries to Google Datastore from Python 3?

I manage to make a connection to a Google Cloud Datastore databased. Now I want to get some entities given their Key/Id. Right now I am doing the following:
from google.cloud import datastore
client = datastore.Client()
query = client.query(kind='City')
query.key_filter("325899977574122") -> Exception here
I get "Invalid key: '325899977574122'".
What could be the cause of error? That Id exist, a city does have that key/Id.
It looks like it needs to be of type google.cloud.datastore.key.Key
https://googleapis.dev/python/datastore/latest/queries.html#google.cloud.datastore.query.Query.key_filter
Also, 325899977574122 is probably supposed to be cast to a long
So something like this:
client = datastore.Client()
query = client.query(kind='City')
query.key_filter(Key('City', 325899977574122L, project=project))
EDIT:
Also if youre trying to retrieve a single id, you should probably use this:
https://googleapis.dev/python/datastore/latest/client.html#google.cloud.datastore.client.Client.get
client = datastore.Client()
client.get(Key('City', 325899977574122L, project=project))
Fetching by ID is faster than doing a query

Modfying ForeignKeyConstraint schema in Alembic post process

I'm using process_revision_directives to apply some post-processing of the operations generated against a reference schema. The one I'm stuck on is removing the postgres schema from the instructions, so it can be generically changed at runtime using the answer from another question.
The below code correctly removes the schema from operations except for ForeignKeyConstraints in a CreateTableOp.
def process_foreign_key(col: sa.ForeignKeyConstraint):
col.referred_table.schema = None # Doesn't work
def process_revision_directives(context, revision, directives):
# Remove the schema from the generated operations
for op in chain(directives[0].upgrade_ops.ops, directives[0].downgrade_ops.ops):
if isinstance(op, ops.CreateTableOp):
op.columns = [
process_foreign_key(col) if isinstance(col, sa.ForeignKeyConstraint) else col
for col in op.columns
]
op.schema = None
This currently generates output like
op.create_table('user',
sa.Column('id', sa.Integer, nullable=False),
sa.ForeignKeyConstraint(['id'], ['reference_schema.group.id'], name='group_group_id', onupdate='CASCADE', ondelete='CASCADE'),
)
Any ideas on how I should modify these constraint objects to not have reference_schema. in the target table?
If you look into the rendering chain you can find where the last schema reference is. It's on op._orig_table, but the important thing it is on this table twice.
Put the following in your for loop.
op._orig_table.schema = None
op._orig_table = op._orig_table.tometadata(clear_meta)
where clear_meta is a MetaData object with no schema, such as
clear_meta = sa.MetaData(bind=session.connection(), schema=None)

SQLAlchemy scoped_session is not getting latest data from DB

I'm rather new to the whole ORM topic, and I've already searched forums and docs.
The question is about a flask application with SQLAlchemy as ORM for the PostgreSQL.
The __init__.py contains the following line:
db = SQLAlchemy()
the created object is referenced in the other files to access the DB.
There is a save function for the model:
def save(self):
db.session.add(self)
db.session.commit()
and also an update function:
def update(self):
for var_name in self.__dict__.keys():
if var_name is not ('_sa_instance_state' or 'id' or 'foreign_id'):
# Workaround for JSON update problem
flag_modified(self, var_name)
db.session.merge(self)
db.session.commit()
The problem occurs when I'm trying to save a new object. The save function writes it to DB, it's visible when querying the DB directly (psql, etc.), but a following ORM query like:
model_list = db.session.query(MyModel).filter(MyModel.foreign_id == this_id).all()
gives an empty response.
A call of the update function does work as expected, new data is visible when requesting with the ORM.
I'm always using the same session object for example this:
<sqlalchemy.orm.scoping.scoped_session object at 0x7f0cff68fda0>
If the application is restarted everything works fine until a new object was created and tried to get with the ORM.
An unhandsome workaround is using raw SQL like:
model_list = db.session.execute('SELECT * FROM models_table WHERE
foreign_id = ' + str(this_id))
which gives a ResultProxy with latest data like this:
<sqlalchemy.engine.result.ResultProxy object at 0x7f0cf74d0390>
I think my problem is a misunderstanding of the session. Can anyone help me?
It figured out that the problem has nothing to do with the session, but the filter() method:
# Neccessary import for string input into filter() function
from sqlalchemy import text
# Solution or workaround
model_list = db.session.query(MyModel).filter(text('foreign_key = ' + str(this_id))).all()
I could not figure out the problem with:
filter(MyModel.foreign_id == this_id) but that's another problem.
I think this way is better than executing raw SQL.

Substituting Variables Value in Mongodb statement

My main intention is to dynamically change the Employees collection while using pymongo, and i was able to do it for insert commands, I am facing problems with the find command, no matter what i do exec() always returns None. but if i copy the string and run it value gets assigned to the variable.
can someone throw some light on why the exec is unable to return a resultset or assign a the resultset to a variable?
db.Employees.update_one(
{"id": criteria},
{
"$set": {
"name":name,
"age":age,
"country":country
}
}
)
from pymongo import MongoClient
import ast
client = MongoClient('localhost:27017')
db = client.TextClassifier
insert works
def mongo_insert_one(COLLECTION_NAME, JSON):
QUERY = """db.%(COLLECTION_NAME)s.insert_one( %(JSON)s )""" % locals();
exec(QUERY)
def mongo_retrive(COLLECTION_NAME, JSON):
resultset = None
query = """resultset = db.%(COLLECTION_NAME)s.find( %(JSON)s )""" % locals();
return resultset
print(mongo_retrive('hungry_intent', "{'Intent':'Hungry'}"))
neither this would work
resultset = exec(""" db.%(COLLECTION_NAME)s.find( %(JSON)s )""" % locals();)
this would not work for an entirely different reason,it says If you meant to call the 'locals' method on a 'Database' object it is failing because no such method exists.
resultset = db.locals()[COLLECTION_NAME].find()
PyMongo Database objects support bracket notation to access a named collection, and PyMongo's included bson module provides a much better JSON decoder than "eval":
from bson import json_util
COLLECTION_NAME = 'hungry_intent'
JSON = "{'Intent':'Hungry'}"
print(list(db[COLLECTION_NAME].find(json_util.loads(JSON))))
This will be faster and more reliable than your "eval" code, and also prevents the injection attack that your "eval" code is vulnerable to.
If you can avoid using JSON at all it could be preferable:
COLLECTION_NAME = 'hungry_intent'
QUERY = {'Intent':'Hungry'}
print(list(db[COLLECTION_NAME].find(QUERY)))

Revit API & Dynamo, Creating a Family Parameter from Project Document

I'm trying to create a new family parameter by calling a family's document in a project document and using the FamilyManager method to edit the family. There have been about 10 people asking for this on the Dynamo forums, so I figured I'd give it a shot. Here's my Python script below:
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *
clr.AddReference("RevitServices")
from RevitServices.Persistence import DocumentManager
from RevitServices.Transactions import TransactionManager
clr.AddReference("RevitAPI")
from Autodesk.Revit.DB import *
#The inputs to this node will be stored as a list in the IN variables.
familyInput = UnwrapElement(IN[0])
familySymbol = familyInput.Symbol.Family
doc = familySymbol.Document
par_name = IN[1]
par_type = ParameterType.Text
par_grp = BuiltInParameterGroup.PG_DATA
TransactionManager.Instance.EnsureInTransaction(doc)
familyDoc = doc.EditFamily(familySymbol)
OUT = familyDoc.FamilyManager.AddParameter(par_name,par_grp,par_type,False)
TransactionManager.Instance.TransactionTaskDone()
When I run the script, I get this error:
Warning: IronPythonEvaluator.EvaluateIronPythonScript operation failed.
Traceback (most recent call last):
File "<string>", line 26, in <module>
Exception: The document is currently modifiable! Close the transaction before calling EditFamily.
I'm assuming that this error is because I am opening a family document that already exists through the script and then never sending the information back to the project document? Or something similar to that. Any tips on how to get around this?
Building up on our discussion from the forum:
import clr
clr.AddReference("RevitServices")
from RevitServices.Persistence import DocumentManager
from RevitServices.Transactions import TransactionManager
doc = DocumentManager.Instance.CurrentDBDocument
clr.AddReference("RevitAPI")
from Autodesk.Revit.DB import *
par_name = IN[0]
exec("par_type = ParameterType.%s" % IN[1])
exec("par_grp = BuiltInParameterGroup.%s" % IN[2])
inst_or_typ = IN[3]
families = UnwrapElement(IN[4])
# class for overwriting loaded families in the project
class FamOpt1(IFamilyLoadOptions):
def __init__(self): pass
def OnFamilyFound(self,familyInUse, overwriteParameterValues): return True
def OnSharedFamilyFound(self,familyInUse, source, overwriteParameterValues): return True
trans1 = TransactionManager.Instance
trans1.ForceCloseTransaction() #just to make sure everything is closed down
# Dynamo's transaction handling is pretty poor for
# multiple documents, so we'll need to force close
# every single transaction we open
result = []
for f1 in families:
famdoc = doc.EditFamily(f1)
try: # this might fail if the parameter exists or for some other reason
trans1.EnsureInTransaction(famdoc)
famdoc.FamilyManager.AddParameter(par_name, par_grp, par_type, inst_or_typ)
trans1.ForceCloseTransaction()
famdoc.LoadFamily(doc, FamOpt1())
result.append(True)
except: #you might want to import traceback for a more detailed error report
result.append(False)
trans1.ForceCloseTransaction()
famdoc.Close(False)
OUT = result
image of the Dynamo graph
The error message is already telling you exactly what the problem is: "The document is currently modifiable! Close the transaction before calling EditFamily".
I assume that TransactionManager.Instance.EnsureInTransaction opens a transaction on the given document. You cannot call EditFamily with an open transaction.
That is clearly documented in the help file:
http://thebuildingcoder.typepad.com/blog/2012/05/edit-family-requires-no-transaction.html
Close the transaction before calling EditFamily, or, in this case, don't open it at all to start with.
Oh, and then, of course, you wish to modify the family document. That will indeed require a transaction, but on the family document 'familyDoc', NOT on the project document 'doc'.
I don't know whether this will be the final solution, but it might help:
familyDoc = doc.EditFamily(familySymbol)
TransactionManager.Instance.EnsureInTransaction(familyDoc)
OUT = familyDoc.FamilyManager.AddParameter(par_name,par_grp,par_type,False)
TransactionManager.Instance.TransactionTaskDone()

Resources