Python get last row in MongoDB database (without looping)

Python get last row in MongoDB database (without looping) - python-3.x

Based on this post:
Get the latest record from mongodb collection
I got the following to work:
docs = dbCollectionQuotes.find({}).limit(1).sort([('$natural', pymongo.DESCENDING)])
for doc in docs:
pprint.pprint(doc)
But since we know there is only going to be one row coming back, is there any way to get that one row without looping through the cursor that is returned? I don't think we can use find_one() because of the limit and the sort.

Use next(). This works for me:
doc = dbCollectionQuotes.find().limit(1).sort([('$natural', -1)]).next()

Related

Consecutive calls to updateOne of mongodb: 3rd one does not work

I receive 3 post calls from client, let say in a second, and with nodejs-mongodb immediately(without any pause, sleep, etc) I try to insert the data that is posted in database using updateOne. All data is new, so in every call, insert would happen.
Here is the code (js):
const myCollection = mydb.collection("mydata")
myCollection.updateOne({name:req.data.name},{$set:{name:req.data.name, data:req.data.data}}, {upsert:true}, function(err, result) {console.log("UPDATEONE err: "+err)})
When I call just 1 time this updateOne, it works; 2 times successively, it works. But if I call 2+ times in succession, only the first two ones correctly inserted into database, and the rest, no.
The error that I get after updateOne is, MongoWriteConcernError: No write concern mode named 'majority;' found in replica set configuration. However, I always get this error, also even when the insertion is done correctly. So I don't think this is related to my problem.
Probably you will suggest to me to use updateMany, bulkWrite, etc. and you will be right, but I want to know the reason why after 2+ the insertion is not done.

Have in mind .updateOne() returns a Promise so it should be handled properly in order to avoid concurrency issues. More info about it here.
The error MongoWriteConcernError might be related to the connection string you are using. Check if there is any &w=majority and remove it as recommended here.

Google Cloud Datastore Cursor with google.cloud.ndb

I am working with Google Cloud Datastore using the latest google.cloud.ndb library
I am trying to implement pagination use Cursor using the following code.
The same is not fetching the data correctly.
[1] To Fetch Data:
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5)
This code works fine and fetches 5 entities from MyModel
I want to implementation pagination that can be integrated with a Web frontend
[2] To Fetch Next Set of Data
from google.cloud.ndb._datastore_query import Cursor
nextpage_value = "2"
nextcursor = Cursor(cursor=nextpage_value.encode()) # Converts to bytes
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5, start_cursor= nextcursor)
[3] To Fetch Previous Set of Data
previouspage_value = "1"
prevcursor = Cursor(cursor=previouspage_value.encode())
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5, start_cursor=prevcursor)
The [2] & [3] sets of code do not fetch paginated data, but returns results same as results of codebase [1].
Please note I'm working with Python 3 and using the
latest "google.cloud.ndb" Client library to interact with Datastore
I have referred to the following link https://github.com/googleapis/python-ndb
I am new to Google Cloud, and appreciate all the help I can get.

Firstly, it seems to me like you are expecting to use the wrong kind of pagination. You are trying to use numeric values, whereas the datastore cursor is providing cursor-based pagination.
Instead of passing in byte-encoded integer values (like 1 or 2), the datastore is expecting tokens that look similar to this: 'CjsSNWoIb3Z5LXRlc3RyKQsSBFVzZXIYgICAgICAgAoMCxIIQ3ljbGVEYXkiCjIwMjAtMTAtMTYMGAAgAA=='
Such a cursor you can obtain from the first call to the fetch_page() method, which returns a tuple:
(results, cursor, more) where results is a list of query results, cursor is a cursor pointing just after the last result returned, and more indicates whether there are (likely) more results after that
Secondly, you should be using fetch_page() instead of fetch_page_async(), since the second method does not return you the cursors you need for pagination. Internally, fetch_page() is calling fetch_page_async() to get your query results.
Thirdly and lastly, I am not entirely sure whether the "previous page" use-case is doable using the datastore-provided pagination. It may be that you need to implement that yourself manually, by storing some of the cursors.
I hope that helps and good luck!

Pymongo - Process only newly updated documents?

So I'm new to Pymongo & MongoDB, and I'm just confused as to how best to go about this problem. I have two collections:
Raw_collection
Processed_collection
Basically, I have raw documents that go into the Raw_collection, after which I process them by dropping some documents based on filters etc, and store the remaining documents into Processed_collection. Specifically, I plan to periodically update the records in Raw_collection as well.
As such, what would be the best way to process only the newly inserted documents to Raw_collection on a successive update? I looked into bulk methods but I'm not sure if that's what I want... this seems like a simple-ish problem to solve, but because of my inexperience I'm not sure what the solution would be. Any help is greatly appreciated, thanks!

So I ended up doing this through pymongo's insert_many method has:
import pandas
import pymongo
insert_raw_collection(): #call First
result = db[collection].insert_many(documents)
obj_id_list = result.inserted_ids
#[ObjectId('54f113fffba522406c9cc20e'), ObjectId('54f113fffba522406c9cc20f')]
return obj_id_list
insert_processed_collection(obj_id_list): # call Second
cursor = raw_collection_pandas_data_frame.find({"_id": {"$in": obid_list}})
for doc in cursor:
if filter(doc) == True
# do something
Basically, I return the list of inserted ObjectId's from the previous insert step and perform a filtering operation so I know which I want to keep.

Sequelize/Postgres - how to update each row individually on migrate?

I have lots of records in my postgres. (using sequelize to communicate)
I want to have a migrate script, but due to locking, I have to do each change as atomic as possible.
So I don't want to selectAll and then modify and then saveAll.
In mongo I have forEach cursor which allows me to update a record, save it and only then move to the next one.
Anything similar in sequelize/postgres?
Currently, I am doing that in my code - getting the IDs, then for each performing a query.
return migration.runOnAllUpdates((record)=>{
record.change = 'new value';
return record.save()
});
where runOnAllUpdates will simply give me records one by one.

node-mongodb show update results

When I db.collection('example').update({"a":1},{"$set":{"b":2}},{multi:true},function(e,r){
I get r:
{
n:3,
nModified:3,
ok:1
}
This works, I can see If I look at my db that I have successfully updated 3 documents but where are my results?
Quoted from https://mongodb.github.io/node-mongodb-native/markdown-docs/insert.html
callback is the callback to be run after the records are updated. Has three parameters, the first is an error object (if error occured), the second is the count of records that were modified, the third is an object with the status of the operation.
I've tried with 3 outputs in the callback but, then I just get null as a result
db.collection('example').update({"a":1},{"$set":{"b":2}},{multi:true},function(e,n,r){
My documents have been successfully updated but r is null!
I am expecting for this to return my updated documents
It doesn't look like this operation ever does, so how can I manullay return the documents that got changed?

You can use findAndModify to get the updated document in the result. It's callback has 2 parameters:
1- error
2- Updated document
I am not sure this would work for you, but check [documentation]: https://mongodb.github.io/node-mongodb-native/markdown-docs/insert.html#find-and-modify for more info.

To get the updated documents in the returned result, you'll need to use the db.collection.bulkWrite method instead.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Python get last row in MongoDB database (without looping) - python-3.x

Use next(). This works for me: doc = dbCollectionQuotes.find().limit(1).sort([('$natural', -1)]).next()

Related

Consecutive calls to updateOne of mongodb: 3rd one does not work

Google Cloud Datastore Cursor with google.cloud.ndb

Pymongo - Process only newly updated documents?

Sequelize/Postgres - how to update each row individually on migrate?

node-mongodb show update results

Categories

Resources