I have lots of records in my postgres. (using sequelize to communicate)
I want to have a migrate script, but due to locking, I have to do each change as atomic as possible.
So I don't want to selectAll and then modify and then saveAll.
In mongo I have forEach cursor which allows me to update a record, save it and only then move to the next one.
Anything similar in sequelize/postgres?
Currently, I am doing that in my code - getting the IDs, then for each performing a query.
return migration.runOnAllUpdates((record)=>{
record.change = 'new value';
return record.save()
});
where runOnAllUpdates will simply give me records one by one.
Related
i have array of objects and I want update all records in one query
var arrayObj=[
{id:'5e346213bec252771415a9ee',
status:1,
date:01-2-2020},{id:'5e346213bec252471415a9efr',
status:2,
date:02-2-2020},
{id:'5e346213bec252771415a9ee',
status:3,
date:01-3-2020}];
Leads.update();
I am new to node and mongo, how can I update this. I don't want to use loop, as I have done with loop. Now I want to learn this.
What I trying to say, are:
You can't bulk update in this way, just passing many objects to some magic function.
MongoDb are a document oriented database, so, it's not normalised. To update some document, you need pass instructions (where you want to update and what you want to update).
If you have an array, you need a loop function to update each array item.
There's no possibility (without some third part library), to update many documents without loop. The links I sent explains the right way to do this.
You can use bulk operations.
var bulkOpr = <collectionName>.initializeUnorderedBulkOp();
bulkOpr.find({ _id: 1 }).updateOne({ /* update document */ });
bulkOpr.find({ _id: 2 }).updateOne({ /* update document */ });
// etc
bulkOp.execute();
You can do whatever you need to and the database do it all at once.
Ref. links:
https://docs.mongodb.com/manual/core/bulk-write-operations/
https://mongodb.github.io/node-mongodb-native/api-generated/unordered.html
I have the following .xslx file:
My software regardless tis language will return the following graph:
My software iterates line by line and on each line iteration executes the following query
MERGE (A:POINT {x:{xa},y:{ya}}) MERGE (B:POINT {x:{xb},y:{yb}}) MERGE (C:POINT {x:{xc},y:{yc}}) MERGE (A)-[:LINKS]->(B)-[:LINKS]->(C) MERGE (C)-[:LINKS]->(A)
Will this avoid by inserting duplicate entries?
According to this question, yes it will avoid writing duplicate entries.
The query above will match any existing nodes and it will avoid to write duplicates.
A good rule of thumb is on each node that it may be a duplicate write it into a seperate MERGE query and afterwards write the merge statements for each relationship between 2 nodes.
Update
After some experiece when using asyncronous technologies such nodejs or even parallel threads you must verify that you read the next line AFTER you inserted the previous one. The reason why is because is that doing multiple insertions asyncronously may result having multiple nodes into your graph that are actually the same ones.
In node.js project of mine I read the excell file like:
const iterateWorksheet=function(worksheet,maxRows,row,callback){
process.nextTick(function(){
//Skipping first row
if(row==1){
return iterateWorksheet(worksheet,maxRows,2,callback);
}
if(row > maxRows){
return;
}
const alphas=_.range('A'.charCodeAt(0),config.excell.maxColumn.charCodeAt(0));
let rowData={};
_.each(alphas,(column) => {
column=String.fromCharCode(column);
const item=column+row;
const key=config.excell.columnMap[column];
if(worksheet[item] && key ){
rowData[key]=worksheet[item].v;
}
});
// The callback is the isertion over a neo4j db
return callback(rowData,(error)=>{
if(!error){
return iterateWorksheet(worksheet,maxRows,row+1,callback);
}
});
});
}
As you see I visit the next line when I successfully inserted the previous one. I find no way yet to serialize the inserts like most conventional RDBMS's does.
In case or web or server applications another UNTESTED approach is to use queue servers such as RabbitMQ or similar in order to queue the queries. Then the code responsimble for insertion will read from the queue so the whole isolation should be in the queue.
Furthermore ensure that all inserts are into a transaction.
I want to implement hashtags functionality with NodeJS and MongoDB support, so that I can also count the uses. Whenever a user adds hashtags to a page, I want to push or update them in the database. Each hastag looks like this:
{_id:<auto>, name:'hashtag_name', uses: 0}
The problem I'm facing is that the user can add new tags as well, so when he clicks 'done', I have to increment the 'uses' field for the existing tags, and add the new ones. The trick is how to do this with only one Mongo instruction? So far I thought of 2 possible ways of achieving this, but I'm not particularly happy with either:
Option 1
I have a service which fetches the existing tags from the db before the user starts to write a new article. Based on this, I can detect which tags are new, and run 2 queries: one which will add the new tags, and another which will update the existing one
Option 2
I will send the list of tags to the server, and there I will run a find() for every tag; if I found one, I'll update, if not, I'll create it.
Option 3 (without solution for now)
Best option would be to run a query which takes an array of tag names, do a $inc operation for the existing ones, and add the missing ones.
The question
Is there a better solution? Can I achieve the end result from option #3?
You should do something like this, all of them will be executed in one batch, this is only an snippet idea how to do it:
var db = new Db('DBName', new Server('localhost', 27017));
// Establish connection to db
db.open(function(err, db) {
// Get the collection
var col = db.collection('myCollection');
var batch = col.initializeUnorderedBulkOp();
for (var tag in hashTagList){
// Add all tags to be executed (inserted or updated)
batch.find({_id:tag.id}).upsert().updateOne({$inc: {uses:1}});
}
batch.execute(function(err, result) {
db.close();
});
});
I would use the Bulk method offered by Mongodb since version 2.6. In the same you could perform insertion operations when the tag is new and the counter update when it already exists.
I know that DeleteUser() will run procedures to delete all relationships etc. Will the private internal DeleteData with a where condition also delete all relationships or will it just try deleting the main record from the table? If any relational data exists will it throw an error?
If you call UserInfoProvider.DeleteData() it won't delete the related data. It just executes the object's deletion SQL query. It won't even look for the cms.user.removedependencies query.
On the other hand, calling DeleteData() upon an info object would cause the related data to be deleted.
If you need to bulk delete users then retrieve them from the DB using object query (make sure you restrict columns, UserID should be enough) first. And then iterate through the collection calling Delete() on each one of them.
foreach (var user in UserInfoProvider.GetUsers().Where("UserEnabled=0").Columns("UserID").TypedResult.Items)
{
user.Delete();
}
On the project which I am currently working, I have to read an Excel file (with over a 1000 rows), extract all them and insert/update to a database table.
in terms of performance, is better to add all the records to a Doctrine_Collection and insert/update them after using the fromArray() method, right? One other possible approach is to create a new object for each row (a Excel row will be a object) and them save it but I think its worst in terms of performance.
Every time the Excel is uploaded, it is necessary to compare its rows to the existing objects on the database. If the row does not exist as object, should be inserted, otherwise updated. My first approach was turn both object and rows into arrays (or Doctrine_Collections); then compare both arrays before implementing the needed operations.
Can anyone suggest me any other possible approach?
We did a bit of this in a project recently, with CSV data. it was fairly painless. There's a symfony plugin tmCsvPlugin, but we extended this quite a bit since so the version in the plugin repo is pretty out of date. Must add that to the #TODO list :)
Question 1:
I don't explicitly know about performance, but I would guess that adding the records to a Doctrine_Collection and then calling Doctrine_Collection::save() would be the neatest approach. I'm sure it would be handy if an exception was thrown somewhere and you had to roll back on your last save..
Question 2:
If you could use a row field as a unique indentifier, (let's assume a username), then you could search for an existing record. If you find a record, and assuming that your imported row is an array, use Doctrine_Record::synchronizeWithArray() to update this record; then add it to a Doctrine_Collection. When complete, just call Doctrine_Collection::save()
A fairly rough 'n' ready implementation:
// set up a new collection
$collection = new Doctrine_Collection('User');
// assuming $row is an associative
// array representing one imported row.
foreach ($importedRows as $row) {
// try to find an existing record
// based on a unique identifier.
$user = Doctrine_Core::getTable('User')
->findOneByUsername($row['username']);
// create a new user record if
// no existing record is found.
if (!$user instanceof User) {
$user = new User();
}
// sync record with current data.
$user->synchronizeWithArray($row);
// add to collection.
$collection->add($user);
}
// done. save collection.
$collection->save();
Pretty rough but something like this worked well for me. This is assuming that you can use your imported row data in some way to serve as a unique identifier.
NOTE: be wary of synchronizeWithArray() if you're using sf1.2/doctrine 1.0 - if I remember correctly it was not implemented correctly. it works fine in doctrine 1.2 though.
I have never worked on Doctrine_Collections, but I can answer in terms of database queries and code logic in a broader sense. I would apply the following logic:-
Fetch all the rows of the excel sheet from database in a single query and store them in an array $uploadedSheet.
Create a single array of all the rows of the uploaded excel sheet, call it $storedSheet. I guess the structures of the Doctrine_Collections $uploadedSheet and $storedSheet will be similar (both two-dimensional - rows, cells can be identified and compared).
3.Run foreach loops on the $uploadedSheet as follows and only identify the rows which need to be inserted and which to be updated (do actual queries later)-
$rowsToBeUpdated =array();
$rowsToBeInserted=array();
foreach($uploadedSheet as $row=>$eachRow)
{
if(is_array($storedSheet[$row]))
{
foreach($eachRow as $column=>$value)
{
if($value != $storedSheet[$row][$column])
{//This is a representation of comparison
$rowsToBeUpdated[$row]=true;
break; //No need to check this row anymore - one difference detected.
}
}
}
else
{
$rowsToBeInserted[$row] = true;
}
}
4. This way you have two arrays. Now perform 2 database queries -
bulk insert all those rows of $uploadedSheet whose numbers are stored in $rowsToBeInserted array.
bulk update all the rows of $uploadedSheet whose numbers are stored in $rowsToBeUpdated array.
These bulk queries are the key to faster performance.
Let me know if this helped, or you wanted to know something else.