Update many objects, create if not exist - node.js

I'm working with a mean application and one of it functions is to do upload of csv files and convert it to json and persist in a mongodb database. But even month i receive a csv with new records and records that already exists (with new informations or not) in the database. Summing up, i need to update many objects and create it if not exist. My question is, what is the better way to do this, because these files are very large.
The current version just create these records like this:
Patient.create(records ,function(err, records) {
if (err){
res.send(err);
console.log(err);
}
res.json(records);
});

You can do this by following some simple steps
At Node JS use parse CSV to convert CSV into JSON
Then take all the data from collection and compare it with the new data.
For Comparison of data you can use lodash library, it's method is really fast.
Once you got the _id of the data which is new or need to be update. You can use the below query
db.collectionName.update({$in:{"_id":ids}},{$set:{"key":"value"}},{upsert:true, multi:true},function(err,doc){
console.log(doc)
})
don't forget to add upsert true because you have new data also.
Hope it Help!!

Related

Query all Table in Sails.js

I have a project that requires syncing where in syncing is to gather all the data from all tables at start up. Kinda easy.
However with Node.js with the framework Sails.js, I cant seem to find a way to do so as one model is equal to one table, all laid out in projectName/api/models/ as a single file for each.
My initial idea was to loop everything in that directory to be able to do my query for each item, however it doesn't work as I have tried.
Here is my source code for the simple query for only one model:
modelName.getDatastore().sendNativeQuery('SELECT * FROM table WHERE id = 0' ,function(err, res) {
if (err) {
console.log(err);
return exits.success(err);
}
return exits.success(res);
});
With what I have tried (not in my sample above), I changed the modelName into string to test out if looping the directory works, which it doesn't. I also tried temporarily creating a simple variable that represents one of the model's name and used it for query, which also didn't work. I'm at my wit's end and can't find a solution even in google. Any help?

Express with pug, Postgres and proper MVC

I recently started using Node.js + Express.js (generated with pug) + pg-promise for handling db.
My first target is to obtain data from Postgres (already set up) and display it pretty using render and pug. Let's say it is user list from Users table.
On this restful tutorial I have learned how to get data and return it as JSON - it worked.
Based on Mozilla's tutorial I seperated my code:
routes/users.js: where for '/' I call user_controller.user_list method (using router.get)
controllers/userController.js I have exported user_list where I would like to ask model for data and call render if I have results
queries.js which is kinda my model? But I'm not sure. It has API: connection to db with promises and one function for every query I am going to use in Controllers. I believe I should have like one Model file per table (or any logical entity) but where to store pgp connections?
This file is based on first tutorial I mentioned
// queries.js (connectionString is set properly to my postgres)
var pgp = require('pg-promise')(options);
var db = pgp(connectionString);
function getUsers(req, res, next) {
db.any('SELECT (user_id, username) FROM public.users ORDER BY user_id ASC LIMIT 1000')
.then(function (data) {
res.json({ data: data });
})
.catch(function (err) {
return next(err);
});
}
module.exports = {
getUsers: getUsers
};
Here starts my problem as most tutorials uses mongoose which is very model-db-schema-friendly and what I have is simple 'SELECT ...' string I pass to pg-promise's any() function.
Therefore I have no model class like User.
In userControllers.js I don't know how to call getUsers() to handle its data. Returning JS object from getUsers() would be nice.
Also: where should I call render? In controller or only in
db.any(...).then(function (data) { <--here--> })
Before, I also tried to embed whole Postgres handling into Controller but from db.any() I got this array for handling:
[{ row: '(1,John)' },{ row: '(2,Amy)' },{ row: '(50,Peter)' } ]
Didn't know how go from there as I probably lost my API functionality as well ;-)
I am browsing through multiple tutorials how to handle MVC but usually they handle MongoDB and
satisfy readers with res.send() not render().
I am not sure that I understand what your question is exactly about, but since I do not have enough reputation to comment, I'll do my best to help you with your interrogations. :)
First, regarding the queries.js file, it is IMO not exactly a model, but rather a DAO (Data Access Object) file. DAO comes between you Model (which is actually you database) and your Controller layers. There usually is a DAO file per object (User, Pet, whatever you want) in your data model.
When the data model is rather complex, it can be useful to use an Object Relational Mapping (ORM) such as Mongoose to map your database and execute complexe processes on your objects. In such a case, you might need a specific file per object so as to describe your model and store your queries. But since you don't need an ORM, you DAO can directly interact with your database. That is why you do not have a User.js file.
Regarding the way the db object should be used, I think you should refer directly to pg-promise documentation on the matter.
IMPORTANT: For any given connection, you should only create a single
Database object in a separate module, to be shared in your application
(see the code example below). If instead you keep creating the
Database object dynamically, your application will suffer from loss in
performance, and will be getting a warning in a development
environment (when NODE_ENV = development)
As a matter of fact, a db object in pg-promise sort of represents the database itself and is actually designed for the simultaneous use of several databases, which does not seem to be your case for the moment.
Finally, when it comes to the render function, I believe it should be in the controller, as your DAO is not supposed to know how the data it has gathered is going to be used.
Modularity is always a time-saving choice on the long-term.
Furthermore, note that you might later need a Business Layer between your DAO and your controller, in order to preprocess and postprocess data you are going to persist or to display. In such a case, if you need for instance to ask for data from your database, you will need to render data after it is processed by the Business layer. If the render is made in the DAO layer, it will not be possible.
In the link I provided earlier to pg-promise's db object connection, you will also find documentation on the any() method. You might already have looked it up.
It specifically states that it returns
A promise object that represents the query result:
When no rows are returned, it resolves with an empty array.
When 1 or more rows are returned, it resolves with the array of rows.
so your returned data is a JS Array. If you want to make it a JS object, just use
JSON.stringify(yourArray) to process your data before rendering it in your controller.
But I wonder if Pug is not able to use your data directly.
Also, if you cannot get any data out of your DAO, maybe you should check that your data object is not empty, as such a case is tolerated by the any() method. If you expect your query to always return something, you might want to consider using the many() or the one() methods.
I hope this helps you.

Mongo Database Fixing Names after Mongoose Index?

I had a very weird issue with the way Mongoose interacted with my Node and Mongo database.
I was using express to create a basic get api route to fetch some data from my mongodb.
I had a database called test and it had a collection call "billings"
so the schema and route was pretty basic
apiRouter.route('/billing/')
.get(function(req, res) {
Billing.find(function(err, billings) {
if (err) res.send(err);
// return the bills
res.json(billings);
});
});
Where "Billing" was my mongoose schema. that simply had 1 object {test: string}
This worked fine, I got a response with all the items in my mongo db called "billings" which is only one item {test: "success"}
Next I created a collection called "historys"
I setup the exact same setup as my billings.
apiRouter.route('/historys/')
// get all the history
.get(function(req, res) {
Historys.find(function(err, historys) {
if (err) res.send(err);
// return the history
res.json(historys);
});
});
where again "Historys" was my mongoose schema. This schema was identical in setup to my billings since I didnt have any real data, the fields were the same, i just had it with a test field so the json object returned from both billings and historys should have been
{ test: "success" }
However, this time I didnt get any data back, I just got an empty object
[].
I went through my code multiple times to make sure maybe a capital got lost, or a comma somewhere etc, but the code was identical. the setup and formatting in my mongodb was identical. I went into robomongo and viewed the database and everything was named correctly.
Except, I had 2 new collections now.
My original : "Historys" AND a brand new collection "Histories"
Once i fixed my api route to go look at Histories instead of Historys, I was able to get the test data successfully. I still however cannot pull data from Historys, its like it doesnt exist yet there it was in my robomongo console when I refreshed.
I searched all my code for any mention of histories and got 0 results. Where did the system know to fix the grammar on my collection?
From the docs:
When no collection argument is passed, Mongoose produces a collection name by passing the model name to the utils.toCollectionName method. This method pluralizes the name. If you don't like this behavior, either pass a collection name or set your schemas collection name option.
So, when you did, in your schema definition, this:
mongoose.model('Historys', YourSchema);
, mongoose created the Histories collection.
When you do:
db.historys.insert({ test: "success" })
through mongodb console, if the historys collection doesn't exist, it'll be created. That's why you have the two collections in your db. Like the docs said, if you don't want mongoose to create a collection with a pluralized name based on your model, just specify the name you want.

add an attachment to a document in couch db using nodejs

I want to update an existing document in couchdb. I have an image and i want to add it to an existing document in the db without lose the previus fields.
I'm using nodejs with nano.
tanks, this put me in the right orientation. At the end i do it in this way:
db.get(id,{ revs_info: true }, function (error, objeto) {
if(error)
console.log('wrong id');
fs.readFile('image.jpg', function(err, data)
{
if (!err)
{
db.attachment.insert(id, 'imagen.jpg', data, 'image/jpg',{ rev: objeto._rev}, function(err, body) {
if (!err)
console.log(body);
});
}
});
});
Your question is not really clear about the specific problem. So here just some general guidance on updating documents.
When designing the database make sure you set the ID rather than allowing couchdb to edit it. This way you can access the document directly when updating it.
When updating, you are required to prove that you are updating the most recent version of the document. I usually retrieve the document first and make sure you have the most recent '_rev' in the document you'll insert.
finally the update may fail if a different process has edited the document in the time between retrieving and updating it. So you should capture a failure in the insert and repeat the process until you succeed.
That being said, there are two ways you can store an image:
As an attachment: I believe nano support the attachment.insert() and attachment.get() functions to do so.
As a reference: I would usually rather store the images elsewhere and just store the url or filepath to access them. I've not used nano much but believe you can do this by doing the below.
doc = db.get(docname); // get the document with all existing elements
doc['mynewimage'] = myimageurl; // update the document with the new image
// this assumes it's a dictionary
db.insert(doc); // inserts the document with the correct _id (= docname)
// and _rev

Insert/update Doctrine object from Excel

On the project which I am currently working, I have to read an Excel file (with over a 1000 rows), extract all them and insert/update to a database table.
in terms of performance, is better to add all the records to a Doctrine_Collection and insert/update them after using the fromArray() method, right? One other possible approach is to create a new object for each row (a Excel row will be a object) and them save it but I think its worst in terms of performance.
Every time the Excel is uploaded, it is necessary to compare its rows to the existing objects on the database. If the row does not exist as object, should be inserted, otherwise updated. My first approach was turn both object and rows into arrays (or Doctrine_Collections); then compare both arrays before implementing the needed operations.
Can anyone suggest me any other possible approach?
We did a bit of this in a project recently, with CSV data. it was fairly painless. There's a symfony plugin tmCsvPlugin, but we extended this quite a bit since so the version in the plugin repo is pretty out of date. Must add that to the #TODO list :)
Question 1:
I don't explicitly know about performance, but I would guess that adding the records to a Doctrine_Collection and then calling Doctrine_Collection::save() would be the neatest approach. I'm sure it would be handy if an exception was thrown somewhere and you had to roll back on your last save..
Question 2:
If you could use a row field as a unique indentifier, (let's assume a username), then you could search for an existing record. If you find a record, and assuming that your imported row is an array, use Doctrine_Record::synchronizeWithArray() to update this record; then add it to a Doctrine_Collection. When complete, just call Doctrine_Collection::save()
A fairly rough 'n' ready implementation:
// set up a new collection
$collection = new Doctrine_Collection('User');
// assuming $row is an associative
// array representing one imported row.
foreach ($importedRows as $row) {
// try to find an existing record
// based on a unique identifier.
$user = Doctrine_Core::getTable('User')
->findOneByUsername($row['username']);
// create a new user record if
// no existing record is found.
if (!$user instanceof User) {
$user = new User();
}
// sync record with current data.
$user->synchronizeWithArray($row);
// add to collection.
$collection->add($user);
}
// done. save collection.
$collection->save();
Pretty rough but something like this worked well for me. This is assuming that you can use your imported row data in some way to serve as a unique identifier.
NOTE: be wary of synchronizeWithArray() if you're using sf1.2/doctrine 1.0 - if I remember correctly it was not implemented correctly. it works fine in doctrine 1.2 though.
I have never worked on Doctrine_Collections, but I can answer in terms of database queries and code logic in a broader sense. I would apply the following logic:-
Fetch all the rows of the excel sheet from database in a single query and store them in an array $uploadedSheet.
Create a single array of all the rows of the uploaded excel sheet, call it $storedSheet. I guess the structures of the Doctrine_Collections $uploadedSheet and $storedSheet will be similar (both two-dimensional - rows, cells can be identified and compared).
3.Run foreach loops on the $uploadedSheet as follows and only identify the rows which need to be inserted and which to be updated (do actual queries later)-
$rowsToBeUpdated =array();
$rowsToBeInserted=array();
foreach($uploadedSheet as $row=>$eachRow)
{
if(is_array($storedSheet[$row]))
{
foreach($eachRow as $column=>$value)
{
if($value != $storedSheet[$row][$column])
{//This is a representation of comparison
$rowsToBeUpdated[$row]=true;
break; //No need to check this row anymore - one difference detected.
}
}
}
else
{
$rowsToBeInserted[$row] = true;
}
}
4. This way you have two arrays. Now perform 2 database queries -
bulk insert all those rows of $uploadedSheet whose numbers are stored in $rowsToBeInserted array.
bulk update all the rows of $uploadedSheet whose numbers are stored in $rowsToBeUpdated array.
These bulk queries are the key to faster performance.
Let me know if this helped, or you wanted to know something else.

Resources