Mongodb: how to compare DB to new data - node.js

Each week I receive a new copy of source data (8500, and growing, records approx and with an id field that Mongo uses as _id) and I want to look for (and save, while keeping the old data) updated information (about 30 changes/additions per month are likely). I'm trying to work out the best approach.
My first thought was, for each entry in new data, get the DB entry with that _id, compare, and update the data where changed. But that results in 8500 asynchronous calls over the net (to mongolab) + 30 upserts where new/changed data needs to be saved.
So, the alternative is to download everything at the outset. But then I end up with an Array from Mongo and would need to do Array.find each time to get the element that matches with the new data.
Is there a Mongo command to return the results of .find({}) as a Javascript Object keyed by _id? Or, does it otherwise make sense to take the raw array form Mongo and covert it myself to an object

I will store :
id + version + date + datas
For each update :
Make a dump of prod DB for local usage
work offline, in a local mongoDB (because you don't want to launch 9000 query over the web)
for each line
compare datas to mongo datas
if modifications ==true, will store a new/first (id+version)
else skip;
make a dump of your local DB
installl dump to production environnement
mongodb doc dump

Related

Redis Compatible Reversible Data Structure for a Binary Search?

I have a chat module coded in Nodejs and Redis, which loads all the DB users into Redis and later retrieve them with the Key and Value, As what redis server is expected to do,
To store them I used "Key" as the User_ID with prefix and values in json as below,
entry.user_id = rows[i].user_id;
entry.uname = rows[i].uname.toString();
client.set('chat_userid_' + entry.user_id, JSON.stringify(entry));
This works fine, as long as we do searches for the user's data using only the User_ID. Sometimes I have to find user with the "name" as well, In this case, when we want to search via name, I had to do another key to the value list just for that search.
entry.user_id = rows[i].user_id;
entry.uname = rows[i].uname.toString();
client.set('chat_uname_' + entry.uname, JSON.stringify(entry));
As you can see above Data structure is very low performance and redundant, Is there a better data structure to store the user data in the Redis server, that we can get the same result as per the above use-case?

How to check if two files have the same content

I am working with a nodejs application. I am querying an API endpoint, and storing the retrieved data inside a database. Everything is working well. However, there are instances where some data is not pushed to the database. In this case what I normally do is manually query the endpoint by assigning the application the date when that data was lost, and retrieve it since its stored in a server which automatically deletes the data after 2 days. The API and database fields are identical.
The following is not the problem, but to give you context, I would like to automate this process by making the application retrieve all the data for the past 48 HRS, save it in a .txt file inside the app. I will do the same, query my mssql database to retrieve the data for the past 48 hrs.
My question is, how can check whether the contents of my api.txt file are the same with that of the db.txt?
You could make use of buf.equals(), as detailed in the docs
const fs = require('fs');
var api = fs.readFileSync('api.txt');
var db = fs.readFileSync('db.txt');
//Returns bool
api.equals(db)
So that:
if (api.equals(db))
console.log("equal")
else
console.log("not equal")

couchDB conflicts when supplying own ID with large inserts using _bulk_docs

Same code works fine when letting couch auto generate UUID's. I am starting off with a new completely empty database yet I keep getting this
error: conflict
reason: Document update conflict
To reiterate I am posting new documents to an empty database so not sure how I can get update conflicts when nothing is being updated. Even stranger the conflicting documents still show up in the DB with only a single revision, but overall there are missing records.
I am trying to insert about 38,000 records with _bulk_docs in batches of 100. I am getting these records (100 at a time) from a RETS server, each record already has a unique ID that I want to use for the couchDB _id instead of their UUID's. I am using a promised based library to get the records and axios to insert them into couch. After getting the first batch of 100 I then run this code to add an _id to each of the 100 records before inserting
let batch = [];
batch = records.results.map((listing) => {
let temp = listing;
temp._id = listing.ListingKey;
return temp;
});
Then insert:
axios.post('http://127.0.0.1:5984/rets_store/_bulk_docs', { docs: batch })
This is all inside of a function that I call recursively.
I know this probably wont be enough to see the issue but thought Id start here. I know for sure it has something to do with my map() and adding the _id = ListingKey
Thanks!

Resource Conflict after syncing with PouchDB

I am new to CouchDB / PouchDB and until now I somehow could manage the start of it all. I am using the couchdb-python library to send initial values to my CouchDB before I start the development of the actual application. Here I have one database with templates of the data I want to include and the actual database of all the data I will use in the application.
couch = couchdb.Server()
templates = couch['templates']
couch.delete('data')
data = couch.create('data')
In Python I have a loop in which I send one value after another to CouchDB:
value = templates['Template01']
value.update({ '_id' : 'Some ID' })
value.update({'Other Attribute': 'Some Value'})
...
data.save(value)
It was working fine the whole time, I needed to run this several times as my data had to be adjusted. After I was satisfied with the results I started to create my application in Javascript. Now I synced PouchDB with the data database and it was also working. However, I found out that I needed to change something in the Python code, so I ran the first python script again, but now I get this error:
couchdb.http.ResourceConflict: (u'conflict', u'Document update conflict.')
I tried to destroy() the pouchDB database data and delete the CouchDB database as well. But I still get this error at this part of the code:
data.save(value)
What I also don't understand is, that a few values are actually passed to the database before this error comes. So some values are saved() into the db.
I read it has something to do with the _rev values of the documents, but I cannot get an answer. Hope someone can help here.

Meteor last executed query in mongodb?

Meteor Mongo and Mongodb query is doest same. I am using external Mongodb. so I need to debug my query. Is their any way to find last executed query in Mongo?
Don't know if this works in meteor mongo -but you seem to be using an external mongo - presumably you set up profiling with a capped collection, so that the collection never grows over a certain size. If you only need the last op, then you make the size pretty much smaller than this.
db.createCollection( "system.profile", { capped: true, size:4000000 } )
The mongo doc is here: http://docs.mongodb.org/manual/tutorial/manage-the-database-profiler/
From the mongo docs:
To return the most recent 10 log entries in the system.profile
collection, run a query similar to the following:
db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
Since it's sorted inversely by time, just take the first record from the result.
Otherwise you could roll your own with a temporary client-only mongo collection:
Queries = new Mongo.Collection(null);
Create an object containing your query, cancel the last record and insert the new one.

Resources