couchDB conflicts when supplying own ID with large inserts using _bulk_docs - node.js

Same code works fine when letting couch auto generate UUID's. I am starting off with a new completely empty database yet I keep getting this
error: conflict
reason: Document update conflict
To reiterate I am posting new documents to an empty database so not sure how I can get update conflicts when nothing is being updated. Even stranger the conflicting documents still show up in the DB with only a single revision, but overall there are missing records.
I am trying to insert about 38,000 records with _bulk_docs in batches of 100. I am getting these records (100 at a time) from a RETS server, each record already has a unique ID that I want to use for the couchDB _id instead of their UUID's. I am using a promised based library to get the records and axios to insert them into couch. After getting the first batch of 100 I then run this code to add an _id to each of the 100 records before inserting
let batch = [];
batch = records.results.map((listing) => {
let temp = listing;
temp._id = listing.ListingKey;
return temp;
});
Then insert:
axios.post('http://127.0.0.1:5984/rets_store/_bulk_docs', { docs: batch })
This is all inside of a function that I call recursively.
I know this probably wont be enough to see the issue but thought Id start here. I know for sure it has something to do with my map() and adding the _id = ListingKey
Thanks!

Related

Use an array of values to query Firestore and setup a snapshot listener

Here is my problem:
I have a firestore collection that has a number of documents. There are about 500 documents generated/updated every hour and saved to the collection.
I would like to query the collection and setup a real-time snapshot listener for a subset of document IDs, that are provided by the client.
I think maybe I could to something like this (this syntax is likely not correct...just trying to get a feel for if it's even possible...but isn't the "in" limited to an array of 10 items? ):
const subbedDocs = ["doc1","doc2","doc3","doc4","doc5"]
docsRef.where('docID', 'in', subbedDocs).onSnapshot((doc) => {
handleSnapshot(doc);
});
I'm sorry, that code probably doesn't make sense....I'm still trying to learn all the ins and outs of Firestore.
Essentially, what I am trying to do is take an array of ID's and setup a .onSnapshot listener for those ID's. This list of IDs could be upwards of 40-50 items. Is this even possible? I am trying to avoid just setting up a listener on the whole collection and filtering out things I am not "subscribed" too as that seems wasteful from a resources perspective.
If you have the doc IDs in your array (it looks like you have) you can loop over them and start a listener during that:
const subbedDocs = ["doc1", "doc2", "doc3", "doc4", "doc5"];
for (let i = 0; i < subbedDocs.length; i++) {
const docID = subbedDocs[i];
docsRef.doc(docID).onSnapshot((doc) => {
handleSnapshot(doc);
});
}
It would be better to listen to a query and all filtered docs at once. But if you want to listen to each of them with a explicit listener that would do the trick.
As you've discovered, Firestore's in operator only allows up to 10 entries in the array. I'm also guessing you've added the docID as a field in the document, since I don't believe 'docID references the actual documentid.
I would not take this approach, because of the 10-entry limitation. What I would do is, as the client is selecting documents to follow, set a field (same in each document) to a unique Id for the client, so your query completely avoids the limitation. You can allow an unlimited number of Client listeners (up to implementation limits of Firestore) if you add that client ID into an array (called something like "ListenerArray") [again, as the client is selecting them]. Your query would be more like:
docsRef.where('ListenerArray', 'array-contains', clientID).onSnapshot((doc) => {
handleSnapshot(doc);
})
array-contains checks a single value against all entries in a document array, without limit. Every client can mark any number of documents to subscribe to.

immutable _id error when performing MongoDB bulkWrite replaceOne on first attempt only

I'm working on a little web application that will crawl and update baseball standings by day and track teams positions (among other things) over time.
I have an API I grab all of this from and a collection in MongoDB that stores all the team data and information for the current day. Right now I just run this manually but eventually it'll be automated to run at like 3am or whenever.
The API supplies a unique ID for each team that never changes. So what I'm doing is I'm taking in the team data from the API. Passing it to a function that then extracts the teams data (there is other data from the response object I don't need), puts it into an object for replacement, and then wherever that team ID exist in the collection its document is replaced in a bulkWite.
async function currentStandings(db,team_standings,callback){
const current_standings = db.collection('current_standings');
let replacePool = [];
for(const single_team of team_standings.data.standing){
let replaceOnePusher = {
replaceOne: {
"filter": {"team_id": single_team.team_id},
"replacement": single_team
}
}
replacePool.push(replaceOnePusher);
}
await current_standings.bulkWrite(replacePool);
callback();
}
However when I execute this code for the first time each day I get an error reading BulkWriteError: After applying the update, the (immutable) field '_id' was found to have been altered to _id: ObjectId('5f26e57b6831761ac840bf1d') (not the same ID every day) and if I look in Compass the data isn't updated. If I immediately run the script again, it goes through successfully without error. Refreshing the data in compass generates the correct data.
Can someone explain to me what is going wrong here? This is actually my first time using MongoDB since I wanted to learn it and this pet project seemed like a good place to start.

MongoDB insetMany not inserting all documents when document length is over 50k

I am trying to add a list of contacts to my db. I am using inertMany to do so. Every thing works fine if amount of contacts i am adding is within 50k. if it exceeds that not all data is being save to db. the response i get after insert is { ok: 1, n: 1047 }
. Anybody know why this is happening ?

Resource Conflict after syncing with PouchDB

I am new to CouchDB / PouchDB and until now I somehow could manage the start of it all. I am using the couchdb-python library to send initial values to my CouchDB before I start the development of the actual application. Here I have one database with templates of the data I want to include and the actual database of all the data I will use in the application.
couch = couchdb.Server()
templates = couch['templates']
couch.delete('data')
data = couch.create('data')
In Python I have a loop in which I send one value after another to CouchDB:
value = templates['Template01']
value.update({ '_id' : 'Some ID' })
value.update({'Other Attribute': 'Some Value'})
...
data.save(value)
It was working fine the whole time, I needed to run this several times as my data had to be adjusted. After I was satisfied with the results I started to create my application in Javascript. Now I synced PouchDB with the data database and it was also working. However, I found out that I needed to change something in the Python code, so I ran the first python script again, but now I get this error:
couchdb.http.ResourceConflict: (u'conflict', u'Document update conflict.')
I tried to destroy() the pouchDB database data and delete the CouchDB database as well. But I still get this error at this part of the code:
data.save(value)
What I also don't understand is, that a few values are actually passed to the database before this error comes. So some values are saved() into the db.
I read it has something to do with the _rev values of the documents, but I cannot get an answer. Hope someone can help here.

Mongodb: how to compare DB to new data

Each week I receive a new copy of source data (8500, and growing, records approx and with an id field that Mongo uses as _id) and I want to look for (and save, while keeping the old data) updated information (about 30 changes/additions per month are likely). I'm trying to work out the best approach.
My first thought was, for each entry in new data, get the DB entry with that _id, compare, and update the data where changed. But that results in 8500 asynchronous calls over the net (to mongolab) + 30 upserts where new/changed data needs to be saved.
So, the alternative is to download everything at the outset. But then I end up with an Array from Mongo and would need to do Array.find each time to get the element that matches with the new data.
Is there a Mongo command to return the results of .find({}) as a Javascript Object keyed by _id? Or, does it otherwise make sense to take the raw array form Mongo and covert it myself to an object
I will store :
id + version + date + datas
For each update :
Make a dump of prod DB for local usage
work offline, in a local mongoDB (because you don't want to launch 9000 query over the web)
for each line
compare datas to mongo datas
if modifications ==true, will store a new/first (id+version)
else skip;
make a dump of your local DB
installl dump to production environnement
mongodb doc dump

Resources