Automatic document updates in couchdb - couchdb

I have 10,000+ couchdb documents, each having (simplified) format like -
{
"First Name" : "John",
"Last Name" : "Doe"
}
I want to add another field to this document, which is e-mail, so that document now looks like -
{
"First Name" : "John",
"Last Name" : "Doe",
"e-mail" : ""
}
I understand that I can easily update this document by inserting a new JSON, in new format.
But my question is how can I add new field automatically to "all 10,000+" docs that I have existing in the DB? Do I need to write my own script to read each doc and update each one of them individually? Or is there a simpler way?

If you use views to access your data, you can modify the view without having to modify the documents. Just emit an email value with a default of "".
Assuming the above is no good, use a view to show you which documents need upgrading.
function(doc) {
// views.email_upgrade.map
if(! ('e-mail' in doc)) {
var key = [doc["Last Name"], doc["First Name"]];
emit(key, {_id:doc._id, _rev:doc._rev});
}
Query /db/_design/foo/_view/email_upgrade?include_docs=true. You can add a &limit=N property to help. Query. The doc value in each row is a document that needs to upgrade. You can send them back with POST /db/_bulk_docs. Loop until you have 0 rows. Once you have 0 rows, add a check to your validate_doc_update function.

Related

Insert document to collection using data from an "inner select" from another collection

I have 2 mongo collections: admins and logs
an 'admins' collection document looks like this:
{
_id: ObjectId("123456"),
name: "John Doe",
age: 33
}
I would like to insert, with one query, a new document to the 'logs' collection, that looks like this:
{
_id: ObjectId("778899"),
action: "did something",
adminUserId: "123456" // just a plain string value, not ObjectId
}
So what I'm really asking is how I can insert a document, where a field value equals an "inner select statement" value,
i.e-
adminUserId = ([select one from collection 'admins' where name="John Doe"]._id.toString())
Any way to do it all in one mongo operation? I'm in node.js
Use aggregation pipeline to retrieve the data you want inserted, then massage it if needed, then use $merge stage to insert it into an existing collection.

Update mongoDB with todolist information?

I have a todolist feature in my frontend, here is a demo: https://gyazo.com/a10fcd7c470439fe5cc703eef75b437f
It is all updated using an array in a Vue component and then using v-models to keep track of the data and change the UI to reflect that array.
When the user clicks 'send' i want it to send off the data to the database.
The issue im having is that i can't work out how to import newly created 'todos'(the text box and check box that is created when the + button is clicked) into the database.
This is what each todolist document looks like in my 'todolists' collection in the mongo:
{
"_id":"5caca1498accb128c8974d56",
"title":"todolist1 11111 11111 11111 11111",
"userEmail":"test#gmail.com",
"todos":[
{
"_id":"5caca1498accb128c8974d57",
"description":"Get this done",
"completed":true}
],
"dateDue":"2019-04-07T18:24:31.207Z",
"__v":0
}
The 'save' button in the demo has a v-on:click attribute that has a function named saveTodoList(), which then makes an axios post request to the route /updateTodoList
Feel free to ask any questions that will help you answer my question :)
When a new todo is saved after clicking plus button, make a request on /updateTodoList with parent document's _id. So your request might look something like this:
POST /updateTodoList:
Body:
{ "_id": "5caca1498accb128c8974d56",
"todo": {
"description": "This is a newly added todo description",
"completed": false
}
}
Then, on the server side, parse this body object and update the document with matching _id and push the new todo into the document. Your query will look something like this:
todolist.findOneAndUpdate({_id: req.body._id}, { $push: {todos: req.body.todo } })
Hope this helps.
Edit:
Each time you push a todo using above query, mongo inserts that element to the todos array. For pushing multiple todos in single query, use $each operator along with $push. Official documentation here.

Node.js and MongoDB if document exact match exists, ignore insert

I am maintaining a collection of unique values that has a companion collection that has instances of those values. The reason I have it that way is the companion collection has >10 million records where the unique values collection only add up to 100K and I use those values all over the place and do partial match lookups.
When I upload a csv file it is usually 10k to 500k records at a time that I insert into the companion collection. What is the best way to insert only values that dont already exist into the unique values collection?
Example:
//Insert large quantities of objects into mongo
var bulkInsert = [
{
name: "Some Name",
other: "zxy",
properties: "abc"
},
{
name: "Some Name",
other: "zxy",
properties: "abc"
},
{
name: "Other Name",
other: "zxy",
properties: "abc"
}]
//Need to insert only values that do not already exist in mongo unique values collection
var uniqueValues = [
{
name:"Some Name"
},
{
name:"Other Name"
}
]
EDIT
I tried creating a unique index on the field, but once it finds a duplicate in the Array of documents that I am inserting, it stops the whole process and doesnt proceed to check any values after the break.
Figured it out. If your doing it from the shell, you need to use Bulk() and create insert jobs like this:
var bulk = db.collection.initializeUnorderedBulkOp();
bulk.insert( { name: "1234567890a"} );
bulk.insert( { name: "1234567890b"} );
bulk.insert( { name: "1234567890"} );
bulk.execute();
and in node, the continueOnError flag works on a straight collection.insert()
collection.insert( [{name:"1234567890a"},{name:"1234567890c"}],{continueOnError:true}, function(err, doc){}
Well, I think the solution here is quite simple if I understand correctly your issue.
Since the process is stopped when it finds a duplicated field you should basically check if the value doesn't already exists before to try to add it.
So, for each element in uniqueValues, make a find/findOne query, if it doesn't return any result then add the element, otherwise don't.

Saving a Person or Group field using REST

Does anyone know how to save a Person field using REST?
I have tried the following and it works:
{
"__metadata": { "type": "SP.Data.SomeListListItem" } ,
"MyPersonFieldId" : 1
}
But this only works if you know the ID. I don't have that! How can I get it? I have the key which is i.0#w|domain\userName.
I tried the following and it doesnt work either:
{
"__metadata": { "type": "SP.Data.SomeListListItem" } ,
"MyPersonField" : { "__metadata": { "type": "SP.Data.UserInfoItem" }, "Name": "i.0#w|domain\userName" }
}
Any ideas?? Thanks!
I haven't done this with a Person field, but I did do something similar with a managed metadata field. I basically had to pass in additional information as an object to create the value in the field.
See if passing in the ID of the user along with the name works. I'm about to try this myself as I have the same need.
{
"MyPersonField": { "Name": "i.0#w|domain\userName", "ID": 1 }
}
EDIT: Ok, updating this field is easier than I thought. I was able to perform the update by simply passing in the ID of the user to the Id field:
{
"MyPersonFieldId": 1
}
This means the user should already be in the site collection, so if the user doesn't exist the request will fail.
Use the below code to get Current User ID to save user under People and group column. People column name is Requestor. But to save user we have to specify column name as RequestorId
var userid = _spPageContextInfo.userId; // To get current user ID
var itemProperties={'Title':vTitle,'RequestorId':userid};
The thing is that User information is a lookup field thereby MyPersonField does not exist on your SharePoint list if you use an OData endpoint, I really don't know how to save data but same problem happened to me when I tried to read a user.
for example {server}/{api}/list/getbytitle('mylist')/items does not return MyPersonField instead return MyPersonFieldId.
But if we use:
{server}/{api}/list/getbytitle('mylist')/items/?$select=*,MyPersonField/Name&$expand=MyPersonField
We are able to work with MyPersonField lookup values.

Whats the best way of saving a document with revisions in a key value store?

I'm new to Key-Value Stores and I need your recommendation. We're working on a system that manages documents and their revisions. A bit like a wiki does. We're thinking about saving this data in a key value store.
Please don't give me a recommendation that is the database you prefer because we want to hack it so we can use many different key value databases. We're using node.js so we can easily work with json.
My Question is: What should the structure of the database look like? We have meta data for each document(timestamp, lasttext, id, latestrevision) and we have data for each revision (the change, the author, timestamp, etc...). So, which key/value structure you recommend?
thx
Cribbed from the MongoDB groups. It is somewhat specific to MongoDB, however, it is pretty generic.
Most of these history implementations break down to two common strategies.
Strategy 1: embed history
In theory, you can embed the history of a document inside of the document itself. This can even be done atomically.
> db.docs.save( { _id : 1, text : "Original Text" } )
> var doc = db.docs.findOne()
> db.docs.update( {_id: doc._id}, { $set : { text : 'New Text' }, $push : { hist : doc.text } } )
> db.docs.find()
{ "_id" : 1, "hist" : [ "Original Text" ], "text" : "New Text" }
Strategy 2: write history to separate collection
> db.docs.save( { _id : 1, text : "Original Text" } )
> var doc = db.docs.findOne()
> db.docs_hist.insert ( { orig_id : doc._id, ts : Math.round((new Date()).getTime() / 1000), data : doc } )
> db.docs.update( {_id:doc._id}, { $set : { text : 'New Text' } } )
Here you'll see that I do two writes. One to the master collection and
one to the history collection.
To get fast history lookup, just grab the original ID:
> db.docs_hist.ensureIndex( { orig_id : 1, ts : 1 })
> db.docs_hist.find( { orig_id : 1 } ).sort( { ts : -1 } )
Both strategies can be enhanced by only displaying diffs
You could hybridize by adding a link from history collection to original collection
Whats the best way of saving a document with revisions in a key value store?
It's hard to say there is a "best way". There are obviously some trade-offs being made here.
Embedding:
atomic changes on a single doc
can result in large documents, may break the reasonable size limits
probably have to enhance code to avoid returning full hist when not necessary
Separate collection:
easier to write queries
not atomic, needs two operations (do you have transactions?)
more storage space (extra indexes on original docs)
I'd keep a hierarchy of the real data under each document with the revision data attached, for instance:
{
[
{
"timestamp" : "2011040711350621",
"data" : { ... the real data here .... }
},
{
"timestamp" : "2011040711350716",
"data" : { ... the real data here .... }
}
]
}
Then use the push operation to add new versions and periodically remove the old versions. You can use the last (or first) filter to only get the latest copy at any given time.
I think there are multiple approaches and this question is old but I'll give my two cents as I was working on this earlier this year. I have been using MongoDB.
In my case, I had a User account that then had Profiles on different social networks. We wanted to track changes to social network profiles and wanted revisions of them so we created two structures to test out. Both methods had a User object that pointed to foreign objects. We did not want to embed objects from the get-go.
A User looked something like:
User {
"tags" : [Tags]
"notes" : "Notes"
"facebook_profile" : <combo_foreign_key>
"linkedin_profile" : <same as above>
}
and then, for the combo_foreign_key we used this pattern (Using Ruby interpolation syntax for simplicity)
combo_foreign_key = "#{User.key}__#{new_profile.last_updated_at}"
facebook_profiles {
combo_foreign_key: facebook_profile
... and you keep adding your foreign objects in this pattern
}
This gave us O(1) lookup of the latest FacebookProfile of a User but required us to keep the latest FK stored in the User object. If we wanted all of the FacebookProfiles we would then ask for all keys in the facebook_profiles collection with the prefix of "#{User.key}__" and this was O(N)...
The second strategy we tried was storing an array of those FacebookProfile keys on the User object so the structure of the User object changed from
"facebook_profile" : <combo_foreign_key>
to
"facebook_profile" : [<combo_foreign_key>]
Here we'd just append on the new combo_key when we added a new profile variation. Then we'd just do a quick sort of the "facebook_profile" attribute and index on the largest one to get our latest profile copy. This method had to sort M strings and then index the FacebookProfile based on the largest item in that sorted list. A little slower for grabbing the latest copy but it gave us the advantage knowing every version of a Users FacebookProfile in one swoop and we did not have to worry about ensuring that foreign_key was really the latest profile object.
At first our revision counts were pretty small and they both worked pretty well. I think I prefer the first one over the second now.
Would love input from others on ways they went about solving this issue. The GIT idea suggested in another answer actually sounds really neat to me and for our use case would work quite well... Cool.

Resources