How can I query pouchdb for deleted documents? - couchdb

I would like to list currently deleted documents in order to provide the ability to undelete one or more.
How can I query couchdb for deleted documents? I am actually using pouchdb.
Although this POST nicely describes how to query for and undelete a document, it requires an id of an existing doc.
I am looking for a way to query for all documents that have been deleted. The POST cites making a query for all changes. That query returns all documents that have been deleted IN ADDITION to any that have been edited/changed.
I am looking only for documents that have been deleted. Think querying for documents in the 'trash bin'. :)

Starting from couch 2.1.0, you can add various selectors to _changes feed. So your request to output only deleted documents will be:
curl -X POST -H "content-Type: application/json" "http://adm:pass#127.0.0.1:15984/tracks/_changes?filter=_selector" -d '{"selector": {"_deleted": true}}'

You can add a filter to the _changes feed in PouchDB: https://pouchdb.com/api.html#filtered-changes
var changes = db.changes({
filter: function(doc) {
return doc._deleted;
}
}).on('change', function(change) {
console.log(change.id);
})

For an all-in-one solution combining the open_revs tip from this answer, here's the TypeScript code I came up with:
const db = new PouchDB('my-db');
async function deletedDocIds(): Promise<string[]> {
const ret: string[] = [];
return new Promise((resolve, reject) => {
db.changes({filter: d => d._deleted})
.on('change', c => ret.push(c.id))
.on('complete', () => resolve(ret))
.on('error', e => reject(e));
});
}
async function deletedDocs() {
const ids = await deletedDocIds();
return Promise.all(ids.map(id => db.get(id, {revs: true, open_revs: 'all'}).then(x => {
const revs = (x[0].ok as any)._revisions;
const lastRev = (revs.start - 1) + '-' + revs.ids[1];
return db.get(id, {rev: lastRev}); // with Pouchdb keys too
})));
}
Calling deletedDocs() will return a promise of an array of all deleted docs as per the rev just prior to deletion.
N.B., the elements of the array will include PouchDb metadata as well as your document's keys.
N.B. 2, version 6.1.3 of DefinitelyTyped's TypeScript bindings for pouchdb-browser which I'm using here (should work for #types/pouchdb too though) doesn't seem to know about the _revisions key, hence the as any escape hatch.
N.B. 3, this should be trivial to manually translate to plain JS, just delete the type declarations and coercions (:, as, and whatever token follows these).

Related

How to Update All Documents in a Collection With Thousands of Documents in FireStore with Firebase Admin?

Updated to reflect Firebase-Admin, rather than v9
New Update: solution at the bottom
How to update a single field (in a map) for all documents in a collection with thousands of documents -- in firebase-admin/firestore (10.0.2)? I attempted to use this:
import { getFirestore } from 'firebase-admin/firestore'
const db = getFirestore()
db.collection("users").get().then(function(querySnapshot) {
querySnapshot.forEach(function(doc) {
doc.ref.update({
"words.subscription": 0
})
})
})
I use it within a Node.js (v12) cloud function. The function runs, I see no errors, but none of the documents are updated. Previously, I attempted to use set() because some documents may not have the field, but Frank let me know update() can update fields that don't exist too.
However, this updated code also does not update all the documents in the 'users' collection.
Thank you in advance.
Update/Solution
#Dharmara - thank you for the gist. It did not work as-is, so I decided to change to async/await and wrap it in try/catch (to hopefully find out why it wasn't working), like so:
try {
let querySnapshot = await db.collection('users').get()
if (querySnapshot.size === 0) {
console.log('No documents to update')
return 'No documents to update'
}
const batches: any[] = [] // hold batches to update at once
querySnapshot.docs.forEach((doc, i) => {
if (i % 500 === 0) {
batches.push(db.batch())
}
const batch = batches[batches.length - 1]
batch.update(doc.ref, { "words.subscription": 0 })
})
await Promise.all(batches.map(batch => batch.commit()))
console.log(`${querySnapshot.size} documents updated`)
return `${querySnapshot.size} documents updated`
}
catch (error) {
console.log(`***ERROR: ${error}`)
return error
}
When I tested this however, it worked. From what I can tell it's the same as the then() way you had in the gist (and from what I had found elsewhere in SO).
If you wanted to put this into an answer I'd gladly mark it as the answer. Your code helped tremendously. Thank you.
The dot notation you use to write a nested field only works when you call update(), not for set(). Luckily update() works for non-existing fields, it only requires that the document already exists.
So for the namespaced API of v8 and before that'd be:
doc.ref.update({
"words.subscription": 0
})
For the modular API for v9 and beyond, it's:
update(doc.ref, {
"words.subscription": 0
})

Cloud firestore trigger query by documentID for array of ids

I am trying to write a transaction that first query documents by documentId from a list of ids, then makes some updates.
I am getting the error:
The corresponding value for FieldPath.documentId() must be a string or a DocumentReference.
For example:
const indexArray = [..list of doc ids...]
const personQueryRef = db.collection("person").where(admin.firestore.FieldPath.documentId(), "in", indexArray)
return db.runTransaction(transaction => {
return transaction.get(personQueryRef).then(personQuery => {
return personQuery.forEach(personRef => {
transaction.update(personRef, { ...update values here })
//more updates etc
})
})
})
I am wanting to do this in an onCreate and onUpdate trigger. Is there another approach I should be taking?
Update
The error still persists when not using a transaction, so this is unrelated to the problem.
The problem does not occur when the query is .where(admin.firestore.FieldPath.documentId(), "==", "just_one_doc_id"). So, the problem is with using FieldPath.documentId() and in.
It sounds like the type of query you're trying to do just isn't supported by the SDK. Whether or not that's intentional, I don't know. But if you want to transact with multiple documents, and you already know all of their IDs, you can use getAll(...) instead:
// build an array of DocumentReference objects
cost refs = indexArray.map(id => db.collection("person").doc(id))
return db.runTransaction(transaction => {
// pass the array to getAll()
return transaction.getAll(refs).then(docs => {
docs.forEach(doc => {
transaction.update(doc.ref, { ...update values here })
})
})
})

Nested transactions with pg-promise

I am using NodeJS, PostgreSQL and the amazing pg-promise library. In my case, I want to execute three main queries:
Insert one tweet in the table 'tweets'.
In case there is hashtags in the tweet, insert them into another table 'hashtags'
Them link both tweet and hashtag in a third table 'hashtagmap' (many to many relational table)
Here is a sample of the request's body (JSON):
{
"id":"12344444",
"created_at":"1999-01-08 04:05:06 -8:00",
"userid":"#postman",
"tweet":"This is the first test from postman!",
"coordinates":"",
"favorite_count":"0",
"retweet_count":"2",
"hashtags":{
"0":{
"name":"test",
"relevancetraffic":"f",
"relevancedisaster":"f"
},
"1":{
"name":"postman",
"relevancetraffic":"f",
"relevancedisaster":"f"
},
"2":{
"name":"bestApp",
"relevancetraffic":"f",
"relevancedisaster":"f"
}
}
All the fields above should be included in the table "tweets" besides hashtags, that in turn should be included in the table "hashtags".
Here is the code I am using based on Nested transactions from pg-promise docs inside a NodeJS module. I guess I need nested transactions because I need to know both tweet_id and hashtag_id in order to link them in the hashtagmap table.
// Columns
var tweetCols = ['id','created_at','userid','tweet','coordinates','favorite_count','retweet_count'];
var hashtagCols = ['name','relevancetraffic','relevancedisaster'];
//pgp Column Sets
var cs_tweets = new pgp.helpers.ColumnSet(tweetCols, {table: 'tweets'});
var cs_hashtags = new pgp.helpers.ColumnSet(hashtagCols, {table:'hashtags'});
return{
// Transactions
add: body =>
rep.tx(t => {
return t.one(pgp.helpers.insert(body,cs_tweets)+" ON CONFLICT(id) DO UPDATE SET coordinates = "+body.coordinates+" RETURNING id")
.then(tweet => {
var queries = [];
for(var i = 0; i < body.hashtags.length; i++){
queries.push(
t.tx(t1 => {
return t1.one(pgp.helpers.insert(body.hashtags[i],cs_hashtags) + "ON CONFLICT(name) DO UPDATE SET fool ='f' RETURNING id")
.then(hash =>{
t1.tx(t2 =>{
return t2.none("INSERT INTO hashtagmap(tweetid,hashtagid) VALUES("+tweet.id+","+hash.id+") ON CONFLICT DO NOTHING");
});
});
}));
}
return t.batch(queries);
});
})
}
The problem is with this code I am being able to successfully insert the tweet but nothing happens then. I cannot insert the hashtags nor link the hashtag to the tweets.
Sorry but I am new to coding so I guess I didn't understood how to properly return from the transaction and how to perform this simple task. Hope you can help me.
Thank you in advance.
Jean
Improving on Jean Phelippe's own answer:
// Columns
var tweetCols = ['id', 'created_at', 'userid', 'tweet', 'coordinates', 'favorite_count', 'retweet_count'];
var hashtagCols = ['name', 'relevancetraffic', 'relevancedisaster'];
//pgp Column Sets
var cs_tweets = new pgp.helpers.ColumnSet(tweetCols, {table: 'tweets'});
var cs_hashtags = new pgp.helpers.ColumnSet(hashtagCols, {table: 'hashtags'});
return {
/* Tweets */
// Add a new tweet and update the corresponding hash tags
add: body =>
db.tx(t => {
return t.one(pgp.helpers.insert(body, cs_tweets) + ' ON CONFLICT(id) DO UPDATE SET coordinates = ' + body.coordinates + ' RETURNING id')
.then(tweet => {
var queries = Object.keys(body.hashtags).map((_, idx) => {
return t.one(pgp.helpers.insert(body.hashtags[i], cs_hashtags) + 'ON CONFLICT(name) DO UPDATE SET fool = $1 RETURNING id', 'f')
.then(hash => {
return t.none('INSERT INTO hashtagmap(tweetid, hashtagid) VALUES($1, $2) ON CONFLICT DO NOTHING', [+tweet.id, +hash.id]);
});
});
return t.batch(queries);
});
})
.then(data => {
// transaction was committed;
// data = [null, null,...] as per t.none('INSERT INTO hashtagmap...
})
.catch(error => {
// transaction rolled back
})
},
NOTES:
As per my notes earlier, you must chain all queries, or else you will end up with loose promises
Stay away from nested transactions, unless you understand exactly how they work in PostgreSQL (read this, and specifically the Limitations section).
Avoid manual query formatting, it is not safe, always rely on the library's query formatting.
Unless you are passing the result of transaction somewhere else, you should at least provide the .catch handler.
P.S. For the syntax like +tweet.id, it is the same as parseInt(tweet.id), just shorter, in case those are strings ;)
For those who will face similar problem, I will post the answer.
Firstly, my mistakes:
In the for loop : body.hashtag.length doesn't exist because I am dealing with an object (very basic mistake here). Changed to Object.keys(body.hashtags).length
Why using so many transactions? Following the answer by vitaly-t in: Interdependent Transactions with pg-promise I removed the extra transactions. It's not yet clear for me how you can open one transaction and use the result of one query into another in the same transaction.
Here is the final code:
// Columns
var tweetCols = ['id','created_at','userid','tweet','coordinates','favorite_count','retweet_count'];
var hashtagCols = ['name','relevancetraffic','relevancedisaster'];
//pgp Column Sets
var cs_tweets = new pgp.helpers.ColumnSet(tweetCols, {table: 'tweets'});
var cs_hashtags = new pgp.helpers.ColumnSet(hashtagCols, {table:'hashtags'});
return {
/* Tweets */
// Add a new tweet and update the corresponding hashtags
add: body =>
rep.tx(t => {
return t.one(pgp.helpers.insert(body,cs_tweets)+" ON CONFLICT(id) DO UPDATE SET coordinates = "+body.coordinates+" RETURNING id")
.then(tweet => {
var queries = [];
for(var i = 0; i < Object.keys(body.hashtags).length; i++){
queries.push(
t.one(pgp.helpers.insert(body.hashtags[i],cs_hashtags) + "ON CONFLICT(name) DO UPDATE SET fool ='f' RETURNING id")
.then(hash =>{
t.none("INSERT INTO hashtagmap(tweetid,hashtagid) VALUES("+tweet.id+","+hash.id+") ON CONFLICT DO NOTHING");
})
);
}
return t.batch(queries);
});
}),

MongoDB race conditions or concurency issues

I have the following code in my chat application based on NodeJS and MongoDB to change admin for room:
export function setAdmin(room, user) {
const userGuid = getGuid(user);
if (room.users && room.users.length) {
// filter current and new
room.users = room.users.filter(guid =>
guid !== room.adminGuid && guid !== userGuid
);
} else {
room.users = [];
}
room.users.push(userGuid);
room.adminGuid = userGuid;
return roomsCollection.save(room, { w: 1 });
}
Each room have only 2 users: admin and customer.
To update several rooms:
const rooms = await roomsCollection.find({ adminGuid: currentAdminGuid }).toArray();
for (const room of rooms) {
await setAdmin(room, newAdminUser);
}
I had some problems under highload with my application. They were resolved with help of indexes.
But after, working with MongoDB dump I found out that I have rooms with 3 users guids in room.users array. I think that save works as update for exist document, but how it updates array? If $set, why I have such 3 users rooms. Any thoughts how it would be possible?
save() behaves as insert if document does not exist. You have to use update() with {upsert: false}
some resources to update array: Array Query, array Operator
simple tutorial to nest array inside document, here
advanced example of dealing with multiple nested array, here

node-elastic full document update

In my nodejs elastic writer I want to replace one document with another.
Currently, I run-
var data = { doc: doc, "doc_as_upsert": true };
var metadata =
{ update: { _id: idToUpdate, _index:indexName,_type: INDEX_TYPE_PREFIX } };
body.push(metadata);
body.push(payment);
}
elasticsearchClient.bulk({
body: body,
}, function (err, resp) {
But in case the document in elastic contained field X and the updated document didn't, field X stays in elastic- I want it to be removed.
According to
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html
using "doc:" is for partial update, so what's the alternative for full update?
Don't use the update api, use the index api instead.

Resources