I'm trying to build a REST API with express, sequelize (PostgreSQL dialect) and node.
Essentially I have two endpoints:
Method
Endpoint
Desc.
GET
/api/players
To get players info, including assets
POST
/api/assets
To create an asset
And there is a mechanism which updates a property (say price) of assets, over a cycle of 30 seconds.
Goal
I want to cache the results of GET /api/players, but I want some control over it, so that whenever a user creates an asset (using POST /api/assets) and right after that a request to GET /api/players should give the updated data (i.e. including the property which updates for every 30 seconds) and cache it until it gets updated in the next cycle.
Expected
The following should demonstrate it:
GET /api/players
JSON Response:
[
{
"name": "John Doe"
"assets": [
{
"id":1
"price": 10
}
]
}
]
POST /api/assets
JSON Request:
{
"id":2
}
GET /api/players
JSON Response:
{
"name": "John Doe"
"assets": [
{
"id":1
"price": 10
},
{
"id":2
"price": 7.99
}
]
}
What I have managed to do so far
I have made the routes, but GET /api/players has no cache mechanism and basically queries the database every time it is requested.
Some solutions I have found, but none seem to meet my scenario
apicache (https://www.youtube.com/watch?v=ZGymN8aFsv4&t=1360s): But I don't have a specific duration, because a user can create an asset anytime.
Example implementation
I have seen (kind off) similar implementation (that I desire) in Github actions workflow for implementing cache, where you define a key and unless the key has changed it uses the same packages and doesn't install packages everytime, (example: https://github.com/python-discord/quackstack/blob/6792fd5868f28573bb8f9565977df84e7ba50f42/.github/workflows/quackstack.yml#L39-L52)
Is there any package, to do that? So that while processing POST /api/assets I can change the key in its handler, and thus GET /api/players gives me the updated result (also I can change the key in that 30 seconds cycle too), and after that it gives me the cached result (until it is updated in the next cycle).
Note: If you have a solution please try to stick with some npm packages, rather than something like redis, unless its the only/best solution.
Thanks in advance!
(P.S. I'm a beginner and this is my first question in SO)
Typically caching is done with help of Redis. Redis is in-memory key-value store. You could handle the cache in the following manner.
In your handler for POST operation update/reset cached entry for players.
In your handler for GET operation if the Redis has the entry in cache return it, otherwise do the logic query the data, add the entry to the cache and return the data.
Alternatively, you could use Memcached.
A bit late to this answer but I was looking for a similar solution. I found that the apicache library not only allows for caching for specified durations, but the cache can also be manually cleared.
apicache.clear([target]) - clears cache target (key or group), or entire cache if no value passed, returns new index.
Here is an example for your implementation:
// POST /api/assets
app.post('/api/assets', function(req, res, next) {
// update assets then clear cache
apicache.clear()
// or only clear the specific players cache by using a parameter
// apicache.clear('players')
res.send(response)
})
Related
I'm using aws-appsync in a Node.js client to keep a cached list of data items. This cache must be available at all times, including when not connected to the internet.
When my Node app starts, it calls a query which returns the entire list of items from the AppSync data source. This is cached by Apollo's cache storage, which allows future queries (using the same GraphQL query) to be made using only the cache.
The app also makes a subscription to the mutations which are able to modify the list on other clients. When an item in the list is changed, the new data is sent to the app. This can trigger the original query for the entire list to be re-fetched, thus keeping the cache up to date.
Fetching the entire list when only one item has changed is not efficient. How can I keep the cache up to date, while minimising the amount of data that has to be fetched on each change?
The solution must provide a single point to access cached data. This can either be a GraphQL query or access to the cache store directly. However, using results from multiple queries is not an option.
The Apollo documentation hints that this should be possible:
In some cases, just using [automatic store updates] is not enough for your application ... to update correctly. For example, if you want to add something to a list of objects without refetching the entire list ... Apollo Client cannot update existing queries for you.
The alternatives it suggests are refetching (essentially what I described above) and using an update callback to manually update the cached query results in the store.
Using update gives you full control over the cache, allowing you to make changes to your data model in response to a mutation in any way you like. update is the recommended way of updating the cache after a query.
However, here it is referring to mutations made by the same client, rather than syncing using between clients using subscriptions. The update callback option doesn't appear to be available to a subscription (which provides the updated item data) or a query (which could fetch the updated item data).
As long as your subscription includes the full resource that was added, it should be possible by reading from and writing to the cache directly. Let's assume we have a subscription like this one from the docs:
const COMMENTS_SUBSCRIPTION = gql`
subscription onCommentAdded {
commentAdded {
id
content
}
}
`;
The Subscription component includes a onSubscriptionData prop, so we should be able to do something along these lines:
<Subscription
subscription={COMMENTS_SUBSCRIPTION}
onSubscriptionData={({ client, subscriptionData: { data, error } }) => {
if (!data) return
const current = client.readQuery({ query: COMMENTS_QUERY })
client.writeQuery({
query: COMMENTS_QUERY,
data: {
comments: [...current.comments, data.commentAdded],
},
})
}}
/>
Or, if you're using plain JavaScript instead of React:
const observable = client.subscribe({ query: COMMENTS_SUBSCRIPTION })
observable.subscribe({
next: (data) => {
if (!data) return
const current = client.readQuery({ query: COMMENTS_QUERY })
client.writeQuery({
query: COMMENTS_QUERY,
data: {
comments: [...current.comments, data.commentAdded],
},
})
},
complete: console.log,
error: console.error
})
I would like to store a value in the config file and look it up in the design document for comparing against update values. I'm sure I have seen this but, for the life of me, I can't seem to remember how to do this.
UPDATE
I realize (after the first answer) that there was more than one way to interpret my question. Hopefully this example clears it up a little. Given a configuration:
curl -X PUT http://localhost:5984/_config/shared/token -d '"0123456789"'
I then want to be able to look it up in my design document
{
"_id": "_design/loadsecrets",
"validate_doc_update": {
"test": function (newDoc,oldDoc) {
if (newDoc.supersecret != magicobject.config.shared.token){
throw({unauthorized:"You don't know the super secret"});
}
}
}
}
It's the abilitly to do something like the magicobject.config.shared.token that I am looking for.
UPDATE 2
Another potentially useful (contrived) scenario
curl -X PUT http://trustedemployee:5984/_config/eventlogger/detaillevel -d '"0"'
curl -X PUT http://employee:5984/_config/eventlogger/detaillevel -d '"2"'
curl -X PUT http://vicepresident:5984/_config/eventlogger/detaillevel -d '"10"'
Then on devices tracking employee behaviour:
{
"_id": "_design/logger",
"updates": {
"logger": function (doc,req) {
if (!doc) {
doc = {_id:req.id};
}
if(req.level < magicobject.config.eventlogger.detaillevel ){
doc.details = req.details;
}
return [doc, req.details];
}
}
}
Here's a follow-up to my last answer with more general info:
There is no general way to use configuration, because CouchDB is designed with scalability, stability and predictability in mind. It has been designed using many principles of functional programming and pure functions, avoiding side effects as much as possible. This is a Good Thing™.
However, each type of function has additional parameters that you can use, depending on the context the function is called with:
show, list, update and filter functions are executed for each request, so they get the request object. Here you have the req.secObj and req.userCtx to (ab)use for common configuration. Also, AFAIK the this keyword is set to the current design document, so you can use the design doc to get common configuration (at least up to CouchDB 1.6 it worked).
view functions (map, reduce) don't have additional parameters, because the results of a view are written to disk and reused in subsequent calls. map functions must be pure (so don't use e.g. Math.random()). For shared configuration across view functions within a single design doc you can use CommonJS require(), but only within the views.lib key.
validate doc update functions are not necessarily executed within a user-triggered http request (they are called before each write, which might not be triggered only via http). So they have the userCtx and secObj added as separate parameters in their function signature.
So to sum up, you can use the following places for configuration:
userCtx for user-specific config. Use a special role (e.g. with a prefix) for storing small config bits. For example superLogin does this.
secObj for database-wide config. Use a special member name for small bits (as you should normally use roles instead of explicit user names, secObj.members.names or secObj.admins.names is a good place).
the design doc itself for design-doc-wide config. Best use the this.views.lib.config for this, as you can also read this key from within views. But keep in mind that all views are invalidated as soon as you change this key. So if the view results will stay the same no matter what the config values are, it might be better to use a this.config key.
Hope this helps! I can also add examples if you wish.
I think I know what you're talking about, and if I'm right then what you are asking for is no longer possible. (at least in v1.6 and v2.0, I'm not sure when this feature was removed)
There was a lesser-known trick that allowed a view/show/list/validation/etc function to access the parent design document as this in your function. For example:
{
"_id": "_design/hello-world",
"config": {
"PI": 3.14
},
"views": {
"test": {
"map": "function (doc) { emit(this.config.PI); })"
}
}
}
This was a really crazy idea, and I imagine it was removed because it created a circular dependency between the design document and the code of the view that made the process of invalidating/rebuilding a view index a very tricky affair.
I remember using this trick at some point in the distant past, but the feature is definitely gone now. (and likely to never return)
For your special use-case (validating a document with a secret token), there might be a workaround, but I'm not sure if the token might leak in some place. It all depends what your security requirements are.
You could abuse the 4th parameter to validate_doc_update, the securityObject (see the CouchDB docs) to store the secret token as the first admin name:
{
"test": "function (newDoc, oldDoc, userCtx, secObj) {
var token = secObj.admins.names[0];
if (newDoc.supersecret != token) {
throw({unauthorized:"You don't know the super secret"});
}
}"
}
So if you set the db's security object to {admins: {names: ["s3cr3t-t0k3n"], roles: ["_admin"]}}, you have to pass 's3cr3t-t0k3n' as the doc's supersecret property.
This is obviously a dirty hack, but as far as I remember, the security object may only be read or modified by admins, you wouldn't immediately leak your token to the public. But consider adding a separate layer between the CouchDB and your caller if you need "real" security.
I have a model called a Transaction which has the following schema
var transactionSchema = new mongoose.Schema({
amount: Number,
status: String,
_recipient: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
_sender: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
});
I want both sender and recipient of this transaction to be able to 'confirm' that the transaction took place. The status starts out as "initial". So when only the sender has confirmed the transaction (but the recipient yet not), I want to update the status to "senderConfirmed" or something, and when the recipient has confirmed it (but sender has not), I want to update status to "recipientConfirmed". When they have both confirmed it, I want to update the status to "complete".
The problem is, how I can know when to update it to "complete" in a way that avoids race conditions? If both sender and recipient go to confirm the transaction at the same time, then both threads will think the status is "initial" and update it just to "senderConfirmed" or "recipientConfirmed", when in actuality it ought to go to "complete".
I read about MongoDBs two phase commit approach here but that doesn't quite fit my need, since I don't (in the case that another thread is currently modifying a transaction) want to prevent the second thread from making its update - I just want it to wait until the first thread is finished before doing its update, and then making the content of its update contingent on the latest status of the transaction.
Bottom line is you need "two" update statement to do this for each of sender and recipient respectively. So basically one is going to try and set the "partial" status to complete, and the other will only set the "initial" status match to the "partial" state.
Bulk operations are the best way to implement multiple statements, so you should use these by accessing the underlying driver methods. Modern API releases have the .bulkWrite() method, which degrades nicely if the server version does not support the "bulk" protocol, and just falls back to issuing separate updates.
// sender confirmation
Transaction.collection.bulkWrite(
[
{ "updateOne": {
"filter": {
"_id": docId,
"_sender": senderId,
"status": "recipientConfirmed"
},
"update": {
"$set": { "status": "complete" }
}
}},
{ "updateOne": {
"filter": {
"_id": docId,
"_sender": senderId,
"status": "initial"
},
"update": {
"$set": { "status": "senderConfirmed" }
}
}}
],
{ "ordered": false },
function(err,result) {
// result will confirm only 1 update at most succeeded
}
);
And of course the same applies for the _recipient except the different status check or change. You could alternately issue an $or condition on the _sender or _recipient and have a generic "partial" status instead of coding different update conditions, but the same basic "two update" process applies.
Of course again you "could" just use the regular methods and issue both updates to the sever in another way, possibly even in parallel since the conditions remain "atomic", but that is also the reason for the { "ordered": false } option since their is no determined sequence that needs to be respected here.
Bulk operations though are better than separate calls, since the send and return is only one request and response, as opposed to "two" of each, so the overhead using bulk operations is far less.
But that is the general approach. No single statement could possibly leave a "status" in "deadlock" or mark as "complete" before the other party also issues their confirmation.
There is a "possibility" and a very slim one that a status was changed from "initial" in between the first attempt update and the second, which would result in nothing being updated. In that case, you can "retry" the action on which it "should" update on the subsequent attempt.
This should only ever need "one" retry at most though. And very very rarely.
NOTE: Care should be taken when using the .collection accessor on Mongoose models. All the regular model methods have built in logic to "ensure" the connection to the database is actually present before they do anything, and in fact "queue" operations until a connection is present.
It's generally good practice to wrap your application startup in an event handler to ensure the database connection:
mongoose.on("open",function() {
// App startup and init here
})
So using the "on" or "once" events for this case.
Generally though a connection is always present either after this event is fired, or after any "regular" model method has already been called in the application.
Possibly mongoose will include methods like .bulkWrite() directly on the model methods in future releases. But presently it does not, so the .collection accessor is necessary to grab the underlying Collection object from the core driver.
Update: I am clarifying my answer based on a comment that my original response did not provide an answer.
An alternative approach would be to keep track of the status as two separate properties:
senderConfirmed: true/false,
recipientConfirmed: true/false,
When the sender confirms you simply update the senderConfirmed field. When the recipient confirms you update the recipientConfirmed field. There is no way they will overwrite each other.
To determine if the transaction is complete you would merely query {senderConfirmed:true,recipientConfirmed:true}.
Obviously this is a change to the document schema, so it may not be ideal.
Original Answer:
Is a change to your schema possible? What if you had two properties - senderStatus and recipientStatus? Sender would only update senderStatus and recipient would only update recipientStatus. Then they couldn't overwrite each others changes.
You would still need some other way to mark it as complete, I assume. You could us a cron job or something...
I am bootstraping replications in couchdb by POSTing to localhost:5984/_replicate. This URL only accepts POST requests.
There is also a second URL: localhost:5984/_replicator, which accepts PUT, GET and DELETE requests.
When I configure a replication POSTing to _replicate, it gets started, but I can not get information about it. It is also not listed in _replicator.
How can I get the list of active replications?
How can I cancel an active replication?
Edit: how to trigger replications with the _replicator method.
Thanks to comments by JasonSmith, I got to the following solution: PUTting to _replicator requires using full url (including authentictation credentials) for the target database. This is not the case when using the _replicate url, which is happy getting just the name of the target database (I am talking here about pull replications). The reason, as far as I can tell, is explained here (see section 8, "The user_ctx property and delegations")
The original API was a special URL, /_replicate where you tell Couch what to do and it tells you the result. However, the newer system is a regular database, called /_replicator and you create documents inside it telling Couch what to do. The document format is the same as the older _replicate format, however CouchDB will update the document as the replication proceeds. (For example, it will add a field "state":"triggered" or "state":"complete", etc.)
To get a list of active replications, GET /_active_tasks as the server admin. For example (formatted):
curl http://admin:secret#localhost:5984/_active_tasks
[ { "type": "Replication"
, "task": "`1bea06f0596c0fe6a1371af473a95aea+create_target`: `http://jhs.iriscouch.com/iris/` -> `iris`"
, "started_on": 1315877897
, "updated_on": 1315877898
, "status": "Processed 83 / 119 changes"
, "pid": "<0.224.0>"
}
, { "type": "Replication"
, // ... etc ...
}
]
The wiki has instructions to cancel CouchDB replication. Basically, you want to specify the same source and target and also add "cancel":true.
I'm trying to get notifications in a CouchDB change poll as soon as pre-defined field is set or changed. I've already had a look at filters that can be used for filtering change events(db/_changes?filter=myfilter). However, I've not yet found a way to include this temporal information, because you can only get the current version of the document in this filter functions.
Is there any possibility to create such a filter?
If it does not work, I could export my field to a separate database and the only poll for changes in that db, but I'd prefer to keep together my data for obvious reasons.
Thanks in advance!
You are correct: filters and _changes feeds can only see snapshots of a document. What you need is a function which can see the old document and the new document and act correctly. But that is unavailable in _filters and _changes.
Obviously your client code knows if it updates that field. You might update your client code however there is a better solution.
Update functions can access both documents. I suggest you make an _update
function which notices the field change and flags that in the document. Next you
have a simple filter checking for that flag. The best part is, you can use a
rewrite function to make the HTTP API exactly the same as before.
1. Create an update function to flag interesting updates
Your _design/myapp would be {"updates", "smart_updater": "(see below)"}.
Update functions are very flexible (see my recent update handlers
walkthrough). However we only want to mimic the normal HTTP/JSON API.
Your updates.smart_updater field would look like this:
function (doc, req) {
var INTERESTING = 'dollars'; // Set me to the interesting field.
var newDoc = JSON.parse(req.body);
if(newDoc.hasOwnProperty(INTERESTING)) {
// dollars was set (which includes 0, false, null, undefined
// values. You might test for newDoc[INTERESTING] if those
// values should not trigger this code.
if((doc === null) || (doc[INTERESTING] !== newDoc[INTERESTING])) {
// The field changed or created!
newDoc.i_was_changed = true;
}
}
if(!newDoc._id) {
// A UUID generator would be better here.
newDoc._id = req.id || Math.random().toString();
}
// Return the same JSON the vanilla Couch API does.
return [newDoc, {json: {'id': newDoc._id}}];
}
Now you can PUT or POST to /db/_design/myapp/_update/[doc_id] and it will feel
just like the normal API except if you update the dollars field, it will add
an additional flag, i_was_changed. That is how you will find this change
later.
2. Filter for documents with the changed field
This is very straightforward:
function(doc, req) {
return doc.i_was_changed;
}
Now you can query the _changes feed with a ?filter= parameter. (Replication
also supports this filter, so you could pull to your local system all documents
which most recently changed/created the field.
That is the basic idea. The remaining steps will make your life easier if you
already have lots of client code and do not want to change the URLs.
3. Use rewriting to keep the HTTP API the same
This is available in CouchDB 0.11, and the best resource is Jan's blog post,
nice URLs in CouchDB.
Briefly, you want a vhost which sends all traffic to your rewriter (which itself
is a flexible "bouncer" to all design doc functionality based on the URL).
curl -X PUT http://example.com:5984/_config/vhosts/example.com \
-d '"/db/_design/myapp/_rewrite"'
Then you want a rewrites field in your design doc, something like (not
tested)
[
{
"comment": "Updates should go through the update function",
"method": "PUT",
"from": "db/*",
"to" : "db/_design/myapp/_update/*"
},
{
"comment": "Creates should go through the update function",
"method": "POST",
"from": "db/*",
"to" : "db/_design/myapp/_update/*"
},
{
"comment": "Everything else is just like normal",
"from": "*",
"to" : "../../../*"
}
]
(Once again, I got this code from examples and existing code I have laying
around but it's not 100% debugged. However I think it makes the idea very clear.
Also remember this step is optional however the advantage is, you never have to
change your client code.)