node.js creating jobs with ttl using redis - node.js

I need to design a node.js app in this scenario:
User requests to /page with a cookie which has a unique token
Node.js creates a record with this token on redis(?) with a 5 mins TTL. tokens has n type. So redis(?) should store tokens with types.
If user comes back again before record expire, record's TTL reset to 5 mins again.
If user doesn't come back again and record expires, a function is triggered.
And at last, I also need the count of records belongs to a specific type (i.e. type 27).
What would be the best way to achieve this problem? Is redis right choice for it? How can I count tokens and trigger 5. step if I use redis?
Using https://github.com/Automattic/kue or something like that would be the nice choice?
Edit:
It seems possible duplicate with Event on key expire
. In general, I ask, is redis suitable for this problem? If you think it is, could you please give an example pattern I should follow. If not, what are any other solutions maybe even without redis or 3rd party tool.
Also as I said at the 5. item in the list above, redis seems not suitable at querying keys with a pattern as described here:https://redis.io/commands/keys
. Because there seems no way to put expiration on an individual member of a set (TTL for a set member) , I cannot put tokens in a type set or hash . Thus I need to query keys in order to get count of a group (type) and this seems not efficient as described at the link above.
Even if https://redis.io/topics/notifications solves the problem at the 4. item in the list above, I also ask for "Which pattern/algorithm should I go and how" . I don't want you to code for me, but just need a spark. A tutorial, guide, video etc.. would be great. Thanks!

You have many options in order to achieve what you want, but I'm going to provide one here, which is a very efficient one in terms of retrieving speed, but it requires some more space. It all depends on your use-case, obviously.
Say token is 12345, type is A
When adding a token (we'll do it with a transaction):
MULTI
SETEX tokens:12345 300 ""
SET type:12345 "A"
INCR types:A
EXEC
When the key expires (after 300 seconds, or whenever Redis sees it as expired) we get notified using Keyspace notifications (https://redis.io/topics/notifications) listening for the EXPIRED event:
PSUBSCRIBE __keyspace#0__:expired
When that subscription receives a message, in your code you'd need to:
MULTI
GET type:12345 # Returns A
DEL type:12345
DECR types:A
EXEC
In order to get the elements of a specific type:
GET types:A
Any NodeJS Redis client would work just fine for this.
I want to make clear this is only one of the multiple options you have; one of the Redis advantages is it's flexibility.

Related

PostgreSQL: Is it possible to limit inserts per user based on time difference between timestamp column and current time?

I have an issue when two almost concurrent requests (+- 10ms difference) by the same user (unintentionally duplicated by client side) successfully execute whole use case logic twice. I can't really solve this situation in code of my API, so I've been thinking about how to limit one user_id to be able to insert row into table order max. once every second for example.
I want to achieve this: If in table order exists row with user_id X and that row was created (inserted) less than 1 second ago, insert with user_id X would fail.
This could be effective way of avoiding unintentionally duplicated requests by client side. Because I can't imagine situation when user could send two complex requests less than 1 second between intentionally. I'm also interested in any other ideas, for example what's the proper way to deal with similar situations in API's.
There is one problem with your idea. If the server becomes really slow for just a second, the orders will arrive more than one second apart in the database and will be inserted.
I'd recommend generating a unique ID, like a UUID, in the front-end, and sending that with the request. You could, for example, generate a new one every page load. Then, if the server sees that the received UUID already exists in the database, the order is skipped.
This avoids any potential timing issues, but also retains the possibility of someone re-ordering the exact same products.
You can do it with an EXCLUDE constraint. You need to create your own immutable helper function, and use an extension.
create extension btree_gist;
create function addsec(timestamptz) returns tstzrange immutable language sql as $$
select tstzrange($1,$1+interval '1 second')
$$;
create table orders (
userid int,
t timestamptz,
exclude using gist (userid with =, addsec(t) with &&)
);
But you should probably change the front end anyway to include a validation token, as currently it may be subject to CSRF attacks.
Note that EXCLUDE constraints may be much less efficient than UNIQUE constraints. Also, I'm not 100% sure that addsec really is immutable. There might be weird things with leap seconds or something that messes it up.

What is the best practice for storing rarely modified database values in NodeJS?

I've got a node app that works with Salesforce for a few different things. One of the features is letting users fill in a form and pushing it to Salesforce.
The form has a dropdown list, so I query salesforce to get the list of available dropdown items and make them available to my form via res.locals. Currently I'm getting these values via some middleware, storing them in the users session, and then checking if the session value is set, use them, if not, query salesforce and pull them in.
This works, but it means every users session data in Mongo holds a whole bunch of picklist vals (they are the same for all users). I very rarely make changes to the values on the Salesforce side of things, so I'm wondering if there is a "proper" way of storing these vals in my app?
I could pull them into a Mongo collection, and trigger a manual refresh of them whenever they change. I could expire them in Mongo (but realistically if they do need to change, it's because someone needs to access the new values immediately), so not sure that makes the most sense...
Is storing them in everyone's session the best way to tackle this, or is there something else I should be doing?
To answer your question quickly, you could add them to a singleton object (instead of session data, which is per user). But not sure how you will manage their lifetime (i.e. pull them again when they change). A singleton can be implemented using a simple script file that can be required which returns a simple object...
But if I was to do something like this, I would go about doing it differently:
I would create an API endpoint that returns your list data (possibly giving it a query parameters to return different lists).
If you can afford the data being outdated for a short period of time then, you can write your API so that it returns the response cached (http cache, for a short period of time)
If your data has to be realtime fresh, then your API should return an eTag in the response of the API. The eTag header basically acts like a checksum for your data, a good checksum would be "last updated date" of all the records in a collection. Upon receiving a request you check if you have the header "if-none-match" which would contain the checksum, at this point, you do a "lite" call to your database to just pull the checksum, if it matches then you return 304 http code (not modified), otherwise you actually pull the full data you need and return it (alongside the new checksum in the response eTag). Basically you are letting your browser do the caching...
Note that you can also combine caching in points 1 and 2 and use them together.
More resources on this here:
https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers
https://developers.facebook.com/docs/marketing-api/etags

How to stream data in a Node JS + Mongo DB REST API?

I am developing a Rest API in Node JS + Mongo DB, handled with Mongoose's middleware, in which one of the methods allows the recovery of contents asociated to a certain user.
So far I've been retrieving all of the user's content, but the amount of data is starting to grow, and now I need to stream the data somehow.
The behaviour I want to implement would be for the server to answer the request with a stream of 10-20 items, and then, if the client needs more data, it would need to send another request, which would be answered with the following 10-20 items.
All I can come up with would be to answer with those first 10-20 items, and then, in case the client needs more data, to provide a new (optional) parameter for my method, which would allow the client to send the last item's id, so the server can send back the following 10-20 items.
I know that this approach will work, but I feel like it's too raw; there's gotta be a cleaner way to implement this behaviour, since it's the kind of behaviour a lot of web applications must implement.
So, my question would be: Do you know of any better way to solve this problem?
Thanks in advance.
Provide the ability to read an offset and a limit from the request, then do something like:
db.collection.find().skip(20).limit(10)
Also, set defaults on APIs you build so that someone can't request a million records at once. Maybe max results is always 200, and if the request doesn't have the above params set, return up to the first 200 results.

CQRS Event Sourcing check username is unique or not from EventStore while sending command

EventSourcing works perfectly when we have particular unique EntityID but when I am trying to get information from eventStore other than particular EntityId i am having tough time.
I am using CQRS with EventSourcing. As part of event-sourcing we are storing the events in SQL table as columns(EntityID (uniqueKey),EventType,EventObject(eg. UserAdded)).
So while storing EventObject we are just serializing the DotNet object and storing it in SQL, So, All the details related to UserAdded event will be in xml format. My concern is I want to make sure the userName which is present in db Should be unique.
So, while making command of AddUser I have to query EventStore(sql db) whether the particular userName is already present in eventStore. So for doing that I need to serialize all the UserAdded/UserEdited events in Event store and check if requested username is present in eventStore.
But as part of CQRS commands are not allowed to query may be because of Race condition.
So, I tried before sending the AddUser command just query the eventStore and get all the UserNames by serializing all events(UserAdded) and fetch usernames and if requested username is unique then shoot command else throwing exception that userName already exist.
As with above approach ,we need to query entire db and we may have hundreds of thousands of events/day.So the execution of query/deserialization will take much time which will lead to performance issue.
I am looking for any better approach/suggestion for maintaining username Unique either by getting all userNames from eventStore or any other approach
So, your client (the thing that issues the commands) should have full faith that the command it sends will be executed, and it must do this by ensuring that, before it sends the RegisterUserCommand, that no other user is registered with that email address. In other words, your client must perform the validation, not your domain or even the application services that surround the domain.
From http://cqrs.nu/Faq
This is a commonly occurring question since we're explicitly not
performing cross-aggregate operations on the write side. We do,
however, have a number of options:
Create a read-side of already allocated user names. Make the client
query the read-side interactively as the user types in a name.
Create a reactive saga to flag down and inactivate accounts that were
nevertheless created with a duplicate user name. (Whether by extreme
coincidence or maliciously or because of a faulty client.)
If eventual consistency is not fast enough for you, consider adding a
table on the write side, a small local read-side as it were, of
already allocated names. Make the aggregate transaction include
inserting into that table.
Querying different aggregates with a repository in a write operation as part of your business logic is not forbidden. You can do that in order to accept the command or reject it due to duplicate user by using some domain service (a cross-aggregate operation). Greg Young mentions this here: https://www.youtube.com/watch?v=LDW0QWie21s&t=24m55s
In normal scenarios you would just need to query all the UserCreated + UserEdited events.
If you expect to have thousands of these events per day, maybe your events are bloated and you should design more atomically. For example, instead having a UserEdited event raised every time something happens on a user, consider having UserPersonalDetailsEdited and UserAccessInfoEdited or similar, where the fields that must be unique are treated differently from the rest of user fields. That way, querying all the UserCreated + UserAccessInfoEdited prior to accepting or not a command would be a lighter operation.
Personally I'd go with the following approach:
More atomicity in events so that everything that touches fields that should be globally unique is described more explicitly (e.g: UserCreated, UserAccessInfoEdited)
Have projections available in the write side in order to query them during a write operation. So for example I'd subscribe to all UserCreated and UserAccessInfoEdited events in order to keep a queryable "table" with all the unique fields (e.g: email).
When a CreateUser command arrives to the domain, a domain service would query this email table and accept or reject the command.
This solution relies a bit on eventual consistency and there's a possibility where the query tells us that field has not been used and allows the command to succeed raising an event UserCreated when actually the projection hadn't been updated yet from a previous transaction, causing therefore the situation where there are 2 fields in the system that are not globally unique.
If you want to completely avoid these uncertain situations because your business can't really deal with eventual consistency my recommendation is to deal with this in your domain by explicitly modeling them as part of your ubiquitous language. For example you could model your aggregates differently since it's obvious that your aggregate User is not really your transactional boundary (i.e: it depends on others).
As often, there's no right answer, only answers that fit your domain.
Are you in an environment that really requires immediate consistency ? What would be the odds of an identical user name being created between the moment uniqueness is checked by querying (say, at client side) and when the command is processed ? Would your domain experts tolerate, for instance, one out of 1 million user name conflict (that can be compensated afterwards) ? Will you have a million users in the first place ?
Even if immediate consistency is required, "user names should be unique"... in which scope ? A Company ? An OnlineStore ? A GameServerInstance ? Can you find the most restricted scope in which the uniqueness constraint must hold and make that scope the Aggregate Root from which to sprout a new user ? Why would the "replay all the UserAdded/UserEdited events" solution be bad after all, if the Aggregate Root makes these events small and simple ?
With GetEventStore (from Greg Young) you can use whatever string as your aggregateId/StreamId. Use the username as the id of the aggregate instead of guids, or a combination like "mycompany.users.john" as the key and.. voila! You have for free user name uniqueness!

Periodic checks in node.js and mongodb (searching for missing record)

I'm receiving some periodic reports from a bunch of devices and storing them into a MongoDB database. They are incoming at about 20-30 seconds. However I would like to check when a device does not send the report for some time (for example the last report is more then 3 minutes old) and I would like to send an email or trigger some other mechanism.
So, the issue is how to check for the missing event in the most correct manner. I considered a cron job and a bunch of timers each related to each device record.
A cron job looks ok but I am fearing that starting a full scan query will overload the server/db and cause performance issues. Is there any kind of database structure that could aid this (some kind of index, maybe?).
Timers are probably the simpler solution but I fear how many timers can I create because I can get quite a number of devices.
Can anybody give me an advice what is the best approach to this? Thanks in advance.
Do you use Redis or something similar on this server? Set device ID as key with any value, e.g. 1. Expire key in 2-3 min and update expiration every time the device connects. Then fire cron jobs to check if ID is missing. This should be super fast.
Also you may use MongoDB's timed collections instead of Redis but in this case you will have to do a bunch of round trips to the DB server. http://docs.mongodb.org/manual/tutorial/expire-data/
Update:
As you do not know what IDs you will be looking for, this rather complicates the matter. Another option is to keep a log in a separate MongoDB collection with a timestamp of last ping you've got from the device.
Index timestamps and query .find({timestamp: {$lt: Date.now() - 60 * 1000}}) to get a list of stale devices.
It's very important that you update existing document rather than create a new one on each ping. So that if you have 10 devices connected you should have 10 documents in this collection. That's why you need a separate collection for this log.
There's a great article on time series data. I hope you find it useful http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb
An index on deviceid+timestamp handles this neatly.
Use distinct() to get your list of devices
For each device d,
db.events.find({ deviceid: d }).sort({ timestamp : -1 }).limit(1)
gives you the most recent event, whose timestamp you can now compare with the current time.

Resources