In a Node project with many .js files, suppose I have a file that manages expensive state: it provides a json blob that it regularly fetches from the web in short intervals. Its data is cached and is requested internally far more than its update interval.
let provider = require('./config-provider.js');
let config = provider.get(); // returns locally cached JSON blob
Suppose the above code exists in 10 different files in my application. This is going to create 10 different instances of this updater, all making expensive web calls to update the config.
I would like to reference a single instance of this config provider across my app. However, this seems to break the modular design of Node apps.
I could always use a global object, but this is obviously frowned upon.
Another solution is to make a complex web of parent / child references across my application. This also seems messy.
Is there some suggested best practice for referencing a single stateful module across the span of one's Node application?
In this case you should use the "singleton" pattern as described here. I also find this code example useful. Notice that some developers frown upon singletons in Node JS as discussed here.
Regardless, this pattern makes sense in your case, as you said:
I would like to reference a single instance of this config provider across my app
Other Important Consideration
Besides coding, when you deploy your application to production, you have to consider if you want this single instance replicated as well. For example, if you deploy 3 Node JS instances, do you still want a single instance handling all your JSON blobs? Or is it OK to have it replicated 3 times as well?
A while back I had a scheduling module deployed with 5 Node JS instances. Obviously I couldn't have the scheduler fire the same job 5 times, so it had its own separate instance AND the scheduling module itself was a singleton. I bring this up because there's a clear distinction between singleton objects and how Node JS can be deployed multiple times.
I hope this helps.
Related
I have two Azure Functions. I can think of them as "Producer-Consumer". One is "HttpTrigger" based Function (Producer) which can be fired randomly. It writes the input data in a static "ConcurrentDictionary". The second one is "Timer Trigger" Azure Function(consumer). It reads the data periodically from the same "ConcurrentDictionary" which was being used by the "Producer" function App and then do some processing.
Both the functions are within the same .Net project (but in different classes). The in-memory data sharing through static "ConcurrentDictionary" works perfectly fine when I run the application locally. While running locally, I assume that they are running under the same process. However, when I deploy these Functions in Azure Portal ( They are in the same function App Resource), I found that data sharing through static "ConcurrentDictionary" is not not working.
I am just curious to know, if in Azure Portal, both the Functions have their own process (Probably, that's why they are not able to share in-process static collection). If that is the case, what are my options that these two Functions work as proper "Producer-Consumer"? Will keeping both the Functions in the same class help?
Probably, the scenario is just opposite to what is described in the post - "https://stackoverflow.com/questions/62203987/do-azure-function-from-same-app-service-run-in-same-instance". As against the question in the post, I would like both the Functions to use the same static member of a static class instance.
I am sorry that I cannot experiment too much because the deployment is done through Azure-DevOps pipeline. Too many check-ins in repository is slightly inconvenient. As I mention, it works well locally. So, I don't know how to recreate what's happening in Azure Portal in local environment so that I can try different options? Is there any configurable thing which I am missing to apply?
Don't do that, use an azure queue, event grid, service bus or something else that is reliable but just don't try using a shared object. It will fail as soon as scale out happens or as soon as one of the processes dies. Do think about functions as independent pieces and do not try to go against the framework.
Yes, it might work when you run the functions locally but then you are running on a single machine and the runtime might use the same process but once deployed that ain't true anymore.
If you really really don't want to decouple your logic into a fully seperated producer and consumer then write a single function that uses an in process queue or collection and have that function deal with the processing.
I have two applications, a node.js app running on node-webkit, and a lua application. I would need to pass data between the two applications on regulars intervals, say every 5 to 15 seconds.
The node.js application is the one creating the data, and the lua application is the one consuming the data. The data only goes to one direction.
How should I do the data transfer. I would prefer json/xml for the data, but actually it can be in any other format as well. The data moved at a time is not large. Its just some ten parameters at a time.
My initial thought was to just make the node app act as server and serve the data via rest api, and the lua app just read the page with LuaSocket or such. But is there a better way to do the transfer, if both of the apps reside on same machine? Currently the lua app is running in Windows, but that could change.
My background is in web development, so I'm totally lost when it comes to sharing data between applications. I'm also new to lua. Thanks for any answers.
There are many ways to accomplish such task. I will describe two of them.
The first approach which I like most is using a Remote Queue such as Apache Kafka, Redis, RabbitMQ, or even Zookeeper for small data, alternatively store in a database. All these remote storage systems have very good Node.js modules and all of them can handle JSON and any other data type very well.
Unless this is just a mere test app, it is good to build such fault tolerance into your apps. In your case, imagine if the consumer Lua app goes down, or the opposite, Node.js producer app goes down. You don't want the failure on one app to affect another app. In production environment, it is best to isolate apps and tasks like this. Another advantage of this approach is that one day you may decide to rewrite your consumer in Node.js, Scala, etc. or have multiple consumers in different languages. This doesn't require your server to stop or change. It even doesn't have to know about any changes to the consumer.
So, your production server always pushes data to a remote data store/queue independently, and a consumer server reads then deletes the data from this remote store on its own pace.
If you used a database, you would read the new records, consume them, and once done, remove them from the database. This approach allows you to shutdown the consumer and producer apps independently for any reason like upgrade.
Another approach is to establish a direct network connection from producer server to a consumer server via a TCP. The producer server would be a client pushing data to the consumer server. This can be accomplished with the net build-in module if the apps are on different physical machines. But as you can see, this is less reliable solution because if the consumer goes down, the produce can no longer push the new data in which case you should think what you should do with it: discard or store somewhere. If store somewhere, you end up reimplementing the first approach explained above.
I have a simple node.js server app built that I'm hoping to test out soon. It's single threaded and works fine without any child processing whatsoever. My problem is that the server box has multiple cores and the simplest way I can think to utilize them is by running multiple instances of the server app. However this would require them all to be on the same domain name and so some sort of request routing is required. I personally don't have much experience with servers in general and don't know if this is a task for node.js to perform or some other less complicated program (or more complicated.) If there is a node.js mechanism to solve this, for example, if one running instance can send incoming requests to the next instance, than how would I detect when this needs to happen? Transversely, if I use some other program how will it manage to detect when it needs to start talking to a new instance?
Node.js includes built-in support for managing a cluster of instances of your application to take advantage of multiple cores via the cluster module.
The scenario is simple: using EF code first migrations, with multiple azure website instances, decent size DB like 100GB (assuming azure SQL), lots of active concurrent users..say 20k for the heck of it.
Goal: push out update, with active users, keep integrity while upgrading.
I've sifted through all the docs I can find. However the core details seem to be missing or I'm blatantly overlooking them. When Azure receives an update request via FTP/git/tfs, how does it process the update? What does it do with active users? For example, does it freeze incoming requests to all instances, let items already processing finish, upgrade/replace each instance, let EF migrations process, then let traffics start again? If it upgrades/refreshes all instances simultaneously, how does it ensure EF migrations run only once? If it refreshes instances live in a rolling upgrade process (upgrade 1 at a time with no inbound traffic freeze), how could it ensure integrity since instances in the older state would/could potentially break?
The main question, what is the real process after it receives the request to update? What are the recommendations for updating a live website?
To put it simply, it doesn't.
EF Migrations and Azure deployment are two very different beasts. Azure deployment gives you a number of options including update and staging slots, you've probably seen
Deploy a web app in Azure App Service, for other readers this is a good start point.
In General the Azure deployment model is concerned about the active connections to the IIS/Web Site stack, in general update ensures uninterrupted user access by taking the instance being deployed out of the load balancer pool and redirecting traffic to the other instances. It then cycles through the instances updating one by one.
This means that at any point in time, during an update deployment there will be multiple versions of your code running at the same time.
If your EF Model has not changed between code versions, then Azure deployment works like a charm, users won't even know that it is happening. But if you need to apply a migration as part of the migration BEWARE
In General, EF will only load the model if the code and DB versions match. It is very hard to use EF Migrations and support multiple code versions of the model at the same time
EF Migrations are largely controlled by the Database Initializer.
See Upgrade the database using migrations for details.
As a developer you get to choose how and when the database will be upgraded, but know that if you are using Mirgrations and deployment updates:
New code code will not easily run against the old data schema.
If the old code/app restarts many default initialization strategies will attempt roll the schema back, if this happens refer to point 1. ;)
If you get around the EF model loading up against the wrong version of the schema, you will experience exceptions and general failures when the code tries to use schema elements that are not there
The simplest way to manage a EF migration on a live site is to take all instances of the site down for deployments that include an EF Migration
- You can use a maintenance page or a redirect, that's up to you.
If you are going to this trouble, it is probably best to manually apply the DB update, then if it fails you can easily abort the deployment, because it hasn't started yet!
Otherwise, deploy the update and the first instance to spin up will run the migration, if the initializer has been configured to do so...
If you absolutely must have continuous deployment of both site code/content and model updates then EF migrations might not be the best tool to get started with as you will find it very restrictive OOTB for this scenario.
I was watching a "Fundamentals" course on Pluralsight and this was touched upon.
If you have 3 sites, Azure will take one offline and upgrade that, and then when ready restart it. At that point, the other 2 instances get taken off-line and your upgraded insance will start, thus running your schema changes.
When those 2 come back the EF migrations would already have been run, thus your sites are back.
In theory then it all sounds like it should work, although depending upon how much EF migrations need running, requests may be delayed.
However, the comment from the author was that in this scenario (i.e. making schema changes) you should consider if your website can run in this situation. The suggestion being that you either need to make your code work with both old and new schemas, or show a "maintenance system down page".
The summary seems to be that depending on what you are actually upgrading, this will impact and affect your choices and method of deployment.
Generally speaking if you want to support active upgrades you need to support multiple version of you application simultaneously. This is really the only way to reliably stay active while you migrate/upgrade. Also consider feature switches to scale up your conversion in a controlled manner.
I'm building an application using Google App Engine. The application consists of several servlets, one of which has a static member object that holds a lot of internal state. Multiple Android phones contact this servlet, causing the servlet to update the state of the static member object. However, when multiple phones happen to talk to the server at the same time, I get synchronization issues in which the static object is being modified by multiple threads at once. I've tried throwing in some synchronized blocks, but this does not seem to help.
I think the reason has something to do with how App Engine spawns threads for HTTP requests, but I'm not sure. What's the standard way to synchronize access to shared objects in App Engine? Thanks!
Your GAE app can be started on different servers, there can be, and will be few instances at one time. Any instance can be started or killed at any time, without any notice before. So, any in-memory state is useless.
You have to use or memcache service, or database instead.