MLFlow Registry high availability - mlflow

I am running the mlflow registry using mlflow server (https://mlflow.org/docs/latest/model-registry.html). The server runs fine. If the server crashes for any reason it restart automatically. But for the time of restart the server is not available.
Is it possible to run multiple isntances in parallel behind a load balancer? Is this safe or could it be possible that there are any inconsistencies?

Yes, it's possible to have multiple instances of MLflow Tracker Service running behind a load balancer.
Because the Tracking server is stateless, you could have multiple instances log to a replicated primary DB as a store. A second hot standby can take over if the primary fails.
As for the documentation in how to set up replicated instances of your backend store will vary on which one you elect to use, we cannot definitely document all different scenarios and their configurations.
I would check the respective documentation of your backend DB and load balancer for how to federate requests to multiple instances of an MLflow tracking server, how to failover to a hot standby or replicated DB, or how to configure a hot-standby replicated DB instance.
The short of it: MLflow tracking server is stateless.

The above suggests active-passive. Can the server be run in active/active (load balancer sends request to any of the runnning instances)?
There are edge cases where a stateless system might have problems with multiple writers.

Related

How to host multiple databases for a micro-service application?

I have an application in mind that would be built using node, mongodb + other db, kubernetes , RabbitMQ, docker and react as a front end. The application will be built in a microservice architecture. We all know that for a monolith app all you need is one DB (MongoDB, MySQL etc etc) but for a micro one you can have multiple databases. My question would be, do I need to buy multiple, separate databases and connect each service to them ? or how does it work in a micro-services design.
At the moment a I have a sample microservices app that is running on my local machine using docker and its connected to multiple databases or database/service. I am just to trying to get an idea on how does this work with companies like DigitalOcean or AWS.
Any input on this would be great.
I am just trying to figure out how this going to work when it comes to production later so that I am ware of cost and deployments. I have done some research on Digital ocean, AWS etc etc but I still can figure out how do they work.
thanks in advance.
You don't need having multiple instances of DBMS running. You can easily use one VM with one MongoDB running on it.
When you scale you might want to have separate machines running DB instances for your services, but at start you may just separate it logically to ensure you do not communicate between services using DB.
Chris Richardson on his microservices.io website says:
There are a few different ways to keep a service’s
persistent data private. You do not need to
provision a database server for each service.
For example, if you are using a relational database
then the options are:
- Private-tables-per-service – each service owns a
set of tables that must only be accessed by that
service
- Schema-per-service – each service has a database
schema that’s private to that service
- Database-server-per-service – each service has
it’s own database server.
Source: https://microservices.io/patterns/data/database-per-service.html

How does Elastic Beanstalk know if EC2 server is busy?

I've got a NodeJS application that does some moderately intense logic work when a user requests it. For example, a user on the frontend can click Analyze and the server will perform the work, which could take 30 seconds to 1 minute (non-blocking)
My app is not aimed at the wide public but at an audience of a few thousand. So there is a chance that several people might analyze at the same time.
I'm currently planning to deploy the app via Elastic Beanstalk, but I am not sure exactly how it will deal with a server when it is busy and if I have to implement some kind of custom signal to tell the load balancer to send requests to another instance, if the current one is busy performing analysis.
I understand that Lambdas are often held up as an option in this case, but I would much prefer to keep it simple and keep the code in my Node app.
How should I design this to ensure the app could handle doing analysis and still handling other requests normally?
Elastic Beanstalk uses Autoscaling Group to launch and maintain the EC2 instances required to run the Application. With Autoscaling Groups you can increase/decrease the EC2 instance count dynamically with Autoscaling Scaling policies. By default, Autoscaling Group provides scaling based on CPU, Network IN, Network Out, Request Count, Latency etc.. You can use any of these metrics and Scale-up your infrastructure dynamically.
You can refer to AWS Documentation here for more information.

Azure Webapps not failover when instance fails

We deployed a Node.js Azure Web App and defined a minimum of 2 instances (for scalability and high-availability).
It seems like the LB is balancing the load between the instances, but it doesn't react on instance error (crash) and seems to insist balancing the load between all the instances including the one which crashed.
Is there a way to set a fail-over mechanism for high-availability?
The load balancer used by Azure App Service will continue to send requests to individual web servers as long as the underlying virtual machines are up and running.
To workaround the issue you are running into, you can try configuring the "auto-heal" feature. If the scenario is that the app gets "stuck" in a permanently broken state, auto-heal rules can be configured to automatically restart the app.
More details on auto-heal here:
Auto-heal for Azure Web Sites

Load balancing stateful web application

I am trying to scale a web app on Azure from a single web instance to multiple instances. The web app does a fair amount of processing of per-user state, it's also fairly interactive so latency is important. We currently have a single database, testing has shown it is not the bottleneck so for this question let's assume we don't have to worry about scaling it, all instances will hit the same database. In this case, I think per-user load balancing is the best option, as per-request will result in per-user state being duplicated in lots of web instances. Apart from the issue of maintaining consistency, I am concerned this would result in unacceptable latency for end users.
This link says that ARR does per-user load balancing by default on Azure. However, the Traffic Manager, which from what I can gather is automatically enabled when you spin up multiple web instances on Azure, does per-request load balancing.
So my question is, which of these two load balancing schemes will I be using if I add a few more instances to my Web Hosting Plan? If I need to manually disable the Traffic Manager, what is the best way to do this?
Calum - you can leverage the standard SQL Session State Provider in Azure or you could look at the Azure Redis Cache provider as well for backing stores for user session state.
When deploying to Cloud Service Web Roles you automatically get a load balancer instance in front of your hosts. It's relatively transparent other than configuration of Endpoints. Each newly added/removed auto-scaled instance gets added to the Cloud Service and is automatically added/removed to the load balancer.
As others have said, Azure Traffic Manager provides a higher level service which can direct traffic to multiple Azure Regions (data centers) and even on-premises endpoints.
A good overview of Load Balancing can be found here: http://azure.microsoft.com/blog/2014/04/08/microsoft-azure-load-balancing-services/

Windows Azure availability set

I have 2 appservers and 1 Db server running in Azure. I have created a CloudService and put the app servers in that. I have created a Availablity set for app server and added the appservers in it.
Now comes the issue, I cannot add Db server in the same availability set, because they are not in the same cloud service as appservers. So Azure may reboot the dbserver anytime and cause an outage. How to solve this.
Do I need add one more DB server and replicate the DB and add them to a new availability set?
If 1 is yes, then I should make my application smarter when the primary Db server goes down?
Any betterways?
Availability sets are limited to Virtual Machines within a single cloud service. Even if you added your database server in the same cloud service as your app servers, this still wouldn't provide HA for your DB since you can have the host OS updated, bringing your DB server down for a short time.
The only way to have HA is to run multiple DB servers. If you run SQL Server, you can set up an Always On configuration (see here for documentation). EDIT I missed the comment about you using MySQL. MySQL has HA configurations, but I haven't set one up before; you'd need to set this up across multiple VMs, and you should consider putting all nodes in an Availability Set, so that any Host OS updates will be staggered across the VMs, rather than being applied at once (and also separating the VMs across different fault domains).

Resources