How can an mlflow model be scaled to serve more requests?

How can an mlflow model be scaled to serve more requests? - multithreading

I would like to have multiple instances of my MLFlow model running in parallel but hidden behind a common the same endpoint/port so it's not visible to the user.

You have a few options:
Increase the number of workers for mlflow serve -w flag with the number of workers
Use cloud platform link
Serve your model on Kubernetes with kubeflow

Related

Should You Use PM2, Node Cluster, or Neither in Kubernetes?

I am deploying some NodeJS code into Kubernetes. It used to be that you needed to run either PM2 or the NodeJS cluster module in order to take full advantage of multi-core hardware.
Now that we have Kubernetes, it is unclear if one must use one or the other, to get the full benefit of multiple cores.
Should a person specify the number of CPU units in their pod YAML configuration?
Or is there simply no need to account for multiple cores with NodeJS in Kubernetes?

You'll achieve utilization of multiple cores either way; the difference being that with the nodejs cluster module approach, you'd have to "request" more resources from Kubernetes (i.e., multiple cores), which might be more difficult for Kubernetes to schedule than a few different containers requesting one core (or less...) each (which it can, in turn, schedule on multiple nodes, and not necessarily look for one node with enough available cores).

How to create a Spark or TensorFlow cluster based on containers with mesos or kubernetes?

After reading the discussions about the differences between mesos and kubernetes and kubernetes-vs.-mesos-vs.-swarm, I am still confused about how to create a Spark and TensorFlow cluster with docker containers via some bear metal hosts and AWS like private cloud (OpenNebular).
Currently, I am able to build a static TensorFlow cluster with docker containers manually distributed to different hosts. I only run a stand alone spark on a bear metal host. The way of manually setup a mesos cluster for containers can be found here.
Since my resources are limited, I would like to find a way to deploy docker containers to the current mixed infrastructure to build either a tensorflow or spark cluster, so that I can do data analysis either with tensorflow or spark on the same resources.
Is it possible to create/run/undeploy a spark or tensorflow cluster quickly with docker containers on a mixed infrastructure with mesos or kubernetes? How can I do that?
Any comments and hints are welcome.

Given you have limited resources, I suggest you have a look at using the Spark helm, which gives you:
1 x Spark Master with port 8080 exposed on an external LoadBalancer
3 x Spark Workers with HorizontalPodAutoscaler to scale to max 10 pods when CPU hits 50% of 100m
1 x Zeppelin with port 8080 exposed on an external LoadBalancer
If this configuration doesn't work then you can build your own docker images and deploy those, take a look at this blog series. There is work underway to make Spark more Kubernetes friendly. This issue also gives some insight.
Not looked into Tensorflow, I suggest you look at this blog

Architecture: How to use Spark ML predictions as HTTP service

I have a Spark streaming application which trains a model and periodically stores the model to HFS. In a http based web service, I would like to POST some values and retrieve a prediction for it. The service should also reload the model on demand (e.g. via GET request).
I implemented a web server with Spark and Spray, it works for proof-of-concept. But I'm not sure if it is a good design solution. What about providing the web server to external services if it runs on a cluster? How can I define on which node the service will be available? I'm not even sure if it is the right idea to use prediction models in this way. Maybe the best-practice is to integrate Spark in a standalone application and access the model on the shared filesystem (e.g. hfs), but this will lack cluster support, wont't it?
Summary: What is the best-practice design to build a prediction web service with Apache Spark.

clustering in node.js using mesos

I'm working on a project with Node.js that involves a server. Now due to large number of jobs, I need to perform clustering to divide the jobs between different servers (different physical machines). Note that my jobs has nothing to do do with internet, so I cannot use stateless connection (or redis to keep state) and a load balancer in front of the servers to distribute the connection.
I already read about the "cluster" module, but, from what i understood, it seems to scale only on multiprocessors on the same machine.
My question: is there any suitable distributed module available in Node.js for my work? What about Apache mesos? I have heard that mesos can abstract multiple physical machines into a single server? is it correct? If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Thanks

My question: is there any suitable distributed module available in Node.js for my work?
Don't know.
I have heard that mesos can abstract multiple physical machines into a single server? is it correct?
Yes. Almost. It allows you to pool resources (CPU, RAM, DISK) across multiple machines, gives you ability to allocate resources for your applications, run and manage the said applications. So you can ask Mesos to run X instances of node.js and specify how much resource does each instance needs.
http://mesos.apache.org
https://www.cs.berkeley.edu/~alig/papers/mesos.pdf
If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Admittedly, I don't know anything about node.js or clustering in node.js. Going by http://nodejs.org/api/cluster.html, it just forks off a bunch of child workers and then round robins the connection between them. You have 2 options off the top of my head:
Run node.js on Mesos using an existing framework such as Marathon. This will be fastest way to get something going on Mesos. https://github.com/mesosphere/marathon
Create a Mesos framework for node.js, which essentially does what cluster node.js is doing, but across the machines. http://mesos.apache.org/documentation/latest/app-framework-development-guide/
In both these solutions, you have the option of letting Mesos create as many instances of node.js as you need, or, use Mesos to run cluster node.js on each machine and let it manage all the workers on that machine.
I didn't google, but there might already be a node.js mesos framework out there!

Elasticsearch deployment in a 2 server load balanced node js application setting

I have the following production setup for my Node JS application:
I am now going to integrate Elasticsearch in this setup. My question is regarding the best practices for deploying Elasticsearch in a production environment. All my instances are virtual machines, and I understand that Elasticsearch uses a lot of memory.
Should I therefore set up Elasticsearch on its own server (server 3), set it up on both server 1 and server 2 as a cluster (much like the Mongo DB replica set) or install it as a separate instance on each server.
What would be the benefits of the chosen method?
Many thanks!

Option 2.
Briefly.. I would definitely set this up on both servers - giving you two nodes. Given the options you have stated, this will provide the maximum distribution, load balancing, performance and fault tolerance.
Ensure that you manually configure your memory allocation carefully, assigning 50% of the total allocated to heap on each node, and leave the rest to Lucene for indexing.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string