I am new to using Service Fabric and am trying to scope out some design options. I have a class library which performs different tasks. Some tasks are resource intensive and long-running (processing messages from queues) and others are short-lived and must be responsive (handling job requests from users). There is a significant amount of cached data so shared processes make sense, and the application is stateless. I want to make sure that long-running tasks don't starve other tasks of resources, but also that the utilisation rate is high.
Is it possible to make one Stateless Service project in my solution (referencing my class library) and deploy multiple named StatelessService instances sharing the same process, using configuration to differentiate the tasks performed by those instances? With or without multiple ServiceTypes (although they seem to be one per project, so I assume this must be one ServiceType)?
If so, is it possible to apply different resource governance rules to those service instances so some resources can be reserved for user-driven tasks? So far I get the impression that this isn't possible when the services share a process.
The default shared process model specifies this:
The preceding section describes the default hosting model provided by
Service Fabric, referred to as the Shared Process model. In this
model, for a given application, only one copy of a given
ServicePackage is activated on a node (which starts all the
CodePackages contained in it). All the replicas of all services of a
given ServiceType are placed in the CodePackage that registers that
ServiceType. In other words, all the replicas of all services on a
node of a given ServiceType share the same process.
You can specify multiple service types and multiple code packages.
ServiceTypes declares what service types are supported by CodePackages
in this manifest. When a service is instantiated against one of these
service types, all code packages declared in this manifest are
activated by running their entry points. The resulting processes are
expected to register the supported service types at run time. Service
types are declared at the manifest level and not the code package
level. So when there are multiple code packages, they are all
activated whenever the system looks for any one of the declared
service types.
Resource governance is configured in the service manifest, not at the instance level.
Related
Currently I'm investigating possibility to use Azure Service Fabric and its Reliable Services in order to implement my problem domain architecture.
Problem domain: I am currently doing a research on distributed large-scale web crawling architectures involving dozens of parallel agents that should crawl web-servers and download resources for further indexing.
I've found useful academic paper which describes Azure-based distributed web-crawling architecture: Link to .pdf paper and I'm trying to implement and try out prototype based on this design.
So basic high-level look of design is something like this figure below:
The idea: Central Web Crawling System Engine (further - CWCE) runs in an infinite loop until program is aborted and fetches Service Bus Queue Message which contains URL of page to be crawled. CWCE component then checks hostname of this URL and consults Agent Registrar SQL database if alive agent already exists for given hostname. If not, CWCE then does one of the following procedures:
If number of alive agents (A_alive) is equal to Max value (upper bound limit of agents, provided by application administrator) CWCE waits until A_alive < Max value
If A_alive < Max, CWCE tries to create new Agent and assign hostname to it. (agent is then registered in SQL Registrar database).
Each Agent runs on its own partition (URL hostname, for example: example.com) and recursively crawls only pages of this hostname while discovering external hostnames URLs and adding them to Service Bus Queue for other agent processings.
The benefit of this architecture would be horizontal scaling of agents and near-linear workload increase of crawling effectiveness.
However, I am very new in Azure Service Fabric and therefore would like to ask if this PaaS layer is capable of solving this problem? Main questions:
Would it be possible to manually create new web crawling agent instances through the programmable code and pass them hostname parameter using Azure Service Fabric? (Maybe using FabricClient class for manipulating cluster and creating service instances?)
Which ASF programming model fits this parallel long-running agents scenario the best? Stateless services, stateful services or Actor Model? Each agent might run as long-running task, since it recursively crawls specific hostname URLs and listens for the queue.
Would it be possible to control and change this upper bound limit of Max alive agents during runtime of application?
Would it be possible to have infinite-loop stateless service CWCE component which continuously listens for the queue messages in order to spawn up new agents?
I am not sure whether the selected ASF PaaS layer is the best solution for this distributed web-crawling system use-case, so your insights would be so much valuable for me. Any helpful resource links would also be so beneficial.
Service Fabric will allow you to implement the architecture that you want.
Would it be possible to manually create new web crawling agent instances through the programmable code and pass them hostname parameter using Azure Service Fabric? (Maybe using FabricClient class for manipulating cluster and creating service instances?)
Yes. The service you will develop and deploy to Service Fabric will be a ServiceType. Service Types don't actually run, instead, from the ServiceType you can create the actual Services, which are named. A single Service (eg ServiceA), will have a number of Instances, to allow scaling and availability. You can programmatically create and remove services of a given type and pass parameters to them, so every service will know what URL to crawl.
Check an example here.
Which ASF programming model fits this parallel long-running agents scenario the best? Stateless services, stateful services or Actor Model? Each agent might run as long-running task, since it recursively crawls specific hostname URLs and listens for the queue.
I would choose Stateless services, because they will be the most efficient in terms of resource utilization and the easiest to manage (no need to store state and manage state, partitioning and replicas). The only thing you need to consider is that every service will eventually crash and restart, so you need to store the current crawling location in a permanent store, not in memory.
Would it be possible to control and change this upper bound limit of Max alive agents during runtime of application?
Yes. Service Fabric services run in Nodes (Virtual Machines) and in Azure, they are managed by Virtual Machine Scale Sets. You can easily add and remove nodes from the VMSS which Will allow you to adjust the total compute and memory power that you want and the actual number of services is already controlled by you as specified in point 1.
Would it be possible to have infinite-loop stateless service CWCE component which continuously listens for the queue messages in order to spawn up new agents?
Absolutely. Message-driven microservices are very common. It's technically not an infinite loop, but a service with a Bus Communication Listener. I found one here as a reference, but I don't know if it's production ready
This is a bit descriptive so please bear with me. :)
In the application that I'm trying to build, there are distinct functionalities of product. Users can choose to opt-in for functionality A, B, D but not C. The way I'm building this, is that each of the distinct functionality is a Service (stateless, I'm thinking of storing the data in Azure SQL DBs and exposing REST APIs from each service). Bundled all services together is an ApplicationType. For each customer tenant (consider this as an shared account of a group of users) that is created, I'm thinking of creating a new concrete instance of registered ApplicationType using a TenantManagementService and calling client.ApplicationManager.CreateApplicationAsync() on a FabricClient instance so that I can have a dedicated application instance running on my nodes for that tenant. However, as I mentioned, a tenant can choose to opt-in only for specific functionality which is mapped to a subset of services. If a tenant chooses only service A of my Application, rest of the service instances corresponding to features B, C, D shouldn't be idly running on the nodes.
I thought of creating actors for each service, but the services I'm creating are stateless and I'd like to have multiple instances of them actively running on multiple nodes to load balance rather than having idle replicas of stateful services.
Similar to what I'm doing with application types, i.e., spawning application types as a new tenant registers, can I spawn/delete services as and when a tenant wants to opt-in/out of product features?
Here's what I've tried:
I tried setting InstanceCount 0 for the services at when packaging my application. In my ApplicationParameters XML files:
<Parameters>
<Parameter Name="FeatureAService_InstanceCount" Value="0" />
<Parameter Name="FeatureBService_InstanceCount" Value="0" />
</Parameters>
However, Service Fabric Explorer cribs when instantiating the application out of such application type. The error is this:
But on the other hand, when a service is deployed on the fabric, it gives me an option to delete it specifically, so this scenario should be valid.
Any suggestions are welcome!
EDIT: My requirement is similar to the approach mentioned by anderso in here - https://stackoverflow.com/a/35248349/1842699, However, the problem that I'm specifically trying to solve is to upload create an application instance with one or more packaged services having zero instance count!
#uplnCloud
I hope I understand everything right.
Your situation is the following:
Each customer should have separate Application (created from the same ApplicationType).
Each customer should have only subset of Services (defined in ApplicationType).
If I get it right then this is supported out of the box.
First of all you should remove <DefaultServices /> section from the ApplicationManifest.xml. This will instruct Service Fabric to don't create services automatically with the application.
Now the algorithm is the following:
Create application using FabricClient.ApplicationManager.CreateApplicationAsync()
For each required feature create a new corresponding Service using FabricClient.ServiceManager.CreateServiceAsync() (you need to specify the Application name of newly created Application)
Also note that CreateServiceAsync() accepts ServiceDescriptor that you can configure all service related parameters - starting from partitioning schema and ending up with instance count.
Unfortunately you can't have 0 instance services, Service Fabric has the idea that a named service always exists(running). In this case, when you define a service (give a name to a serviceType instance), it will have at least 1 instance running, otherwise, you shouldn't even have the definition of this service on your application if it does not need to be running.
But what you can have is the ServiceType definition, that means, you have the binaries but you will create it when required.
I assume you are being limited by the default services, where you declare the application and services structure upfront(before deployment of any application instance), instead, you should use dynamic service creation via FabricClient like you described, or via Powershell using New-ServiceFabricApplication and New-ServiceFabricService .
This link you guide you how to do it using FabricClient
I'll just add this as a new answer instead of commenting on another answer.
As other have mentioed, remove DefaultServices from your ApplicationManifest. That way, every new instance of the ApplicationType you create will come online without services, and you'll have to create those manually depending on what functionallity your customer has selected.
Also, going with the "services per customer" approach, make sure you got enough nodes to handle the load when you get customers online. You'll end up with a lot of processes (since Application Instances) runs their own processes of the services, and if you have few nodes with a lot of these, reboots to your cluster nodes can take a bit to stabilise since it can potentionally have many services it needs to relocate. Altough running Stateless Services will aleviate a good part of this.
What is the reasoning behind Applications concept in Service Fabric? What is the recommended relation between Applications and Services? In which scenarios do Applications prove useful?
Here is a nice summary how logical services differ from physical services: https://learn.microsoft.com/en-us/dotnet/standard/microservices-architecture/architect-microservice-container-applications/logical-versus-physical-architecture
Now, in relation to Service Fabric, Service Fabric applications represent logical services while Service Fabric services represent physical services. To simplify it, a Service Fabric application is a deployment unit, so you would put there multiple services that rely on the same persistent storage or have other inter-dependencies so that you really need to deploy them together. If you have totally independent services, you would put them into different Service Fabric applications.
An application is a collection of constituent services that perform a certain function or functions. A service performs a complete and standalone function and can start and run independently of other services. A service is composed of code, configuration, and data. For each service, code consists of the executable binaries, configuration consists of service settings that can be loaded at run time, and data consists of arbitrary static data to be consumed by the service. Each component in this hierarchical application model can be versioned and upgraded independently.
It is described here in detail
How I currently see it, applications are a nice concept to group multiple services together and manage them as single unit. In context of service fabric, this is useful if you have multiple nano-services which do not warrant them being completely standalone; instead you can package them together into microservices (SF application).
Disclaimers:
- nano-service would be a REALLY small piece of code running as a stateless SF service for example (e.g. read from queue, couple of lines of code to process, write to another queue).
- in case of "normal" microservices, one could consider packaging them as 1 SF application = 1 SF service
An application is a required top level container for services. You deploy applications, not services. So you cannot really speak about differences between the two since you cannot have services without an application.
From https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-application-model:
An application is a collection of constituent services that perform a certain function or functions. A service performs a complete and standalone function (it can start and run independently of other services) and is composed of code, configuration, and data. For each service, code consists of the executable binaries, configuration consists of service settings that can be loaded at run time, and data consists of arbitrary static data to be consumed by the service. Each component in this hierarchical application model can be versioned and upgraded independently.
Take a look at the link provided and you will see the hierarchical relationship.
I need help how to think about designing our application to fit into the new Azure Service Fabric template.
Today we have an application built on Azure Cloud Services. The application is built around DDD and we have separate bounded contexts for different subsystem parts of the application. The bounded contexts are today hosted in one worker role that exposes these subsystems using a single WebAPI.
Additionally we have one Web Role hosting the web frontend and one Worker Role processing a background queue.
We strive to move to a micro services architecture. The first thing I planned to do was to extract all bounded context into their own API-hosts. This will result in 5-10 new WebAPI services supporting our subsystems.
To my question, should all of these subsystem/bounded context/API-hosts be their own Service Fabric Application or a service within a single Service Fabric Application?
I've read the documentation, found here Service Fabric Application Model, over and over and I can't figure out where my services fits in.
We want the system to support different versions of the services, and the services should also be possible to scale different from another. There might even be a requirement to have one micro service to run in a larger VM size then the rest.
Please can someone guide me in which suits my needs.
I think you have the right idea, in general terms, that each bounded context is a (micro) service. Service Fabric gives you two levels of organization with applications and services, where an application is a logical grouping of services. Here's what that means for you:
Logically speaking, think of an application as a cohesive set of functionality. The services that collectively form that cohesive set of functionality should be grouped as an application. You can ask yourself, for each service: "does it make sense to deploy this service by itself without these other services?" If the answer is no, then they should probably be grouped in the same application.
Developmentally speaking, the Visual Studio tooling is geared a bit more toward multiple services in one application, but you can have multiple applications in one solution too.
Operationally speaking, an application represents a process boundary, upgrade group, and versioning group:
Each instance of an application you create gets its own process (or set of processes if you have multiple service types in the application). Service instances of a service type share host processes. Service instances of different service types get their own process per type.
The application is the top level upgrade unit, that is, every upgrade you do is an application upgrade. You can upgrade individual services within an application (you don't always have to upgrade every service within an application), but each time you do an upgrade, the application version changes.
You can create side-by-side instances of different versions of the same application type in your cluster. You cannot create side-by-side instances of different versions of the same service type within an application instance.
Placement and scale is done at the service. So for example, you can scale one service in an application, and you can place another service on a larger VM.
Iam relatively new to Cloud Computing and azure. I was wondering whether you can have more than one web and worker role in an Azure application. If so what advantages can I get using multiple roles and where do they apply?
Yes, you can have more than 1 web or worker role in an Azure Cloud Service. You can have up to 25 different roles per deployment I believe in any mix of Web and Worker roles. See the Azure Subscription and Service Limits, Quotas and Constraints link for more information.
The advantage of having the roles within the same cloud service is simply that within that cloud service they can see all the other roles and instances easily (unless you configure them otherwise). They will all be relatively close to each other within a data center because a cloud service is assigned to a stamp of machines and controlled by a Fabric Controller assigned to that stamp. You can watch this video by Mark Russinovich which sheds more light on the inner workings of Azure and talks a bit about stamps I think. A cloud service is a security boundary as well, so you get some benefits from that encapsulation if you need to do a lot of inter machine communication that ISN'T going across a queue for some reason.
The disadvantage of batching a whole bunch of roles together is that they are tied pretty closely together at that point. You can certainly scale them separately, and you can do updates that target only a single role at a time. However, if you want to deploy changes to multiple roles you may end up having to do a full deployment to all roles (even those that haven't changed) or do updates to single roles one at a time until all the ones you need updated are, which can take some time. Of course, it could be argued that having them in separate cloud services would still have you doing updates concurrently depending on your architecture and/or dependencies.
My suggestion is to group only roles that REALLY belong together in the same solution. These are role that have workloads that are interrelated. Even then, there's nothing stopping you from separating these as well into separate deployments (though you may benefit from the security boundaries that being within the same cloud service). Think about how each role will be updated, and if they would generally be updated together or not. There are many factors in thinking about how to package roles together.