Threads and XPC - appstore-sandbox

Threads and XPC - appstore-sandbox

I want to run multiple concurrent logical operations within an XPC service. Thing is, though, XPC services are singletons — either they’re running, or they’re not. Can I use NSThread, dispatch queues, or similar to simulate this?
The application that will be consuming this XPC service is a sandboxed user app that uses XPC services to workaround the limitations inherent in sandboxed fork/exec.

If you're talking about sandboxed XPC, then yes, you can multi-thread, BUT, launchd keeps a close eye on you and throttles it a lot depending on the load at the time.
If there's too much activity, it could be inherently terminated as a "Misuse of API" violation (which launchd likes to do to XPC a lot).
If you're planning on not being sandboxed, and working with mach services, then you shouldn't have an issue.
What sort of service/server/helper/whatever are you looking at?
Is it designed to be in userspace? i.e. when a user opens the apps in which its contained? Or is it part of daemons and other deeper services?

Related

Questions pertaining to micro-service architecture

I have a couple of questions that exist around micro service architecture, for example take the following services:
orders,
account,
communication &
management
Question 1: From what I read I understand that each service is suppose to have ownership of the data pertaining to that service, so orders would have an orders database. How important is that data ownership? Would micro-services make sense if they all called from one traditional database such that all data pertaining to the services would exist in one database? If so, are there an implications of structuring the services this way.
Question 2: Services should be able to communicate with one and other. How would that statement be any different than simply curling an existing API? & basing the logic on that response? Is calling a service more efficient than simply curling the API?
Question 3: Is it worth it? Now I understand this is a massive generality , and it's fundamentally predicated on the needs of the business. But when that discussion has been had, was the re-build worth it? & what challenges can you expect to face

I will try to answer all the questions.
Respect to all services using the same database. If you do so you have two main problems. First the database would become a bottleneck because all requests will go to the same point. And second you will have coupled all your services, so if the database goes down or it needs to update, all your services will be affected. (The database will became a single point of failure)
The communication between services could be whatever your services need (syncrhonous, asynchronous, via message passing (message broker), etc..) it all depends on the use cases you have to support. The recommended way to do to avoid temporal decoupling is to use a message broker like kafka, doing this your services don't have to known each other and in case some of them go down the others will still working. And when they are up again, they can continue to process the messages that have pending. However, if your services need to respond in synchronous way, you can define synchronous communication between services and use a circuit breaker to behave properly in case the callee service is down.
Microservices architecture is far more complicated to make it work, to monitoring and to debug than a traditional monolith architecture so, it is only worth if you will have very large requirements of scalability and availability and/or if the system is very large and it will require several teams working in different parts of the system and it is recommendable to avoid dependencies among them. So each team can work at their own pace deploying their own services

Service Fabric: Looking for ways to balance load between services or actors inside one application

We're considering using Service Fabric on-premises, fully or partially replacing our old solution built based on NServiceBus, though our knowledge about SF is yet a bit limited. What we like about NServiceBus is the out-of-the-box feature to declaratively throttle any service with the maximum amount of threads. If we have multiple services, and one of them starts hiccuping due to some external factors, we do not want other services affected by that. That "problem" service would just take the maximum amount of threads we allocate it with in its configuration, and its queue would start growing, but other services keep working fine as computer resources are still available. In Service Fabric, if we let our application create as many "problem" actors as it wants, it will lead to uncontrollable growth of the "problem" actors that will consume all server resources.
Any ideas on how with SF we can protect our resources in the situation I described? My first impression is that no such things like queuing or actors throttling mechanism are implemented in Service Fabric, and all must be made manually.
P.S. I think it should not be a rare demand for capability to somehow balance resources between different types of actors inside one application, to make them less dependent on each other in regards to consuming resources. I just can't believe there is nothing offered for that in SF.
Thanks

I am not sure how you would compare NServiceBus (which is a messaging solution) with Service Fabric that is a platform for building microservices. Service Fabric is a platform that supports many different types of workload. So it makes sense it does not provide out of the box throttling of threads etc.
Also, what would you expect from Service Fabric when it comes to actors or services when it comes to resource consumption. It is up to you what you want to do and how to react. I wouldn't want SF to kill my actors or throttle service request automatically. I would expect mechanisms to notify me when it happens and those are available.
That said, SF does have a mechanism to react on load using metrics, See the docs:
Metrics are the resources that your services care about and which are provided by the nodes in the cluster. A metric is anything that you want to manage in order to improve or monitor the performance of your services. For example, you might watch memory consumption to know if your service is overloaded. Another use is to figure out whether the service could move elsewhere where memory is less constrained in order to get better performance.
Things like Memory, Disk, and CPU usage are examples of metrics. These metrics are physical metrics, resources that correspond to physical resources on the node that need to be managed. Metrics can also be (and commonly are) logical metrics. Logical metrics are things like “MyWorkQueueDepth” or "MessagesToProcess" or "TotalRecords". Logical metrics are application-defined and indirectly correspond to some physical resource consumption. Logical metrics are common because it can be hard to measure and report consumption of physical resources on a per-service basis. The complexity of measuring and reporting your own physical metrics is also why Service Fabric provides some default metrics.
You can define you're own custom metrics and have the cluster react on those by moving services to other nodes. Or you could use the Health Reporting system to issue a health event and have your application or outside process act on that.

Disadvantages of the standalone auxiliary application instead of embedding it to main application

I am working on an embedded application running on Linux kernel. I need to add another auxiliary application that will communicate with the main application by opening a socket between two applications. There is another option to embed this auxiliary application to main application as a new thread, but this will cost so much time to rearrenge.
What is the advantages/disadvantages of using standalone auxiliary applications? What would be the possible misbehavior or problems that we would encounter? I am waiting for your wise hand-on and/or technical experience.
Thanks

Disadvantages of communication over socket:
Alower than shared memory.
Additional coding effort.
Third application might hijack socket.
Advantages of communication over socket:
Easily extendable to usage of separate systems for the two processes.
The two applications can be programmed in totally different languages and could use different bitness.
One application can be changed without touching the other if the protocol stays the same.

Multiple Function Apps with fewer Functions or few Function Apps with lots of Functions?

We are currently deploying a single Function App per environment / per region in Azure. These Function Apps contain lots of Functions inside them. With the Service Plan set to consumption and therefore dynamic we are fairly happy with this as it reduces the operational complexity in our ARM templates.
We do wonder though, if it would be best to have more "function apps" per environment and spread our functions across them?
Is there any real benefit to doing this as we are under the impression performance will be scaled by the dynamic service plan?

Jordan,
The answer to the question would really depend on the type of workloads you're handling with your functions.
Although the scale controller will handle scaling to meet demand, functions within a Function App do share resources on each instance, and a resource intensive (either memory or CPU) may impact other functions in the same app.
There is also no process isolation between functions in the same Function App. They all run in the same process (except for some of the scripting languages like Python, Batch, etc) and in the same App Domain. So if isolation is a factor (for reasons like security, dependency management, shared state, ect.), you may want to consider splitting functions into different apps.
Versioning and deployment is another factor worth considering, as the unit of deployment is a Function App (and not the individual functions)
With that said, if you're not running into resource consumption issues with your workloads and the issues mentioned above are not a concern, as you pointed out, running multiple functions in a single function app significantly simplifies management, so I wouldn't change that approach if there's no need to do so.
I hope this helps!

My main concern was already pointed out by Fabio. All your functions are running in the same process. So, if one of the functions is running into a timeout, that the host will be shut down (incl. restart of course). This would also affect your other functions.
I had this problem with a service bus trigger by calling a stored procedure, which sometimes reached the timeout threshold. The restart of my function app took about 7 minutes and the real-time data was not really real-time anymore ;-)

Managing multiple-processes: What are the common strategies?

While multithreading is faster in some cases, sometimes we just want to spawn multiple worker processes to do work. This has the benefits of not crashing the main app if one of the worker crashes, and that the user doesn't need to worry a lot about inter-locking stuffs.
COM+'s Application Pooling seems like a good way to achieve this on Windows. The downside is that we need to write a COM+ wrapper for the worker process.
However, when I search for Application Pooling on Google, it seems like most of its usages are related to IIS. Don't other applications (such as scientific/graphics) find it useful to spawn multiple worker processes?
So there are several questions:
Why isn't COM+ more popular in areas other than IIS? If I write a non-IIS application and want to use process management on Windows, should I go with COM+ or are there better alternatives out there?
What would be the cross platform way to do it? Are there libraries out there that give me a "process pool" (worker processes will intelligently pick up work, can be managed, etc.)

I can't offer any answers to the COM aspect of your question, but it's worth noting there's another world (besides HPC MPI) where multi-processing (rather than the more common multi-threading approach) is apparently alive, well and thriving: Python.
Why ? Python's GIL ("global interpreter lock") cripples most attempts to multithread python code so badly that multiprocessing is the generally recommended approach to parallelising Python on SMP. The standard library includes process pools; there are various other options too.
Python certainly ought to satisfy any multi-platform requirement!

You might want to investigate how the apache web server manages process pools. From version 2.0 it runs natively on windows and one of the multi-processing models it supports are process pools. A part of apache is also APR (apache portable runtime), which handles platform-specific issues.

No one can answer why something is not popular because may be no body is looking for what you are looking for. After .NET came in picture, people shifted from COM to Managed Environment, before .NET, COM and ATL and relative other technologies were quite painful to implement and they would crash and were also quite difficult to debug.
That is the reason, managed environment came in existence.
However, .NET 4 onwards, parallel libraries give much more power to user for parallel programming and also you can spawn and control other proceeses.
For multiplatform, you can look for zvrba's answer.

Yes, other applications--especially science applications--find it useful to spawn multiple processes. Since few super-computers run Microsoft Windows, scientists generally avoid using anything that ties them to a Microsoft platform. Nothing related to COM will help scientists leverage their enormous existing code base written in Fortran.
People who choose to run IIS have generally already drunk the Microsoft Koolaid, so they have fewer inhibitions to tying themselves to Microsoft's proprietary platforms, which is why COM-specific terminology will get lots of hits related to IIS.
One of the open standards for doing what you want is the Message Passing Interface. Several implementations exist and some of them run on supercomputers using Fortran. Some of them run on cheaper computers using sexier languages.
See http://en.wikipedia.org/wiki/Message_Passing_Interface

There hasn't been a mob rushing through the doors of COM application pooling primarily because of two factors:
COM is a pain in the ass to deal with compared to just about anything else
Threading can be a headache, but it's a lot easier and more convenient to manage than inter-process communication
COM application pooling was essentially created for IIS. It has one very specific benefit over normal multithreading: the multiple processes are fully isolated from each other. This is important for data security and for app stability when dealing with third party plugins of questionable stability.
Scientific computing generally doesn't need strong data security isolation between operations, and I would venture to guess that scientific computing doesn't rely much on third party plugins of questionable stability. When doing big math operations, you're either using a sexy numerics library that had better be rock solid to be taken seriously, or you're using your own code, in which case crashes should be fixed and repeat offenders should be spanked.
Oh, and all crashes except stack overflow can be trapped and dealt with within a multithreaded app, especially if it's your own code.
In short, COM app pooling is overkill for just about anything other than IIS.

Google's webbrowser chrome is a multi-process architecture software. It is open source, so you can check out its code and see how to manage processes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string