can we Increase Netty Server performance by using limiting thread pool

can we Increase Netty Server performance by using limiting thread pool - multithreading

Can anybody help me to fix the correct thread pool size a according to the processor and RAM.
Can we fix the limit of worker thread for better performance ?

There is no general answer to it.. All this depends on the workload etc. So you will not to attach a profiler to see how busy your worker threads are etc. The most important is that you need to make sure you have no blocking code in them. If you have you need an ExecutionHandler.

You can specify the number of I/O worker threads as a constructor parameter. Do NOT use fixed thread pool executor. Use an unbounded cached thread pool.

Try it like this :
ChannelFactory factory = new NioServerSocketChannelFactory(Executors.newCachedThreadPool(),
new OrderedMemoryAwareThreadPoolExecutor(workerMax, 0, 0));
Check out the documentation for more details.

Related

How many thread Pools are allowed to be created?

I have a Spring Boot application in which everytime API call is made, I am creating an ExecutorService with fixedThreadPool size of 5 threads and passing around 500 tasks to CompletableFuture to run Async. I am using this for a migration of lakhs of data.
As I started the migration, initially API was working fine and each API Call ( Basically code logic + ThreadPool Creation + Jobs Assignment to threads ) was taking around just 200 ms or so. But as API calls increased and new threadpools kept on creating, I can see gradual increase in time being taken to Create the thread Pool and assign the jobs, as a result API response time went till 4 secs.
Note : After the jobs are done, i am shutting down the executor service in finally block.
Question :
Can multiple creation create overhead to the application and do those pools keep on piling up?
Wont there be any automatic garbage collection to this ?
Will there be any limit to how many pools get created ?
And what could be causing this time delay ..
I can add further clarifications based on specific queries..

Can multiple creation create overhead to the application and do those pools keep on piling up?
Yes absolutely. Unless you shutdown the thread pools, they won't be destroyed automatically and consume resources. See next question for more details.
Wont there be any automatic garbage collection to this ?
You need to take care that the thread pools are destructed after they are no longer needed. For example, the javadoc of ThreadPoolExecutor provides some hints:
A pool that is no longer referenced in a program AND has no remaining threads will be shutdown automatically. If you would like to ensure that unreferenced pools are reclaimed even if users forget to call shutdown(), then you must arrange that unused threads eventually die, by setting appropriate keep-alive times, using a lower bound of zero core threads and/or setting allowCoreThreadTimeOut(boolean).
Will there be any limit to how many pools get created ?
There is no hard limit on how many threads are supported by Java, however there may be restrictions depending on your operating system and available resources such as memory. This is quite a complex question, more details can be found in the answers to this question: How many threads can a Java VM support?
And what could be causing this time delay?
I assume that you don't have a proper cleanup / shutdown mechanism in place for the thread pools. Every thread allocates at least 1 MB of memory for the thread stack. For example, the more threads you create, the more memory your application consumes. Depending on the system / jvm configuration, the application may utilize swap which dramatically slows down the performance.
There may be other things that cause a drop in performance, so this is just what came to my mind right now.
Profilers will help you to identify performance issues or resource leaks. This article by Baeldung shows a few profilers you could use.

When is better using clustering or worker_threads?

I have been reading about multi-processing on NodeJS to get the best understanding and try to get a good performance in heavy environments with my code.
Although I understand the basic purpose and concept for the different ways to take profit of the resources to handle the load, some questions arise as I go deeper and it seems I can't find the particular answers in the documentation.
NodeJS in a single thread:
NodeJS runs a single thread that we call event loop, despite in background OS and Libuv are handling the default worker pool for I/O asynchronous tasks.
We are supossed to use a single core for the event-loop, despite the workers might be using different cores. I guess they are sorted in the end by OS scheduler.
NodeJS as multi-threaded:
When using "worker_threads" library, in the same single process, different instances of v8/Libuv are running for each thread. Thus, they share the same context and communicate among threads with "message port" and the rest of the API.
Each worker thread runs its Event loop thread. Threads are supposed to be wisely balanced among CPU cores, improving the performance. I guess they are sorted in the end by OS scheduler.
Question 1: When a worker uses I/O default worker pool, are the very same
threads as other workers' pool being shared somehow? or each worker has its
own default worker pool?
NodeJS in multi-processing:
When using "cluster" library, we are splitting the work among different processes. Each process is set on a different core to balance the load... well, the main event loop is what in the end is set in a different core, so it doesn't share core with another heavy event loop. Sounds smart to do it that way.
Here I would communicate with some IPC tactic.
Question 2: And the default worker pool for this NodeJS process? where
are they? balanced among the rest of cores as expected in the first
case? Then they might be on the same cores as the other worker pools
of the cluster I guess. Shouldn't it be better to say that we are balancing main threads (event loops) rather than "the process"?
Being all this said, the main question:
Question 3: Whether is better using clustering or worker_threads? If both are being used in the same code, how can both libraries agree the best performance? or they
just can simply get in conflict? or at the end is the OS who takes
control?

Each worker thread has its own main loop (libuv etc). So does each cloned Node.js process when you use clustering.
Clustering is a way to load-balance incoming requests to your Node.js server over several copies of that server.
Worker threads are a way for a single Node.js process to offload long-running functions to a separate thread, to avoid blocking its own main loop.
Which is better? It depends on the problem you're solving. Worker threads are for long-running functions. Clustering makes a server able to handle more requests, by handling them in parallel. You can use both if you need to: have each Node.js cluster process use a worker thread for long-running functions.
As a first approximation for your decision-making: only use worker threads when you know you have long-running functions.
The node processes (whether from clustering or worker threads) don't get tied to specific cores (or Intel processor threads) on the host machine; the host's OS scheduling assigns cores as needed. The host OS scheduler minimize context-switch overhead when assigning cores to runnable processes. If you have too many active Javascript instances (cluster instances + worker threads) the host OS will give them timeslices according to its scheduling algorithms. Other than avoiding too many Javascript instances, there's very little point in trying second-guess the OS scheduler.
Edit Each Node.js instance, with any worker threads, uses a single libuv thread pool. A main Node.js process shares a single libuv thread pool with all its worker threads. If your Node.js program uses many worker threads, you may, or may not, need to set the UV_THREADPOOL_SIZE environment variable to a value greater than the default 4.
Node.js's cluster functionality uses the underlying OS's fork/exec scheme to create a new OS process for each cluster instance. So, each cluster instance has its own libuv pool.
If you're running stuff at scale, lets say with more than ten host machines running your Node.js server, then you can spend time optimizing Javascript instances.
Don't forget nginx if you use it as a reverse proxy to handle your https work. It needs some processor time too, but it uses fine-grain multithreading so you won't have to worry about it unless you have huge traffic.

Thread Pool in MariaDB vs Innodb thread and pool settings

Until I started using Thread Pool in MariaDB, my.cnf file was having the settings below to keep the SQL server stable.
innodb_additional_mem_pool_size
innodb_buffer_pool_size
innodb_commit_concurrency
innodb_write_io_threads
innodb_read_io_threads
innodb_thread_concurrency
innodb_sort_buffer_size
After I have learned that MariaDB supports Thread Pool feature for free then I added
thread_handling=pool-of-threads
line in my.cnf, I restarted SQL server and everything seems cool, now I was wondering that
Do the Innodb settings above still count?
Does the Thread Pool in MariaDB manages number of threads, memory allocations, concurrencies etc?
Finally, after started using Thread Pool in MariaDB, should I still keep these innodb settings or should I delete them?
Thank you

Do the Innodb settings above still count?
Yes, they still count. the Threadpool does not know about innodb specifics. It does not even know if you run a "SELECT 1" or a DML statement on Innodb table. Everything is an opaque query. Threadpool knows though whether a thread is running or waiting, and tries to keep count of running threads at number of CPUs (or, thread-pool-size if you're on Unix) . The only setting that does not count with threadpool is thread-cache-size
Does the Thread Pool in MariaDB manages number of threads, memory allocations, concurrencies etc?
Number of threads, concurrency. Not the memory allocation.
Finally, after started using Thread Pool in MariaDB, should I still keep these innodb settings or should I delete them?
Don't delete them all. Threadpool does not manage buffer sizes, so these one is pretty important to keep. However you can expriment whether to keep or remove innodb-thread-concurrency, and innodb-commit-concurrency, as those might be obsoleted by threadpool (they still can be efficient on heavy write workloads though)

The settings...
innodb_additional_mem_pool_size -- deprecated in MySQL and removed in 5.7; possibly still in use in MariaDB; not a very important setting.
innodb_buffer_pool_size -- ABSOLUTELY. This is still the most important setting; keep it!
The rest -- keep.

What is the safe number of Threads to be initialised in Windows Phone

I am aware that Windows Phone devices are diverse in hardware, especially CPU resources.
But, by practical experience, does anybody know about that number of Threads that can be run simultaneously so as to prevent device performance issues and battery consumption.
By threads, i mean, the ones initialized by Thread class, using .Start() method.

The maximum number of thread safely possible in WP depends on the amount of resources consumed and work done by the Threads. The more the resources / work done by threads, more will be load on CPU.
Try using Thread Pool for better performance.
Hope it satisfies your query :)

Thread re-purpose

Can we re-purpose the completion port thread (for async I/O operation) as worker thread in CLR process ThreadPool ?
If this is naïve then can someone suggest me how to maximize the use of thread pool threads in order to reduce number of work item stacked in the worker queue.

The IOCP threads are already sorta 'workers' - they take input from a queue and act on the received items. If you wish to avoid using another thread pool for processing items other than 'normal' IOCP completion objects rx from the network drivers, there is nothing stopping you from 'manually' queueing up objects to the IOCP queue that ask the IOCP pool threads to perform other actions. I forget the actual APIs now, but AFAIK there should be no problem.
I remember using such a mechanism for server tuning - reducing the number of IOCP threads by queueing an item that instructed the receiving IOCP pool thread to terminate.
That said, I'm not sure that such a mechansim will improve throughput significantly - the work has to be done somewhere and it may be that avoiding an extra thread pool would not help much. Empirically, as a general-purpose inter-thread comms mechanism, an IOCP queue has a worse performance than Windows message queues, (useless for thread pools anyway since only one thread can wait), and user-space CS/semaphore-based P-C queues.
Rgds,
Martin

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string