cassandra reads across multiple read requests

cassandra reads across multiple read requests - cassandra

Does Cassandra use concurrent threads to read sstables of a column family to serve a read request for a row key or an individual worker thread does the job of look up across multiple sstables?
What would be the overhead of the one over the other [using concurrent threads and single thread]?

Cassandra implements a Staged Event-Driven Architecture (SEDA) see SEDA
In a typical application, a single unit of work is often performed within the confines of a single thread. A write operation, for example, will start and end within the same thread. Cassandra, however, is different: its concurrency model is based on SEDA, so a single operation may start with one thread, which then hands off the work to another thread, which may hand it off to other threads. But it’s not up to the current thread to hand off the work to another thread. Instead, work is subdivided into what are called stages, and the thread pool (really, a java.util.concurrent.ExecutorService) associ- ated with the stage determines execution. A stage is a basic unit of work, and a single operation may internally state-transition from one stage to the next. Because each stage can be handled by a different thread pool, Cassandra experiences a massive perform- ance improvement. Read is represented as a stage in cassandra so there are definitely multiple threads involved in the Read stage, you would have to look deeper in the source code to understand whether multiple thread in read stage are used for reading or no.

Related

Is a Task lightweight compared to a Thread?

I overheard a coworker saying that a Task is basically a lightweight thread. Coming from a C++ background (where threads where the lightest weight processing unit), this seems counter-intuitive to me.
Aren't Tasks just as heavy as Threads?

You need to distinguish between a unit of work (Tasks) from the underlying process used to host/execute them. It isn't even necessary for Tasks to run on other threads. For example, Tasks can be executed in a single threaded application that periodically yields control to the task pool.
Even when Tasks are executed on separate threads, there is usually not a 1 to 1 relationship between Task and Thread. The threads are preallocated as part of a pool, and then tasks are scheduled to run on these threads as available. Creating a new task does not require the overhead of creating a thread, it only requires the cost of an enque in a task queue.
This makes tasks inherently more scalable. I can have millions of tasks throughout the lifetime of my application, but only ever actually use some constant number of threads.

Typically a "thread" implies mandatory concurrency. Starting up a thread requires allocating a stack and internal OS data structures for it. In contrast, a "task" often refers to a piece of work for which concurrency is optional, hence a parallel framework (such as OpenMP, Cilk Plus, TBB, PPL) can use the same thread to execute many tasks, by serializing the tasks, and converting optional parallelism to real parallelism only as necessary to keep the machine busy.

You are right - everything runs on a thread under the covers.
The reason people say that a Task is more lightweight than a Thread is that Microsoft put a lot of thought into having Tasks make efficient use of Threads, and the implementation is probably much lighter weight than what the average developer would come up with on their own using the Thread class.
EDIT
A more clear explanation is that a Task object is lighter weight than a Thread object, and while each Task is eventually run on a Thread, creating N Task objects concurrently leads to less than N concurrent Thread objects being used, for large N.

should I use memory fences in a supervisor-workers model?

I am building multithreading support for my application.
In my application, it can happen that a worker should access the "work field" of another worker to complete its own job. I have tried to make this safe with pthread mutexes, but they turned out to be horribly slow, even when there is only one worker and no so contention.
So, I came up with another idea. Let a worker complete its job where it can, and then add to a (per-worker, own) queue the jobs that have the aforementioned problem: when all the workers are done, the main supervisor thread will complete the unfinished jobs, in the hope that they will be orders of magnitude fewer than the number of jobs got done by the workers.
My question is: should I throw in a memory fence, at the moment that I transfer the execution from the supervisor to the workers and vice-versa?
EDIT:
more details (the code is on github, see pool::collision_wsc()). Each thread reads pointers from various "cells" (which are basically a std::vector), and applies some operation on the objects pointed (collision between hard spheres).
The point is that a cell interacts with (some of) its neighbours, but some of these cells might be in the ownership of another worker (one sphere might be near the bounds of a cell, and collide with one of another cell).

Why are message queues used insted of mulithreading?

I have the following query which i need someone to please help me with.Im new to message queues and have recently started looking at the Kestrel message queue.
As i understand,both threads and message queues are used for concurrency in applications so what is the advantage of using message queues over multitreading ?
Please help
Thank you.

message queues allow you to communicate outside your program.
This allows you to decouple your producer from your consumer. You can spread the work to be done over several processes and machines, and you can manage/upgrade/move around those programs independently of each other.
A message queue also typically consists of one or more brokers that takes care of distributing your messages and making sure the messages are not lost in case something bad happens (e.g. your program crashes, you upgrade one of your programs etc.)
Message queues might also be used internally in a program, in which case it's often just a facility to exchange/queue data from a producer thread to a consumer thread to do async processing.

Actually, one facilitates the other. Message queue is a nice and simple multithreading pattern: when you have a control thread (usually, but not necessarily an application's main thread) and a pool of (usually looping) worker threads, message queues are the easiest way to facilitate control over the thread pool.
For example, to start processing a relatively heavy task, you submit a corresponding message into the queue. If you have more messages, than you can currently process, your queue grows, and if less, it goes vice versa. When your message queue is empty, your threads sleep (usually by staying locked under a mutex).
So, there is nothing to compare: message queues are part of multithreading and hence they're used in some more complicated cases of multithreading.

Creating threads is expensive, and every thread that is simultaneously "live" will add a certain amount of overhead, even if the thread is blocked waiting for something to happen. If program Foo has 1,000 tasks to be performed and doesn't really care in what order they get done, it might be possible to create 1,000 threads and have each thread perform one task, but such an approach would not be terribly efficient. An second alternative would be to have one thread perform all 1,000 tasks in sequence. If there were other processes in the system that could employ any CPU time that Foo didn't use, this latter approach would be efficient (and quite possibly optimal), but if there isn't enough work to keep all CPUs busy, CPUs would waste some time sitting idle. In most cases, leaving a CPU idle for a second is just as expensive as spending a second of CPU time (the main exception is when one is trying to minimize electrical energy consumption, since an idling CPU may consume far less power than a busy one).
In most cases, the best strategy is a compromise between those two approaches: have some number of threads (say 10) that start performing the first ten tasks. Each time a thread finishes a task, have it start work on another until all tasks have been completed. Using this approach, the overhead related to threading will be cut by 99%, and the only extra cost will be the queue of tasks that haven't yet been started. Since a queue entry is apt to be much cheaper than a thread (likely less than 1% of the cost, and perhaps less than 0.01%), this can represent a really huge savings.
The one major problem with using a job queue rather than threading is that if some jobs cannot complete until jobs later in the list have run, it's possible for the system to become deadlocked since the later tasks won't run until the earlier tasks have completed. If each task had been given a separate thread, that problem would not occur since the threads associated with the later tasks would eventually manage to complete and thus let the earlier ones proceed. Indeed, the more earlier tasks were blocked, the more CPU time would be available to run the later ones.

It makes more sense to contrast message queues and other concurrency primitives, such as semaphores, mutex, condition variables, etc. They can all be used in the presence of threads, though message-passing is also commonly used in non-threaded contexts, such as inter-process communication, whereas the others tend to be confined to inter-thread communication and synchronisation.
The short answer is that message-passing is easier on the brain. In detail...
Message-passing works by sending stuff from one agent to another. There is generally no need to coordinate access to the data. Once an agent receives a message it can usually assume that it has unqualified access to that data.
The "threading" style works by giving all agent open-slather access to shared data but requiring them to carefully coordinate their access via primitives. If one agent misbehaves, the process becomes corrupted and all hell breaks loose. Message passing tends to confine problems to the misbehaving agent and its cohort, and since agents are generally self-contained and often programmed in a sequential or state-machine style, they tend not to misbehave as often — or as mysteriously — as conventional threaded code.

Check number of idle cores when creating .Net 4.0 Parallel Task

My question might sound a bit naive but I'm pretty new with multi-threaded programming.
I'm writing an application which processes incoming external data. For each data that arrives a new task is created in the following way:
System.Threading.Tasks.Task.Factory.StartNew(() => methodToActivate(data));
The items of data arrive very fast (each second, half second, etc...), so many tasks are created. Handling each task might take around a minute. When testing it I saw that the number of threads is increasing all the time. How can I limit the number of tasks created, so the number of actual working threads is stable and efficient. My computer is only dual core.
Thanks!

One of your issues is that the default scheduler sees tasks that last for a minute and makes the assumption that they are blocked on another tasks that have yet to be executed. To try and unblock things it schedules more pending tasks, hence the thread growth. There are a couple of things you can do here:
Make your tasks shorter (probably not an option).
Write a scheduler that deals with this scenario and doesn't add more threads.
Use SetMaxThreads to prevent
unbounded thread pool growth.
See the section on Thread Injection here:
http://msdn.microsoft.com/en-us/library/ff963549.aspx

You should look into using the producer/consumer pattern with a BlockingCollection<T> around a ConcurrentQueue<T> where you set the BoundedCapacity to something that makes sense given the characteristics of your workload. You can make your BoundedCapacity configurable and then tweak as you run through some profiling sessions to find the sweet spot.
While it's true that the TPL will take care of queueing up the tasks you create, creating too many tasks does not come without penalties. Also, what's the point in producing more work than you can consume? You want to produce enough work that the consumers will never be starved, but you don't want to get to far ahead of yourself because that's just wasting resources and potentially stealing those very same resources from your consumers.

You can create a custom TaskScheduler for the Task Parallel library and then schedule tasks on that by passing an instance of it to the TaskFactory constructor.
Here's one example of how to do that: Task Scheduler with a maximum degree of parallelism.

What is the advantage of Executors over Threads in multithreaded application

I have seen several comments to the effect that Executors are better than Threads, but if you have a number of Threads communicating via bounded buffers (as in Flow-Based Programming) why would you use Executors when you have to use Threads anyway (with newCachedThreadPool (?)). Also, I use methods like isAlive(), interrupt() - how do I get hold of the Thread handle?
Does anyone have sample code that I can plagiarize? ;-)

Executors are basically an abstraction over Threads. They make you isolate your potentially parallel logic in Runnable/Callable instances while liberating you from the duties of manually creating and starting a thread or managing a pool. You still need to handle dependencies as part of your application logic.
If you want to interact / are comfortable with Threads for your application logic, you may skip using Executors. Regarding getting hold of the thread, you can always execute Thread.currentThread() to get hold of the current thread from any executing context.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string