A Simple explanation for Process vs Threads [closed] - multithreading

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
Can anybody distinguish between between process and a thread. I have read a lot on the web but most of them are confusing enough.

Forgive my analogy but think of a process as the body and thread as the mind or soul.
Process is the abstraction that represents resources related to your application such as memory (virtual, physical, etc...), security and other kernel related properties.
Threads are where actual execution happens. It is the life of a process. Matter of fact, when the operating system is asked to run a process, it starts by allocating all the process required resource, but the process is not getting into action or life yet till the OS creates and starts the execution of the process first and main thread.
After the process last thread stop execution, the process start the actual death, (and cleanup).
I intentionally tried to make things less dry. I hope I succeeded :).

A process is something that is executed by a processor (CPU), and has its own resources, e.g. an own address space. Therefore it is isolated from other processed by the operating system, as far as possible.
In contrast, a thread is a "light weighted" process that shares its resources, particularly its address space, with other threads. These threads can easily communicate with each other using the common address space. But since they are not isolated from each other, they can influence each other in possibly tricky ways.
ADDED:
"light weight" means that it requires much less "administrative" work from the operating system than a process. A process switch, e.g., requires to switch the address space using the memory management unit, which takes considerable time. To switch from one thread to another one in the same addressing space is thus MUCH faster. For the same reason, a process communication requires more work from the OS than a communication between threads in the same addressing space. So, light weight simply means threads require less work from the operating system.

I'll add my own selection of clarifying statements :-)
A process is a program (e.g. a file ending in .exe) that the OS runs. The OS keeps processes apart so that a process runs in parallel with, independent from and unaffected by the behaviour of all other processes (unless it chooses otherwise). OSes are not quite perfect so that isn't always true, but that's the idea.
A process contains at least one thread of execution, i.e. a sequence of instructions that are executed one after another.
A process can contain more than one thread if the initial thread starts more of them. Each thread runs its sequence of instructions in parallel with the other threads (or at least that's what the OS endeavours to achieve).
It is normally the case that threads in a process interact by accessing shared resources (memory, network connections, etc.)
You use objects such as semaphores and mutexes to ensure that when a thread accesses a shared resource it has exclusive use of that resource. Other threads contending for the resource are halted by the OS until it becomes available again, at which point they're resumed. This is called a context switch.
A process can access named shared resources and use named semaphores / mutexes to interact with another process in a similar way. The difference is that the OS has to context switch a whole process instead of a whole thread. On most OSes this is takes a lot longer, and so is generally avoided for reasons of efficiency.

Related

Is a coroutine a kind of thread that is managed by the user-program itself (rather than managed by the kernel)?

In my opinion,
Kernel is an alias for a running program whose program text is in the kernel area and can access all memory spaces;
Process is an alias for a running program whose program has an independent memory space in the user memory area. Which process can get the use of the CPU is completely managed by the kernel;
Thread is an alias for a running program whose program-text is in the memory space of a process and completely shares the memory space with another thread of the same process. Which thread can get the use of the CPU is completely managed by the kernel;
Coroutine is an alias for a running program whose program-text is in the memory space of a process.And it is a user thread that the process decides itself (not the kernel) how to use, and the kernel is only responsible for allocating CPU resources to the process.
Since the process itself has no right to schedule like the kernel, the coroutine can only be concurrent but not parallel.
Am I correct in saying Above?
process is an alias for a running program...
The modern way to think of a process is to think of it as a container for a collection of threads and the resources that those threads need to execute.
Every process (except for "zombie" processes that can exist in some systems) must have at least one thread. It also has a virtual address space, open file handles, and maybe sockets and other resources that are shared by the threads.
Thread is an alias for a running program...
The problem with saying that is, "running program" sounds too much like "process," and a thread is most definitely not a process. (E.g., a thread can only exist in a process.)
A computer scientist might tell you that a thread is one particular execution of the application's code. I like to think of a thread as an independent agent who executes the code.
coroutine...is a user thread...
I'm going to mostly leave that one alone. "Coroutine" seems to mean something different from the highly formalized, and not particularly useful coroutines that I learned about more than forty years ago. What people call "coroutines" today seem to have somewhat in common with what I call "green threads," but there are details of how and when and why they are used that I don't yet understand.
Green threads (a.k.a., "user mode threads") simply are threads that the kernel doesn't know about. They are pretty much just like the threads that the kernel does know about except, the kernel scheduler never preempts them because, Duh! it doesn't know about them. Context switches between green threads can only happen at specific points where the application allows it (e.g., by calling a yield() function or, by calling some library function that is a documented yield point.)
kernel is an alias for a running program...
The kernel also is most definitely not a process.
I don't know every detail about every operating system, but the bulk of kernel code does not run independently of the applications that the kernel serves. It only runs when an application thread enters kernel mode by making a system call. The thread that runs the kernel code still belongs to the application process, but the code that determines the thread's behavior at that point is written or chosen by the kernel developers, not by the application developers.

Process is a thread or thread is a process?

I was asked this interview question. I replied that thread is the process after thinking that process is a superset of thread but interviewer didn't agree with it. It is confusing and I'm not able to find any clear answer to this.
A process is an executing instance of an application.
A thread is a path of execution within a process.
Also, a process can contain multiple threads.
1.
It’s important to note that a thread can do anything a process can do.
But since a process can consist of multiple threads, a thread could be
considered a ‘lightweight’ process. Thus, the essential difference
between a thread and a process is the work that each one is used to
accomplish. Threads are used for small tasks, whereas processes are
used for more ‘heavyweight’ tasks – basically the execution of
applications.
2.
Another difference between a thread and a process is that threads
within the same process share the same address space, whereas
different processes do not. This allows threads to read from and write
to the same data structures and variables, and also facilitates
communication between threads. Communication between processes – also
known as IPC, or inter-process communication – is quite difficult and
resource-intensive.
I feel like this is a terrible question.
Both are independent blocks of execution
Both are scheduled by the operating system
Threads run within the context of a process, share memory with the process.
I can't think of a time where a thread would have it's own address space
By that logic I would agree with your answer that a thread is a process. I think its kind of a loaded question. I would have asked you to explain the differences between the two.
For more information here's a good thread to view on the subject.
Every process is a thread, but not every thread is a process.
A thread is just an independet sequence of operations. A process has an additional context.
The nature of a thread is highly system dependent. For example, some systems implement threads as part of the operating system. Other system implement threads through a run-time library. The process itself manages its own threads (not the OS) and the management may be different for different processes (e.g., Java threading implemented differently from Ada threading).
In OS-scheduled threads, a thread and a process are different terms. A process is an address space with multiple, schedulable threads of execution.
In RTL-scheduled threads, the process is a thread.

When is clone() and fork better than pthreads?

I am beginner in this area.
I have studied fork(), vfork(), clone() and pthreads.
I have noticed that pthread_create() will create a thread, which is less overhead than creating a new process with fork(). Additionally the thread will share file descriptors, memory, etc with parent process.
But when is fork() and clone() better than pthreads? Can you please explain it to me by giving real world example?
Thanks in Advance.
clone(2) is a Linux specific syscall mostly used to implement threads (in particular, it is used for pthread_create). With various arguments, clone can also have a fork(2)-like behavior. Very few people directly use clone, using the pthread library is more portable. You probably need to directly call clone(2) syscall only if you are implementing your own thread library - a competitor to Posix-threads - and this is very tricky (in particular because locking may require using futex(2) syscall in machine-tuned assembly-coded routines, see futex(7)). You don't want to directly use clone or futex because the pthreads are much simpler to use.
(The other pthread functions require some book-keeping to be done internally in libpthread.so after a clone during a pthread_create)
As Jonathon answered, processes have their own address space and file descriptor set. And a process can execute a new executable program with the execve syscall which basically initialize the address space, the stack and registers for starting a new program (but the file descriptors may be kept, unless using close-on-exec flag, e.g. thru O_CLOEXEC for open).
On Unix-like systems, all processes (except the very first process, usuallyinit, of pid 1) are created by fork (or variants like vfork; you could, but don't want to, use clone in such way as it behaves like fork).
(technically, on Linux, there are some few weird exceptions which you can ignore, notably kernel processes or threads and some rare kernel-initiated starting of processes like /sbin/hotplug ....)
The fork and execve syscalls are central to Unix process creation (with waitpid and related syscalls).
A multi-threaded process has several threads (usually created by pthread_create) all sharing the same address space and file descriptors. You use threads when you want to work in parallel on the same data within the same address space, but then you should care about synchronization and locking. Read a pthread tutorial for more.
I suggest you to read a good Unix programming book like Advanced Unix Programming and/or the (freely available) Advanced Linux Programming
The strength and weakness of fork (and company) is that they create a new process that's a clone of the existing process.
This is a weakness because, as you pointed out, creating a new process has a fair amount of overhead. It also means communication between the processes has to be done via some "approved" channel (pipes, sockets, files, shared-memory region, etc.)
This is a strength because it provides (much) greater isolation between the parent and the child. If, for example, a child process crashes, you can kill it and start another fairly easily. By contrast, if a child thread dies, killing it is problematic at best -- it's impossible to be certain what resources that thread held exclusively, so you can't clean up after it. Likewise, since all the threads in a process share a common address space, one thread that ran into a problem could overwrite data being used by all the other threads, so just killing that one thread wouldn't necessarily be enough to clean up the mess.
In other words, using threads is a little bit of a gamble. As long as your code is all clean, you can gain some efficiency by using multiple threads in a single process. Using multiple processes adds a bit of overhead, but can make your code quite a bit more robust, because it limits the damage a single problem can cause, and makes it much easy to shut down and replace a process if it does run into a major problem.
As far as concrete examples go, Apache might be a pretty good one. It will use multiple threads per process, but to limit the damage in case of problems (among other things), it limits the number of threads per process, and can/will spawn several separate processes running concurrently as well. On a decent server you might have, for example, 8 processes with 8 threads each. The large number of threads helps it service a large number of clients in a mostly I/O bound task, and breaking it up into processes means if a problem does arise, it doesn't suddenly become completely un-responsive, and can shut down and restart a process without losing a lot.
These are totally different things. fork() creates a new process. pthread_create() creates a new thread, which runs under the context of the same process.
Thread share the same virtual address space, memory (for good or for bad), set of open file descriptors, among other things.
Processes are (essentially) totally separate from each other and cannot modify each other.
You should read this question:
What is the difference between a process and a thread?
As for an example, if I am your shell (eg. bash), when you enter a command like ls, I am going to fork() a new process, and then exec() the ls executable. (And then I wait() on the child process, but that's getting out of scope.) This happens in an entire different address space, and if ls blows up, I don't care, because I am still executing in my own process.
On the other hand, say I am a math program, and I have been asked to multiply two 100x100 matrices. We know that matrix multiplication is an Embarrassingly Parallel problem. So, I have the matrices in memory. I spawn of N threads, who each operate on the same source matrices, putting their results in the appropriate location in the result matrix. Remember, these operate in the context of the same process, so I need to make sure they are not stamping on each other's data. If N is 8 and I have an eight-core CPU, I can effectively calculate each part of the matrix simultaneously.
Process creation mechanism on unix using fork() (and family) is very efficient.
Morever , most unix system doesnot support kernel level threads i.e thread is not entity recognized by kernel. Hence thread on such system cannot get benefit of CPU scheduling at kernel level. pthread library does that which is not kerenl rather some process itself.
Also on such system pthreads are implemented using vfork() and as light weight process only.
So using threading has no point except portability on such system.
As per my understanding Sun-solaris and windows has kernel level thread and linux family doesn't support kernel threads.
with processes pipes and unix doamin sockets are very efficient IPC without synchronization issues.
I hope it clears why and when thread should be used practically.

Threads vs Processes in Linux [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
The community reviewed whether to reopen this question last year and left it closed:
Original close reason(s) were not resolved
Improve this question
I've recently heard a few people say that in Linux, it is almost always better to use processes instead of threads, since Linux is very efficient in handling processes, and because there are so many problems (such as locking) associated with threads. However, I am suspicious, because it seems like threads could give a pretty big performance gain in some situations.
So my question is, when faced with a situation that threads and processes could both handle pretty well, should I use processes or threads? For example, if I were writing a web server, should I use processes or threads (or a combination)?
Linux uses a 1-1 threading model, with (to the kernel) no distinction between processes and threads -- everything is simply a runnable task. *
On Linux, the system call clone clones a task, with a configurable level of sharing, among which are:
CLONE_FILES: share the same file descriptor table (instead of creating a copy)
CLONE_PARENT: don't set up a parent-child relationship between the new task and the old (otherwise, child's getppid() = parent's getpid())
CLONE_VM: share the same memory space (instead of creating a COW copy)
fork() calls clone(least sharing) and pthread_create() calls clone(most sharing). **
forking costs a tiny bit more than pthread_createing because of copying tables and creating COW mappings for memory, but the Linux kernel developers have tried (and succeeded) at minimizing those costs.
Switching between tasks, if they share the same memory space and various tables, will be a tiny bit cheaper than if they aren't shared, because the data may already be loaded in cache. However, switching tasks is still very fast even if nothing is shared -- this is something else that Linux kernel developers try to ensure (and succeed at ensuring).
In fact, if you are on a multi-processor system, not sharing may actually be beneficial to performance: if each task is running on a different processor, synchronizing shared memory is expensive.
* Simplified. CLONE_THREAD causes signals delivery to be shared (which needs CLONE_SIGHAND, which shares the signal handler table).
** Simplified. There exist both SYS_fork and SYS_clone syscalls, but in the kernel, the sys_fork and sys_clone are both very thin wrappers around the same do_fork function, which itself is a thin wrapper around copy_process. Yes, the terms process, thread, and task are used rather interchangeably in the Linux kernel...
Linux (and indeed Unix) gives you a third option.
Option 1 - processes
Create a standalone executable which handles some part (or all parts) of your application, and invoke it separately for each process, e.g. the program runs copies of itself to delegate tasks to.
Option 2 - threads
Create a standalone executable which starts up with a single thread and create additional threads to do some tasks
Option 3 - fork
Only available under Linux/Unix, this is a bit different. A forked process really is its own process with its own address space - there is nothing that the child can do (normally) to affect its parent's or siblings address space (unlike a thread) - so you get added robustness.
However, the memory pages are not copied, they are copy-on-write, so less memory is usually used than you might imagine.
Consider a web server program which consists of two steps:
Read configuration and runtime data
Serve page requests
If you used threads, step 1 would be done once, and step 2 done in multiple threads. If you used "traditional" processes, steps 1 and 2 would need to be repeated for each process, and the memory to store the configuration and runtime data duplicated. If you used fork(), then you can do step 1 once, and then fork(), leaving the runtime data and configuration in memory, untouched, not copied.
So there are really three choices.
That depends on a lot of factors. Processes are more heavy-weight than threads, and have a higher startup and shutdown cost. Interprocess communication (IPC) is also harder and slower than interthread communication.
Conversely, processes are safer and more secure than threads, because each process runs in its own virtual address space. If one process crashes or has a buffer overrun, it does not affect any other process at all, whereas if a thread crashes, it takes down all of the other threads in the process, and if a thread has a buffer overrun, it opens up a security hole in all of the threads.
So, if your application's modules can run mostly independently with little communication, you should probably use processes if you can afford the startup and shutdown costs. The performance hit of IPC will be minimal, and you'll be slightly safer against bugs and security holes. If you need every bit of performance you can get or have a lot of shared data (such as complex data structures), go with threads.
Others have discussed the considerations.
Perhaps the important difference is that in Windows processes are heavy and expensive compared to threads, and in Linux the difference is much smaller, so the equation balances at a different point.
Once upon a time there was Unix and in this good old Unix there was lots of overhead for processes, so what some clever people did was to create threads, which would share the same address space with the parent process and they only needed a reduced context switch, which would make the context switch more efficient.
In a contemporary Linux (2.6.x) there is not much difference in performance between a context switch of a process compared to a thread (only the MMU stuff is additional for the thread).
There is the issue with the shared address space, which means that a faulty pointer in a thread can corrupt memory of the parent process or another thread within the same address space.
A process is protected by the MMU, so a faulty pointer will just cause a signal 11 and no corruption.
I would in general use processes (not much context switch overhead in Linux, but memory protection due to MMU), but pthreads if I would need a real-time scheduler class, which is a different cup of tea all together.
Why do you think threads are have such a big performance gain on Linux? Do you have any data for this, or is it just a myth?
I think everyone has done a great job responding to your question. I'm just adding more information about thread versus process in Linux to clarify and summarize some of the previous responses in context of kernel. So, my response is in regarding to kernel specific code in Linux. According to Linux Kernel documentation, there is no clear distinction between thread versus process except thread uses shared virtual address space unlike process. Also note, the Linux Kernel uses the term "task" to refer to process and thread in general.
"There are no internal structures implementing processes or threads, instead there is a struct task_struct that describe an abstract scheduling unit called task"
Also according to Linus Torvalds, you should NOT think about process versus thread at all and because it's too limiting and the only difference is COE or Context of Execution in terms of "separate the address space from the parent " or shared address space. In fact he uses a web server example to make his point here (which highly recommend reading).
Full credit to linux kernel documentation
If you want to create a pure a process as possible, you would use clone() and set all the clone flags. (Or save yourself the typing effort and call fork())
If you want to create a pure a thread as possible, you would use clone() and clear all the clone flags (Or save yourself the typing effort and call pthread_create())
There are 28 flags that dictate the level of resource sharing. This means that there are over 268 million flavours of tasks that you can create, depending on what you want to share.
This is what we mean when we say that Linux does not distinguish between a process and a thread, but rather alludes to any flow of control within a program as a task. The rationale for not distinguishing between the two is, well, not uniquely defining over 268 million flavours!
Therefore, making the "perfect decision" of whether to use a process or thread is really about deciding which of the 28 resources to clone.
How tightly coupled are your tasks?
If they can live independently of each other, then use processes. If they rely on each other, then use threads. That way you can kill and restart a bad process without interfering with the operation of the other tasks.
To complicate matters further, there is such a thing as thread-local storage, and Unix shared memory.
Thread-local storage allows each thread to have a separate instance of global objects. The only time I've used it was when constructing an emulation environment on linux/windows, for application code that ran in an RTOS. In the RTOS each task was a process with it's own address space, in the emulation environment, each task was a thread (with a shared address space). By using TLS for things like singletons, we were able to have a separate instance for each thread, just like under the 'real' RTOS environment.
Shared memory can (obviously) give you the performance benefits of having multiple processes access the same memory, but at the cost/risk of having to synchronize the processes properly. One way to do that is have one process create a data structure in shared memory, and then send a handle to that structure via traditional inter-process communication (like a named pipe).
In my recent work with LINUX is one thing to be aware of is libraries. If you are using threads make sure any libraries you may use across threads are thread-safe. This burned me a couple of times. Notably libxml2 is not thread-safe out of the box. It can be compiled with thread safe but that is not what you get with aptitude install.
I'd have to agree with what you've been hearing. When we benchmark our cluster (xhpl and such), we always get significantly better performance with processes over threads. </anecdote>
The decision between thread/process depends a little bit on what you will be using it to.
One of the benefits with a process is that it has a PID and can be killed without also terminating the parent.
For a real world example of a web server, apache 1.3 used to only support multiple processes, but in in 2.0 they added an abstraction so that you can swtch between either. Comments seems to agree that processes are more robust but threads can give a little bit better performance (except for windows where performance for processes sucks and you only want to use threads).
For most cases i would prefer processes over threads.
threads can be useful when you have a relatively smaller task (process overhead >> time taken by each divided task unit) and there is a need of memory sharing between them. Think a large array.
Also (offtopic), note that if your CPU utilization is 100 percent or close to it, there is going to be no benefit out of multithreading or processing. (in fact it will worsen)
Threads -- > Threads shares a memory space,it is an abstraction of the CPU,it is lightweight.
Processes --> Processes have their own memory space,it is an abstraction of a computer.
To parallelise task you need to abstract a CPU.
However the advantages of using a process over a thread is security,stability while a thread uses lesser memory than process and offers lesser latency.
An example in terms of web would be chrome and firefox.
In case of Chrome each tab is a new process hence memory usage of chrome is higher than firefox ,while the security and stability provided is better than firefox.
The security here provided by chrome is better,since each tab is a new process different tab cannot snoop into the memory space of a given process.
Multi-threading is for masochists. :)
If you are concerned about an environment where you are constantly creating threads/forks, perhaps like a web server handling requests, you can pre-fork processes, hundreds if necessary. Since they are Copy on Write and use the same memory until a write occurs, it's very fast. They can all block, listening on the same socket and the first one to accept an incoming TCP connection gets to run with it. With g++ you can also assign functions and variables to be closely placed in memory (hot segments) to ensure when you do write to memory, and cause an entire page to be copied at least subsequent write activity will occur on the same page. You really have to use a profiler to verify that kind of stuff but if you are concerned about performance, you should be doing that anyway.
Development time of threaded apps is 3x to 10x times longer due to the subtle interaction on shared objects, threading "gotchas" you didn't think of, and very hard to debug because you cannot reproduce thread interaction problems at will. You may have to do all sort of performance killing checks like having invariants in all your classes that are checked before and after every function and you halt the process and load the debugger if something isn't right. Most often it's embarrassing crashes that occur during production and you have to pore through a core dump trying to figure out which threads did what. Frankly, it's not worth the headache when forking processes is just as fast and implicitly thread safe unless you explicitly share something. At least with explicit sharing you know exactly where to look if a threading style problem occurs.
If performance is that important, add another computer and load balance. For the developer cost of debugging a multi-threaded app, even one written by an experienced multi-threader, you could probably buy 4 40 core Intel motherboards with 64gigs of memory each.
That being said, there are asymmetric cases where parallel processing isn't appropriate, like, you want a foreground thread to accept user input and show button presses immediately, without waiting for some clunky back end GUI to keep up. Sexy use of threads where multiprocessing isn't geometrically appropriate. Many things like that just variables or pointers. They aren't "handles" that can be shared in a fork. You have to use threads. Even if you did fork, you'd be sharing the same resource and subject to threading style issues.
If you need to share resources, you really should use threads.
Also consider the fact that context switches between threads are much less expensive than context switches between processes.
I see no reason to explicitly go with separate processes unless you have a good reason to do so (security, proven performance tests, etc...)

Practical uses for threads [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 14 years ago.
In your work, what specifically have you used threads for?
(Please give a description of the application and how the thread helped/enhanced the application.)
Threads are critical for most UI work. Otherwise, any time you want to do a calculation or anything that takes a while, you will freeze the UI.
Therefore, most GUI frameworks have UI threads that handles the event loop (and some drawing activities), but most user code takes place in another thread.
Threads are also useful for occasionally checking things or making episodic changes to the state of the system.
(Less serious answer) I like to use threads in any situation where I want the system to fall on its arse in interesting and unobvious ways, while still having plausible deniability as to how I could have let the problem slip through.
Or, in the words of Rasmus Lerdorf, "People aren't smart enough to write thread-safe code".
Handling concurrent client requests in a server.
Threads are fundamental for most I/O bound applications, and for any reasonably complex server side application. Consider an application that acts as an exchange for information with multiple sources of data. You need to be able to deal with this information in independent threads, in particular if operations on this data is subject to latencies or require a signifigant amount of time to complete.
In most occasions threads often help to decouple various concerns within the application. A single thread dispatching events to interested parties will not scale well in the vast majority of occasions.
All but the simplest of applications will require threading to some degree.
Most common use is for resposive UI like showing a progress bar for a long running background task.
Background tasks:
Handling network connections and protocols.
Doing Sound Synthesis running in the background of a multimedia application.
Doing File-loading in the background in a multimedia application (CD streaming)
Other uses:
Accelerating certain algorithms by running two instances of the same code in two different threads.
I know that most of the time I use threads, what I actually want to do is to launch some asynchronous lump of work - i.e. I want something to happen in the mythical "background". Unfortunately thinking about threads isn't really the correct level of abstraction for doing "launch some lump of work", because you're not putting something into the background. With a thread API you're creating another place to run stuff as a sibling of the process's original thread, and need to worry about what information is shared between them, and how, and so on. That's why I'm enjoying newer API such as Cocoa's NSOperation and NSOperationQueue. In the case of that API, launching some lump of work is just some single line, and the library takes care of whether a new thread should be launched or an old one reused.
To scan directories looking for changed files. It is a lot quicker to spawn a thread per sub directory then do it in a single thread.
We have been using threads for several applications where the main screen was comprised of a workflow tailored to the currently logged on user.
Getting the workflow can be time intensive. Various parts of the workflow get loaded by different threads. For our main application BP/GeNA, about 11 threads are fired, each running a database query.
Regards,
Lieven
I most commonly use threads when I want to do something with a bunch of resources that I know will take a long time, when there's no interdependence between the work to process the elements, and particularly if the bottleneck isn't a local resource (like CPU of disk). For example, if I've got a bunch of URLs to retrieve, each of those will go into a separate thread.
This is a very general question. I've used "threads" to take potentially blocking work off of the UI thread, whether that work is local or network i/o or that work is computationally intensive tasks which will tend to "block" depending on the hardware it is run on.
I think it is more interesting to ask about a particular problem or pattern which help alleviates it and the applicability on threads, i.e.:
how are threads relevant to model
view controller?
how or why should I
take work off of a UI thread to
ensure that the UI doesn't even think
of blocking?
how can a threadpool be
useful for recursive (network)
directory traversal as someone else
alluded to?
Should I be affinitizing
threads for cooperative scheduling of
computationally intensive tasks, or
should I be using a ThreadPool and
let the OS preemptively schedule the
threads as it sees fit.
It's a pretty broad space and more clarity would likely help.
I build web applications, so all code that I write is executed in multiple threads.
Our app is a web service, so we spawn off a thread per request. Technically, JNI spawns off the thread, but the code has to be thread-safe regardless. We've run into some interesting (FSVO) issues with both Hibernate and our ESB-based infrastructure, but for the most part keeping things in ThreadLocals and synchronizing on subsystem entry points has worked pretty well. We haven't tried more than a couple dozen simultaneous requests, so there may well be some race conditions we haven't identified yet, but overall we're performant and give the right answers.
I wrote a function that spawns a thread that beeps (from the speaker) at regular intervals to alert a test operator that something needs attention. After the modal dialog is responded to, the thread is killed.
Not employment related, but I'm doing some side work on the Netflix Prize. My computer has 8 cores and 20 GB of RAM ... running only 1 thread would be an utter waste, so I typically start 16 threads or so.

Resources