threads and high level languages - multithreading

can someone tell me why if i use threads it's better to use an low level languages like c++
and not c# and JAVA? someone asked me that in an interview and i did'nt know the answer

It's news to me. Higher level languages provide easy to use abstractions over thread management, for example.
I expect the interviewer's point would make sense in context. It's dependent on the problem in hand - the level of timing control you need if you're writing a computer game or software for an engine management system may be greater than if you are writing a conference room booking system.
You trade off the low-level control and the associated learning curve and risk you get with lower-level languages for ease of use, safety and productivity of higher-level languages.

I don't think this is necessarily true. In Java (I can't comment on C#) a thread maps directly to a native thread. From here:
The Java HotSpotâ„¢ virtual machine
currently associates each Java thread
with a unique native thread. The
relationship between the Java thread
and the native thread is stable and
persists for the lifetime of the Java
thread.
plus you have the additional high level constructs such as the Executor framework.
Going forwards, functional languages (such as F# and Scala) encourage immutability, which contribute to a safer threaded environment.
There may well be scenarios where a low-level language offers more control (as for most requirements), but I suspect those will be fairly specialised situations. You have to balance that against the safety/productivity that the higher-level languages offer.
EDIT : From your comments supplementing the question, this may relate to running a garbage collector and consequent garbage-collection pauses and the impact on providing real-time performance and predictability. Threading in C/C++ may well offer some benefits in this area since a garbage collection cycle is not going to kick off during some critical time-dependent code. For this reason (amongst others) Java can't be considered as a real-time platform.

like most answers : it depends. languages with built in threading facilities like C# and Java
will do some or most of the work needed for thread usage and synchronization for you.
with C++ you have do it yourself but you can employ better optimization techniques for your specific OS and platform

Will you use threads or not - depends solely on application, not on language. And language is a function of design.
C++ provides more control, c# provides more abstraction, Java provides simplification, but in the end they all work the same way.

Related

How well do common languages perform multi-threading?

In my CS class we're discussing threads and processes. I'm curious to know what common programming languages (Java, C/C++, C#, Python) can actually implement multi-threading and, if they do, how efficiently they do it.
We were shown a simple multi-threading structure in C but they didn't demonstrate the difference by running it or by a chart of collected results from a previous test. I assume that the gains for some languages using multi-threading may be negligible
EDIT
PDizzle pointed out that the gains in efficiency isn't necessarily dependent upon the language but rather what the applications/software in question require, as well as how well it is implemented for said application/software
When a program creates a separate thread for processing, it all boils down to the program making a call to the operating system to request resources for a thread.
Each operating system has an API programming languages can request multi-threading to use in a program. The implementation is platform dependent. C++ (now) has the std::thread that has operating system dependent calls. Java has classes that implement calls from the virtual machine to the operating system for requesting a thread.
I assume that the gains for some languages using multi-threading may
be negligible
No, the gains from using multi-threading in general may be negligible depending on the application requirements. I would say it's more important how an application uses threading to accomplish a task than worry about the overhead each language has to access multi-threading.
I think most modern languages do multitasking well. Modern being c++11 ,java, c#, d etc.
However most programs don't benefit from multitasking not because of the language in use, but because the algorithm being multi threaded Doesn't benefit from parallel processing. Think sorting algorithm and the like.

Distributed Haskell state of the art in 2011?

I've read a lot of articles about distributed Haskell. Much work has been done but seems to be in the area of distributing computations. I saw the remote package which seems to implement Erlang-style messaging passing but it is 0.1 and early stage.
I'd like to implement a system where there are many separate processes that provide distinct services, and are tied together by several main processes. This seems to be a natural fit for Erlang, but not so for Haskell. But I like Haskell's type safety.
Has there been any recent adoption of Erlang-style process management in Haskell?
If you want to learn more about the remote package, a.k.a CloudHaskell, see the paper as well as Jeff Epstein's thesis. It aims to provide precisely the actor abstraction you want, but as you say it is in the early stages. There is active discussion regarding improvements on the parallel-haskell mailing list, so if you have specific needs that remote doesn't provide, we'd be happy for you to jump in and help us decide its future directions.
More mature but lower-level than remote is the haskell-mpi package. If you stick to the Simple interface, messages can be sent containing arbitrary Serialize instances, but the abstraction is still way lower than remote.
There are some experimental systems, such as described in Implementing a High-level Distributed-Memory Parallel Haskell in Haskell (Patrick Maier and Phil Trinder, IFL 2011, can't find a pdf online). It blends a monad-par approach of deterministic dataflow parallelism with a limited ability to make the I-structures serializable over the network. These sorts of abstraction have promise for doing distributed computation, but since the focus is on computing purely-functional values rather than providing Erlang-style processes, they probably wouldn't be a good fit for your application.
Also, for completeness, I should point out the Haskell wiki page on cloud and HPC Haskell, which covers what I describe here, as well as the subsection on distributed Haskell, which seems in need of a refresh.
I frequently get the feeling that IPC and actors are an oversold feature. There are plenty of attractive messaging systems out there that have Haskell bindings e.g. MessagePack, 0MQ or Thrift. IMHO the only thing you have to add is proper addressing of processes and decide who/what is managing this addressing capability.
By the way: a number of coders adopt e.g. 0MQ into their Erlang environments, simply because it offers the possibility to structure messaging via message brokers rather then relying on pure process to process messaging in super scale.
In a "massively multicore world" I personally assume that shared memory approaches will eventually be outperforming messaging. Someone can then always come and argue with asynchrony of course. But already when you write that you want to "tie together" your processes by "several main processes" you in fact speak about synchronization. Also, you can of course challenge whether a single function, process or thread is the right level of parallelization.
In short: I would probably see whether MessagePack or 0MQ could fit my needs in Haskell and care for the rest in my code.

What does built in support for multithreading mean?

Java provides built-in support for multithreaded programming.
That is what my book says. I can do multithreaded programming in C, C++ also. So do they also provide built-in support for multithreading?
What does built in support for multithreading mean? Isn't it the OS that ACTUALLY provides support for multithreading?
Are there any programming languages that cannot support multithreading? If so why? (I am asking this question because, if the OS provides support for multithreading then why cant we do multithreaded programming on all languages that are supported on that OS?)
The issue is one of language-support vs. library support for multithreading.
Java's use of the keyword synchronized for placing locks on objects is a language-level construct. Also the built-in methods on Object (wait, notify, notifyAll) are implemented directly in runtime.
There is a bit of a debate regarding whether languages should implement threading though keywords and language structures and core data types vs. having all thread capabilities in the library.
A research paper espousing the view that language-level threading is beneficial is the relatively famous http://www.hpl.hp.com/personal/Hans_Boehm/misc_slides/pldi05_threads.pdf.
In theory, any language built on a C runtime can access a library such as pthreads, and any language running on a JVM can use those threads. In short all languages that can use a library (and have the notion of function pointers) can indeed do multithreading.
I believe they mean that Java has keywords like volatile and synchronized keyword built-in, to make multithreading easier, and that the library already provides threading classes so you don't need a 3rd party library.
The language needs constructs to create and destroy threads, and in turn the OS needs to provide this behaviour to the language.
Exception being Java Green Threads that aren't real threads at all, similarly with Erlang I think.
A language without threading support, say Basic implemented by QBasic in DOS. Basic is supposed to be basic so threads and processes are advanced features that are non-productive in the languages intent.
C and C++ as a language have no mechanism to:
Start a thread
Declare a mutex, semaphore, etc
etc
This is not part of the language specification. However, such facilities exist on every major operating system. Unlike in Java, these facilities are different on different operating systems: pthread on Linux, OS X and other UNIX-derivatives, CreateThread on Windows, another API on real-time operating systems.
Java has a language definition for Thread, synchronized blocks and methods, 'notify' 'wait' as part of the core Object and the like, which makes the language proper understand multithreading.
It means that there is functionality in the language's runtime that models the concepts of threads and all that goes with that such as providing synchronisation. What happens behind the scenes is up to the languages implementors... they could choose to use native OS threading or they might fake it.
A language that doesn't support it could be VB6 (at least not natively, IIRC)

Which scripting languages support multi-core programming?

I have written a little python application and here you can see how Task Manager looks during a typical run.
(source: weinzierl.name)
While the application is perfectly multithreaded, unsurprisingly it uses only one CPU core.
Regardless of the fact that most modern scripting languages support multithreading, scripts can run on one CPU core only.
Ruby, Python, Lua, PHP all can only run on a single core.
Even Erlang, which is said to be especially good for concurrent programming, is affected.
Is there a scripting language that has built in
support for threads that are not confined to a single core?
WRAP UP
Answers were not quite what I expected, but the TCL answer comes close.
I'd like to add perl, which (much like TCL) has interpreter-based threads.
Jython, IronPython and Groovy fall under the umbrella of combining a proven language with the proven virtual machine of another language. Thanks for your hints in this
direction.
I chose Aiden Bell's answer as Accepted Answer.
He does not suggest a particular language but his remark was most insightful to me.
You seem use a definition of "scripting language" that may raise a few eyebrows, and I don't know what that implies about your other requirements.
Anyway, have you considered TCL? It will do what you want, I believe.
Since you are including fairly general purpose languages in your list, I don't know how heavy an implementation is acceptable to you. I'd be surprised if one of the zillion Scheme implementations doesn't to native threads, but off the top of my head, I can only remember the MzScheme used to but I seem to remember support was dropped. Certainly some of the Common LISP implementations do this well. If Embeddable Common Lisp (ECL) does, it might work for you. I don't use it though so I'm not sure what the state of it's threading support is, and this may of course depend on platform.
Update Also, if I recall correctly, GHC Haskell doesn't do quite what you are asking, but may do effectively what you want since, again, as I recall, it will spin of a native thread per core or so and then run its threads across those....
You can freely multi-thread with the Python language in implementations such as Jython (on the JVM, as #Reginaldo mention Groovy is) and IronPython (on .NET). For the classical CPython implementation of the Python language, as #Dan's comment mentions, multiprocessing (rather than threading) is the way to freely use as many cores as you have available
Thread syntax may be static, but implementation across operating systems and virtual machines may change
Your scripting language may use true threading on one OS and fake-threads on another.
If you have performance requirements, it might be worth looking to ensure that the scripted threads fall through to the most beneficial layer in the OS. Userspace threads will be faster, but for largely blocking thread activity kernel threads will be better.
As Groovy is based on the Java virtual machine, you get support for true threads.
F# on .NET 4 has excellent support for parallel programming and extremely good performance as well as support for .fsx files that are specifically designed for scripting. I do all my scripting using F#.
An answer for this question has already been accepted, but just to add that besides tcl, the only other interpreted scripting language that I know of that supports multithreading and thread-safe programming is Qore.
Qore was designed from the bottom up to support multithreading; every aspect of the language is thread-safe; the language was designed to support SMP scalability and multithreading natively. For example, you can use the background operator to start a new thread or the ThreadPool class to manage a pool of threads. Qore will also throw exceptions with common thread errors so that threading errors (like potential deadlocks or errors with threading APIs like trying to grab a lock that's already held by the current thread) are immediately visible to the programmer.
Qore additionally supports and thread resources; for example, a DatasourcePool allocation is treated as a thread-local resource; if you forget to commit or roll back a transaction before you end your thread, the thread resource handling for the DatasourcePool class will roll back the transaction automatically and throw an exception with user-friendly information about the problem and how it was solved.
Maybe it could be useful for you - an overview of Qore's features is here: Why use Qore?.
CSScript in combination with Parallel Extensions shouldn't be a bad option. You write your code in pure C# and then run it as a script.
It is not related to the threading mechanism. The problem is that (for example in python) you have to get interpreter instance to run the script. To acquire the interpreter you have to lock it as it is going to keep the reference count and etc and need to avoid concurrent access to this objects. Python uses pthread and they are real threads but when you are working with python objects just one thread is running an others waiting. They call this GIL (Global Interpreter Lock) and it is the main problem that makes real parallelism impossible inside a process.
https://wiki.python.org/moin/GlobalInterpreterLock
The other scripting languages may have kind of the same problem.
Guile supports POSIX threads which I believe are hardware threads.

Erlang-style Concurrency for Other Languages

What libraries exist for other programming languages to provide an Erlang-style concurrency model (processes, mailboxes, pattern-matching receive, etc.)?
Note: I am specifically interested in things that are intended to be similar to Erlang, not just any threading or queueing library.
Ulf Wiger had a great post recently on this topic - here are the properties he defines as required before you can call something "Erlang Style Concurrency":
Fast process creation/destruction
Ability to support >> 10 000 concurrent processes with largely unchanged characteristics.
Fast asynchronous message passing.
Copying message-passing semantics (share-nothing concurrency).
Process monitoring.
Selective message reception.
Number 2 above is the hardest to support in VMs and language implementations that weren't initially designed for concurrency. This is not to knock Erlang-ish concurrency implementations in other languages, but a lot of Erlang's value comes from being able to create millions of processes, which is pretty damn hard if the process abstraction has a 1-1 relationship with an OS-level thread or process. Ulf has a lot more on this in the link above.
Message Passing Interface (MPI) (http://www-unix.mcs.anl.gov/mpi/) is a highly scalable and robust library for parallel programming, geared original towards C but now available in several flavors http://en.wikipedia.org/wiki/Message_Passing_Interface#Implementations. While the library doesn't introduce new syntax, it provides a communication protocol to orchestrate the sharing of data between routines which are parallelizable.
Traditionally, it is used in large cluster computing rather than on a single system for concurrency, although multi-core systems can certainly take advantage of this library.
Another interesting solution to the problem of parallel programming is OpenMP, which is an attempt to provide a portable extension on various platforms to provide hints to the compiler about what sections of code are easily parallelizable.
For example (http://en.wikipedia.org/wiki/OpenMP#Work-sharing_constructs):
#define N 100000
int main(int argc, char *argv[])
{
int i, a[N];
#pragma omp parallel for
for (i=0;i<N;i++)
a[i]= 2*i;
return 0;
}
There are advantages and disadvantages to both, of course, but the former has proven to be extremely successful in academia and other heavy scientific computing applications. YMMV.
Microsoft Concurrency and Coordination Runtime for .NET.
The CCR is appropriate for an
application model that separates
components into pieces that can
interact only through messages.
Components in this model need means to
coordinate between messages, deal with
complex failure scenarios, and
effectively deal with asynchronous
programming.
Scala supports actors. But I would not call scala intentionally similar to Erlang.
Nonetheless scala is absolutely worth taking a look!
Also kilim is a library for java, that brings erlang style message passing/actors to the Java language.
Mike Rettig created a .NET library called Retlang and a Java port called Jetlang that is inspired by Erlang's concurrency model.
Termite for Gambit Scheme.
Microsoft's Not-Production-Ready Answer to Erlang: Microsoft Axum
If you are using Ruby, take a look at Revactor.
Revactor is an Actor model implementation for Ruby 1.9 built on top of the Rev high performance event library. Revactor is primarily designed for writing Erlang-like network services and tools.
Take a look at this code sample:
myactor = Actor.spawn do
Actor.receive do |filter|
filter.when(:dog) { puts "I got a dog!" }
end
end
Revactor only runs on Ruby 1.9. I believe the author of the library has discontinued maintaining it but the documentation on their site is very good.
You might also want to take a look at Reia: a ruby-like scripting language built on top of the Erlang VM. Reia is the new project of the creator of Revactor: Tony Arcieri.
For python you can try using processing module.
Warning: shameless plug!
I developed a library for this kind of message passing in Haskell:
Erlang-style Distributed Haskell.
Volker
JoCaml extends OCaml with join calculus for concurrent and distributed programming.
Akka (http://akka.io) is heavily influenced by erlangs OTP. It has built on scala's actors and is great for concurrency on the JVM.

Resources