What methods can we use to interoperate programming languages? - programming-languages

What can we do to integrate code written in a language with code written in any other language? Which techniques are more/less known? I know that some/most languages can be compiled to Java bytecode, but what do we do about the rest ?

You mention the "compile to Java" approach, and there's also the "use a .NET language" approach, so let's look at other cases. There are a number of ways you can interoperate, and it depends on what you're trying to accomplish, it's a case by case situation. Things that come to mind are
Web Services (SOAP or REST)
A text (or other) file in the file system
Use of a database to relay state or other data
A messaging environment like MSMQ or MQSeries
TCP sockets or UDP messages
Mailslots and named pipes

It depends on the level of integration you want.
Do you need the code to share data? Use a platform-neutral data format, such as JSON, XML, Protocol Buffers, Thrift etc.
Do you need to be able to ask code written in one language to perform some task for code in the other? Use a web service or similar inter-process communication layer.
Do you need to be able to call the code within a single process? The answer at that point will entirely depend on which languages you're talking about.

Direct invocations:
Direct calls (if the compilers understand each other's call stack)
Remote Procedure Call (early 90's)
CORBA (late 90's)
Remote Method Invocation (Java, with RMI stack/library in target environment)
.Net Remoting
Less tightly integrated:
Web services/SOAP
REST

The two I see most often are SWIG and Thrift. The main difference is (IIRC) Thrift opens up a port and puts a server there to marshal the data between the different languages, whereas SWIG builds library interface files and uses those to call the specified methods.

I think there are a few possible relationships among programs in different langauges...
There's shares a runtime (e.g. C# and Visual Basic) and compiled into same application/process...
There's one invokes the other (e.g. perl script that invokes a C program)...
There's talks to each other via IPC on the box, or over the network (e.g. pipes and web services)...

Unfortunately your question is rather vague.
There are ways to use different languages in the same process usually by embedding a VM or an interpreter into the executable. If you need to communicate over process boundaries there again are several possibilities many of them have been already mentioned by other answers.
I would suggest you refine your question to get more helpful answers.

On the Web, cookies can be set to pass variables between ASP/PHP/JavaScript. On a previous project I worked on, we used this to create a PHP file for downloading PDFs without revealing their location on the file system from an ASP application.

Almost every language that pretends some kind of system's development use is capable of linking against external routines with either a standard OS interface, or a C function interface. That is what I tend to use.

Related

Implementation patterns for multiple programming languages in a single web application [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I've only created web application with one programming language (like python or js).
I'm aware that multiple programming languages are used to create advanced services. But I don't how exactly does it works together, what are the different patterns to implement this.
Here's a scenario. If we have a Nodejs application that accepts like 100s of key-value pair data ( say JSON ) from a user and if we need to work on the data using Haskell... which are compiled to binary.
I have a hierarchy of data of say, a set of people and their managers along with some performance metrics and their points. And I want to pass them to a program written in Haskell to compute some values based on their role etc...
What methods could be used to pass the data into the program?
should I be running a server that accepts the values as JSON (via HTTP) and parses them inside Haskell?
or can I link them with my nodejs application in some other way ? in that case, how can I pass the data from nodejs application to Haskell?
My concern is also about the latency, It's a real-time computation that would happen every time requested.
For instance, facebook uses Haskell for spam filtering and an engineer states they use c++ and Haskell in that service.. c++ accepts input passes to Haskell, which returns back with info. how might be the interfacing working here?
What are the methods used to pass the data into the program ? Should the binary services be Daemon ?
The exact approach depends on the exact requirement in hand, software components planned for usage.
If you are looking for interworking between different languages, there are various ways.
The method based on Addons(dynamically-linked shared objects written in C++) provides an interface between JavaScript and C/C++ libraries. The Foreign Function Interface (FFI) and Dynamic Libraries (.dylib) allow a function written in another language(rust) to be called from language in host(node.js) language. This shall rely on the require() function that shall load Addon as ordinary Node.js modules.
For example, the node-ffi addon can be used to create bindings to native libraries without writing any C++ code for loading and calling dynamic libraries using pure JavaScript. The FFI based approach is used for dynamically loading and calling exported Go functions as well.
In case if you would like to call the Go functions from python, then you can use the ctypes foreign function library for calling the the exported Go functions
If you are looking for design pattern for a architecture that accommodates modules, services built out of various languages, it depends on your exact application & performance requirement.
In general, if you would like to develop a loosely coupled solution taking advantage of emerging technologies (various language, frameworks), then microservices based architecture can be more beneficial. This shall bring in more independency as a change in a module/service shall not impact other services drastically. If your application is large/complex, then you may need to go with microservices pattern like , "Decompose by business capability" or "Decompose by subdomain". There are many patterns related to the microservices pattern like "Database per Service" pattern where each service shall have own database based on your requirement, "API gateway" pattern that is based on how services are accessed by the clients ("Client-side Discovery pattern" or "Server-side Discovery pattern") and other related variants of microservices are available which you can deploy based on your requirement.
The approach in-turn also shall be based on the the messaging mechanism (synchronous / asynchronous), message formats between microservices as per the solution requirement.
For a near perfect design, you may need to do some prototyping and performance test / load test / profiling on your components both software & hardware with the chosen approach and check if the various system requirements / performance metrics are met and decide accordingly.
Use Microservices Architecture.
Microservice Architecture is an architecture where the application itself is divided into various components, with each component serving a particular purpose. Now, these components are called Microservices collectively. The components are no longer dependent on the application itself. Each of these components is literally and physically independent. Because of this awesome separation, you can have dedicated Databases for each component, aka Microservices as well as deploy them to separate Hosts / Servers and moreover, having a specific programming language for each microservice.

What are node.js bindings?

I am very new to node.js and I can not seem to find a definition anywhere as to what node.js bindings are. I have seen this term used in slides and nodejs talks but it was never clearly explained. Can anyone help clarify this concept for me? I have attached a picture of what I am referring to.
Rather than understanding what node.js bindings are, it is more useful to understand what "bindings" are in the first place.
Let's say you are writing a web application where a node.js (JavaScript) backend:
receives requests from clients,
conducts queries to databases,
sorts the query results and finally
returns the results to the client.
Now normally you would write all the code yourself. However, you know that there is an excellent sorting library that can take care of step 3 (i.e. sorting query results). The only problem is that the library is written in a system programming language such as C/C++ whereas your code is written in JavaScript. Normally you can't use that library in your code because they are in different programming languages, but with bindings, you can.
Bindings basically are libraries that "bind" two different programming languages so that code written in one language can be used in code written in another library. With the presence of bindings, you don't have to write all the code again just because they are in different languages. Another motivation for bindings is that you can benefit from the advantages of different programming languages. For example, C/C++ are much faster than JavaScript. It might be beneficial to write some code in C/C++ for performance purposes.
Now let's take a look at the picture you attached. V8 engine, according to Google Official website, is "written in C++". libuv adds a layer of abstraction that provides asynchronous I/O operations, written in C. However, the core functionalities of Node.js, such as networking, Database queries, file system I/O, are provided in libraries (or modules if you prefer) that are written in JavaScript. Plus, your code is written in JavaScript as well. Now in order for these pieces of technology written in different programming languages to communicate with each other, you have to "bind" them together, using bindings. These bindings are node.js bindings.
I've written an article lately that explains the architecture of Node.js' internal codebase where I explained how binds fit into Node.js!
Node.js bindings are series of methods that can be used in Node.js code which are in reality just running C++ code behind the scenes.
fs.readFile()
This method is not part of javascript. It's provided to v8 as part of the node.js runtime. So javascript does not know how to read a file from disk but C++ does. So when we use javascript code and node.js to read a file from disk it just defers all of that to the C++ function that can actually read the file from disk and get the results back.
Javascript also has bindings in the browser too. for example;
document.querySelector()
is not a javascript code. It is implemented by chrome V8 engine.
Upon further research i've come across this article. I hope this helps anyone out:
http://pravinchavan.wordpress.com/2013/11/08/c-binding-with-node-js/

Managing multiple-processes: What are the common strategies?

While multithreading is faster in some cases, sometimes we just want to spawn multiple worker processes to do work. This has the benefits of not crashing the main app if one of the worker crashes, and that the user doesn't need to worry a lot about inter-locking stuffs.
COM+'s Application Pooling seems like a good way to achieve this on Windows. The downside is that we need to write a COM+ wrapper for the worker process.
However, when I search for Application Pooling on Google, it seems like most of its usages are related to IIS. Don't other applications (such as scientific/graphics) find it useful to spawn multiple worker processes?
So there are several questions:
Why isn't COM+ more popular in areas other than IIS? If I write a non-IIS application and want to use process management on Windows, should I go with COM+ or are there better alternatives out there?
What would be the cross platform way to do it? Are there libraries out there that give me a "process pool" (worker processes will intelligently pick up work, can be managed, etc.)
I can't offer any answers to the COM aspect of your question, but it's worth noting there's another world (besides HPC MPI) where multi-processing (rather than the more common multi-threading approach) is apparently alive, well and thriving: Python.
Why ? Python's GIL ("global interpreter lock") cripples most attempts to multithread python code so badly that multiprocessing is the generally recommended approach to parallelising Python on SMP. The standard library includes process pools; there are various other options too.
Python certainly ought to satisfy any multi-platform requirement!
You might want to investigate how the apache web server manages process pools. From version 2.0 it runs natively on windows and one of the multi-processing models it supports are process pools. A part of apache is also APR (apache portable runtime), which handles platform-specific issues.
No one can answer why something is not popular because may be no body is looking for what you are looking for. After .NET came in picture, people shifted from COM to Managed Environment, before .NET, COM and ATL and relative other technologies were quite painful to implement and they would crash and were also quite difficult to debug.
That is the reason, managed environment came in existence.
However, .NET 4 onwards, parallel libraries give much more power to user for parallel programming and also you can spawn and control other proceeses.
For multiplatform, you can look for zvrba's answer.
Yes, other applications--especially science applications--find it useful to spawn multiple processes. Since few super-computers run Microsoft Windows, scientists generally avoid using anything that ties them to a Microsoft platform. Nothing related to COM will help scientists leverage their enormous existing code base written in Fortran.
People who choose to run IIS have generally already drunk the Microsoft Koolaid, so they have fewer inhibitions to tying themselves to Microsoft's proprietary platforms, which is why COM-specific terminology will get lots of hits related to IIS.
One of the open standards for doing what you want is the Message Passing Interface. Several implementations exist and some of them run on supercomputers using Fortran. Some of them run on cheaper computers using sexier languages.
See http://en.wikipedia.org/wiki/Message_Passing_Interface
There hasn't been a mob rushing through the doors of COM application pooling primarily because of two factors:
COM is a pain in the ass to deal with compared to just about anything else
Threading can be a headache, but it's a lot easier and more convenient to manage than inter-process communication
COM application pooling was essentially created for IIS. It has one very specific benefit over normal multithreading: the multiple processes are fully isolated from each other. This is important for data security and for app stability when dealing with third party plugins of questionable stability.
Scientific computing generally doesn't need strong data security isolation between operations, and I would venture to guess that scientific computing doesn't rely much on third party plugins of questionable stability. When doing big math operations, you're either using a sexy numerics library that had better be rock solid to be taken seriously, or you're using your own code, in which case crashes should be fixed and repeat offenders should be spanked.
Oh, and all crashes except stack overflow can be trapped and dealt with within a multithreaded app, especially if it's your own code.
In short, COM app pooling is overkill for just about anything other than IIS.
Google's webbrowser chrome is a multi-process architecture software. It is open source, so you can check out its code and see how to manage processes.

What is the difference with these technology related terms?

What is the difference between the next terms, it can help a lot in interviews and general understanding.
Framerwork
Library
IDE
API
Framework
Some predefined architecture that a developer has chosen and which dictates how the application will be written. It usually already includes many concepts which helps the developer to concentrate on the domain of the application instead of the plumbing. This plumbing is provided by the framework. For example the .NET framework provides out-of-the-box tools that would allow you to talk to web servers, without even knowing the internals of the TCP/IP protocol (actually it helps knowing the internals but you get the point).
Library
A reusable compiled unit that can be redistributed and reused across various projects. Well not necessary compiled in case of dynamic languages.
IDE
It's the development environment where you create the other three parts (usually text editor), it might also include compiler and the possibility to execute, debug and see the output of the program in order to speed up the development process.
API
Application Programming Interface. This could mean many things but usually it is a set of functions given to the disposition of the developer and which perform specific tasks and work only in a specific context.
IDE is a tool for fast, easy and flexible development
An API is provided for an existing software. Using these third party applications can interact with main/primary application.
A framework or library are typically same. They are a common set of functionality for other software to use.
Ref: wiki for Framework, API
Framework: a collection of libraries and programming practices to provide general functionality for a program, so that it doesn't have to be rewritten. Typically a framework for an application program will handle user display and input, among other things. The intent is usually to hide the more complex functionality of an application, and to encourage a certain style.
Library: A piece of software to provide certain functionality to other programs that call it. Typically designed to be reusable and modular, so that a library can be distributed and be useful without its source code.
Integrated Development Environment: A integrated set of tools to write programs and turn them into finished products, usually including at least an editor, compiler, linker, and debugger. IDEs sometimes provide support for frameworks.
Application Programming Interface: A set of function calls and sometimes variable accesses available to a program, typically being the public interface of one or more libraries.

Should I use .NET 4.0 Tasks in a library?

I'm writing a .NET 4.0 library that should be efficient and simple to use.
The library is used by referencing it and using its different classes.
Should I use .NET 4.0 Tasks tot make things more efficient internally? I fear that it might make the usage of the library more complex and limited since the users might want to decide for themselves when and where to use tasks and threads.
If your answer depends on the kind of library, here is more information:
The library is Pcap.Net, which is a wrapper for WinPcap and includes a packet interpretation framework.
It only is an issue when the user can 'see' the threading, ie you give out access to data that could be accessed (by you) on another Thread. Probably not a good idea.
But when the parallel processing stays completely inside your application then there is very little chance your users would object.
Should? Dunno. How about giving people an option by providing extension methods that use tasks against the library and push that out in a separate DLL? If you want to use tasks, reference the extension library and go crazy. Otherwise, stick with the core dll.
I believe there are many projects that follow this pattern with Linq. They provide their core library and a separate .Linq.DLL which has extension methods...

Resources