Lightweight query server

Lightweight query server - linux

I'm looking for some service server that is extremely simple and lightweight. It's supposed to be used by administration scripts or simple apps to query for information that is available only as root on other server.
I don't need high-throughput, stateful processing, etc. Only blocking, synchronous queries required. Preferably no HTTP server. I'd be happy with something that takes a number of strings as an input and outputs a string over the network. Any data serialisation can be done in the client if required, so that only opaque strings are passed.
Is there any project like that already available? Bindings for perl and python would be a bonus.

There is D-Bus, but the network transport is a bit... DIY.

So you only need data out of this service? I have used memcached before to do what it sounds like you need. There is Cache::Memcached::Fast in perl that can interface with the process.

I've found RPC::Lite, which satisfies everything (more or less) and is extremely simple to use. I'll probably stick with that, but feel free to add more ideas.
http://metacpan.org/pod/RPC::Lite::Server

Related

Why use the nodeJS driver to manipulate a MongoDB database?

What's the point of using the driver and JavaScript if we can perform the same query operations a lot easier from the Mongo shell?

In theory any piece of code can as well be achieved through a good shell.
So, why we actually stay away from shell at all cost?
Security concerns, when an application uses shell to perform operations, it is very sensitive to exploits.
Configuration. What if server does not have the needed client, or the client is of a wrong version?
Driver handles many edge cases you might not notice at first glance. Connection loss handling, multiple connections, and such.
Briefly, imagine shell command as a User Interface for administrators. It might be powerful enough for a task, but being a developer you would like to pass this middle-man and communicate directly with the server.

If you program in a certain language (say Java), it's much easier to use the Java driver to access MongoDB rather call the mongodb shell from Java, and execute commands to MongoDB this way (from the shell). The same applies for the JavaScript language and the NodeJS JavaScript host environment in particular. That's why using a driver makes sense.
Actually this whole thing applies not just to MongoDB but to relational databases too (like MySQL, Oracle, etc.).

CouchDB in-memory implementation

Is there a mock backend for CouchDB, i.e. same REST interface and semantics but purely in-memory? We have a testsuite that runs each test on a pristine database every time (to be reproducible), but running against real database could be faster.

Do you mean running against a mock database?
I do not think there is something right out of the box. Two ideas:
CouchDB on a memory filesystem. Set up a ramdisk, or tmpfs mount, and configure the CouchDB database_dir and view_index_dir to point to there.
PouchDB is porting CouchDB to the browser IndexedDB standard. You did not say which language and environment you are using, but if you can run Node.js, this might be worth looking into. PouchDB has good momentum and I think it will be running in Node.js soon (perhaps through jsdom or some other library. Note, this does not get you the full solution; but you have expanded your question to "are there in-memory IndexedDB implementations for Node.js" for which the answer is either "yes" or "soon," given its adoption trajectory.

Found this: https://github.com/RipcordSoftware/AvanceDB - it supports different platforms and seems to be a serious effort.

Rather late to the party, but I've had great success using pouchdb-server, based on the aforementioned PouchDB project (a JavaScript implementation of CouchDB). It can run against a variety of back-ends, including an in-memory back-end. That means you can run
pouchdb-server --in-memory
to get an in-memory CouchDB-compatible server. There's several other command-line options to explore, too.
I think it is able to run the entire CouchDB test suite, so I'd guess it is fairly unlikely you'd run into too many implementation differences.

I have the same problem... for tests i just don't want to setup a couchdb... i just want to have some memory stuff, as easy as possible.
What did i do:
* I create a memory CouchDB Connector => it's just a very simple implementation of "org.ektorp.CouchDbConnector"
* By spring i wire the CouchDbConnection-Implementation which i need => when i use it for my dev-tests i wire my memory CouchDB Connector, if i want to connect to a real CouchDb i use the usual connector => org.ektorp.impl.StdCouchDbConnector
The only problem is, that "org.ektorp.CouchDbConnector" has more than 50 methods, which must be implemented. For my issues it was enough to implemented just a few of these methods. Depends on your testcases.

memorydb is a partial (in-progress) in-memory implementation of CouchDB to be used with Kivik, which can be run as a stand-alone server.
Not all functionality is implemented yet.

Is there a compelling reason to use an AMQP based server over something like beanstalkd or redis?

I'm writing a piece to a project that's responsible for processing tasks outside of the main application facing data server, which is written in javascript using Node.js. It needs to handle tasks which are scheduled in the future and potentially handle tasks that are "right now". The "right now" just means the next time a worker becomes available it will operate on that task, so that bit might not matter. The workers are going to all talk to external resources, an example job would be to send an email. We are a small shop and we don't have a ton of resources so one thing I don't want to do is start mixing languages at this point in the process, and I already see that Node can do this for us pretty easily, so that's what we're going to go with unless I see a compelling reason not to before I start coding, which is soon.
All that said, I can't tell if there is a compelling reason to use an AMQP based server, like OpenAMQ or RabbitMQ over something like Kue or Beanstalkd with a node client. So, here we go:
Is there a compelling reason to use an AMQP based server over something like beanstalkd or redis with Kue? If yes, which AMPQ based server would fit best with the architecture that I laid out? If no, which nosql solution (beanstalkd, redis/Kue) would be easiest to set up and fastest to deploy?

FWIW, I'm not accepting my answer yet, I'm going to explain what I've decided and why. If I don't get any answers that appear to be better than what I've decided, I'll accept my own later.
I decided on Kue. It supports multiple workers running asynchronously, and with cluster it can take advantage of multicore systems. It is easily extended to provide security. It's backed with Redis, which is used all over for this exact thing, so I know I'm not backing my job process server with unproven software (that's not to say that any of the others are unproven.)
The most compelling reasons that I picked Kue is that it provides a JSON api so that the client applications (The first client is going to be a web based application, but we're planning on making smartphone apps also) can add jobs easily without going through the main application facing node instance, so I can be totally out of the way of the rest of my team as I write this. I don't need a route, I don't need anything, and it's all provided for me so I don't need to write anything to support this. This has another advantage, with an extention to provide l/p security only authorized clients can add jobs, so I don;t have to expose my redis server to client applications directly. It also has a built in web console and the API allows the client to pull back lists of jobs associated with a given user very easily, so we can show the user all of their scheduled tasks in a nifty calendar view with 0 effort on my part.
The other compelling reason is the lack of steep learning curve associated with getting redis and Kue going for me. I've set up redis before, and Kue is simple and effective.
Yes, I'm a lazy developer, but I'm the good kind of lazy developer.
UPDATE:
I have it working and doing jobs, the throughput is amazing. I split out the task marshaling logic into it's own node instance, basically all I have to do is deploy my repo to a new machine and run node task-server.js to scale out my workers. I may need to add in some more job searching calls to Kue, because of how I implimented a few things, but that will be easy.

Efficient implementation for serving 10's of thousands of short lived HTTP requests on a single Linux node?

I'm reading about different approaches for scaling request handling capabilities on a single machine being taken by node.js, ruby, jetty and company.
Being an application developer, i.e. having very little understanding in Kernel/Networking I'm curious to understand the different approaches taken by each implementation (kernel select, polling the socket for connection, event based and company.) ?
Please note that I'm not asking about special handling features (such as jetty continuations (request->wait->request), a pattern which is typical for AJAX clients) but more generally, should you like to implement a server that can respond with "Hello World" to the maximum number of concurrent clients how would you do it? and Why?
Information / References to reading material would be great.

Take a look at The C10K problem page.

What methods can we use to interoperate programming languages?

What can we do to integrate code written in a language with code written in any other language? Which techniques are more/less known? I know that some/most languages can be compiled to Java bytecode, but what do we do about the rest ?

You mention the "compile to Java" approach, and there's also the "use a .NET language" approach, so let's look at other cases. There are a number of ways you can interoperate, and it depends on what you're trying to accomplish, it's a case by case situation. Things that come to mind are
Web Services (SOAP or REST)
A text (or other) file in the file system
Use of a database to relay state or other data
A messaging environment like MSMQ or MQSeries
TCP sockets or UDP messages
Mailslots and named pipes

It depends on the level of integration you want.
Do you need the code to share data? Use a platform-neutral data format, such as JSON, XML, Protocol Buffers, Thrift etc.
Do you need to be able to ask code written in one language to perform some task for code in the other? Use a web service or similar inter-process communication layer.
Do you need to be able to call the code within a single process? The answer at that point will entirely depend on which languages you're talking about.

Direct invocations:
Direct calls (if the compilers understand each other's call stack)
Remote Procedure Call (early 90's)
CORBA (late 90's)
Remote Method Invocation (Java, with RMI stack/library in target environment)
.Net Remoting
Less tightly integrated:
Web services/SOAP
REST

The two I see most often are SWIG and Thrift. The main difference is (IIRC) Thrift opens up a port and puts a server there to marshal the data between the different languages, whereas SWIG builds library interface files and uses those to call the specified methods.

I think there are a few possible relationships among programs in different langauges...
There's shares a runtime (e.g. C# and Visual Basic) and compiled into same application/process...
There's one invokes the other (e.g. perl script that invokes a C program)...
There's talks to each other via IPC on the box, or over the network (e.g. pipes and web services)...

Unfortunately your question is rather vague.
There are ways to use different languages in the same process usually by embedding a VM or an interpreter into the executable. If you need to communicate over process boundaries there again are several possibilities many of them have been already mentioned by other answers.
I would suggest you refine your question to get more helpful answers.

On the Web, cookies can be set to pass variables between ASP/PHP/JavaScript. On a previous project I worked on, we used this to create a PHP file for downloading PDFs without revealing their location on the file system from an ASP application.

Almost every language that pretends some kind of system's development use is capable of linking against external routines with either a standard OS interface, or a C function interface. That is what I tend to use.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string