Nashorn: concurrent eval with singleton ScriptEngine ? thread safe?

Nashorn: concurrent eval with singleton ScriptEngine ? thread safe? - multithreading

We would like to use Nashorn within a servlet. The idea is to use a singleton instance of ScriptEngine that is reused at every request. At each request an new EngineScope Binding is created, and the eval is run with that binding. Then the binding is cleared. No shared objects are passed to the bindings (just the request/response objects from the servlet).
Within the servlets, the singleton instance of ScriptEngine may be eval-ed concurrently in different threads, will this work properly or will it run into a threading issue? Here is some code that gives the idea:
ScriptEngine engine = getNashornSingleton();
ScriptContext newContext = new SimpleScriptContext();
newContext.setBindings(engine.createBindings(), ScriptContext.ENGINE_SCOPE);
Bindings engineScope =newContext.getBindings(ScriptContext.ENGINE_SCOPE);
engineScope.put("request", request);
engineScope.put("response", response);
engine.eval(jsCode, engineScope);
engineScope.clear();

My answer to my own question: I would not use a singleton as described above. Despite some potential threading issues, you likely do not want to destroy the bindings at each request (as it may require re-compiling of scripts). What we ended up doing is creating a pool of Engines and their associated scope Bindings. An engine/binding pair.
At each servlet request we grab an engine/binding pair from the pool, put the request/response into the binding, and then execute the script. There is no threading issues to worry about because a given engine/binding pair is only executed by a single thread at a time. When the request is done, the engine/binding pair is returned to the pool. Seems to work well.

Related

How to wrap Web Worker response messages in futures?

Please consider a scala.js application which runs in the browser and consists of a main program and a web worker.
The main thread delegates long running operations to the web worker by passing messages that contain the names of methods and the parameters required to invoke them. The worker passes method return values back to the main thread in the form of response messages.
In simpler terms, this program abstracts web worker messaging so that code in the main thread can call methods in the worker thread in idiomatic and asynchronous Scala syntax.
Because web workers do not associate messages with their responses in any way, the abstraction relies on a registry, an intermediary object, that governs each cross context method call to associate the invocation with the result. This singleton could also bind callback functions but is there a way to accomplish this with futures instead of callbacks?
How can I build an abstraction over this registry that allows programmers to use it with the standard asynchronous programming structures in Scala: futures and promises?
How should I write this functionality so that scala programmers can interact with it in the canonical way? For example:
// long running method in the web worker
val f: Future[String] = Registry.ultimateQuestion(42) // async
f onSuccess { case q => println("The ultimate question is: " + q) }
I'm new to futures and promises, but it seems like they usually complete when some execution block terminates. In this case, receiving a response from the web worker signifies completion of the future. Is there a way to write a custom future that delegates its completion status to an external process? Is there another way to link the web worker response message to the status of the future?
Can/Should I extend the Future trait? Is this possible in Scala.js? Is there a concrete class that I should extend? Is there some other way to encapsulate these cross context web worker method calls in existing asynchronous Scala functionality?
Thank you for your consideration.

Hmm. Just spitballing here (I haven't used workers yet), but it seems like associating the request with the Future is fairly easy in the single-threaded JavaScript world you're working in.
Here's a hypothetical design. Say that each request/response to the worker is automatically wrapped in an Envelope; the Envelope contains a RequestId. So the send side looks something like (this is pseudo-code, but real-ish):
def sendRequest[R](msg:Message):Future[R] = {
val promise = Promise[R]
val id = nextRequestId()
val envelope = Envelope(id, msg)
register(id, promise)
sendToWorker(envelope)
promise.future
}
The worker processes msg, wraps the result in another Envelope, and the result gets handled back in the main thread with something like:
def handleResult(resultEnv:Envelope):Unit = {
val promise = findRegistered(resultEnv.id)
val result = resultEnv.msg
promise.success(result)
}
That needs some filling in, and some thought about what the types like R should be, but that sort of outline would probably work decently well. If this was the JVM you'd have to worry about all sorts of race conditions, but in the single-threaded JS world it probably can be as simple as using an autoincrementing integer for the request ID, and storing away the Promise...

How does NodeJS handle multi-core concurrency?

Currently I am working on a database that is updated by another java application, but need a NodeJS application to provide Restful API for website use. To maximize the performance of NodeJS application, it is clustered and running in a multi-core processor.
However, from my understanding, a clustered NodeJS application has a their own event loop on each CPU core, if so, does that mean, with cluster architect, NodeJS will have to face traditional concurrency issues like in other multi-threading architect, for example, writing to same object which is not writing protected? Or even worse, since it is multi-process running at same time, not threads within a process blocked by another...
I have been searching Internet, but seems nobody cares that at all. Can anyone explain the cluster architect of NodeJS? Thanks very much
Add on:
Just to clarify, I am using express, it is not like running multiple instances on different ports, it is actually listening on the same port, but has one process on each CPUs competing to handle requests...
the typical problem I am wondering now is: a request to update Object A base on given Object B(not finish), another request to update Object A again with given Object C (finish before first request)...then the result would base on Object B rather than C, because first request actually finishes after the second one.
This will not be problem in real single-threaded application, because second one will always be executed after first request...

The core of your question is:
NodeJS will have to face traditional concurrency issues like in other multi-threading architect, for example, writing to same object which is not writing protected?
The answer is that that scenario is usually not possible because node.js processes don't share memory. ObjectA, ObjectB and ObjectC in process A are different from ObjectA, ObjectB and ObjectC in process B. And since each process are single-threaded contention cannot happen. This is the main reason you find that there are no semaphore or mutex modules shipped with node.js. Also, there are no threading modules shipped with node.js
This also explains why "nobody cares". Because they assume it can't happen.
The problem with node.js clusters is one of caching. Because ObjectA in process A and ObjectA in process B are completely different objects, they will have completely different data. The traditional solution to this is of course not to store dynamic state in your application but to store them in the database instead (or memcache). It's also possible to implement your own cache/data synchronization scheme in your code if you want. That's how database clusters work after all.
Of course node, being a program written in C, can be easily extended in C and there are modules on npm that implement threads, mutex and shared memory. If you deliberately choose to go against node.js/javascript design philosophy then it is your responsibility to ensure nothing goes wrong.
Additional answer:
a request to update Object A base on given Object B(not finish), another request to update Object A again with given Object C (finish before first request)...then the result would base on Object B rather than C, because first request actually finishes after the second one.
This will not be problem in real single-threaded application, because second one will always be executed after first request...
First of all, let me clear up a misconception you're having. That this is not a problem for a real single-threaded application. Here's a single-threaded application in pseudocode:
function main () {
timeout = FOREVER
readFd = []
writeFd = []
databaseSock1 = socket(DATABASE_IP,DATABASE_PORT)
send(databaseSock1,UPDATE_OBJECT_B)
databaseSock2 = socket(DATABASE_IP,DATABASE_PORT)
send(databaseSock2,UPDATE_OPJECT_C)
push(readFd,databaseSock1)
push(readFd,databaseSock2)
while(1) {
event = select(readFD,writeFD,timeout)
if (event) {
for (i=0; i<length(readFD); i++) {
if (readable(readFD[i]) {
data = read(readFD[i])
if (data == OBJECT_B_UPDATED) {
update(objectA,objectB)
}
if (data == OBJECT_C_UPDATED) {
update(objectA,objectC)
}
}
}
}
}
}
As you can see, there's no threads in the program above, just asynchronous I/O using the select system call. The program above can easily be translated directly into single-threaded C or Java etc. (indeed, something similar to it is at the core of the javascript event loop).
However, if the response to UPDATE_OBJECT_C arrives before the response to UPDATE_OBJECT_B the final state would be that objectA is updated based on the value of objectB instead of objectC.
No asynchronous single-threaded program is immune to this in any language and node.js is no exception.
Note however that you don't end up in a corrupted state (though you do end up in an unexpected state). Multithreaded programs are worse off because without locks/semaphores/mutexes the call to update(objectA,objectB) can be interrupted by the call to update(objectA,objectC) and objectA will be corrupted. This is what you don't have to worry about in single-threaded apps and you won't have to worry about it in node.js.
If you need strict temporally sequential updates you still need to either wait for the first update to finish, flag the first update as invalid or generate error for the second update. Typically for web apps (like stackoverflow) an error would be returned (for example if you try to submit a comment while someone else have already updated the comments).

Groovy Thread Safety

I am using groovy scripting execution in multi threaded mode.
Script themselves are thread safe.
Its like below:
//Startup Code. Single threaded
Class<?> scriptClass = getScriptClass(fileName); //utility to get script class from file name
Method method = getMethods(scriptClass); //Utility to get a specific Method
storeMethod(method); //Store method globally.
Object scriptInstance = scriptClass.newInstance();
storeScriptInstance(scriptInstance); //Store script Instance
Multiple threads execute following code: (without any synchronization.)
ScriptInstance scriptInstance = getScriptInstance(); //Utility to get scriptInstance stored in init
Method method = getMethod(); //Utility for getting method stored in init step
Object obj[] = new Object[] { context }; //context variable available per thread.
method.invoke(scriptInstance,obj);
script consists of just one function which is totally thread safe (function is reentrant and modifies context variable.)
This works in my unit testing with multiple threads but couldn't find any material to support this claim.
Question => Is it safe under multiple thread execution? More generically, sharing of same script instances across threads to execute scripts functions which themselves are thread safe is safe? Script instances shouldn't have global variables in execution.
Context is an argument to script and not global variable.
Please help.

Without the actual function I cannot tell if this is supposed to be threadsafe or not. Since you say that the function modifies the context variable I conclude that you mutate global state. In that case it is not threadsafe without synchronization of some kind. If my assumption is wrong and no global state is mutated, then executing a method by reflection is surely not the problem

Best NHibernate multithreading pattern?

As we know, NHibernate sessions are not thread safe. But we have a code path split in several long running threads, all using objects loaded in the initial thread.
using (var session = factory.OpenSession())
{
var parent = session.Get<T>(parentId);
DoSthWithParent(session, parent);
foreach (var child in parent.children)
{
parallelThreadMethodLongRunning.BeginInvoke(session, child);
//[Thread #1] DoSthWithChild(child #1) -> SaveOrUpdate(child #1) + Flush()
//[Thread #2] DoSthWithChild(child #2) -> SaveOrUpdate(child #2) + Flush()
//[Thread #3] DoSthWithChild(child #3) -> SaveOrUpdate(child #3) + Flush()
// -> etc... changes to be persisted immediately, not all at the end.
EndInvoke();
}
DoFinalChangesOnParentAndChildren(parent);
session.Flush();
}
}
One way would be a session for each thread, but that would require the parent object to be reloaded in each. Plus, the final method is also doing changes on the children and would run in a StaleObjectException if another session changed it meanwhile, or had to be evicted/reloaded.
So all threads have to use the same session. What is the best way to do this?
Use save queue in initial thread (thread safe implementation), which is polled in a loop (instead of EndInvoke()) from the main thread. Child threads can insert NHibernate objects to be saved by the main thread.
Use some callback mechanism to save/flush objects in main thread. Is there something similar possible to UI thread callback in WPF, Control.Invoke() or BackgroundWorker?
Put Save/Flush accesses into lock(session) blocks? Maybe dangerous, because modifying the NHibernate objects might change the session, even if not doing a Save()/Flush().
Or should I live with the database overhead to load the same objects for separate sessions in each thread, evict and reload them in the main thread and then do changes again? [edit: bad "solution" due to object concurrency/risk of stale objects]
Consider also that the application has a business logic layer above NHibernate, which has similar objects, but sends it's property values to the NHibernate objects on it's own Save() command, only then modifying them and doing NHibernate Save()/Flush() immediately.
Edit:
It's important that any read operation on NHibernate objects may change the session - lazy loading, chilren collection change under certain conditions. So it is really better to have a business object layer on top, which synchronizes all access to NHibernate objects. Considering the database operations take only a minimum time of the threads (mainly occasional status settings), and most is for calculations, watching, web service access and similar, the performance loss by data layer synchronization is negligible.

Firstly, if I understand correctly, different threads may be updating the same objects. In that case, nHibernate or not, you're performing several updates on the same objects concurrently, which may lead to unexpected results.
You may want to tweak your design a bit to ensure that an object can be only updated by (at most) a single thread.
Now, assuming your flow may include having the same threads reading the same data (but writing different data), I'd suggest using different sessions- one per thread, and utilizing 2nd level cache;
2nd level cache is kept at the SessionFactory (rather than in the Session) level, and is therefore shared by all session instances.

The session object is not thread safe, you can't use it over different threads. The SaveOrUpdate in your sepperate threads will most likely crash your program or corrupt your database. However what about creating the data set you want to update and do the SaveOrUpdate actions in your main thread (were your session is created)?
You should observe the following practices when creating NHibernate
Sessions: • Never create more than one concurrent ISession or
ITransaction instance per database connection.
• Be extremely careful when creating more than one ISession per
database per transaction. The ISession itself keeps track of updates
made to loaded objects, so a different ISession might see stale data.
• The ISession is not threadsafe! Never access the same ISession in
two concurrent threads. An ISession is usually only a single
unit-of-work!

What is the difference between +[NSThread detachNewThreadSelector:toTarget:withObject:] and -[NSObject performSelectorInBackground:withObject:]?

They seem to perform a reasonably similar task: launching a new thread that performs that selector quickly and easily. But are there any differences? Maybe with regards to memory management?

Both are identical.
In iOS and Mac OS X v10.5 and later, all objects have the ability to spawn a new thread and use it to execute one of their methods. The performSelectorInBackground:withObject: method creates a new detached thread and uses the specified method as the entry point for the new thread. For example, if you have some object (represented by the variable myObj) and that object has a method called doSomething that you want to run in a background thread, you could could use the following code to do that:
[myObj performSelectorInBackground:#selector(doSomething) withObject:nil];
The effect of calling this method is the same as if you called the detachNewThreadSelector:toTarget:withObject: method of NSThread with the current object, selector, and parameter object as parameters. The new thread is spawned immediately using the default configuration and begins running. Inside the selector, you must configure the thread just as you would any thread. For example, you would need to set up an autorelease pool (if you were not using garbage collection) and configure the thread’s run loop if you planned to use it. For information on how to configure new threads

I presume they are the same, as - (void)performSelectorInBackground:(SEL)aSelector withObject:(id)arg; is defined in NSThread.h in the NSObject (NSThreadPerformAdditions) category. That is nothing conclusive, but that is evidence in that direction.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string