I am porting Lightweight Communications and Marshalling from julia to lisp as it has a better API. I used swig to generate C function calls.
I want to know if this is a safe usage for C pointer or not. here is the create function:
(defun create-lcm (&optional (provider (null-pointer)))
(let* ((ptr (lcm_create provider))
(addr (cffi:pointer-address ptr)))
(tg:finalize ptr (lambda () (lcm_destroy (cffi:make-pointer addr))))
(if (NULL-POINTER-P ptr)
(error "lcm creation error"))
(%create-lcm :pointer ptr :provider provider
:file-descriptor (lcm_get_fileno ptr))))
Question:
What is the proper way to finalize C pointer?
How to make a test for that?
Any other notes/advices are welcome.
Thanks in advance.
Here are a few things that were wrong:
Attaching a finaliser to a pointer that might be null
I’m not sure you’re allowed to attach a finaliser to a foreign pointer. Maybe you are.
You need to be careful with finalisers and gc. If the finaliser references the object that it finalises then the object and its finaliser keep each other alive (they can’t be collected at once because a finaliser might store a reference to the object somewhere and then that object would be alive and so shouldn’t have been finalised.
I don’t know if this is right but it is better:
(defun create-lcm (&optional (provider (null-pointer))
(let ((ptr (lcm_create provider)))
(when (null-pointer-p ptr)
(error “lcm creation error”))
(flet ((finaliser () (lcm_destroy ptr)))
(let ((result (%create-lcm :pointer ptr :provider provider
:file-descriptor (lcm_get_fileno ptr))))
(tg:finalize result #'finaliser)
result))))
Here are some things that are wrong:
If there is an error from %create-lcm or lcm_get_fileno then the finaliser will not run
You might want to read about cl-autowrap, which is used notably to wrap the SDL 2 in cl-sdl2. The library provides thin wrappers around pointers that already free memory on finalization.
I think also that the recommended way to use finalizers is only to use them for cleaning up possible leaks, given that you have little control over when and how the cleanup function is executed (e.g. which thread, which dynamic environment).
One way to manage memory is to allocate your structures ahead of time and clean them when you no longer need them (a pool). Or you can define a function or a macro which defines a scope such that memory is allocated when entering it and freed on exit, using unwind-protect:
(defmacro with-lcm ((context &rest options) &body body)
(let ((internal (gensym)))
`(let* ((,internal (create-lcm ,#options))
(,context ,internal))
(unwind-protect (progn ,#body)
(destroy-lcm ,internal)))))
Related
In one of Timmy Jose's blog posts at https://z0ltan.wordpress.com/2016/09/02/basic-concurrency-and-parallelism-in-common-lisp-part-3-concurrency-using-bordeaux-and-sbcl-threads/ he gives an example of the wrong way to print to the top level from inside a thread (using Bordeaux Threads as an example, although I am using Lparallel):
(defun print-message-top-level-wrong ()
(bt:make-thread
(lambda ()
(format *standard-output* "Hello from thread!")))
nil)
(print-message-top-level-wrong) -> NIL
The explanation is that "The same code would have run fine if we had not run it in a separate thread. What happens is that each thread has its own stack where the variables are rebound. In this case, even for *standard-output*, which being a global variable, we would assume should be available to all threads, is rebound inside each thread!"
And this is exactly what happens if the function is run in Allegro CL. However, in SBCL the function does print the intended output at the terminal. Does this mean *standard-output* is not being rebound in SBCL? In general, is there a cross-platform way to print to *standard-output* from inside a thread?
In a multithreaded situation printing to the terminal should normally be coordinated to avoid potentially printing from several streams at the same time. But there don't seem to be any functions like atomic-format or atomic-print available. Is there a straightforward way to avoid printing interference when there are multiple threads (assuming that locks/mutexes are too expensive to use for each individual printing operation)?
If you actually have a global binding (a binding in the global environment), it does work for all threads; see the documentation for bt:make-thread. Only dynamic (re-)bindings are thread-local. Implementations differ in how/when they bind those streams; sometimes the binding that is actually in effect for user programs is global, sometimes not.
I like to use some sort of queue or channel to coordinate output where necessary; I have not yet run into situations where the locking overhead was prohibitive.
Maybe you could try something with optimistic locking, but I don't know what has been done for that librarywise (some Lisp implementations do have CAS operations that could be used). This should be orthogonal to the parallelism library used.
EDIT: Just found in the SBCL manual: sb-concurrency has lock-free queues and mailboxes.
My question may seem weird but I think I'm facing an issue with volatile objects.
I have written a library implemented like this (just a scheme, not real content):
(def var1 (volatile! nil))
(def var2 (volatile! nil))
(def do-things [a]
(vreset! var1 a)
(vswap! var2 (inc #var2))
{:a #var1 :b #var2})
So I have global var which are initialized by external values, others that are calculated and I return their content.
i used volatile to have better speed than with atoms and not to redefine everytime a new var for every calculation.
The problem is that this seems to fail in practice because I map do-things to a collection (in another program) with inner sub-calls to this function occasionaly, like (pseudo-code) :
(map
(fn [x]
(let [analysis (do-things x)]
(if blabla
(do-things (f x))
analysis)))) coll)
Will inner conditionnal call spawn another thread under the hood ? It seems yes because somethimes calls work, sometimes not.
is there any other way to do apart from defining volatile inside every do-things body ?
EDIT
Actually the error was another thing but the question is still here : is this an acceptable/safe way to do without any explicit call to multithreading capabilities ?
There are very few constructs in Clojure that create threads on your behalf - generally Clojure can and will run on one or more threads depending on how you structure your program. pmap is a good example that creates and manages a pool of threads to map in parallel. Another is clojure.core.reducers/fold, which uses a fork/join pool, but really that's about it. In all other cases it's up to you to create and manage threads.
Volatiles should only be used with great care and in circumstances where you control the scope of use such that you are guaranteed not to be competing with threads to read and write the same volatile. Volatiles guarantee that writes can be read on another thread, but they do nothing to guarantee atomicity. For that, you must use either atoms (for uncoordinated) or refs and the STM (for coordinated).
I'm new to Clojure and am writing a web application. It includes a function fn performed on user user-id which includes several steps of reading and writing to the database and file system. These steps cannot be performed simultaneously by multiple threads (will cause database and file system inconsistencies) and I don't believe they can be performed using a database transaction. However, they are specific to one user and thus can be performed simultaneously for different users.
Thus, if a http request is made to perform fn for a specific user-id I need to make sure that it is completed before any http requests can perform fn for this user-id
I've come up with a solution that seems to work in the REPL but have not tried it in the web server yet. However, being unexperienced with Clojure and threaded programming I'm not sure whether this is a good or safe way to solve the problem. The following code has been developed by trial-and-error and uses the locking function - which seems to go against the "no locks" philosophy of Clojure.
(ns locking.core)
;;; Check if var representing lock exists in namespace
;;; If not, create it. Creating a new var if one already
;;; exists seems to break the locking.
(defn create-lock-var
[var-name value]
(let [var-sym (symbol var-name)]
(do
(when (nil? (ns-resolve 'locking.core var-sym))
(intern 'locking.core var-sym value))
;; Return lock var
(ns-resolve 'locking.core var-sym))))
;;; Takes an id which represents the lock and the function
;;; which may only run in one thread at a time for a specific id
(defn lock-function
[lock-id transaction]
(let [lock (create-lock-var (str "lock-id-" lock-id) lock-id)]
(future
(locking lock
(transaction)))))
;;; A function to test the locking
(defn test-transaction
[transaction-count sleep]
(dotimes [x transaction-count]
(Thread/sleep sleep)
(println "performing operation" x)))
If I open three windows in REPL and execute these functions, it works
repl1 > (lock-function 1 #(test-transaction 10 1000)) ; executes immediately
repl2 > (lock-function 1 #(test-transaction 10 1000)) ; waits for repl1 to finish
repl2 > (lock-function 2 #(test-transaction 10 1000)) ; executes immediately because id=2
Is this reliable? Are there better ways to solve the problem?
UPDATE
As pointed out, the creation of the lock variable is not atomic. I've re-written the lock-function function and it seems to work (no need for create-lock-var)
(def locks (atom {}))
(defn lock-transaction
[lock-id transaction]
(let [lock-key (keyword (str "lock-id-" lock-id))]
(do
(compare-and-set! locks (dissoc #locks lock-key) (assoc #locks lock-key lock-id))
(future
(locking (lock-key #locks)
(transaction))))))
Note: Renamed the function to lock-transaction, seems more appropriate.
Don't use N vars in a namespace, use an atom wrapped around 1 hash-map mapping N symbols to N locks. This fixes your current race condition, avoids creating a bunch of silly vars, and is easier to write anyway.
Since you're making a web app, I have to warn you: even if you do manage to get in-process locking right (which is not easy in itself), it will be for nothing as soon as you deploy your web server on more than one machine (which is almost mandatory if you want your app to be highly-available).
So basically, if you want to use locking, you'd better use distributed locking. From this point on, this discussion is not Clojure-specific, since Clojure's concurrency tools won't be especially helpful here.
For distributed locking, you could use something like Zookeeper. If you don't want to set up a whole Zookeeper cluster just for this, maybe you can compromise by using a Redis database (the Carmine library gives you distributed locks out of the box), although last time I heard Redis locking is not 100% reliable.
Now, it seems to me locking is not especially a requirement, and is not the best approach, especially if you're striving for idiomatic Clojure. How about using a queue instead ? Some popular JVM message brokers (such as HornetQ and ActiveMQ) give you Message Grouping, which guarantees that messages of the same group-id will be processed (serially) by the same consumer. All you have to do is have some threads listen to the right queue and set the user-id as the group id for your messages.
HACK: If you don't want to set up a distributed message broker, maybe you could get around by enabling sticky sessions on you load balancer, and using such a message broker in-VM.
By the way, don't name your function fn :).
I'm somewhat baffled by the following behaviour of SBCL garbage collector in REPL. Define two functions:
(defun test-gc ()
(let ((x (make-array 50000000)))
(elt x 0)))
(defun add-one (x) (+ 1 x))
Then run
(add-one (test-gc))
I would expect that nothing references the original array anymore. Yet, as (room) reports, the memory is not freed. I would understand, if I ran (test-gc) directly, then some reference could have been stuck somewhere in SLIME or in
(list * ** ***)
But was is the case here? Thanks, Andrei.
Update Some time ago I filed a bug. It was recently confirmed. See:
https://bugs.launchpad.net/sbcl/+bug/936304
Just because nothing references the objects anymore doesn't mean that the memory will be reclaimed. The garbage collector will be run some time in the future, and often the only guarantee that you get is that it will be run before you get an out of memory error.
Another thing that may happen here is that you are looking at the Lisp process memory usage. When memory is CG'ed it is generally not returned to the operating system. Instead, the memory is simply marked as free on the heap, and can be used in future memory allocations.
SBCL only...
(gc :full t)
This will forcibly kick off a garbage collection across all generations. I noticed SBCL was holding onto a ton of memory a few days ago and used this to drop the memory down to the "true" usage.
I then wrote an ensure-gc macro to wrap my garbagey computation & experimentation stuff in. I'll paste it in when I get home if I remember... it's nothing fancy.
Slightly modified version of canonical broken double-checked locking from Wikipedia:
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null) {
// Create new Helper instance and store reference on
// stack so other threads can't see it.
Helper myHelper = new Helper();
// Atomically publish this instance.
atomicSet(helper, myHelper);
}
}
}
return helper;
}
}
Does simply making the publishing of the newly created Helper instance atomic make this double checked locking idiom safe, assuming that the underlying atomic ops library works properly? I realize that in Java, one could just use volatile, but even though the example is in pseudo-Java, this is supposed to be a language-agnostic question.
See also:
Double checked locking Article
It entirely depends on the exact memory model of your platform/language.
My rule of thumb: just don't do it. Lock-free (or reduced lock, in this case) programming is hard and shouldn't be attempted unless you're a threading ninja. You should only even contemplate it when you've got profiling proof that you really need it, and in that case you get the absolute best and most recent book on threading for that particular platform and see if it can help you.
I don't think you can answer the question in a language-agnostic fashion without getting away from code completely. It all depends on how synchronized and atomicSet work in your pseudocode.
The answer is language dependent - it comes down to the guarantees provided by atomicSet().
If the construction of myHelper can be spread out after the atomicSet() then it doesn't matter how the variable is assigned to the shared state.
i.e.
// Create new Helper instance and store reference on
// stack so other threads can't see it.
Helper myHelper = new Helper(); // ALLOCATE MEMORY HERE BUT DON'T INITIALISE
// Atomically publish this instance.
atomicSet(helper, myHelper); // ATOMICALLY POINT UNINITIALISED MEMORY from helper
// other thread gets run at this time and tries to use helper object
// AT THE PROGRAMS LEISURE INITIALISE Helper object.
If this is allowed by the language then the double checking will not work.
Using volatile would not prevent a multiple instantiations - however using the synchronize will prevent multiple instances being created. However with your code it is possible that helper is returned before it has been setup (thread 'A' instantiates it, but before it is setup thread 'B' comes along, helper is non-null and so returns it straight away. To fix that problem, remove the first if (helper == null).
Most likely it is broken, because the problem of a partially constructed object is not addressed.
To all the people worried about a partially constructed object:
As far as I understand, the problem of partially constructed objects is only a problem within constructors. In other words, within a constructor, if an object references itself (including it's subclass) or it's members, then there are possible issues with partial construction. Otherwise, when a constructor returns, the class is fully constructed.
I think you are confusing partial construction with the different problem of how the compiler optimizes the writes. The compiler can choose to A) allocate the memory for the new Helper object, B) write the address to myHelper (the local stack variable), and then C) invoke any constructor initialization. Anytime after point B and before point C, accessing myHelper would be a problem.
It is this compiler optimization of the writes, not partial construction that the cited papers are concerned with. In the original single-check lock solution, optimized writes can allow multiple threads to see the member variable between points B and C. This implementation avoids the write optimization issue by using a local stack variable.
The main scope of the cited papers is to describe the various problems with the double-check lock solution. However, unless the atomicSet method is also synchronizing against the Foo class, this solution is not a double-check lock solution. It is using multiple locks.
I would say this all comes down to the implementation of the atomic assignment function. The function needs to be truly atomic, it needs to guarantee that processor local memory caches are synchronized, and it needs to do all this at a lower cost than simply always synchronizing the getHelper method.
Based on the cited paper, in Java, it is unlikely to meet all these requirements. Also, something that should be very clear from the paper is that Java's memory model changes frequently. It adapts as better understanding of caching, garbage collection, etc. evolve, as well as adapting to changes in the underlying real processor architecture that the VM runs on.
As a rule of thumb, if you optimize your Java code in a way that depends on the underlying implementation, as opposed to the API, you run the risk of having broken code in the next release of the JVM. (Although, sometimes you will have no choice.)
dsimcha:
If your atomicSet method is real, then I would try sending your question to Doug Lea (along with your atomicSet implementation). I have a feeling he's the kind of guy that would answer. I'm guessing that for Java he will tell you that it's cheaper to always synchronize and to look to optimize somewhere else.