Related
The topic of multi-threaded access to Lisp objects came up in another post at https://stackoverflow.com/posts/comments/97440894?noredirect=1, but as a side issue, and I am hoping for further clarification.
In general, Lisp functions (and special forms, macros, etc) seem to naturally divide into accessors and modifiers of objects. Modifiers of shared objects are clearly problematic in multi-threaded applications, since updates occurring at the same time can interfere with each other (requiring protective locks, atomic operations, etc).
But the question of potential accessor interference seems less clear. Of course, any accessor could be written to include latent modifying code, but I would like to think that the basic Lisp accessor operations (as specified in CLHS and implemented for the various platforms) do not. However, I suspect there could be a very few exceptions for reasons of efficiency—exceptions that would be good to be aware of if otherwise used in multi-threaded code without protection. (The kind of exceptions I’m talking about are not operations like maphash which can be used as both an accessor and modifier.)
It would be helpful if anyone with implementation experience could point to at least one built-in access-only operation (say in SBCL or other source) that includes potentially troublesome modification. I know guarantees are hard to come by, but heuristic guidance is useful too.
Any code that does that would be a bug in an implementation that supports multithreading. SBCL protects functions that are not thread-safe with the famous *world-lock*.
If you have a real reason to want an immutable structure, use defconstant with a read-only defstruct.
(defstruct number (value :read-only t))
(defconstant +five+ (make-number 5))
Dependent types are often advertised as a way to enable you to assert that a program is correct up to a specification. So, for example, you are asked to write a code that sorts a list - you are able to prove that code is correct by encoding the notion of "sort" as a type, and writing a function such as List a -> SortedList a. But how do you prove that the specification, SortedList, is correct? Wouldn't it be the case that, the more complex your specification is, the more likely it would be that your encoding of that specification as a type is incorrect?
This is the static, type-system version of, How do you tell that your tests are correct?
The only answer I can honestly give is, yes, the more complex and unwieldy your specification, the more likely you are to have made a mistake. You can mess up in writing something in a type theoretic formalism just as well as you can in formalizing the description of your program as an executable function.
The hope is that your specification is simple and small enough to judge by examination, while your implementation of that might be far larger. It helps that, once you have some "seed" ideas formalized, you can show that the ideas derived from these are correct. From that point of view, the more readily you can mechanically and provably derive parts of your specification from simpler parts, and ultimately derive your implementation from your specification, the more likely you are to get a correct implementation.
But it can be unclear how to formalize something, which has the effect that either you might make a mistake in translating your ideas into the formalism – you might think you proved one thing, when actually you proved another – or you might find yourself doing type theory research in order to formalize an idea.
This is a problem with any specification language (even English), not just dependent types. Your own post is a good example: it contains an informal specification of "sort function" that only requires the result to be sorted, which is not what you want (\xs -> [] would qualify). See e.g. this post from Twan van Laarhoven's blog.
I think it's the other way around: a well-typed program can't prove nonsense (assuming the system is constistent), while specifications can be inconsistent or just silly. So it's not "how to make sure this piece of code reflects my platonic ideas?", but rather "how to make sure my ideas meaningfully project onto a well-founded plane of pure syntactic rules?". How to make sure the bird you see is a mockingbird [for some supplied notion of mockingbirdness]? Well, study birds and raise you chances to be right. But as always with humans, you can't be 100% sure.
Type theory is a way to mitigate the imperfectness of human mind by introducing formal rules, machine-checked proofs (it's a very relevant paper) and other stuff, that allows to focus and thus to simplify problems a lot (as Brouwer said: "Mathematics is nothing more, nothing less, than the exact part of our thinking"), but you can't expect any tool to make your thoughts "right", because there is just no uniform notion of rightness. IOW, there is no way to formally connect informal and formal: being informal is like being inside the IO monad — there is no escape.
So it's not "does this syntax reflects my very precise semantics?", but rather "can I attach my raw semantics to this strongly structured syntax?". Programs are proper material objects, while ideas are cumbersome approximations, that can become proper material objects only by convention. So we form some basis using conventions, and then we just trust it, because it's much more sensible to trust to a small subset of all your numerous ideas than to all of them.
One thing formal methods can do that I don't think others have touched on is help relate simple things to more complex ones. You may not know for sure how to specify exactly how a Set data structure should behave, but if you can write a simple version based on sorted lists, you can then prove that your fancy version based on balanced search trees relates to it correctly through the toList function. That is, you can use formal methods to transfer your confidence in sorted lists to balanced search trees.
How do you prove that math is correct? That is, how do you prove that integer addition is the correct way to count apples, or how do you prove that real addition is the correct way to add weights? There is always an interface between the formal / mathematical and the informal / real. It requires skill and mathematical / physical taste to find the appropriate formalism for solving a particular problem. Formal methods won't eliminate that.
The value of formal methods is twofold:
You're not going to know if your program is correct, unless you know what properties it actually satisfies. Before you know if your sort routine is "correct", you first have to know what it actually does. Any procedure for finding that out is going to be a formal method (even unit testing!), so people who reject "formal methods" are really just limiting themselves to a tiny subset of the available methods.
Even when you know how to find out what a program actually does, people make mistakes in their mathematical reasoning (we are not rational creatures, whatever certain ideologies may claim); so it's helpful to have a machine check us. This is the same reason we use unit tests --- it's nice to run a desk check and make sure the program does what we want; but having the computer do the check and tell us whether the result is correct helps prevent mistakes. Letting the computer check our proofs about the behavior of the program is helpful for exactly the same reason.
Coming late to the party, but AFAICT, noone has yet mentioned another important aspect: in the context of program verification, having a bug in the spec is not always too terrible, because you can use the code to check the spec.
IOW, the proof doesn't say "the code is right", but "the code and the spec are mutually consistent". So, in order for a bug in the spec to go unnoticed, it has to be one of:
an underspecified spec.
a bug in the spec matched by a corresponding bug in the code.
As someone else pointed out: the problem is the same for tests.
Suppose your function is not top level one, but used by somebody else as part of some module, which also has correctness proof. The latter must use correctness proof of your function, and if it is bad, module will not compile. The module itself still can have mistakes, but it is not problem of yours anymore.
I hear a lot about functional languages, and how they scale well because there is no state around a function; and therefore that function can be massively parallelized.
However, this makes little sense to me because almost all real-world practical programs need/have state to take care of. I also find it interesting that most major scaling libraries, i.e. MapReduce, are typically written in imperative languages like C or C++.
I'd like to hear from the functional camp where this hype I'm hearing is coming from..
It's important to add one word: "there's no shared state".
Any meaningful program (in any language) changes the state of the world. But (some) functional languages make it impossible to access the same resource from multiple threads simultaneously. The absence of shared state makes multithreading safe.
Functional languages such as Haskell, Scheme and others have what are called "pure functions". A pure function is a function with no side effects. It doesn't modify any other state in the program. This is by definition threadsafe.
Of course you can write pure functions in imperative languages. You also find multi-paradigm languages like Python, Ruby and even C# where you can do imperative programming, functional programming or both.
But the point of Haskell (etc) is that you can't write a non-pure function. Well that's not strictly true but it's mostly true.
Similarly, many imperative languages have immutable objects for much the same reason. An immutable object is one whose state doesn't change once created. Again by definition an immutable object is threadsafe.
You're talking about two different things and don't realize it.
Yes, most real-world programs have state somewhere, but if you want to do multithreading, that state should not be everywhere, and in fact, the fewer places it's in, the better. In functional programs, the default is not to have state, and you can introduce state exactly where you need it and nowhere else. Those parts that are dealing with state will not be as easily multithreaded, but since all the rest of your program is free of side-effects and thus it doesn't matter what order those parts are executed in, it removes a huge barrier to parallelization.
However, this makes little sense to me because almost all real-world
practical programs need/have state to take care of.
You'd be surprised! Yes, all programs need some state (I/O in particular) but often you don't need much more. Just because most programs have heaps of state doesn't mean they need it.
Programming in a functional language encourages you to use less state, and thus your programs become easier to parallelise.
Many functional languages are "impure" which means they allow some state. Haskell doesn't, but Haskell has monads which basically let you get something from nothing: you get state using stateless constructs. Monads are a bit fiddly to work with which is why Haskell gives you a strong incentive to restrict state to as small a part of your program as possible.
I also find it interesting that most major scaling libraries, i.e.
MapReduce, are typically written in imperative languages like C or C++.
Programming concurrent applications is "hard" in C/C++. That's why it's best to do all the dangerous stuff in a library which is heavily tested and inspected. But you still get the flexibility and performance of C/C++.
Higher order functions. Consider a simple reduction operation, summing the elements of an array. In an imperative language, programmers typically write themselves a loop and perform reductions one element at a time.
But that code isn't easy to make multi-threaded. When you write a loop you're assuming an order of operations and you have to spell out how to get from one element to the next. You'd really like to just say "sum the array" and have the compiler, or runtime, or whatever, make the decision about how to work through the array, dividing up the task as necessary between multiple cores, and combining those results together. So instead of writing a loop, with some addition code embedded inside it, an alternative is to pass something representing "addition" into a function that can do the divvying. As soon as you do that, you're writing functionally. You're passing a function (addition) into another function (the reducer). If you write this way then it not only makes more readable code, but when you change architecture, or want to write for heterogeneous architecture, you don't have to change the summer, just the reducer. In practice you might have many different algorithms that all share one reducer so this is a big payoff.
This is just a simple example. You may want to build on this. Functions to apply other functions on 2D arrays, functions to apply functions to tree structures, functions to combine functions to apply functions (eg. if you have a hierarchical structure with trees above and arrays below) and so on.
So I'm currently working on a new programming language. Inspired by ideas from concurrent programming and Haskell, one of the primary goals of the language is management of side effects. More or less, each module will be required to specify which side effects it allows. So, if I were making a game, the graphics module would have no ability to do IO. The input module would have no ability to draw to the screen. The AI module would be required to be totally pure. Scripts and plugins for the game would have access to a very restricted subset of IO for reading configuration files. Et cetera.
However, what constitutes a side effect isn't clear cut. I'm looking for any thoughts or suggestions on the subject that I might want to consider in my language. Here are my current thoughts.
Some side effects are blatant. Whether its printing to the user's console or launching your missiles, anything action that reads or write to a user-owned file or interacts with external hardware is a side effect.
Others are more subtle and these are the ones I'm really interested in. These would be things like getting a random number, getting the system time, sleeping a thread, implementing software transactional memory, or even something very fundamental such as allocating memory.
Unlike other languages built to control side effects (looking at you Haskell), I want to design my language to be pragmatic and practical. The restrictions on side effects should serve two purposes:
To aid in the separations of concerns. (No one module can do everything).
To sandbox each module in the application. (Any module could be used as a plugin)
With that in mind, how should I handle "pseudo"-side effects, like random numbers and sleeping, as I mention above? What else might I have missed? In what ways might I manage memory usage and time as resources?
The problem of how to describe and control effects is currently occupying some of the best scientific minds in programming languages, including people like Greg Morrisett of Harvard University. To my knowledge, the most ambitious pioneering work in this area was done by David Gifford and Pierre Jouvelot in the FX programming language started in 1987. The language definition is online, but you may get more insight into the ideas by reading their 1991 POPL paper.
This is a really interesting question, and it represents one of the stages I've gone through and, frankly, moved beyond.
I remember seminars in which Carl Hewitt, in talking about his Actors formalism, discussed this. He defined it in terms of a method giving a response that was solely a function of its arguments, or that could give different answers at different times.
I say I moved beyond this because it makes the language itself (or the computational model) the main subject, as opposed to the problem(s) it is supposed to solve. It is based on the idea that the language should have a formal underlying model so that its properties are easy to verify. That is fine, but still remains a distant goal, because there is still no language (to my knowledge) in which the correctness of something as simple as bubble sort is easy to prove, let alone more complex systems.
The above is a fine goal, but the direction I went was to look at information systems in terms of information theory. Specifically, assuming a system starts with a corpus of requirements (on paper or in somebody's head), those requirements can be transmitted to a program-writing machine (whether automatic or human) to generate source code for a working implementation. THEN, as changes occur to the requirements, the changes are processed through as delta changes to the implementation source code.
Then the question is: What properties of the source code (and the language it is encoded in) facilitate this process? Clearly it depends on the type of problem being solved, what kinds of information go in and out (and when), how long the information has to be retained, and what kind of processing needs to be done on it. From this one can determine the formal level of the language needed for that problem.
I realized the process of cranking through delta changes of requirements to source code is made easier as the format of the code comes more to resemble the requirements, and there is a nice quantitative way to measure this resemblence, not in terms of superficial resemblence, but in terms of editing actions. The well-known technology that best expresses this is domain specific languages (DSL). So I came to realize that what I look for most in a general-purpose language is the ability to create special-purpose languages.
Depending on the application, such special-purpose languages may or may not need specific formal features like functional notation, side-effect control, paralellism, etc. In fact, there are many ways to make a special-purpose language, from parsing, interpreting, compiling, down to just macros in an existing language, down to simply defining classes, variables, and methods in an existing language. As soon as you declare a variable or subroutine you're created new vocabulary and thus, a new language in which to solve your problem. In fact, in this broad sense, I don't think you can solve any programming problem without being, at some level, a language designer.
So best of luck, and I hope it opens up new vistas for you.
A side effect is having any effect on anything in the world other than returning a value, i.e. mutating something that could be visible in some way outside the function.
A pure function neither depends on or affects any mutable state outside the scope of that invocation of the function, which means that the function's output depends only on constants and its inputs. This implies that if you call a function twice with the same arguments, you are guaranteed to get the same result both times, regardless of how the function is written.
If you have a function that modifies a variable that it has been passed, that modification is a side effect because it's visible output from the function other than the return value. A void function that is not a no-op must have side effects, because it has no other way of affecting the world.
The function could have a private variable only visible to that function that it reads and modifies, and calling it would still have the side effect of changing the way the function behaves in the future. Being pure means having exactly one channel for output of any kind: the return value.
It is possible to generate random numbers purely, but you have to pass around the random seed manually. Most random functions keep a private seed value that is updated each time its called so that you get a different random each time. Here's a Haskell snippet using System.Random:
randomColor :: StdGen -> (Color, Int, StdGen)
randomColor gen1 = (color, intensity, gen2)
where (color, gen2) = random gen1
(intensity, gen3) = randomR (1, 100) gen2
The random functions each return the randomized value and a new generator with a new seed (based on the previous one). To get a new value each time, the chain of new generators (gen1,gen2,gen3) have to be passed along. Implicit generators just use an internal variable to store the gen1.. values in the background.
Doing this manually is a pain, and in Haskell you can use a state monad to make it a lot easier. You'll want to implement something less pure or use a facility like monads, arrows or uniqueness values to abstract it away.
Getting the system time is impure because the time could be different each time you ask.
Sleeping is fuzzier because sleep doesn't affect the result of the function, and you could always delay execution with a busy loop, and that wouldn't affect purity. The thing is that sleeping is done for the sake of something else, which IS a side effect.
Memory allocation in pure languages has to happen implicitly, because explicitly allocating and freeing memory are side effects if you can do any kind of pointer comparisons. Otherwise, creating two new objects with the same parameters would still produce different values because they would have different identities (e.g. not be equal by Java's == operator).
I know I've rambled on a bit, but hopefully that explains what side effects are.
Give a serious look to Clojure, and their use of software transactional memory, agents, and atoms to keep side effects under control.
Question How can I make sure my application is thread-safe? Are their any common practices, testing methods, things to avoid, things to look for?
Background I'm currently developing a server application that performs a number of background tasks in different threads and communicates with clients using Indy (using another bunch of automatically generated threads for the communication). Since the application should be highly availabe, a program crash is a very bad thing and I want to make sure that the application is thread-safe. No matter what, from time to time I discover a piece of code that throws an exception that never occured before and in most cases I realize that it is some kind of synchronization bug, where I forgot to synchronize my objects properly. Hence my question concerning best practices, testing of thread-safety and things like that.
mghie: Thanks for the answer! I should perhaps be a little bit more precise. Just to be clear, I know about the principles of multithreading, I use synchronization (monitors) throughout my program and I know how to differentiate threading problems from other implementation problems. But nevertheless, I keep forgetting to add proper synchronization from time to time. Just to give an example, I used the RTL sort function in my code. Looked something like
FKeyList.Sort (CompareKeysFunc);
Turns out, that I had to synchronize FKeyList while sorting. It just don't came to my mind when initially writing that simple line of code. It's these thins I wanna talk about. What are the places where one easily forgets to add synchronization code? How do YOU make sure that you added sync code in all important places?
You can't really test for thread-safeness. All you can do is show that your code isn't thread-safe, but if you know how to do that you already know what to do in your program to fix that particular bug. It's the bugs you don't know that are the problem, and how would you write tests for those? Apart from that threading problems are much harder to find than other problems, as the act of debugging can already alter the behaviour of the program. Things will differ from one program run to the next, from one machine to the other. Number of CPUs and CPU cores, number and kind of programs running in parallel, exact order and timing of stuff happening in the program - all of this and much more will have influence on the program behaviour. [I actually wanted to add the phase of the moon and stuff like that to this list, but you get my meaning.]
My advice is to stop seeing this as an implementation problem, and start to look at this as a program design problem. You need to learn and read all that you can find about multi-threading, whether it is written for Delphi or not. In the end you need to understand the underlying principles and apply them properly in your programming. Primitives like critical sections, mutexes, conditions and threads are something the OS provides, and most languages only wrap them in their libraries (this ignores things like green threads as provided by for example Erlang, but it's a good point of view to start out from).
I'd say start with the Wikipedia article on threads and work your way through the linked articles. I have started with the book "Win32 Multithreaded Programming" by Aaron Cohen and Mike Woodring - it is out of print, but maybe you can find something similar.
Edit: Let me briefly follow up on your edited question. All access to data that is not read-only needs to be properly synchronized to be thread-safe, and sorting a list is not a read-only operation. So obviously one would need to add synchronization around all accesses to the list.
But with more and more cores in a system constant locking will limit the amount of work that can be done, so it is a good idea to look for a different way to design your program. One idea is to introduce as much read-only data as possible into your program - locking is no longer necessary, as all access is read-only.
I have found interfaces to be a very valuable aid in designing multi-threaded programs. Interfaces can be implemented to have only methods for read-only access to the internal data, and if you stick to them you can be quite sure that a lot of the potential programming errors do not occur. You can freely share them between threads, and the thread-safe reference counting will make sure that the implementing objects are properly freed when the last reference to them goes out of scope or is assigned another value.
What you do is create objects that descend from TInterfacedObject. They implement one or more interfaces which all provide only read-only access to the internals of the object, but they can also provide public methods that mutate the object state. When you create the object you keep both a variable of the object type and a interface pointer variable. That way lifetime management is easy, because the object will be deleted automatically when an exception occurs. You use the variable pointing to the object to call all methods necessary to properly set up the object. This mutates the internal state, but since this happens only in the active thread there is no potential for conflict. Once the object is properly set up you return the interface pointer to the calling code, and since there is no way to access the object afterwards except by going through the interface pointer you can be sure that only read-only access can be performed. By using this technique you can completely remove the locking inside of the object.
What if you need to change the state of the object? You don't, you create a new one by copying the data from the interface, and mutate the internal state of the new objects afterwards. Finally you return the reference pointer to the new object.
By using this you will only need locking where you get or set such interfaces. It can even be done without locking, by using the atomic interchange functions. See this blog post by Primoz Gabrijelcic for a similar use case where an interface pointer is set.
Simple: don't use shared data. Every time you access shared data you risk running into a problem (if you forget to synchronize access). Even worse, each time you access shared data you risk blocking other threads which will hurt your paralelization.
I know this advice is not always applicable. Still, it doesn't hurt if you try to follow it as much as possible.
EDIT: Longer response to Smasher's comment. Would not fit in a comment :(
You are totally correct. That's why I like to keep a shadow copy of the main data in a readonly thread. I add a versioning to the structure (one 4-aligned DWORD) and increment this version in the (lock-protected) data writer. Data reader would compare global and private version (which can be done without locking) and only if they differr it would lock the structure, duplicate it to a local storage, update the local version and unlock. Then it would access the local copy of the structure. Works great if reading is the primary way to access the structure.
I'll second mghie's advice: thread safety is designed in. Read about it anywhere you can.
For a really low level look at how it is implemented, look for a book on the internals of a real time operating system kernel. A good example is MicroC/OS-II: The Real Time Kernel by Jean J. Labrosse, which contains the complete annotated source code to a working kernel along with discussions of why things are done the way they are.
Edit: In light of the improved question focusing on using a RTL function...
Any object that can be seen by more than one thread is a potential synchronization issue. A thread-safe object would follow a consistent pattern in every method's implementation of locking "enough" of the object's state for the duration of the method, or perhaps, narrowed to just "long enough". It is certainly the case that any read-modify-write sequence to any part of an object's state must be done atomically with respect to other threads.
The art lies in figuring out how to get useful work done without either deadlocking or creating an execution bottleneck.
As for finding such problems, testing won't be any guarantee. A problem that shows up in testing can be fixed. But it is extremely difficult to write either unit tests or regression tests for thread safety... so faced with a body of existing code your likely recourse is constant code review until the practice of thread safety becomes second nature.
As folks have mentioned and I think you know, being certain, in general, that your code is thread safe is impossible (I believe provably impossible but I would have to track down the theorem). Naturally, you want to make things easier than that.
What I try to do is:
Use a known pattern of multithreaded design: A thread pool, the actor model paradigm, the command pattern or some such approach. This way, the syncronization process happens in the same way, in a uniform way, throughout the application.
Limit and concentrate the points of synchronization. Write your code so you need synchronization in as few places as possible and the keep the synchronization code in one or few places in the code.
Write the synchronization code so that the logical relation between the values is clear on both on entering and on exiting the guard. I use lots of asserts for this (your environment may limit this).
Don't ever access shared variables without guards/synchronization. Be very clear what your shared data is. (I've heard there are paradigms for guardless multithreaded programming but that would require even more research).
Write your code as cleanly, clearly and DRY-ly as possible.
My simple answer combined with those answer is:
Create your application/program using
thread safety manner
Avoid using public static variable in
all places
Therefore it usually fall into this habit/practice easily but it needs some time to get used to:
program your logic (not the UI) in functional programming language such as F# or even using Scheme or Haskell. Also functional programming promotes thread safety practice while it also warns us to always code towards purity in functional programming.
If you use F#, there's also clear distinction about using mutable or immutable objects such as variables.
Since method (or simply functions) is a first class citizen in F# and Haskell, then the code you write will also have more disciplined toward less mutable state.
Also using the lazy evaluation style that usually can be found in these functional languages, you can be sure that your program is safe fromside effects, and you'll also realize that if your code needs effects, you have to clearly define it. IF side effects are taken into considerations, then your code will be ready to take advantage of composability within components in your codes and the multicore programming.