Related
My Optaplanner version is 7.24.0
Can I set the thread number like bellows?
Construct Heuristics : single thread
Local Search: multi thread (for example 4)
Because, some data shows different result in construct heuristics.
I found that the data is affected by thread number.
(It shows always the same result every time in single thread.)
Best regards
It is not possible to use different move thread count in CH and LS. If you need that, you will have to run the solver twice - first just the CH, then just the LS.
That said, some comments:
What you're describing sounds like a bug. Either a bug in OptaPlanner, or a bug in your domain. Construction Heuristics is a deterministic algorithm.
Are you really using OptaPlanner 7.24? That release is, by now, nearly 4 years old. You are missing out on so many performance improvements and bugfixes, and I don't even remember anymore if the behavior you are describing was something we've seen or fixed since. I strongly recommend you upgrade to the latest version; you'll be surprised by how easy it is, and by how much you gain.
For comparison of different libraries with the same functionality, we compare their execution time. This works great. However, there are v8 flags that impact execution time and skew results.
Some flags that are relevant are: --predictable, --always-opt, --no-opt, --minimal.
Question: Which v8 flags should typically be set for running a meaningful benchmarks? What are the tradeoffs?
Edit: The problem is that a benchmark typically runs the same code over and over to get a good average. This might lead to v8 optimizing code, which it would typically not optimize.
V8 developer here. You should definitely run benchmarks with the default configuration. It is the responsibility of the benchmark to be realistic. An unrealistic benchmark cannot be made meaningful with engine flags. (And yes, there are many many unrealistic and/or otherwise meaningless snippets of code out there that people call "benchmarks". Remember, if you can't measure a difference with a realistic benchmark, then any unmeasurable difference that might exist is irrelevant.)
In particular:
--predictable
Absolutely not. Detrimental to performance. Changes behavior in unrealistic ways. Meant for debugging certain things, and for helping fuzzers find reproducible test cases (at the expense of being somewhat unrealistic), not for anything related to performance testing.
--always-opt
Absolutely not. Contrary to what a naive reader of this flag's name might think, this does not improve performance, on the contrary; it mostly causes V8 to waste a bunch of CPU cycles on useless work. This flag is barely ever useful at all; it can sometimes flush out weird corner case bugs in the compilation pipeline, but most of the time it just creates pointless work for V8 developers by creating artificial situations that never occur in practice.
--no-opt
Absolutely not. Turns off all optimizations. Totally unrealistic.
--minimal
That's not a V8 flag I've ever heard of. So yeah, sure, pass it along, it won't do anything (beyond printing an "unknown flag" warning), so at least it won't break anything.
Using default flags seems like the best way to me, since that's what most people will use.
My company currently runs a third-party simulation program (natural catastrophe risk modeling) that sucks up gigabytes of data off a disk and then crunches for several days to produce results. I will soon be asked to rewrite this as a multi-threaded app so that it runs in hours instead of days. I expect to have about 6 months to complete the conversion and will be working solo.
We have a 24-proc box to run this. I will have access to the source of the original program (written in C++ I think), but at this point I know very little about how it's designed.
I need advice on how to tackle this. I'm an experienced programmer (~ 30 years, currently working in C# 3.5) but have no multi-processor/multi-threaded experience. I'm willing and eager to learn a new language if appropriate. I'm looking for recommendations on languages, learning resources, books, architectural guidelines. etc.
Requirements: Windows OS. A commercial grade compiler with lots of support and good learning resources available. There is no need for a fancy GUI - it will probably run from a config file and put results into a SQL Server database.
Edit: The current app is C++ but I will almost certainly not be using that language for the re-write. I removed the C++ tag that someone added.
Numerical process simulations are typically run over a single discretised problem grid (for example, the surface of the Earth or clouds of gas and dust), which usually rules out simple task farming or concurrency approaches. This is because a grid divided over a set of processors representing an area of physical space is not a set of independent tasks. The grid cells at the edge of each subgrid need to be updated based on the values of grid cells stored on other processors, which are adjacent in logical space.
In high-performance computing, simulations are typically parallelised using either MPI or OpenMP. MPI is a message passing library with bindings for many languages, including C, C++, Fortran, Python, and C#. OpenMP is an API for shared-memory multiprocessing. In general, MPI is more difficult to code than OpenMP, and is much more invasive, but is also much more flexible. OpenMP requires a memory area shared between processors, so is not suited to many architectures. Hybrid schemes are also possible.
This type of programming has its own special challenges. As well as race conditions, deadlocks, livelocks, and all the other joys of concurrent programming, you need to consider the topology of your processor grid - how you choose to split your logical grid across your physical processors. This is important because your parallel speedup is a function of the amount of communication between your processors, which itself is a function of the total edge length of your decomposed grid. As you add more processors, this surface area increases, increasing the amount of communication overhead. Increasing the granularity will eventually become prohibitive.
The other important consideration is the proportion of the code which can be parallelised. Amdahl's law then dictates the maximum theoretically attainable speedup. You should be able to estimate this before you start writing any code.
Both of these facts will conspire to limit the maximum number of processors you can run on. The sweet spot may be considerably lower than you think.
I recommend the book High Performance Computing, if you can get hold of it. In particular, the chapter on performance benchmarking and tuning is priceless.
An excellent online overview of parallel computing, which covers the major issues, is this introduction from Lawerence Livermore National Laboratory.
Your biggest problem in a multithreaded project is that too much state is visible across threads - it is too easy to write code that reads / mutates data in an unsafe manner, especially in a multiprocessor environment where issues such as cache coherency, weakly consistent memory etc might come into play.
Debugging race conditions is distinctly unpleasant.
Approach your design as you would if, say, you were considering distributing your work across multiple machines on a network: that is, identify what tasks can happen in parallel, what the inputs to each task are, what the outputs of each task are, and what tasks must complete before a given task can begin. The point of the exercise is to ensure that each place where data becomes visible to another thread, and each place where a new thread is spawned, are carefully considered.
Once such an initial design is complete, there will be a clear division of ownership of data, and clear points at which ownership is taken / transferred; and so you will be in a very good position to take advantage of the possibilities that multithreading offers you - cheaply shared data, cheap synchronisation, lockless shared data structures - safely.
If you can split the workload up into non-dependent chunks of work (i.e., the data set can be processed in bits, there aren't lots of data dependencies), then I'd use a thread pool / task mechanism. Presumably whatever C# has as an equivalent to Java's java.util.concurrent. I'd create work units from the data, and wrap them in a task, and then throw the tasks at the thread pool.
Of course performance might be a necessity here. If you can keep the original processing code kernel as-is, then you can call it from within your C# application.
If the code has lots of data dependencies, it may be a lot harder to break up into threaded tasks, but you might be able to break it up into a pipeline of actions. This means thread 1 passes data to thread 2, which passes data to threads 3 through 8, which pass data onto thread 9, etc.
If the code has a lot of floating point mathematics, it might be worth looking at rewriting in OpenCL or CUDA, and running it on GPUs instead of CPUs.
For a 6 month project I'd say it definitely pays out to start reading a good book about the subject first. I would suggest Joe Duffy's Concurrent Programming on Windows. It's the most thorough book I know about the subject and it covers both .NET and native Win32 threading. I've written multithreaded programs for 10 years when I discovered this gem and still found things I didn't know in almost every chapter.
Also, "natural catastrophe risk modeling" sounds like a lot of math. Maybe you should have a look at Intel's IPP library: it provides primitives for many common low-level math and signal processing algorithms. It supports multi threading out of the box, which may make your task significantly easier.
There are a lot of techniques that can be used to deal with multithreading if you design the project for it.
The most general and universal is simply "avoid shared state". Whenever possible, copy resources between threads, rather than making them access the same shared copy.
If you're writing the low-level synchronization code yourself, you have to remember to make absolutely no assumptions. Both the compiler and CPU may reorder your code, creating race conditions or deadlocks where none would seem possible when reading the code. The only way to prevent this is with memory barriers. And remember that even the simplest operation may be subject to threading issues. Something as simple as ++i is typically not atomic, and if multiple threads access i, you'll get unpredictable results.
And of course, just because you've assigned a value to a variable, that's no guarantee that the new value will be visible to other threads. The compiler may defer actually writing it out to memory. Again, a memory barrier forces it to "flush" all pending memory I/O.
If I were you, I'd go with a higher level synchronization model than simple locks/mutexes/monitors/critical sections if possible. There are a few CSP libraries available for most languages and platforms, including .NET languages and native C++.
This usually makes race conditions and deadlocks trivial to detect and fix, and allows a ridiculous level of scalability. But there's a certain amount of overhead associated with this paradigm as well, so each thread might get less work done than it would with other techniques. It also requires the entire application to be structured specifically for this paradigm (so it's tricky to retrofit onto existing code, but since you're starting from scratch, it's less of an issue -- but it'll still be unfamiliar to you)
Another approach might be Transactional Memory. This is easier to fit into a traditional program structure, but also has some limitations, and I don't know of many production-quality libraries for it (STM.NET was recently released, and may be worth checking out. Intel has a C++ compiler with STM extensions built into the language as well)
But whichever approach you use, you'll have to think carefully about how to split the work up into independent tasks, and how to avoid cross-talk between threads. Any time two threads access the same variable, you have a potential bug. And any time two threads access the same variable or just another variable near the same address (for example, the next or previous element in an array), data will have to be exchanged between cores, forcing it to be flushed from CPU cache to memory, and then read into the other core's cache. Which can be a major performance hit.
Oh, and if you do write the application in C++, don't underestimate the language. You'll have to learn the language in detail before you'll be able to write robust code, much less robust threaded code.
One thing we've done in this situation that has worked really well for us is to break the work to be done into individual chunks and the actions on each chunk into different processors. Then we have chains of processors and data chunks can work through the chains independently. Each set of processors within the chain can run on multiple threads each and can process more or less data depending on their own performance relative to the other processors in the chain.
Also breaking up both the data and actions into smaller pieces makes the app much more maintainable and testable.
There's plenty of specific bits of individual advice that could be given here, and several people have done so already.
However nobody can tell you exactly how to make this all work for your specific requirements (which you don't even fully know yourself yet), so I'd strongly recommend you read up on HPC (High Performance Computing) for now to get the over-arching concepts clear and have a better idea which direction suits your needs the most.
The model you choose to use will be dictated by the structure of your data. Is your data tightly coupled or loosely coupled? If your simulation data is tightly coupled then you'll want to look at OpenMP or MPI (parallel computing). If your data is loosely coupled then a job pool is probably a better fit... possibly even a distributed computing approach could work.
My advice is get and read an introductory text to get familiar with the various models of concurrency/parallelism. Then look at your application's needs and decide which architecture you're going to need to use. After you know which architecture you need, then you can look at tools to assist you.
A fairly highly rated book which works as an introduction to the topic is "The Art of Concurrency: A Thread Monkey's Guide to Writing Parallel Application".
Read about Erlang and the "Actor Model" in particular. If you make all your data immutable, you will have a much easier time parallelizing it.
Most of the other answers offer good advice regarding partitioning the project - look for tasks that can be cleanly executed in parallel with very little data sharing required. Be aware of non-thread safe constructs such as static or global variables, or libraries that are not thread safe. The worst one we've encountered is the TNT library, which doesn't even allow thread-safe reads under some circumstances.
As with all optimisation, concentrate on the bottlenecks first, because threading adds a lot of complexity you want to avoid it where it isn't necessary.
You'll need a good grasp of the various threading primitives (mutexes, semaphores, critical sections, conditions, etc.) and the situations in which they are useful.
One thing I would add, if you're intending to stay with C++, is that we have had a lot of success using the boost.thread library. It supplies most of the required multi-threading primitives, although does lack a thread pool (and I would be wary of the unofficial "boost" thread pool one can locate via google, because it suffers from a number of deadlock issues).
I would consider doing this in .NET 4.0 since it has a lot of new support specifically targeted at making writing concurrent code easier. Its official release date is March 22, 2010, but it will probably RTM before then and you can start with the reasonably stable Beta 2 now.
You can either use C# that you're more familiar with or you can use managed C++.
At a high level, try to break up the program into System.Threading.Tasks.Task's which are individual units of work. In addition, I'd minimize use of shared state and consider using Parallel.For (or ForEach) and/or PLINQ where possible.
If you do this, a lot of the heavy lifting will be done for you in a very efficient way. It's the direction that Microsoft is going to increasingly support.
2: I would consider doing this in .NET 4.0 since it has a lot of new support specifically targeted at making writing concurrent code easier. Its official release date is March 22, 2010, but it will probably RTM before then and you can start with the reasonably stable Beta 2 now. At a high level, try to break up the program into System.Threading.Tasks.Task's which are individual units of work. In addition, I'd minimize use of shared state and consider using Parallel.For and/or PLINQ where possible. If you do this, a lot of the heavy lifting will be done for you in a very efficient way. 1: http://msdn.microsoft.com/en-us/library/dd321424%28VS.100%29.aspx
Sorry i just want to add a pessimistic or better realistic answer here.
You are under time pressure. 6 month deadline and you don't even know for sure what language is this system and what it does and how it is organized. If it is not a trivial calculation then it is a very bad start.
Most importantly: You say you have never done mulitithreading programming before. This is where i get 4 alarm clocks ringing at once. Multithreading is difficult and takes a long time to learn it when you want to do it right - and you need to do it right when you want to win a huge speed increase. Debugging is extremely nasty even with good tools like Total Views debugger or Intels VTune.
Then you say you want to rewrite the app in another lanugage - well this isn't as bad as you have to rewrite it anyway. THe chance to turn a single threaded Program into a well working multithreaded one without total redesign is almost zero.
But learning multithreading and a new language (what is your C++ skills?) with a timeline of 3 month (you have to write a throw away prototype - so i cut the timespan into two halfs) is extremely challenging.
My advise here is simple and will not like it: Learn multithreadings now - because it is a required skill set in the future - but leave this job to someone who already has experience. Well unless you don't care about the program being successfull and are just looking for 6 month payment.
If it's possible to have all the threads working on disjoint sets of process data, and have other information stored in the SQL database, you can quite easily do it in C++, and just spawn off new threads to work on their own parts using the Windows API. The SQL server will handle all the hard synchronization magic with its DB transactions! And of course C++ will perform a lot faster than C#.
You should definitely revise C++ for this task, and understand the C++ code, and look for efficiency bugs in the existing code as well as adding the multi-threaded functionality.
You've tagged this question as C++ but mentioned that you're a C# developer currently, so I'm not sure if you'll be tackling this assignment from C++ or C#. Anyway, in case you're going to be using C# or .NET (including C++/CLI): I have the following MSDN article bookmarked and would highly recommend reading through it as part of your prep work.
Calling Synchronous Methods Asynchronously
Whatever technology your going to write this, take a look a this must read book on concurrency "Concurrent programming in Java" and for .Net I highly recommend the retlang library for concurrent app.
I don't know if it was mentioned yet, but if I were in your shoes, what I would be doing right now (aside from reading every answer posted here) is writing a multiple threaded example application in your favorite (most used) language.
I don't have extensive multithreaded experience. I've played around with it in the past for fun but I think gaining some experience with a throw-away application will suit your future efforts.
I wish you luck in this endeavor and I must admit I wish I had the opportunity to work on something like this...
Yeah I know ... Some people are sometimes hard to convince of what sounds natural to the rest of us, an I need your help right now SO community (or I'll go postal soon ..)
One of my co-worker is convinced the linux kernel code is not re-entrant as he reads it somewhere last time he get insterested in it, likely 7 years ago. Probably its reading was right at that time, remember that multi core architecture was not much widespread some time ago and linux project at its begining or so was not totally well writen and fully fledged with all fancy features.
Today is different. It's obvious that calling the same system call from different processes running in parallel on the same architecture won't lead to undefined behavior. Linux kernel is widespread now, and known for its reability even though running on multicore architectures.
That is my argument for now. But what would be yours to prove that objectively ?
I was thinking to show him off some function in the linux kernel (on lxr website ) as the mutex_lock() system call. Eveything is tuned to get it work in concurrent environnement. But the code could be not that obvious for newbie (as I am).
Please help me.. ;-)
Search the kernel mailing list archive for "BKL". That stands for "Big Kernel Lock", which is what used to be used to prevent problems. A lot of work has been put into breaking it up into pieces, to allow reentry as long different parts of the kernel are used by different processes. Most recent mentions of "BKL" (at least that I've noticed) have basically referred to somebody trying to make his own life easy by locking more than somebody else approved of, at which point they frequently say something about "returning to the days of the BKL", or something on that order.
The easiest way to prove that multiple CPUs can execute in the kernel simultaneously would be to write a program that does a lot of work in-kernel (for example, looks up long pathnames in a tight loop), then run two copies of it at the same time on a dual-core machine and show that the "system" percentage in top goes above 50%.
At the risk of being snarky: why not just read the code? If neither of you are expert enough to follow the code through an interrupt handler and into some subsystem or another where you can read out the synchronization code, then ... why bother? Isn't this just a dancing on the head of a pin argument? It's like a creationist demanding "proof" of evolution when they aren't interested in learning any biology.
Maybe you should have your friend prove Linux is not reentrant. Burden should not be on you to prove this.
There are many different bad practices, such as memory leaks, that are easy to slip into a program on accident. Sometimes, they might even be able to jury-rig your program together.
I'm working on a project right now and it works if I deliberately put a memory leak in my code. If I take the leak out, the code crashes. Obviously this is bad, and needs to be (and will be) fixed soon.
My question is, when do you decide to deliver code in this state, if it's not possible to release code without these poor practices, in time?
If the problem's impact on actual usage of the system can be reasonably expected to be none or negligible, and the delivery date cannot be pushed back, and it can be fixed within a scope of time before the problem's impact becomes more than negligible, ship it.
Obviously this is not ideal or even recommended, but you're clearly pushed into a corner at this point. Sometimes there are no good choices, but pragmatism must win over formal correctness. If an application has a memory leak, but we can reasonably expect that the app will be recycled or machine restarted or whatever before the leak becomes a real problem, that can sometimes be better than delivering late. It depends on the conditions of the agreement and the customer.
It's always better to at least try to push back the delivery date, but I am assuming you've already tried that and it's not an option here.
It is typical once an application has been shipped to ignore technical debt and move on. It's the responsibility of the developers to clearly communicate to the stakeholders the importance of paying off some of that debt, especially in a case like this.
However, given that it seems the customer cares more about a delivery date than correctness, it's less likely anyone will be convinced to pay off any debt once you go live. This is a bad situation to be in. Only the person with all the facts can make the right choice.
"My question is, when do you decide to deliver code in this state, if it's not possible to release code without these poor practices, in time?"
Never.
What you do instead: prioritize and focus.
If what you're working on is really high-priority, and you've mis-designed it, something low- priority has to be sacrificed. Often, some feature(s) must be delayed to give you time to focus on the high priority feature that doesn't work.
If what you're working on is really low-priority, you have to ask why you're not working on something higher-priority. And you still have to focus and prioritize. Sometimes things which are very low priority must be sacrificed.
When you can't do "everything" you have to pick things you can do that will be reasonably bug-free.
You might be interested in the concept of technical debt.
You only have three knobs you can turn when shipping software, assuming a fixed number of developers: features, quality and ship date, and turning one up means the others get turned down.
One of the most difficult things to do in software development is to build your product with the knobs set just right. For example, the Duke Nukem Forever guys have turned the features and quality knob up to eleven and thrown the ship date knob out the window. Microsoft often seems to glue the ship date knob in place and turn down the feature knob as needed, then unglue the ship date knob, turn it up a bit, glue it back down and continue twiddling the other two. And there are seem to be an endless amount of products out there that ship all the time but never put in the features they need to be successful.
In the end, you don't get paid if you don't deliver. Having poor quality hurts you terribly in the long run; reputations are hard to rebuild. It has almost always been that the right thing to do is to cut features if you have too many bugs. Always.
However, bug triage is just as important as feature development. What kind of leak are we talking about here? Are you leaking a byte? A small object? One thousand objects? Entire DLLs? There are scenarios where its probably better to leak a little than to fail to deliver the product.
And what do you mean by leak? Does your application have a well defined life cycle? Where you allocate something once at startup and then never free it until the process dies? Well how long does your process run? Do you expect to run multiple copies of your process?
Obviously you never want to leak, and you should work to develop best practices that minimize leaking, but in the end you have to make a judgment call. Maybe you can just explain the bug to your customers, explain the impact, and they'll buy it anyway. Maybe you can patch it a week later. Maybe you really do need to fix it. But we'd need to know more about it to give good advice.
I will say I have shipped known leaks in the past. I won't say what product or company, but I had a bug where DLL dependencies and insane lifetime management made it next to impossible to correctly free our references to a certain DLL once it was loaded, so in the end we just leaked it. And I still think it was the right thing to do. Other times I've seen things deliberately leaked to keep third party code that was written incorrectly from crashing (though that is a completely separate debate).
But in the end I believe such instances are rare and once you have identified the source of a memory leak, it shouldn't take much more than a day to fix it. It is rare indeed that I would ship with a memory leak that was known and a fix was known. It would have to be something that required a major re-architect involving changing a threading model, or refactoring huge swaths of code, and it would literally have to be a day or two before the product was to ship. At that point I might just leak the memory and promise a patch in a weak after proper testing could be done on the re-architect.
I would be very uncomfortable releasing with such a known bug. It is likely to occur in another way.
You have not specified your environment or language, but I suggest you look at using a memory checking tool such as:
Purify (trial available)
BoundsChecker
Valgrind
or even a free one, Visual Leak Detector
Perhaps, when you are not going to be around to maintain the code later, you don't care about your client/employer and none of the ramifications of your code could possibly affect you.
In other words, in your professional coding life, it's never a really good idea.
If you are working for someone that is less concerned about code quality than you are and simply wants you to finish the code at all costs, then I can see how you'd be in a difficult situation. Finishing faster but poorer will earn you some immediate reward. You should remember though that even if failing to meet your employer/client's expectation for a milestone bites you only once, your poor code may continue to bite you into the future, not only through the difficulties in maintaining it but also through the negative impression others may form of you down the track.
If its a single (or limited) memory leak, and it doesn't grow, and say it only causes a crash when shutting down (the most common case of stuff like this), then it depends. If its a client/desktop software and the users are going to crash every time on their way out, I'd make it an ultra high priority. If its server, and the only one running the server is you, and everything else works fine, I'd say its alright to enter beta. But if the leaks grow, or can cause crashes at "random" times they need to be fixed asap.
To get past an internal milestone, it's arguable, although still nothing to be taken likely.
To release, never. It always comes back and bites you. If your software is in such a bad space that a piece of poor design will get it over the line, you've got much bigger problems looming round the corner
Never, unless you don't care about the poor developer who is going to be maintaining your work afterwards.
Ultimately, a decision like this should be made by the customer or the project manager. Individual developers should not be making these kinds of decisions alone, or keeping this information to themselves.
Tell them what the problem is, and what the consequences will be for not fixing it. If they want you to ship it broken on time, that's their call.
If you don't want to work for people who accept shoddy products, that's your call, but it's a mistake to think that developers have some sort of professional responsibility to ignore their clients' and bosses' quality/cost/time priorities.
If somebody may actually die if you ship bad software, then don't do it, but if the worst-case scenario is that somebody is going to have to reboot a couple times per day, then do what you're told or find another job.