We have a monitoring application built on swt and running on linux. we have few buttons and a dynamic part that changes as we click on these buttons. The problem is that if some ones click too rapidly the cpu could reach 100% and hanging forever. We observed this rapid cpu spikes only on Ubuntu Linux where as windows it runs without on itch. We are sure that our app does repainting whenever we click (we have dynamic part) the button and that's by design. The problem is not alone with the dynamic part. One solution is to ignore rapid clicks.
We are wondering if we can ignore rapid Button clicks to avoid cpu spiking all the way to 100%. If that doesn't work we may have to redesign the dynamic part which we prefer as last option. suggestions/comments are greatly appreciated.
Another solution is to increase your memory with -Xms512m -Xmx512m
It sounds like the application is simply deadlocking. Are you using threads?
Check to see if the repaint is indeed the root cause of the application hanging. Also check to determine which thread it is in using:
Thread.currentThread()
If it is the main thread, then something is inherently wrong; it could be a problem in Java itself. If it is a thread, make sure that it isn't waiting for another thread to finish synchronizing.
I have the same problem in Ubuntu. But on OpenSuse, it seems a lot better.
Things you can try:
Set the anti alias and advanced option of the GC, like:
gc.setAntialias(SWT.OFF);
gc.setTextAntialias(SWT.OFF);
gc.setAdvanced(false);
And check if you are using the commercial graphic driver (i.e. from NVIDIA or ATI) and not the open source driver.
Try this or use pstack or lsstack. When the app runs a long time (or hangs) is when it's begging you to just take a look and see what it's doing.
Many people have experienced performance problems (i.e. very high CPU consumption) with SWT applications on Gtk+ when updating widgets too often. The actual cause seems to be Gtk+.
Although a bit outdated, here's a throughout explanation of such performance problems.
You could try replacing your SWT components with embedded Swing ones and check whether the problems is still reproducible.
Related
I'm designing an application for an asphalt batch mix plant, using a thread to run the mixing process, several timers to read system states and perform control actions.
If "Hyper-Threading" features is disabled, the application will run smoothly, everything is OK; or it will bring up a dialog grumbling that memory access is invalid and abort immediately after click "OK".
Don't know why? Maybe something wrong with IDE version, since Delphi 5 was released at 10th August 1999; maybe the thread unit in Delphi 5.0 cannot deal with new CPU technology?
Maybe memory management has some bugs, maybe the thread mode is not suitable for new era?
I want to upgrade the IDE, but since there are many many years pasted, I have no idea which would be the best choice,
Delphi 7? Delphi 2007(which support OmniThreadLibrary)? RAD Studio XE6/7? Hope someone will help.
The most plausible explanation is that your program has a bug related to threading. You happen to get away with the flaw in your code when hyperthreading is disabled, but enabling it is sufficient to make the error in your code manifest.
Threading bugs are just like this. They will manifest if threads execute specific code in a particular order, with respect to the other threads. And the relative ordering is unpredictable. Which is part and parcel of parallel computation. Code that is broken can appear to be correct when running under one environment, but then fail under another. Whilst it is tempting to blame the tools, always check in the mirror first.
Changing development environment is not the solution. What you need to do is to find and then fix the error in your code. Getting a good stack trace will help, and I can recommend a tool like madExcept for that.
I am working in an embedded Linux environment debugging a highly timing sensitive issue related to the pairing/binding of Zigbee devices.
Our architecture is such that data read from Zigbee Front End Module via SPI interface and then passed from Kernel space to user space for processing. The processed data and response is then passed back to kernel space and clocked out over the SPI interface again.
The Zigbee 802.15.4 timing requirements specifies that we need to respond within 19.5ms and we frequently have situations where we respond just outside of this window which results in a failure and packet loss on the network.
The Linux kernel is not running with pre-emption enabled and it may not be possible to enable preemption either.
My suspicion is that since the kernel is not preemptible there is another task/process which is using the ioctl() interface and this holds off the Zigbee application just long enough that the 19.5ms window is exceeded.
I have tried the following tools
oprofile - not much help here since it profiles the entire system and the application is not actually very busy during this time since it moves such small amounts of data
strace - too much overhead, I don't have much experience using it though so maybe the output can be refined. The overhead affects the performance so much that the application does not funciton at all
Are there any other lightweight methods of profiling a system like this?
Is there anyway to catch when an ioctl call is pended on another task/thread? (assuming this is the root cause of the issue)
Good question.
Here's an idea. Don't think of it as profiling.
Think of catching it in the act.
I would investigate creating a watchdog timer to go off after the 16.5ms interval.
Whenever you are successful, reset the timer.
That way, it will only go off when there's a failure.
At that point, I would try to take a stack sample of the process, or possibly another process that might be blocking it.
That's an adaptation of this technique.
It will take some work, but I'd be surprised if there's any tool that will tell you exactly what's going on, short of an in-circuit-emulator.
LTTng is the tool you are looking for. Like Oprofile, it profiles the entire system, but you will be able to see exactly what is going on with each process and kernel thread, in a timeline fashion. You will be able to view the interaction of the threads and scheduler around the point of interest, that is, when you miss your Zigbee deadline. You may have to get clever and use some method of triggering (or more likely, stopping) the LTTng trace once you've detected the missed packet, or you might get lucky and catch it right away just using the command line tools to start and stop tracing.
You may have to do some work to get there, for example you'll have to invest some time and energy in 1) enabling your kernel to run LTTng if it doesn't have it already, and 2) learning how to use it. It is a powerful tool, and useful for a variety of profiling and analysis tasks. Most commercial embedded Linux vendors have complete end-to-end LTTng products and configuration if you have that option. If not, you should be able to find plenty of useful help and examples on line. LTTng has been around for a very long time! Happy hunting!
I have a very strange system behavior in a few places which can be described in short: There is a process either in user or kernel space which waits for event, and although the event occurs, the process does not wakes up.
I will describe this below, but since the problem is in many different places (at least 4) I am starting to look for a system problem and not a local one something like preemption flag (already checked and not the problem) that will make the difference.
The system is Linux working on Freescale IMX6 which is brand new and still in beta phase. Same code is working well on many other Linux systems.
The system is running 2 separate processes, one is showing video using gstreamer playing from a file, using new image processor which has never been used. If this process runs alone the system can run over-night.
Another process is working with digital tuner connected over USB. The process only gets the device version in a loop, again when running alone can run over-night.
If these 2 process are running at the same time on the system, one is stuck within a few minutes. If we change the test parameters (like the periodic get version timing) the other process will get stuck.
The processes always stuck on wait for event (either wait_event_interruptible in kernel driver, or in user space on pthread_cond_wait). The event itself occurs and I have logs to see that. But the process does not wake up.
Trying to kill that process makes in Zombie. I managed to find one place with a very specific timing problem where condition check was misplaced and could cause this kind of stuck if the process was switched in the right place. It solved one problem and I got to another with the same characteristic. Anyway the bug that was found could not explain why it happens so often, it could explain theoretical bug which will stuck once in a lot of time, but not this fast.
Anyway - something in the system cause this to show up very fast even if the problem is real. Again - this code (except for the display driver which is new) is working in other systems, and even on the same system when working alone. Those processes are not related and not working with one another, the common about them is the machine they are running on.
It is probably has something to do with system resources (memory use 100M out of 1G, CPU usage is 5%), scheduler behavior or something on the system configuration. Anyone has ideas what could cause these kind of problems?
If it's a brand new port of Linux, then it may be that you actually have a real kernel bug - or a hardware bug if it's new hardware.
However, you need really good evidence, so strace, ftrace and perhaps even some instrumentation of the relevant kernel code to show this to someone who can actually fix the problem - I'm guessing that since you are asking this question in the way you are, that you are not a regular kernel hacker.
Sorry if this isn't really the answer you were looking for.
Taken as given that multi-threaded applications are debugging hell and should be avoided at all costs...
Is the requirement to display of a progress bar a sufficient reason to enter multi-threaded land?
Specifically, let's say a C# windows forms .NET 3.0 application needs to download a 100MB file. Is it right to multi-thread (via a BackgroundWorker component) to do the download on that thread so that the UI thread is free to update a progress bar showing progress?
Sorry if this question is a bit fuzzy. I've not done anything multi-threaded before.
You have not said anything about what language are you using; in C#, for instance, you could use WebClient with DownloadFileAsync and DownloadProgressChanged
While this is a multi-threaded program, I consider it as low complexity program.
When you're creating a GUI application, the main problem with doing everything on the UI thread, is that it might freeze your application while for example waiting for I/O. So if you don't have anything that might freeze the application for long enough to annoy the user (and that time is very subjective of course), then go ahead with single threaded applications. But otherwise, I guess the question is how much you value the user experience.
Leading on from the answer to another question about bugs in the Delphi IDE, does anyone know if there is a way to improve the multi-threaded debugging functionality of the IDE, or if not, at least why it is so bad on occasion?
When you have multiple threads within a program, stepping through the code with F7 or F8 can often lead to either very long pauses, or the whole IDE just locks ups. This is especially apparent when you leave or enter a method or procedure. The debugger always seems to be fine for single threaded applications.
PS. The version I'm using is 2007
From my experience multi threaded debugging is much nicer using Vista and Delphi 2009 than XP with Delphi 2007.
First, the ide is significantly more stable.
Second, in Delphi 2009 on vista the debugger can show you where deadlocks are occurring.
If you have to use Delphi 2007, I would strongly recommend debugging your code in a single threaded unit test if possible, then using your by now tested code in the main program. ;)
When the application itself has not deadlocked, try to be very aware of which thread you're in. Keep the thread list up in the debugger, and consider using named threads.
There are times when it will be impossible to interactively debug an application which itself deadlocks. When this happens, you can use tools such as WinDbg and Adplus in order to work with memory dumps. Yes, this is a lot harder than using the interactive debugger, but it beats having no debugger at all. There are sample applications, demos, and instructions, on Tess Ferrandez's blog. I would start with this page. The labs are .NET-centric, but don't let that keep you away; the ideas are the same.
When I want to debug a multithreaded operation I often use a log file (that I analyse after the application has run) instead of the interactive Debugger.
For example with the function 'OutputDebugString'. The output comes in the event log of Delphi. If you start your program outside of Delphi, you can use DebugView from SysInternals to display the log. Take care to add the Thread-ID to each output (GetCurrentThreadID).
Be aware that there could be a thread switch just before writing to the log. But at places where several threads interact you will probably have a critical session (or another synchronization object) so that it should be a problem.
Yes, debugging a multithreaded application is a hassle. Because you are constantly swapping from one thread to another.
Another Idea that I have never tried because I've just thought about it: if you are interested in debugging one thread and just want to avoid being disturbed by the other threads, it might be possible to suspend some threads temporaly.
Process Explorer from SysInternals offers a possibility to suspend and resume threads (in the tab called "Threads" in the properties of a process). But as I said I've never tested it until now.