Detecting the reason for EXCEPTION_FLT_STACK_CHECK - visual-c++

I have a compliacted C and C++ code with heavy mathematics calculations. I use intel C++ - the latest update to compile. I use optimizatons and the application does not give the expected answer. After a long time I managed to reduce the problem to getting EXCEPTION_FLT_STACK_CHECK
0xc0000092. If I compile without optimization - The program work as expected.
It's a single threaded code on Winxp64 (the application is 32-bit).
MSVC 2010 gives the same results with Debug or Release builds. (I mean Good=Expected results)
Can someone help me where to look? Currently I suspect a compiler bug - since I have no asmsembly code of my own, Only compiler-generated code. I looked at the assembler and it's SSE/x87 mixed code.
I'm looking for directions to look for. Since I'm on trial version (of the intel compiler) I don't have much time for investigations.
I will try to use /Qfp-stack-check tommorow to see if i can find something wrong with my code.
* Update *
I just found a bug in intel compiler. A function returns a value on st(0) but the calling function does not remove it. That way i get the stack exception. Workaround is to use the returned value even that i dont always need it. I will try to reproduce it with code that I can share.
After this workaround intel was faster 35% then msvc2010 on the same code. - That's the main result.
mordy

Update * I just found a bug in intel compiler. A function returns a value on st(0) but the calling function does not remove it. That way i get the stack exception. Workaround is to use the returned value even that i dont always need it.

Related

Porting duktape, getting duk_create_heap error during JS compilation of builtin initjs

This question might be too detailed for this forum, but I could not find a mailing list for duktape. Maybe this question will be useful for others trying to get duktape running on more obscure hardware.
I am trying to get duktape to work on an old ColdFire CPU, using an OLD gcc compiler (2.95.3). The board has limited resources (flash/RAM) but I seem to have enough of both. I must live with the old compiler.
I believe the duk_config.h is calculating the right options regarding endianness, etc. I am using a number of the duktape options to reduce code and data size. I have successfully used the same configuration on 64 and 32 bit Ubuntu and it works fine.
The "properties string" that is formed and set in duk_hthread_create_builtin_objects() is:
"bb u pnRHSBOL p2 a8 generic linux gcc" which seems correct (not sure of the effect of the "generic" tag for architecture).
I am getting a failure when calling duk_create_heap(). I have isolated the problem to a what I believe is a JS compile error related to duk_initjs. If I undef DUK_USE_BUILTIN_INITJS, initialization works. The error is a syntax error (not sure where yet). By running "strings" on my executable, I can see that the javascript program source string is there. As a side issue, when this error occurs, the longjmp doesn't work (setjmp never called?) so my fatal handler gets called, but I don't care about for now.
I thought it might be my small C stack (as it appears the js compiler uses recursion) but making the stack much larger didn't help.
I am starting to dig into the JS compiler, but this must be an issue with the architecture or my environment. Any suggestions appreciated!
EDIT: I just now noticed a post of a similar issue, and there was a request to repeat with "-DDUK_OPT_DEBUG -DDUK_OPT_DPRINT -DDUK_OPT_ASSERTIONS -DDUK_OPT_SELF_TESTS" I will try to use these options (if possible, I am very close to a relocation limit on my executable).
There was a bug in 1.4.0 release (https://github.com/svaarala/duktape/pull/550) which caused duk_config.h to incorrectly end up with an unpacked value representation even when the architecture supported packed representation. This might be an issue in your case - try adding and explicit -DDUK_OPT_PACKED_TVAL (which forces Duktape to use packed representation) to see if it helps.

Rcpp: Platform differences in output

i have the following problem (and cannot really produce a minimal test)--
i am porting a package from C++ via Rcpp to R.
the tests (i am testing if the output matrix is exactly what i
would get if calling c++ directly) under linux and osx are absolutely equal, no difference.
but when testing either via build_win() or via a win 8.1 virtual machine i get different results (but the results between both are consistent, so i have linux/osx vs win results)
i already replaced the one rand() call with the corresponding Rcpp sugar, so this should be no problem (i hope at least).
as calling the tests via "R -d valgrind" also produce no error, i am a bit puzzled how to proceed.
all tests are done with R 3.2.0 (local machines) and latest unstable (via build_win())
so my questions are:
are there any known Rcpp differences when compiling (e.g. the compiler provided by Rtools on windows is too old and therefore numeric computations (using STL, no other library like boost/eigen etc) are expected to be slightly different?
is there a good way to debug the problem? i would need to trace basically the C++ code line by line, i am even not sure how to do that except for heavy std::couts.
thanks.
the truth about the 32bit/64bit problem is indeed written up here
different behaviour or sqrt when compiled with 64 or 32 bits
adding the -ffloat-store option did fix my problem.
never expected that, thought the problem is in the source code.

How to enable and use OpenMP?

I decided to ask this question after quite a surfing and reading through google and stackoverflow and doing some experiments in Fortran on my quad core machine with Ubuntu 12.04 and finally finding myself back to square one. So this is how the whole story goes.
Having acquired some basic knowledge about parallel computing, I decided to go for open mp. I could find a tutorial for beginner however could not proceed beyond lesson number one as my computer never created more than one thread even after using commands given in the first chapter. I then searched on stackoverflow.com and found a post which gives a solution of calling omp_set_dynamic(0). However, after typing this on my terminal, gives a following error:
bash: syntax error near unexpected token `0'
so I had to leave this too!
After this, I moved to this site and thought that this was it! But after going till first exercise, I found no way to proceed as I could not even run the model Fortran program given there as it gave the following error.
/tmp/ccaqbCe0.o: In function `MAIN__':
first_open_mp.f95:(.text+0xa): undefined reference to `omp_get_thread_num_'
first_open_mp.f95:(.text+0x9c): undefined reference to `omp_get_num_threads_'
collect2: ld returned 1 exit status
Now I find myself completely helpless about all this. Is there any way, by which I can logically learn open mp at least and make use of all cores in my machine? I am pretty desperate about learning it and any help is highly appreciated.
Thanks in advance.
OpenMP is not standard Fortran. You must study the manual of your compiler how to enable this extension and of whether is possible at all. Some compilers don't even support OpenMP.
You still did not specify which compiler you use so I can't be more specific. Gfortran has the -fopenmp option. Many other compilers have the -openmp option.
BTW find some less prehistoric tutorial. Specifically, something that uses Fortran 90 and the omp_lib module to call the procedures. But even more I suggest you to use the environment variables where possible. My 60 000 lines of code parallel solver needs to call this module only in very small number of places to get the number of threads and the current thread number.

Differences between GNU C++ 4.8.1 (MinGW) and Visual C++ 2013

I know question like this have been asked before, but there isn't the exact answer I'm searching for.
Today I was writing ACM-ICPC contest with my team. Usually we are using GNU C++ 4.8.1 compilator (which was available on contest). We had written code, which had time limit exceeded on test case 10. At the end of contest, less then 2 minutes remaining, I sent the exactly same submission with Visual C++ 2013 (same source file, different language) it got accepted and worked. There were more than 60 test cases and our code passed them all.
Once more I say that there were no differences between the source codes.
Now I'm just interested why it happened.
Anyone knows what the reason is?
Without knowing the exact compiler options you used, this answer might be a bit difficult to answer. Usually, compilers come with many options and provide some default values which are used as long as the user does not override them. This is also true for code optimization options. Both mentioned compilers are capable to significantly improve the speed of the generated binary when being told so. A wild guess would be that in our case, the optimization settings used by the GNU compiler did not improve the executable performance so much but the VC++ settings did. For example because not any flags were used in one case. Another wild guess would be that one compiler was generating a debug binary and the other did not (check for the option -g with GCC which switches debug symbol generation on).
On the other hand, depending on the program you created, it could of course be that VC++ was simply better in performing the optimization than g++.
If you are interested in easy increasing the performance, have a look at the high-level optimization flags at https://gcc.gnu.org/onlinedocs/gnat_ugn/Optimization-Levels.html or for the full story, at the complete list at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html.
More input on comparing compilers:
http://willus.com/ccomp_benchmark2.shtml?p1
http://news.dice.com/2013/11/26/speed-test-2-comparing-c-compilers-on-windows

Experience building and using Qt Embedded

I am currently trying to compile and build QT for Embedded Linux on an Ubuntu box for ARM architecture. So far, I have run into MANY errors while trying to MAKE. The biggest one being a 2000 line C++ function which caused a compiler error. What are other peoples experiences with this and how did you fix it?
My experience has always been favorable, given:
You must follow every single instruction in the installation instructions for Qt, without exception. Every time I've run into compilation errors, it's been because I tried to just do it quickly, instead of reading the attached documentation for that specific platform.
I'd review the instructions - there's probably some minor thing that needs to be done first, which will most likely eliminate your errors.

Resources