I decided to ask this question after quite a surfing and reading through google and stackoverflow and doing some experiments in Fortran on my quad core machine with Ubuntu 12.04 and finally finding myself back to square one. So this is how the whole story goes.
Having acquired some basic knowledge about parallel computing, I decided to go for open mp. I could find a tutorial for beginner however could not proceed beyond lesson number one as my computer never created more than one thread even after using commands given in the first chapter. I then searched on stackoverflow.com and found a post which gives a solution of calling omp_set_dynamic(0). However, after typing this on my terminal, gives a following error:
bash: syntax error near unexpected token `0'
so I had to leave this too!
After this, I moved to this site and thought that this was it! But after going till first exercise, I found no way to proceed as I could not even run the model Fortran program given there as it gave the following error.
/tmp/ccaqbCe0.o: In function `MAIN__':
first_open_mp.f95:(.text+0xa): undefined reference to `omp_get_thread_num_'
first_open_mp.f95:(.text+0x9c): undefined reference to `omp_get_num_threads_'
collect2: ld returned 1 exit status
Now I find myself completely helpless about all this. Is there any way, by which I can logically learn open mp at least and make use of all cores in my machine? I am pretty desperate about learning it and any help is highly appreciated.
Thanks in advance.
OpenMP is not standard Fortran. You must study the manual of your compiler how to enable this extension and of whether is possible at all. Some compilers don't even support OpenMP.
You still did not specify which compiler you use so I can't be more specific. Gfortran has the -fopenmp option. Many other compilers have the -openmp option.
BTW find some less prehistoric tutorial. Specifically, something that uses Fortran 90 and the omp_lib module to call the procedures. But even more I suggest you to use the environment variables where possible. My 60 000 lines of code parallel solver needs to call this module only in very small number of places to get the number of threads and the current thread number.
Related
This question might be too detailed for this forum, but I could not find a mailing list for duktape. Maybe this question will be useful for others trying to get duktape running on more obscure hardware.
I am trying to get duktape to work on an old ColdFire CPU, using an OLD gcc compiler (2.95.3). The board has limited resources (flash/RAM) but I seem to have enough of both. I must live with the old compiler.
I believe the duk_config.h is calculating the right options regarding endianness, etc. I am using a number of the duktape options to reduce code and data size. I have successfully used the same configuration on 64 and 32 bit Ubuntu and it works fine.
The "properties string" that is formed and set in duk_hthread_create_builtin_objects() is:
"bb u pnRHSBOL p2 a8 generic linux gcc" which seems correct (not sure of the effect of the "generic" tag for architecture).
I am getting a failure when calling duk_create_heap(). I have isolated the problem to a what I believe is a JS compile error related to duk_initjs. If I undef DUK_USE_BUILTIN_INITJS, initialization works. The error is a syntax error (not sure where yet). By running "strings" on my executable, I can see that the javascript program source string is there. As a side issue, when this error occurs, the longjmp doesn't work (setjmp never called?) so my fatal handler gets called, but I don't care about for now.
I thought it might be my small C stack (as it appears the js compiler uses recursion) but making the stack much larger didn't help.
I am starting to dig into the JS compiler, but this must be an issue with the architecture or my environment. Any suggestions appreciated!
EDIT: I just now noticed a post of a similar issue, and there was a request to repeat with "-DDUK_OPT_DEBUG -DDUK_OPT_DPRINT -DDUK_OPT_ASSERTIONS -DDUK_OPT_SELF_TESTS" I will try to use these options (if possible, I am very close to a relocation limit on my executable).
There was a bug in 1.4.0 release (https://github.com/svaarala/duktape/pull/550) which caused duk_config.h to incorrectly end up with an unpacked value representation even when the architecture supported packed representation. This might be an issue in your case - try adding and explicit -DDUK_OPT_PACKED_TVAL (which forces Duktape to use packed representation) to see if it helps.
I've decided to learn assembler through online tutorials.
I've come across this one that uses the NASM compiler, which most other tutorials seem to as well:
http://www.tutorialspoint.com/assembly_programming/index.htm
I've also come across this youtube series "Assembly primer for hackers"
https://www.youtube.com/watch?v=K0g-twyhmQ4&list=PLue5IPmkmZ-P1pDbF3vSQtuNquX0SZHpB
This one uses what the guy describes as the 'generic linux compiler' (owtte).
The commands for compiling go something like this:
as -o file.o file.s
Where file.s is the assembly source code. Followed by:
ld -o file file.o
Where file is then the executable.
Each of the tutorials uses a different syntax (e.g. a register in the latter tutorial is always preceded by %. NB. There do appear to be less superficial differences in the syntax than this as well). Are these syntaxes decided by the individual compiler?
I was also initially confused when I tried to compile code from the NASM tutorial with the latter method. I was always under the impression that the instruction set had to depend on the CPU and it therefore shouldn't matter which compiler I use. I've just concluded that it's merely differences in syntax but is that correct?
I'm running a Linux computer, by the way, on kernel 4.1.6.
My main question is really which syntax do I use? Is it just a matter of choice? Is one more widely used than the other? Thanks for any help.
Each of the tutorials uses a different syntax (e.g. a register in the
latter tutorial is always preceded by %. NB. There do appear to be
less superficial differences in the syntax than this as well). Are
these syntaxes decided by the individual compiler?
Yes, different assemblers (= assembly language compilers) might use different assembler language syntax although they provide code for the same processor and platform.
My main question is really which syntax do I use? Is it just a matter
of choice? Is one more widely used than the other?
One assembler, like NASM, might go for a wide range of processors and platforms, in this case you would benefit from learning its syntax when you need to work with several processors or platforms.
In other cases it might be better to stick with the assembler of some prominent vendor, because it is widely used and you can find more example code on the net for it which might help you with your development.
Last not least you might simply prefer a particular assembler because you like its features or syntax.
If your'e on a Windows system, Microsoft's MASM (ML.EXE or ML64.exe for 64 bit) syntax is virtually the same as Intel's syntax. MASM (ML.EXE and ML64.EXE) is included with the free Visual Studio express editions, although you usually have to create a custom build step to invoke the assembler in a VS project. VS express includes a good source level debugger.
If you're on a Linux type system, then you'll probably use AT&T syntax, which I assume ended up that way since it was a conversion of some generic assembler. I don't know which assembler(s) to recommend for Linux.
I know question like this have been asked before, but there isn't the exact answer I'm searching for.
Today I was writing ACM-ICPC contest with my team. Usually we are using GNU C++ 4.8.1 compilator (which was available on contest). We had written code, which had time limit exceeded on test case 10. At the end of contest, less then 2 minutes remaining, I sent the exactly same submission with Visual C++ 2013 (same source file, different language) it got accepted and worked. There were more than 60 test cases and our code passed them all.
Once more I say that there were no differences between the source codes.
Now I'm just interested why it happened.
Anyone knows what the reason is?
Without knowing the exact compiler options you used, this answer might be a bit difficult to answer. Usually, compilers come with many options and provide some default values which are used as long as the user does not override them. This is also true for code optimization options. Both mentioned compilers are capable to significantly improve the speed of the generated binary when being told so. A wild guess would be that in our case, the optimization settings used by the GNU compiler did not improve the executable performance so much but the VC++ settings did. For example because not any flags were used in one case. Another wild guess would be that one compiler was generating a debug binary and the other did not (check for the option -g with GCC which switches debug symbol generation on).
On the other hand, depending on the program you created, it could of course be that VC++ was simply better in performing the optimization than g++.
If you are interested in easy increasing the performance, have a look at the high-level optimization flags at https://gcc.gnu.org/onlinedocs/gnat_ugn/Optimization-Levels.html or for the full story, at the complete list at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html.
More input on comparing compilers:
http://willus.com/ccomp_benchmark2.shtml?p1
http://news.dice.com/2013/11/26/speed-test-2-comparing-c-compilers-on-windows
I have a compliacted C and C++ code with heavy mathematics calculations. I use intel C++ - the latest update to compile. I use optimizatons and the application does not give the expected answer. After a long time I managed to reduce the problem to getting EXCEPTION_FLT_STACK_CHECK
0xc0000092. If I compile without optimization - The program work as expected.
It's a single threaded code on Winxp64 (the application is 32-bit).
MSVC 2010 gives the same results with Debug or Release builds. (I mean Good=Expected results)
Can someone help me where to look? Currently I suspect a compiler bug - since I have no asmsembly code of my own, Only compiler-generated code. I looked at the assembler and it's SSE/x87 mixed code.
I'm looking for directions to look for. Since I'm on trial version (of the intel compiler) I don't have much time for investigations.
I will try to use /Qfp-stack-check tommorow to see if i can find something wrong with my code.
* Update *
I just found a bug in intel compiler. A function returns a value on st(0) but the calling function does not remove it. That way i get the stack exception. Workaround is to use the returned value even that i dont always need it. I will try to reproduce it with code that I can share.
After this workaround intel was faster 35% then msvc2010 on the same code. - That's the main result.
mordy
Update * I just found a bug in intel compiler. A function returns a value on st(0) but the calling function does not remove it. That way i get the stack exception. Workaround is to use the returned value even that i dont always need it.
I am currently trying to compile and build QT for Embedded Linux on an Ubuntu box for ARM architecture. So far, I have run into MANY errors while trying to MAKE. The biggest one being a 2000 line C++ function which caused a compiler error. What are other peoples experiences with this and how did you fix it?
My experience has always been favorable, given:
You must follow every single instruction in the installation instructions for Qt, without exception. Every time I've run into compilation errors, it's been because I tried to just do it quickly, instead of reading the attached documentation for that specific platform.
I'd review the instructions - there's probably some minor thing that needs to be done first, which will most likely eliminate your errors.