Determine judging parameters for a code in Linux - linux

I am developing an online code judging software for c/c++/java codes.
I want to include various parameters for judging a code ,namely compilation time,execution time,memory usage ,just like the IDEONE API provides with.
How can i extract these parameters while compiling/executing a code in a LINUX environment?Are there any specific commands?
Also are there any other parameters which can be used to judge a code?

The judge verb is a bit strange in your question (which is perhaps too imprecise). Maybe you mean evaluate ?
Assuming the evaluated source code is compiled by a recent GCC compiler (i.e. version 4.7 or 4.8 of GCC) and that you can parameterize (or just repeat) its compilation, you could consider extending & customizing the GCC compiler for evaluation or metric purposes. This is possible either directly thru GCC plugins (painfully coded in C or C++), or thru MELT extensions (MELT is a domain specific language to extend and customize GCC).
You'll need some work to go this route, because you need to dive into GCC internals. The MELT probe might help you in understanding more the Gimple representation (inside GCC). You could also try compiling some sample code with gcc -fdump-tree-all which produce many dump files.
So the idea is that you would take time (days, perhaps weeks) to develop a MELT extension (e.g. in some file shiven.melt) for analysis, metrics and evaluation purposes, and that you would [re-] compile the example.c source code, e.g. with
gcc -fplugin=melt \
-fplugin-arg-melt-extra=shiven \
-fplugin-arg-melt-mode=shivenanalysis \
-c example.c
(of course you'll add other compiler flags, e.g. -O -I/some/include/dir/ ...)
Then, you could make a MELT extension to measure some characteristics of the compiled code, like number of functions, number of basic blocks, number of Gimple instructions, etc. This will happen at compilation time. Your MELT extension (in your file shiven.melt) could e.g. write some statistics in some database.
Extending GCC is meaningful for C, C++, Fortran, Ada .... source code, but much less for Java (because nobody uses GCC to compile Java, even if gcj exists, and because gcj probably supports a subset of some old Java standard).
Please subscribe to gcc-melt#googlegroups list and ask there for MELT related questions. Mention explicitly your MELT interest (perhaps your question) in your subscription.

There is the time command which gives you the execution time of a binary. With that you can get the compilation time, time gcc code.c, or execution time, time ./a.out. For memory usage you can use valgrind, or ps. With ps, if you are using stdin for input it should be simple. Just start the application, run ps at certain intervals in the backgound, and supply the input to the application.

Related

Gnu fortran compiler write option

I use FORTRAN gnu compiler to compile a piece of code written using fortran(.f90). Unlike in other compilers the output of write statement are not displayed in the screen rather written in the output file.
For example I have placed "write(*,*) 'Check it here'" in the middle of the source code so that this message is displayed in the screen when someone runs the compiled version of the code.
I dont understand why this message is not displayed in the terminal window while running the code, but it is written in the output file.
I would appreciate your help to resolve this !!
>
I am compiling these source codes:
https://github.com/firemodels/fds/tree/master/Source
makefile that I am using to compile the code is located here:
https://github.com/firemodels/fds/tree/master/Build/mpi_intel_linux_64
I run the program using a executable that makefile creates
The version of the compiler that I am using is
GNU Fortran (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
>
Thank you.
Way bigger picture: Is there a reason you're building FDS from source rather than downloading binaries directly from NIST i.e. from https://pages.nist.gov/fds-smv/downloads.html ?
Granted, if you're qualifying the code for safety-related use, you may need to compile from source rather than use someone else's binaries. You may need to add specific info to a header page such as code version, date of run, etc. to satisfy QA requirements.
If you're just learning about FDS (practicing fire analysis, learning about CFD, evaluating the code), I'd strongly suggest using NIST's binaries. If you need/want to compile it from source, we'll need more info to diagnose the problem.
That said, operating on the assumption that you have a use case that requires that you build the code, your specific problem seems to be that writing to the default output unit * isn't putting the output where you expect.
Modern Fortran provides the iso_fortran_env module which formalizes a lot of the obscure trivia of Fortran, in this case, default input and output units.
In the module you're editing, look for something like:
use iso_fortran_env
or
use iso_fortran_env, only: output_unit
or
use, intrinsic:: iso_fortran_env, only: STDOUT => output_unit
If you see an import of output_unit or (as in the last case) an alias to it, write to that unit instead of to *.
If you don't an import from iso_fortran_env, add the last line above to the routine or module you're printing from and write to STDOUT instead of *.
That may or may not fix things, depending on if the FDS authors do something strange to redirect IO. They might; I'm not sure how writing to screen works in an MPI environment where the code may run in parallel on a number of networked machines (I'd write to a networked logging system in that case, but that's just me). But in a simple case of a single instance of the code running, writing to output_unit is more precise than writing to * and more portable and legible than writing to 6.
Good luck with FDS; I tried using it briefly to model layer formation from a plume of hydrogen gas in air. FDS brought my poor 8 CPU machine to its knees so I went back to estimating it by hand instead of trying to make CFD work...

Customising Cabal libraries (I think?)

Perhaps it's just better to describe my problem.
I'm developing a Haskell library. But part of the library is written in C, and another part actually in raw LLVM. To actually get GHC to spit out the code I want I have to follow this process:
Run ghc -emit-llvm on both the code that uses the Haskell module and the "Main" module.
Run clang -emit-llvm on the C file
Now I've got three .ll files from above. I add the part of the library I've handwritten in raw LLVM and llvm-link these into one .ll file.
I then run LLVM's opt on the linked file.
Lastly, I feed the LLVM bitcode fileback into GHC (which pleasantly accepts it) and produces an executable.
This process (with appropriate optimisation settings of course) seems to be the only way I can inline code from C, removing the function call overhead. Since many of these C functions are very small this is significant.
Anyway, I want to be able to distribute the library and for users to be able to use it as painlessly as possible, whilst still gaining the optimisations from the process above. I understand it's going to be a bit more of a pain than an ordinary library (for example, you're forced to compile via LLVM) but as painlessly as possible is what I'm looking for advice for.
Any guidance will be appreciated, I don't expect a step by step answer because I think it will be complex, but just some ideas would be helpful.

How to inspect Haskell bytecode

I am trying to figure out a bug (a serious performance downgrade). Unfortunately, I wasn't able to figure out why by going back many different versions of my code.
I am suspecting it could be some modifications to libraries that I've updated, not to mention in the meanwhile I've updated to GHC 7.6 from 7.4 (and if anybody knows if some laziness behavior has changed I would greatly appreciate it!).
I have an older executable of this code that does not have this bug and thus I wonder if there are any tools to tell me the library versions I was linking to from before? Like if it can figure out the symbols, etc.
GHC creates executables, which are notoriously hard to understand... On my Linux box I can view the assembly code by typing in
objdump -d <executable filename>
but I get back over 100K lines of code from just a simple "Hello, World!" program written in Haskell.
If you happen to have the GHC .hi files, you can get some information about the executable by typing in
ghc --show-iface <hi filename>
This won't give you the assembly code, but you can get some extra information that may prove useful.
As I mentioned in the comment above, on Linux you can use "ldd" to see what C-system libraries you used in the compile, but that is also probably less than useful.
You can try to use a disassembler, but those are generally written to disassemble to C, not anything higher level and certainly not Haskell. That being said, GHC compiles to C as an intermediary (at least it used to; has that changed?), so you might be able to learn something.
Personally I often find view system calls in action much more interesting than viewing pure assembly. On my Linux box, I can view all system calls by running using strace (use Wireshark for the network traffic equivalent):
strace <program executable>
This also will generate a lot of data, so it might only be useful if you know of some specific place where direct real world communication (i.e., changes to a file on the hard disk drive) goes wrong.
In all honesty, you are probably better off just debugging the problem from source, although, depending on the actual problem, some of these techniques may help you pinpoint something.
Most of these tools have Mac and Windows equivalents.
Since much has changed in the last 9 years, and apparently this is still the first result a search engine gives on this question (like for me, again), an updated answer is in order:
First of all, yes, while Haskell does not specify a bytecode format, bytecode is also just a kind of machine code, for a virtual machine. So for the rest of the answer I will treat them as the same thing. The “Core“ as well as the LLVM intermediate language, or even WASM could be considered equivalent too.
Secondly, if your old binary is statically linked, then of course, no matter the format your program is in, no symbols will be available to check out. Because that is what linking does. Even with bytecode, and even with just classic static #include in simple languages. So your old binary will be no good, no matter what. And given the optimisations compilers do, a classic decompiler will very likely never be able to figure out what optimised bits used to be partially what libraries. Especially with stream fusion and such “magic”.
Third, you can do the things you asked with a modern Haskell program. But you need to have your binaries compiled with -dynamic and -rdynamic, So not only the C-calling-convention libraries (e.g. .so), and the Haskell libraries, but also the runtime itself is dynamically loaded. That way you end up with a very small binary, consisting of only your actual code, dynamic linking instructions, and the exact data about what libraries and runtime were used to build it. And since the runtime is compiler-dependent, you will know the compiler too. So it would give you everything you need, but only if you compiled it right. (I recommend using such dynamic linking by default in any case as it saves memory.)
The last factor that one might forget, is that even the exact same compiler version might behave vastly differently, depending on what IT was compiled with. (E.g. if somebody put a backdoor in the very first version of GHC, and all GHCs after that were compiled with that first GHC, and nobody ever checked, then that backdoor could still be in the code today, with no traces in any source or libraries whatsoever. … Or for a less extreme case, that version of GHC your old binary was built with might have been compiled with different architecture options, leading to it putting more optimised instructions into the binaries it compiles for unless told to cross-compile.)
Finally, of course, you can profile even compiled binaries, by profiling their system calls. This will give you clues about which part of the code acted differently and how. (E.g. if you notice that your new binary floods the system with some slow system calls where the old one just used a single fast one. A classic OpenGL example would be using fast display lists versus slow direct calls to draw triangles. Or using a different sorting algorithm, or having switched to a different kind of data structure that fits your work load badly and thrashes a lot of memory.)

Measure function time execution without modification code

I have found some piece of code (function) in library which could be improved by the optimization of compiler (as the main idea - to find good stuff to go deep into compilers). And I want to automate measurement of time execution of this function by script. As it's low-level function in library and get arguments it's difficult to extract this one. Thus I want to find out the way of measurement exactly this function (precise CPU time) without library/application/environment modifications. Have you any ideas how to achieve that?
I could write wrapper but I'll need in near future much more applications for performance testing and I think to write wrapper for every one is very ugly.
P.S.: My code will run on ARM (armv7el) architecture, which has some kind of "Performance Monitor Control" registers. I have learned about "perf" in linux kernel. But don't know is it what I need?
It is not clear if you have access to the source code of the function you want to profile or improve, i.e. if you are able to recompile the considered library.
If you are using a recent GCC (that is 4.6 at least) on a recent Linux system, you could use profilers like gprof (assuming you are able to recompile the library) or better oprofile (which you could use without recompiling), and you could customize GCC for your needs.
Be aware that like any measurements, profiling may alter the observed phenomenon.
If you are considering customizing the GCC compiler for optimization purposes, consider making a GCC plugin, or better yet, a MELT extension, for that purpose (MELT is a high-level domain specific language to extend GCC). You could also customize GCC (with MELT) for your own specific profiling purposes.

Where did the first make binary come from?

I'm having to build gnu make from source for reasons too complicated to explain here.
I noticed to build it I require the make command itself, in the traditional fashion:
./configure
make install
So what if I didn't have the make binary already? Where did the first ever make binary come from?
From the same place the first gcc binary came from.
The first make was created probably using a shell script to do the build. After that, make would "make" itself.
It's a notable achievement in systems development when the platform becomes "self-hosting". That is the platform can build itself.
Things like "make make" and "gcc gcc.c".
Many language writers will create their language in another language (say, C) and when they have moved it far enough along, they will use that original bootstrap compiler to write a new compiler in the original language. Finally, they discard the original.
Back in the day, a friend was working on a debugger for OS/2, notable for being a multi-tasking operating system at the time. And he would regale about the times when they would be debugging the debugger, and find a bug. So, they would debug the debugger debugging the debugger. It's a novel concept and goes to the heart of computing and abstraction.
Inevitably, it all boils back to when someone keyed in something through a hardwire key pad or some other switches to get an initial program loaded. Then they leveraged that program to do other work, and it all just grows from there.
Stuart Feldman, then at AT&T, wrote the source code for make around the time of 7th Edition UNIX™, and used manual compilation (or maybe a shell script) until make was working well enough to be used to build itself. You can find the UNIX Programmer's Manual for 7th Edition online, and in particular, the original paper describing the original version of make, dated August 1978.
make is just one convenience tool. It is still possible to invoke cc, ld, etc. manually or via other scripting tools.
If you're building GNU make, have a look at build.sh in the source tree after running configure:
# Shell script to build GNU Make in the absence of any `make' program.
# build.sh. Generated from build.sh.in by configure.
Compiling C programs is not the only way to produce an executable file. The first make executable (or more notably the C compiler itself) could for example be an assembly program, or it could be hand coded in machine code. It could also be cross compiled on a completely different system.
The essence of make is that it is a simplified way of running some commands.
To make the first make, the author had to manually act as make, and run gcc or whatever toolset was available, rather than having it run automatically.

Resources