I am currently trying to use an eBPF program on a Raspberry Pi 3 Model B V1.2, that has Ubuntu installed. For managing the compilation, system calls and all that, I use the BPF compiler collection.
Whenever BCC tries to compile the program, I get a multitude of errors, one of which is "SMP is not supported on this platform" and another "SMP not supported on pre-ARMv6 CPUs".
This seems really strange to me, because looking at the headers, those should only occur, if __LINUX_ARM_ARCH__ is smaller than 6.
"uname -m" gives me armv7l, which should be sufficient, right?
Looking at the kernel config, CONFIG_SMP is y and CONFIG_CPU_32v7 is y as well. So as far as I understand, everything seems correct.
So why doesn't it work and how can I fix it?
Also if you need more information, I'll gladly supply it. I've never been so deep into this stuff, so I don't know what's more and what's less important.
Related
I am trying to figure out a bug (a serious performance downgrade). Unfortunately, I wasn't able to figure out why by going back many different versions of my code.
I am suspecting it could be some modifications to libraries that I've updated, not to mention in the meanwhile I've updated to GHC 7.6 from 7.4 (and if anybody knows if some laziness behavior has changed I would greatly appreciate it!).
I have an older executable of this code that does not have this bug and thus I wonder if there are any tools to tell me the library versions I was linking to from before? Like if it can figure out the symbols, etc.
GHC creates executables, which are notoriously hard to understand... On my Linux box I can view the assembly code by typing in
objdump -d <executable filename>
but I get back over 100K lines of code from just a simple "Hello, World!" program written in Haskell.
If you happen to have the GHC .hi files, you can get some information about the executable by typing in
ghc --show-iface <hi filename>
This won't give you the assembly code, but you can get some extra information that may prove useful.
As I mentioned in the comment above, on Linux you can use "ldd" to see what C-system libraries you used in the compile, but that is also probably less than useful.
You can try to use a disassembler, but those are generally written to disassemble to C, not anything higher level and certainly not Haskell. That being said, GHC compiles to C as an intermediary (at least it used to; has that changed?), so you might be able to learn something.
Personally I often find view system calls in action much more interesting than viewing pure assembly. On my Linux box, I can view all system calls by running using strace (use Wireshark for the network traffic equivalent):
strace <program executable>
This also will generate a lot of data, so it might only be useful if you know of some specific place where direct real world communication (i.e., changes to a file on the hard disk drive) goes wrong.
In all honesty, you are probably better off just debugging the problem from source, although, depending on the actual problem, some of these techniques may help you pinpoint something.
Most of these tools have Mac and Windows equivalents.
Since much has changed in the last 9 years, and apparently this is still the first result a search engine gives on this question (like for me, again), an updated answer is in order:
First of all, yes, while Haskell does not specify a bytecode format, bytecode is also just a kind of machine code, for a virtual machine. So for the rest of the answer I will treat them as the same thing. The “Core“ as well as the LLVM intermediate language, or even WASM could be considered equivalent too.
Secondly, if your old binary is statically linked, then of course, no matter the format your program is in, no symbols will be available to check out. Because that is what linking does. Even with bytecode, and even with just classic static #include in simple languages. So your old binary will be no good, no matter what. And given the optimisations compilers do, a classic decompiler will very likely never be able to figure out what optimised bits used to be partially what libraries. Especially with stream fusion and such “magic”.
Third, you can do the things you asked with a modern Haskell program. But you need to have your binaries compiled with -dynamic and -rdynamic, So not only the C-calling-convention libraries (e.g. .so), and the Haskell libraries, but also the runtime itself is dynamically loaded. That way you end up with a very small binary, consisting of only your actual code, dynamic linking instructions, and the exact data about what libraries and runtime were used to build it. And since the runtime is compiler-dependent, you will know the compiler too. So it would give you everything you need, but only if you compiled it right. (I recommend using such dynamic linking by default in any case as it saves memory.)
The last factor that one might forget, is that even the exact same compiler version might behave vastly differently, depending on what IT was compiled with. (E.g. if somebody put a backdoor in the very first version of GHC, and all GHCs after that were compiled with that first GHC, and nobody ever checked, then that backdoor could still be in the code today, with no traces in any source or libraries whatsoever. … Or for a less extreme case, that version of GHC your old binary was built with might have been compiled with different architecture options, leading to it putting more optimised instructions into the binaries it compiles for unless told to cross-compile.)
Finally, of course, you can profile even compiled binaries, by profiling their system calls. This will give you clues about which part of the code acted differently and how. (E.g. if you notice that your new binary floods the system with some slow system calls where the old one just used a single fast one. A classic OpenGL example would be using fast display lists versus slow direct calls to draw triangles. Or using a different sorting algorithm, or having switched to a different kind of data structure that fits your work load badly and thrashes a lot of memory.)
Suppose I've written a simple program in C. It builds and runs successfully on my primary arch.
Now I want to find out on what architectures the program can be built and works; also provide pre-built executables for download for variety of platforms. However I have only few of them.
The most obvious way seems to be set up maximum number of cross compiling tool chains and maximum number of executable images for different architectures. But this seems to be inconvenient (especially if you just want it for one little program).
How to do it in a simple way? Should I use some on-line service which provides already set up for development systems for various architectures?
Expecting something like this:
user$ ssh i386.buildhere.example
guest#i386 $ echo 'int main(){}' > hello.c
guest#i386 $ gcc hello.c -o hello
guest#i386 $ ./hello
guest#i386 $ file hello
hello: ELF 32-bit LSB executable, Intel 8038....
user$ ssh armel.buildhere.example
guest#armel $ ....
...
Additional bonus whould be if there are also various outdated systems availble to test "how my program would behave on that legacy distribution?".
There is one thing that is ALMOST certain in programming: Whatever you do to test something will probably work, but whatever you haven't tested will fail when you give it to a customer.
Use of virtual machines will help to some degree to avoid having to buy unusual hardware, as does things like QEMU.
Unless your program is either REALLY trivial, or you want to use your customers [1] as guinea-pigs, you are best off testing on every platform-type that you want to release for. If you don't, it WILL come back and bite you at some point.
If you don't, you run the risk of some of your customers gets an "unhappy experience". An unhappy customer tells ten people, a happy customer tells maybe one person.
If you wish to support architectures/stuff that you don't have access to, maybe just having a "help yourself" option of source-code is a better choice than downloadable binaries.
Of course, you can rent time/space on servers of various kinds - I looked into writing an iPhone app, and there are places that run Mac's as virtual machines on the net that you can rent for around US$ 15 per month, for example.
[1] Throughout this answer, by customer, I mean anyone that downloads your software, regardless of whether they actually give some money to anyone, they will have spent some effort to get your software on their machine. If it doesn't work, then they will be unhappy to some degree. How unhappy depends on a number of things, including how clear you made it that "this may not work" if you were to publish "untested" software.
I work from 2 different machines. One is Windows and the other is Linux. If I alternately work on the same project but switch between both OSes, will I eventually run into compiling errors? I ask because maybe there are standards supported by one but not by the other.
That question is a pretty broad one and it depends, strictly speaking, on your tool chain. If you were to use the same tool chain (e.g. GCC/MinGW or Clang), you'd be minimizing the chance for this class of errors. If you were to use Visual Studio on Windows and GCC or Clang on the Linux side, you'd run into more issues alone because some of the headers differ. So once your program leaves the realm of strict ANSI C (C89) you'll be on your own.
However, if you aren't careful you may run into a lot of other more profane errors, such as the compiler on Linux choking on the line endings if you didn't tell your editor on the Windows side to use these.
Ah, and also keep in mind that if you want to actually cross-compile, GCC may be the best choice and therefore the first part I mentioned in my answer becomes a moot point. GCC is a proven choice on both ends. And given your question it's unlikely that you are trying to write something like a kernel mode driver - which would be fundamentally different.
That may be only if your application use some specific API.
It is entirely possible to write code that works on both platforms, with no issues to compile the code. It is, however, not without some difficulties. Compilers allow you to use non-standard features in the compiler, and it's often hard to do more fancy user interfaces (even if it's still just text) because as soon as you start wanting to do more than "read a line of text as it is entered in a shell", it's into "non-standard" land.
If you do find yourself needing to do more than what the standard C library can do, make sure you isolate those parts of the code into a separate file (or a couple of files, one for Linux/Unix style systems and one for Windows systems).
Using the same compiler (gcc) would help avoiding problems with "compiler B doesn't compile code that works fine in compiler A".
But it's far from an absolute necessity - just make sure you compile the code on both platforms and with all of your "suppoerted" compilers often enough that you haven't dug a very deep hole that is hard to get out of before you discover that "it's not working on the other system". It certainly helps if you have (at least) a virtual machine running the other OS, so you can easily try both variants.
Ideally, you want to set up an automated system, such that when you change the code [and feel that the changes are "complete"], it automatically gets built on both platforms and all compilers you want to use. And if possible, also automatically tested!
I would also seriously consider using version control - that way, when something breaks on one or the other side, you can go back and look at what the code looked like before it stopped working, and (hopefully) find the reason it broke much quicker than "Hmm, I think it's the change I made to foo.c, lets take that out... No, not that one, ok how about the change here..." - at least with version control, you can say "Ok, so version 1234 doesn't work, let's try version 1220 - ok, that works. Now try 1228, still works - so change between 1229 and 1234 - try 1232, ah, it's broken..." No editing files and you can still go to any other version you like with very little difficulty. I have used Mercurial quite a bit, git a little bit, some subversion, and worked on a project in Perforce for a few years. All of these are good - personally, I think I prefer mercurial.
As a side-effect: Most version control systems also deal with filename and line endings in the saner way than doing this manually.
If you combine your version control system with a "automated build and test-system", such as Jenkins, you can get everything very automated. Jenkins is free and runs on both Windows and Linux, and you can use it to automatically build and test your code as and when you submit the code to the version control system.
It will not create a problem until you recompile the source code in the respective OS. If you wanna run your compiled file generated by windows(.exe or .obj), into linux or vice-versa then it will definitely create a problem and wont be possible. But you can move you source code (file with extension .c/.c++) into any of the os. And sometimes it also create problems with different header files, so take care of that also. Best practice is to use single OS for you entire project, avoid multiple os until it is extremely necessary.
I am currently a student at a university studying a computing related degree and my current project is focusing on finding vulnerabilities in the Linux kernel. My aim is to both statically audit as well as 'fuzz' the kernel (targeting version 3.0) in an attempt to find a vulnerability.
My first question is 'simple' is fuzzing the Linux kernel possible? I have heard of people fuzzing plenty of protocols etc. but never much about kernel modules. I also understand that on a Linux system everything can be seen as a file and as such surely input to the kernel modules should be possible via that interface shouldn't it?
My second question is: which fuzzer would you suggest? As previously stated lots of fuzzers exist that fuzz protocols however I don't see many of these being useful when attacking a kernel module. Obviously there are frameworks such as the Peach fuzzer which allows you to 'create' your own fuzzer from the ground up and are supposedly excellent however I have tried repeatedly to install Peach to no avail and I'm finding it difficult to believe it is suitable given the difficulty I've already experienced just installing it (if anyone knows of any decent installation tutorials please let me know :P).
I would appreciate any information you are able to provide me with this problem. Given the breadth of the topic I have chosen, any idea of a direction is always greatly appreciated. Equally, I would like to ask people to refrain from telling me to start elsewhere. I do understand the size of the task at hand however I will still attempt it regardless (I'm a blue-sky thinker :P A.K.A stubborn as an Ox)
Cheers
A.Smith
I think a good starting point would be to extend Dave Jones's Linux kernel fuzzer, Trinity: http://codemonkey.org.uk/2010/12/15/system-call-fuzzing-continued/ and http://codemonkey.org.uk/2010/11/09/system-call-abuse/
Dave seems to find more bugs whenever he extends that a bit more. The basic idea is to look at the system calls you are fuzzing, and rather than passing in totally random junk, make your fuzzer choose random junk that will at least pass the basic sanity checks in the actual system call code. In other words, you use the kernel source to let your fuzzer get further into the system calls than totally random input would usually go.
"Fuzzing" the kernel is quite a broad way to describe your goals.
From a kernel point of view you can
try to fuzz the system calls
the character- and block-devices in /dev
Not sure what you want to achieve.
Fuzzing the system calls would mean checking out every Linux system call (http://linux.die.net/man/2/syscalls) and try if you can disturb regular work by odd parameter values.
Fuzzing character- or block-drivers would mean trying to send data via the /dev-interfaces in a way which would end up in odd result.
Also you have to differentiate between attempts by an unprivileged user and by root.
My suggestion is narrowing down your attempts to a subset of your proposition. It's just too damn broad.
Good luck -
Alex.
One way to fuzzing is via system call fuzzing.
Essentially the idea is to take the system call, fuzz the input over the entire range of possible values - whether it remain within the specification defined for the system call does not matter.
What are the input limitations of a bare metal cross compiler...as in does it not compile programs with pointers or mallocs......or anything that would require more than the underlying hardware....also how can 1 find these limitations..
I also wanted to ask...I built a cross compiler for target mips..i need to create a mips executable using this cross compiler...but i am not able to find where the executable is...as in there is 1 executable which i found mipsel-linux-cpp which is supposed to compile,assemble and link and then produce a.out but it is not doing so...
However the ./cc1 gives a mips assembly.......
There is an install folder which has a gcc executable which uses i386 assembly and then gives an exe...i dont understand how can the gcc exe give i386 and not mips assembly when i have specified target as mips....
please help im really not able to understand what is happ...
I followed the foll steps..
1. Installed binutils 2.19
2. configured gcc for mips..(g++,core)
I would suggest that you should have started two separate questions.
The GNU toolchain does not have any OS dependencies, but the GNU library does. Most bare-metal cross builds of GCC use the Newlib C library which provides a set of syscall stubs that you must map to your target yourself. These stubs include low-level calls necessary to implement stream I/O and heap management. They can be very simple or very complex depending on your needs. If the only I/O support is to a UART to stdin/stdout/stderr, then it is simple. You don't have to implement everything, but if you do not implement teh I/O stubs, you won't be able to use printf() for example. You must implement the sbrk()/sbrk_r() syscall is you want malloc() to work.
The GNU C++ library will work correctly with Newlib as its underlying library. If you use C++, the C runtime start-up (usually crt0.s) must include the static initialiser loop to invoke the constructors of any static objects that your code may include. The run-time start-up must also of course initialise the processor, clocks, SDRAM controller, timers, MMU etc; that is your responsibility, not the compiler's.
I have no experience of MIPS targets, but the principles are the same for all processors, there is a very useful article called "Building Bare Metal ARM with GNU" which you may find helpful, much of it will be relevant - especially porting the parts regarding implementing Newlib stubs.
Regarding your other question, if your compiler is called mipsel-linux-cpp, then it is not a 'bare-metal' build but rather a Linux build. Also this executable does not really "compile, assemble and link", it is rather a driver that separately calls the pre-processor, compiler, assembler and linker. It has to be configured correctly to invoke the cross-tools rather than the host tools. I generally invoke the linker separately in order to enforce decisions about which standard library to link (-nostdlib), and also because it makes more sense when a application is comprised of multiple execution units. I cannot offer much help other than that here since I have always used GNU-ARM tools built by people with obviously more patience than me, and moreover hosted on Windows, where there is less possibility of the host tool-chain being invoked instead (one reason why I have also avoided those tool-chains that rely on Cygwin)
EDIT
With more time available, I have rewritten my original answer in an attempt to provide something more useful.
I cannot provide a specific answer for your question. I have never tried to get code running on a MIPS machine. What I do have is plenty of experience getting a variety of "bare metal" boards up and running. All kinds of CPUs and all kinds of compilers and cross compilers. So I have an understanding of the principles that apply in all such situations. I will point out the kind of knowledge you will need to absorb before you can hope to succeed with a job like this, and hopefully I can list some links to resources to get you started on learning that knowledge.
I am worried you don't know that pointers are exactly the kind of thing a bare metal compiler can handle, they are a basic machine primitive. This tells me you are probably not an expert embedded developer who is just stuck in this particular scenario. Never mind. There isn't anything magic about programming an embedded system, and you can learn what you need to know.
The first step is getting to understand the relationship between C and the machine you wish to run code on. Basically C is a portable assembly language. This means that C is good for manipulating the basic operations of the machine. In this sense the basic operations of the machine are reading and writing memory locations, performing arithmetic and boolean operations on the data read from memory, and making branching and looping decisions based on that data. In particular the C concept of pointers allows you to manipulate data at locations in memory that you specify.
So far so good, but just doing raw computations in memory is not usually enough - you need a way to input and output data from memory. To do that you need to manipulate the hardware peripherals on your board. If the hardware peripherals are memory mapped then the machine registers used to control the peripherals look exactly like memory locations and C can manipulate them directly. Even in that case though, it is much more likely that doing useful I/O is best handled by extending the C core language with a library of routines provided just for that purpose. These library routines handle all the nasty details (timers, interrupts, non-memory mapped I/O) involved in manipulating the peripheral hardware on the board, and wrap them up with a convenient C function call interface. The idea is that you can go simply printf("hello world"); and the library call take care of the details of displaying the string.
An appropriately skilled developer knows how to adapt an existing I/O library to a new board, or how to develop new library routines to provide access to non-standard custom hardware. The classic way to develop these skills is to start with something simple, usually a LED for an output device, and a switch for an input device. Write a program that pulses a LED in a predictable way, or reads a switch and reflects in on a LED. The first time you get this working will be hugely satisfying.
Okay I have rambled enough. It is time to provide some more resources for you to study. The good news is that there's never been a better time to learn how things work at the interface between hardware and software. There is a wealth of freely available code and docs. Stackoverflow is a great resource as you know. Good luck! Links follow;
Embedded systems overview
Knowing the C language well is fundamental
Why not get your code working on a simulator before you try real hardware
Another emulated environment
Linux device drivers - an overlapping subject
Another book about bare metal programming