Tool for analyse portable executable loaded into memory - security

There is a lot of tools designed to help analyzing portable executable files. For example PE Explorer. We can load .exe file into it and check things like number of sections, section alignment or virtual addresses of particular sections.
Is there any similar tool which allows me to do the same but for a portable executable already loaded into memory? Without access to it's .exe file?
EDIT:
Maybe I will try to clarify what I'm trying to achieve. Lets say (like #0x90 suggested) that I have two applications or maybe even three applications.
app1.exe - executed by user, creates new process basing on app3.exe and modifies its memory by putting app2.exe into it.
app2.exe - Injected into app3.exe memory by app1.exe.
app3.exe - Used to create the new process.
I have sources of all of three applications. I'm simply trying to learn about windows internals by practical exercises with injecting/Processes Hollowing. I have some bug in app1.exe which leads to:
The application was unable to start correctly (0xc0000018).
And I'm trying to find a way to debug this situation. My idea was to compare PE on disk with PE in memory and check if it looks correct. I was surprised that I can't find tool designed for such a propose.

For clarity, the application you wish to examine shall be called App1 while the application which loads the PE files contents into memory shall be called App2.
To my knowledge, all major disassemblers work on files.
This is due to the fact that App1 will be mapped into the address space of App2.
Your easiest solution is to dump the executable to disk from memory.
If you control the source of App2, this step is trivial.
If you don't, you will need to attack a debugger, identify the exact memory range where the PE file resides and use the debugger's functionality to dump that memory range to disk.

Related

What exactly is and what does a core dump of process made with GDB contain?

It's clear that GDB can generate a core dump of process for example via its wrapper - gcore, however it's not clear what exactly it includes and it's really hard to find an answear because different sources say totally different things.
Because of that I've got following questions:
Does it contain a whole virtual memory of process? If not, what part of it? All writable regions or stack only or just (as its name suggests) some essential part or what?
Is it possible to generate a core file with complete memory of process via GDB? If so, how?
What would be the difference between a "dump" generated by saving and concatenating memory from all regions according to process' memory map and automatically generated file with gcore command?
A process wrote some data somewhere into RAM. Can I be sure that the data will be (always) accessed and saved in a core file generated with gcore command? If not, why? What does it depend on?

How does chroot affect dynamic libraries memory use?

Although there is another question with similar topic, it does not cover the memory use by the shared libraries in chrooted jails.
Let's say we have a few similar chroots. To be more specific, exactly the same sets of binary files and shared libraries which are actually hard links to the master copies to conserve the disk space (to prevent the potential possibility of a files alteration the file system is mounted read only).
How is the memory use affected in such a setup?
As described in the chroot system call:
This call changes an ingredient in the pathname resolution process and does nothing else.
So, the shared library will be loaded in the same way as if it were outside the chroot jail (share read only pages, duplicate data, etc.)
http://man7.org/linux/man-pages/man2/chroot.2.html
Because hardlinks share the same underlying inode, the kernel treats them as the same item when it comes to caching/mapping.
You'll see filesystem cache savings by using hardlinks, as well as disk-space savings.
The biggest issue I'd have with this is that if someone manages so subvert the read-only nature of one of the chroot environments, then they could subvert all of them by making modifications to any of the hardlinked files.
When I set this up, I copied the shared libraries per chroot instead of linking to a read-only mount. With separate files, the text segments were not shared. It's likely that the same inode will map to the same read-only text segment, but this may vary with available memory management hardware and similar architectural details.
Try this experiment on your system: write a small program that makes some minimal use of a large shared library. Run twenty or thirty chroot jails as you describe, each with a running copy of the program. Check overall memory usage before & during running, and dissect one instance to get a good text/data segment breakdown. If memory use increases by the full size of the map for each instance, the segments are not shared. Conversely, if memory use goes up by a fraction of the map, the segments are shared.

Can you load a tree structure in memory with Linux shell?

I want to create an application with a Linux shell script like this — but can it be done?
This application will create a tree containing data. The tree should be loaded in the memory. The tree (loaded in memory) could be readable from any other external Linux script.
Is it possible to do it with a Linux shell?
If yes, how can you do it?
And are there any simple examples for that?
There are a large number of misconceptions on display in the question.
Each process normally has its own memory; there's no trivial way to load 'the tree' into one process's memory and make it available to all other processes. You might devise a system of related programs that know about a shared memory segment (somehow — there's a problem right there) that contains the tree, but that's about it. They'd be special programs, not general shell scripts. That doesn't meet your 'any other external Linux script' requirement.
What you're seeking is simply not available in the Linux shell infrastructure. That answers your first question; the other two are moot given the answer to the first.
There is a related discussion here. They use shared memory device /dev/shm and, ostensibly, it works for multiple users. At least, it's worth a try:
http://www.linuxquestions.org/questions/linux-newbie-8/bash-is-it-possible-to-write-to-memory-rather-than-a-file-671891/
Edit: just tried it with two users on Ubuntu - looks like a normal directory and REALLY WORKS with the right chmod.
See also:
http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html
I don't think there is a way to do this as if you want to keep all the requirements of:
Building this as a shell script
In-memory
Usable across terminals / from external scripts
You would have to give up at least one requirement:
Give up shell script req - Build this in C to run as a Linux process. I only understand this up to the point to say that it would be non-trivial
Give up in-memory req - You can serialize the tree and keep the data in a temp file. This works as long as the file is small and performance bottleneck isn't around access to the tree. The good news is you can use the data across terminals / from external scripts
Give up usability from external scripts req - You can technically build a script and run it by sourcing it to add many (read: a mess of) variables representing the tree into your current shell session.
None of these alternatives are great, but if you had to go with one, number 2 is probably the least problematic.

How many copies of program/class gets loaded into memory when multiple users accessing it at the same time

We are trying to setup Eclipse in a shared environment, i.e., it will be installed on a server and each user connects to it using VNC. There are different reasons for sharing Eclipse, one being proper integration with ClearCase.
We identified that Eclipse is using large amounts of memory. We are wondering whether the Eclipse(JVM?) loads each class once per user/session or whether there is any sort of sharing of objects that are already loaded into memory?
This makes me think about a basic question in general. How many copies of a program gets loaded into memory when two or more users are accessing the host at the same time.
Is it one per user or a single copy is shared between users?
Two questions here:
1) How many copies of a program gets loaded into memory when two or
more users are using it at the same time?
2) How does the above holds in the world of Java/JVM?
Linux allows for sharing binary code between running processes, i.e. the segments that hold executable parts of a program are mapped into virtual memory space of each running copy. Then each process gets its own data parts (stack, heap, etc.).
The issue with Java, or almost any other interpreted language, is that run-time, the JVM, treats byte-code as data, loading it into heap. The fact that Java is half-compiled and half interpreted is irrelevant here. This results in a situation where the JVM executable itself is eligible for code sharing by the OS, but your application Java code is not.
In general, a single copy of a program (i.e. text segment) is loaded into RAM and shared by all instances, so the exact same read-only memory mapped physical pages (though possibly/probably mapped to different addresses in different address spaces, but it's still the same memory). Data is usually private to each process, i.e. each program's data lives in separate pages RAM (though it can be shared).
BUT
The problem is that the actual program here is only the Java runtime interpreter, or the JIT compiler. Eclipse, like all Java programs, is rather data than a program (which however is interpreted as a program). That data is either loaded into the private address space and interpreted by the JVM or turned into an executable by the JIT compiler, resulting in a (temporary) executable binary, which is launched. This means, in principle, each Java program runs as a separate copy, using separate RAM.
Now, you might of course be lucky, and the JVM might load the data as a shared mapping, in this case the bytecode would occupy the same identical RAM in all instances. However, whether that's the case is something only the author of the JVM could tell, and it's not something you can rely on in general.
Also, depending on how clever the JIT is, it might cache that binary for some time and reuse it for identical Java programs, which would be very advantageous, not only because it saves the compilation. All instances launched from the same executable image share the same memory, so this would be just what you want.
It is even likely that this is done -- at least to some extent -- on your JIT compiler, because compiling is rather expensive and it's a common optimization.

Minimal core dump (stack trace + current frame only)

Can I configure what goes into a core dump on Linux? I want to obtain something like the Windows mini-dumps (minimal information about the stack frame when the app crashed). I know you can set a max size for the core files using ulimit, but this does not allow me to control what goes inside the core (i.e. there is no guarantee that if I set the limit to 64kb it will dump the last 16 pages of the stack, for example).
Also, I would like to set it in a programmatic way (from code), if possible.
I have looked at the /proc/PID/coredump_filter file mentioned by man core, but it seems too coarse grained for my purposes.
To provide a little context: I need tiny core files, for multiple reasons: I need to collect them over the network, for numerous (thousands) of clients; furthermore, these are embedded devices with little SD cards, and GPRS modems for the network connection. So anything above ~200k is out of question.
EDIT: I am working on an embedded device which runs linux 2.6.24. The processor is PowerPC. Unfortunately, powerpc-linux is not supported in breakpad at the moment, so google breakpad is not an option
I have "solved" this issue in two ways:
I installed a signal handler for SIGSEGV, and used backtrace/backtrace_symbols to print out the stack trace. I compiled my code with -rdynamic, so even after stripping the debug info I still get a backtrace with meaningful names (while keeping the executable compact enough).
I stripped the debug info and put it in a separate file, which I will store somewhere safe, using strip; from there, I will use add22line with the info saved from the backtrace (addresses) to understand where the problem happened. This way I have to store only a few bytes.
Alternatively, I found I could use the /proc/self/coredump_filter to dump no memory (setting its content to "0"): only thread and proc info, registers, stacktrace etc. are saved in the core. See more in this answer
I still lose information that could be precious (global and local variable(s) content, params..). I could easily figure out which page(s) to dump, but unfortunately there is no way to specify a "dump-these-pages" for normal core dumps (unless you are willing to go and patch the maydump() function in the kernel).
For now, I'm quite happy with there 2 solutions (it is better than nothing..) My next moves will be:
see how difficult would be to port Breakpad to powerpc-linux: there are already powerpc-darwin and i386-linux so.. how hard can it be? :)
try to use google-coredumper to dump only a few pages around the current ESP (that should give me locals and parameters) and around "&some_global" (that should give me globals).

Resources