Buffer overflow exploitation 101 - security

I've heard in a lot of places that buffer overflows, illegal indexing in C like languages may compromise the security of a system. But in my experience all it does is crash the program I'm running. Can anyone explain how buffer overflows could cause security problems? An example would be nice.
I'm looking for a conceptual explanation of how something like this could work. I don't have any experience with ethical hacking.

First, buffer overflow (BOF) are only one of the method of gaining code execution. When they occur, the impact is that the attacker basically gain control of the process. This mean that the attacker will be able to trigger the process in executing any code with the current process privileges (depending if the process is running with a high or low privileged user on the system will respectively increase or reduce the impact of exploiting a BOF on that application). This is why it is always strongly recommended to run applications with the least needed privileges.
Basically, to understand how BOF works, you have to understand how the code you have build gets compiled into machine code (ASM) and how data managed by your software is stored in memory.
I will try to give you a basic example of a subcategory of BOF called Stack based buffer overflows :
Imagine you have an application asking the user to provide a username.
This data will be read from user input and then stored in a variable called USERNAME. This variable length has been allocated as a 20 byte array of chars.
For this scenario to work, we will consider the program's do not check for the user input length.
At some point, during the data processing, the user input is copied to the USERNAME variable (20bytes) but since the user input is longer (let's say 500 bytes) data around this variable will be overwritten in memory :
Imagine such memory layout :
size in bytes 20 4 4 4
data [USERNAME][variable2][variable3][RETURN ADDRESS]
If you define the 3 local variables USERNAME, variable2 and variable3 the may be store in memory the way it is shown above.
Notice the RETURN ADDRESS, this 4 byte memory region will store the address of the function that has called your current function (thanks to this, when you call a function in your program and readh the end of that function, the program flow naturally go back to the next instruction just after the initial call to that function.
If your attacker provide a username with 24 x 'A' char, the memory layout would become something like this :
size in bytes 20 4 4 4
data [USERNAME][variable2][variable3][RETURN ADDRESS]
new data [AAA...AA][ AAAA ][variable3][RETURN ADDRESS]
Now, if an attacker send 50 * the 'A' char as a USERNAME, the memory layout would looks like this :
size in bytes 20 4 4 4
data [USERNAME][variable2][variable3][RETURN ADDRESS]
new data [AAA...AA][ AAAA ][ AAAA ][[ AAAA ][OTHER AAA...]
In this situation, at the end of the execution of the function, the program would crash because it will try to reach the address an invalid address 0x41414141 (char 'A' = 0x41) because the overwritten RETURN ADDRESS doesn't match a correct code address.
If you replace the multiple 'A' with well thought bytes, you may be able to :
overwrite RETURN ADDRESS to an interesting location.
place "executable code" in the first 20 + 4 + 4 bytes
You could for instance set RETURN ADDRESS to the address of the first byte of the USERNAME variable (this method is mostly no usable anymore thanks to many protections that have been added both to OS and to compiled programs).
I know it is quite complex to understand at first, and this explanation is a very basic one. If you want more detail please just ask.
I suggest you to have a look at great tutorials like this one which are quite advanced but more realistic

Related

Getting the percentage of used space and used inodes in a mount

I need to calculate the percentage of used space and used inodes for a mount path (e.g. /mnt/mycustommount) in Go.
This is my attempt:
var statFsOutput unix.Statfs_t
err := unix.Statfs(mount_path, &statFsOutput)
if err != nil {
return err
}
totalBlockCount := statFsOutput.Blocks // Total data blocks in filesystem
freeSystemBlockCount = statFsOutput.Bfree // Free blocks in filesystem
freeUserBlockCount = statFsOutput.Bavail // Free blocks available to unprivileged user
Now the proportion I need would be something like this:
x : 100 = (totalBlockCount - free[which?]BlockCount) : totalBlockCount
i.e. x : 100 = usedBlockCount : totalBlockCount . But I don't understand the difference between Bfree and Bavail (what's 'unprivileged' user go to do with filesystem blocks?).
For inodes my attempt:
totalInodeCount = statFsOutput.Files
freeInodeCount = statFsOutput.Ffree
// so now the proportion is
// x : 100 = (totalInodeCount - freeInodeCount) : totalInodeCount
How to get the percentage for used storage?
And is the inodes calculation I did correct?
Your comment expression isn't valid Go, so I can't really interpret it without guessing. With guessing, I interpret it as correct, but have I guessed what you actually mean, or merely what I think you mean? In other words, without showing actual code, I can only imagine what your final code will be. If the code I imagine isn't the actual code, the correctness of the code I imagine you will write is irrelevant.
That aside, I can answer your question here:
(what's 'unprivileged' user go to do with filesystem blocks?)
The Linux statfs call uses the same fields as 4.4BSD. The default 4.4BSD file system (the one called the "fast file system") uses a blocks-with-fragmentation approach to allocate blocks in a sort of stochastic manner. This allocation process works very well on an empty file system, and continues to work well, without extreme slowdown, on somewhat-full file systems. Computerized modeling of its behavior, however, showed pathological slowdowns (amounting to linear search, more or less) were possible if the block usage exceeded somewhere around 90%.
(Later, analysis of real file systems found that the slowdowns generally did not hit until the block usage exceeded 95%. But the idea of a 10% "reserve" was pretty well established by then.)
Hence, if a then-popular large-size disk drive of 400 MB1 gave 10% for inodes and another 10% for reserved blocks, that meant that ordinary users could allocate about 320 MB of file data. At that point the drive was "100% full", but it could go to 111% by using up the remaining blocks. Those blocks were reserved to the super-user though.
These days, instead of a "super user", one can have a capability that can be granted or revoked. However, these days we don't use the same file systems either. So there may be no difference between bfree and bavail on your system.
1Yes, the 400 MB Fujitsu Eagle was a large (in multiple senses: it used a 19 inch rack mount setup) drive back then. People are spoiled today with their multi-terabyte SSDs. 😀

What is the reason of Corrupted fields problem in SPLUNK?

I have a problem on this search below for last 25 days:
index=syslog Reason="Interface physical link is down" OR Reason="Interface physical link is up" NOT mainIfname="Vlanif*" "nw_ra_a98c_01.34_krtti"
Normally field7 values are like these ones:
Region field7 Date mainIfname Reason count
ASYA nw_ra_m02f_01.34pndkdv may 9 GigabitEthernet0/3/6 Interface physical link is up 3
ASYA nw_ra_m02f_01.34pldtwr may 9 GigabitEthernet0/3/24 Interface physical link is up 2
But recently they wee like this:
00:00:00.599 nw_ra_a98c_01.34_krtti
00:00:03.078 nw_ra_a98c_01.34_krtti
I think problem may be related to:
It started to happen after the disk free alarm. (-Cri- Swap reservation, bottleneck situation, current value: 95.00% exceeds configured threshold: 90.00%. : 07:17 17/02/20)
Especially This is not about disk, it's about swap space, the application finishes memory and then goes to swap use. There was memory increase before, but obviously it was insufficient, it is switching to swap again.
I need to understand: ''Why they use so many resources?''
Problematic one:
Normal one:
You need to provide example events, one from the normal situation, and one from the problematic situation.
It appears that someone in your environment has developed a field extraction for field7, which is incorrectly parsing the event.
Alternatively, it could the device that is sending the syslog data, may have an issue with it and it is reporting an error. Depending on the device, you may be better using a TA from splunkbase.splunk.com to extract the relevant information from the event

What does the IS_ALIGNED macro in the linux kernel do?

I've been trying to read the implementation of a kernel module, and I'm stumbling on this piece of code.
unsigned long addr = (unsigned long) buf;
if (!IS_ALIGNED(addr, 1 << 9)) {
DMCRIT("#%s in %s is not sector-aligned. I/O buffer must be sector-aligned.", name, caller);
BUG();
}
The IS_ALIGNED macro is defined in the kernel source as follows:
#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
I understand that data has to be aligned along the size of a datatype to work, but I still don't understand what the code does.
It left-shifts 1 by 9, then subtracts by 1, which gives 111111111. Then 111111111 does bitwise-and with x.
Why does this code work? How is this checking for byte alignment?
In systems programming it is common to need a memory address to be aligned to a certain number of bytes -- that is, several lowest-order bits are zero.
Basically, !IS_ALIGNED(addr, 1 << 9) checks whether addr is on a 512-byte (2^9) boundary (the last 9 bits are zero). This is a common requirement when erasing flash locations because flash memory is split into large blocks which must be erased or written as a single unit.
Another application of this I ran into. I was working with a certain DMA controller which has a modulo feature. Basically, that means you can allow it to change only the last several bits of an address (destination address in this case). This is useful for protecting memory from mistakes in the way you use a DMA controller. Problem it, I initially forgot to tell the compiler to align the DMA destination buffer to the modulo value. This caused some incredibly interesting bugs (random variables that have nothing to do with the thing using the DMA controller being overwritten... sometimes).
As far as "how does the macro code work?", if you subtract 1 from a number that ends with all zeroes, you will get a number that ends with all ones. For example, 0b00010000 - 0b1 = 0b00001111. This is a way of creating a binary mask from the integer number of required-alignment bytes. This mask has ones only in the bits we are interested in checking for zero-value. After we AND the address with the mask containing ones in the lowest-order bits we get a 0 if any only if the lowest 9 (in this case) bits are zero.
"Why does it need to be aligned?": This comes down to the internal makeup of flash memory. Erasing and writing flash is a much less straightforward process then reading it, and typically it requires higher-than-logic-level voltages to be supplied to the memory cells. The circuitry required to make write and erase operations possible with a one-byte granularity would waste a great deal of silicon real estate only to be used rarely. Basically, designing a flash chip is a statistics and tradeoff game (like anything else in engineering) and the statistics work out such that writing and erasing in groups gives the best bang for the buck.
At no extra charge, I will tell you that you will be seeing a lot of this type of this type of thing if you are reading driver and kernel code. It may be helpful to familiarize yourself with the contents of this article (or at least keep it around as a reference): https://graphics.stanford.edu/~seander/bithacks.html

Variable length messages in Verilog (serial CRC-32)

I'm working with a serial protocol. Messages are of variable length that is known in advance. On both transmission and reception sides, I have the message saved to a shift register that is as long as the longest possible message.
I need to calculate CRC32 of these registers, the same as for Ethernet, as fast as possible. Since messages are variable length (anything from 12 to 64 bits), I chose serial implementation that should run already in parallel with reception/transmission of the message.
I ran into a problem with organization of data before calculation. As specified here , the data needs to be bit-reversed, padded with 32 zeros and complemented before calculation.
Even if I forget the part about running in parallel with receiving or transmitting data, how can I effectively get only my relevant message from max-length register so that I can pad it before calculation? I know that ideas like
newregister[31:0] <= oldregister[X:0] // X is my variable length
don't work. It's also impossible to have the generate for loop clause that I use to bit-reverse the old vector run variable number of times. I could use a counter to serially move data to desired length, but I cannot afford to lose this much time.
Alternatively, is there an operation that would directly give me the padded and complemented result? I do not even have an idea how to start developing such an idea.
Thanks in advance for any insight.
You've misunderstood how to do a serial CRC; the Python question you quote isn't relevant. You only need a 32-bit shift register, with appropriate feedback taps. You'll get a million hits if you do a Google search for "serial crc" or "ethernet crc". There's at least one Xilinx app note that does the whole thing for you. You'll need to be careful to preload the 32-bit register with the correct value, and whether or not you invert the 32-bit data on completion.
EDIT
The first hit on 'xilinx serial crc' is xapp209, which has the basic answer in fig 1. On top of this, you need the taps, the preload value, whether or not to invert the answer, and the value to check against on reception. I'm sure they used to do all this in another app note, but I can't find it at the moment. The basic references are the Ethernet 802.3 spec (3.2.8 Frame check Sequence field, which was p27 in the original book), and the V42 spec (8.1.1.6.2 32-bit frame check sequence, page 311 in the old CCITT Blue Book). Both give the taps. V42 requires a preload to all 1's, invert of completion, and gives the test value on reception. Warren has a (new) chapter in Hacker's Delight, which shows the taps graphically; see his website.
You only need the online generators to check your solution. Be careful, though: they will generally have different preload values, and may or may not invert the result, and may or may not be bit-reversed.
Since X is a viarable, you will need to bit assignments with a for-loop. The for-loop needs to be inside an always block and the for-loop must static unroll (ie the starting index, ending index, and step value must be constants).
for(i=0; i<32; i=i+1) begin
if (i<X)
newregister[i] <= oldregister[i];
else
newregister[i] <= 1'b0; // pad zeros
end

Stack Overflow (Shellcoder's Handbook)

I'm currently following the erratas for the Shellcoder's Handbook (2nd edition).
The book is a little outdated but pretty good still. My problem right now is that I can't guess how long my payload needs to be I tried to follow every step (and run gdb with the same arguments) and I tried to guess where the buffer starts, but I don't know exactly. I'm kind of new to this too so it makes sense.
I have a vulnerable program with strcpy() and a buffer[512]. I want to make the stack overflow, so I run some A's with the program (as the Erratas for the Shellcoders Handbook). I want to find how long the payload needs to be (no ASLR) so in theory I just need to find where the buffer is.
Since I'm new I can't post an image, but the preferred output from the book has a full 4 row of 'A's (0x41414141), and mine is like this:
(gdb) x/20xw $esp - 532
0xbffff968 : 0x0000000 0xbfffffa0e 0x41414141 0x41414141
0xbffff968 0x41414141 0x41414141 0x00004141 0x0804834
What address is that? How I know where this buffer starts? I want to do this so I can keep working with the book. I realize that the buffer is somewhere in there because of the A's that I ran. But if I want to find how long the payload needs to be I need the point where it starts.
I'm not sure that you copied the output of gdb correctly. You used the command x/20xw, this says you'd like to examine 20 32-bit words of memory, displayed as hex. As such, each item of data displayed should consist of 0x followed by 8 characters. You have some some with only 7, and some with 9. I'll assume that you copied out the text by hand and made a few mistakes.
The address is the first item displayed on the line, so, for the first line the address is 0xbffff968, this is the address of the first byte on the line. From there you can figure out the address of every other byte on the line by counting.
Your second line looks a little messed up, you have the same address, and also you're missing the : character, again, I'll assume this is just a result of the copy. I would expect the address of the second line to be 0xbffff978.
If the buffer starts with the first word of 0x41414141 then this is at address 0xbffff970, though an easier way to figure out the address of a variable is just to ask gdb for the address of the variable, so, in your case, once gdb is stopped at a place where buffer is in scope:
(gdb) p &buffer
$1 = (char (*)[512]) 0xbffff970
Metasploit has a nice tool to assist with calculating the offset. It will generate a string that contains unique patterns. Using this pattern (and the value of EIP or whatever other location after using the pattern), you can see how big the buffer should be to write exactly into EIP or whatever other location.
Open the tools folder in the metasploit framework3 folder (I’m using a linux version of metasploit 3). You should find a tool called pattern_create.rb. Create a pattern of 5000 characters and write it into a file:
root#bt:/pentest/exploits/framework3/tools# ./pattern_create.rb
Usage: pattern_create.rb length [set a] [set b] [set c]
root#bt:/pentest/exploits/framework3/tools# ./pattern_create.rb 5000
Then just replace the A's with the output of the tool.
Run the application and wait until the application dies again, and take note of the contents of EIP or whatever other location.
Then use a second metasploit tool to calculate the exact length of the buffer before writing into EIP or whatever other location, feed it with the value of EIP or whatever other location(based on the pattern file) and length of the buffer :
root#bt:/pentest/exploits/framework3/tools# ./pattern_offset.rb 0x356b4234 5000
1094
root#bt:/pentest/exploits/framework3/tools#
That’s the buffer length needed to overwrite EIP or whatever other location.

Resources