How to decode cmd = 3222823425 in ioctl in Linux 2.6.29? - linux

I am just confused like how can I break cmd=3222823425 value into different parts to figure out what this command means actually in the Linux kernel. I know, some functions are making ioctl command with following parameters but I want to know what these parameter values mean.
fd=21, cmd=3222823425 and arg=3203118816
I have been looking into various forums, man pages and other links to figure this out like what does it mean when a cmd in an ioctl system call has value of 3222823425. I have found that cmd is a command number which consists of type, number and data_type and first twos are 8-bit integers (0-255).
So my question is how to decode these parameter values to find out what this call is trying to do?

Beware to refer to the right documentation to understand how to decode an ioctl command. Documentation/ioctl-number.txt explanes how to create a new ioctl code, while the document linked in the previous answer gives an overview of the overall process before focusing on ioctl creation as well. asm/ioctl.h is a better source, because an ioctl actual masking may vary across different architectures, but an explanation of the general convention and bitfields meaning and position can be found in include/asm-generic/ioctl.h and Documentation/ioctl-decoding.txt.
From the latter:
bits meaning
31-30 00 - no parameters: uses _IO macro
10 - read: _IOR
01 - write: _IOW
11 - read/write: _IOWR
29-16 size of arguments
15-8 ascii character supposedly
unique to each driver
7-0 function #
According to the above, cmd=3222823425 should decode as:
3222823425 -> 0xC0186201 -> 11000000000110000110001000000001
- `direction` -> `11` -> read/write;
- `size` -> `00000000011000` -> 24 bytes (a pointer to a struct of
this size should be passed as 3rd
argument of ioctl();
- `type` -> `01100010` -> 0x62, ascii for character 'b';
- `number` -> `00000001` -> driver function #1.
In the hope this can help.
Regards.

According to this link, ioctl command number has multiple components:
type. The magic number.This field is _IOC_TYPEBITS bits wide (usually 8)
number. The ordinal (sequential) number. It's _IOC_NRBITS bits wide. (usually 8)
direction. The direction of data transfer. The possible values are _IOC_NONE (no data transfer), _IOC_READ, _IOC_WRITE, and _IOC_READ|_IOC_WRITE (data is transferred both ways). It's usually 2 bits.
size. The size of user data involved. It's _IOC_SIZEBITS wide (14 bits).
You should consult include/asm/ioctl.h and Documentation/ioctl-number.txt for your kernel to see the actual configuration.
For your case 3222823425==0xC0186201
So:
type==0xC0
number==0x18
direction==0x1
size==0x2201
(6 in bits is 0110, so the size is the first two bits(01), the remaining bits are put in data_type, which remains 0x2201)

Related

Quickjob MOVZON X'FF' to OFA1

what does MOVZON X'FF' do in quickjob. I believe it just moves input to output. Please let me know, if I am wrong.
The smallest unit of information is the bit. Processors usually don‘t work on single bits when accessing memory; they work on bytes. A byte consists of 8 consecutive bits (for most architectures).
To describe how different processor instructions work with bytes, bytes are sometimes subdivided into two 4-bit groups, called nibbles. Counting left to right, bits 0-3 are called „left nibble“, „high order nibble“, or „zone nibble“. Bits 4-7, the right half, are called „right nibble“, „low order nibble“, or „number nibble“.
There are instructions that work on the whole byte, e.g. MOVE. And there are instructions that work on nibbles. MOVEZONE (MOVZON) works on zone nibbles and leaves the number nibbles alone; MOVENUM (MOVNUM) works on number nibbles, and leaves the zone nibbles alone.
This kind of instructions are usually used with bytes that contain numeric values, coded as either zoned decimal, or packed decimal. They are rather exotic when working on text data.
This reference is used.
Given the instruction:
MOVZON X'FF' to OFA1
The receiving field OFA1 refers to the first record position (the 1) of the output file ( the OF) designated as A. The instruction will set the high-order bits (0-3 or "zone bits") of the first position to ones, matching bits 0-3 of the X'FF'.
However, it appears, as a matter of style, the instruction should have been written as MOVZON X'F0' TO OAF1 since the low-order bits (4-7) are not used.

What does the IS_ALIGNED macro in the linux kernel do?

I've been trying to read the implementation of a kernel module, and I'm stumbling on this piece of code.
unsigned long addr = (unsigned long) buf;
if (!IS_ALIGNED(addr, 1 << 9)) {
DMCRIT("#%s in %s is not sector-aligned. I/O buffer must be sector-aligned.", name, caller);
BUG();
}
The IS_ALIGNED macro is defined in the kernel source as follows:
#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
I understand that data has to be aligned along the size of a datatype to work, but I still don't understand what the code does.
It left-shifts 1 by 9, then subtracts by 1, which gives 111111111. Then 111111111 does bitwise-and with x.
Why does this code work? How is this checking for byte alignment?
In systems programming it is common to need a memory address to be aligned to a certain number of bytes -- that is, several lowest-order bits are zero.
Basically, !IS_ALIGNED(addr, 1 << 9) checks whether addr is on a 512-byte (2^9) boundary (the last 9 bits are zero). This is a common requirement when erasing flash locations because flash memory is split into large blocks which must be erased or written as a single unit.
Another application of this I ran into. I was working with a certain DMA controller which has a modulo feature. Basically, that means you can allow it to change only the last several bits of an address (destination address in this case). This is useful for protecting memory from mistakes in the way you use a DMA controller. Problem it, I initially forgot to tell the compiler to align the DMA destination buffer to the modulo value. This caused some incredibly interesting bugs (random variables that have nothing to do with the thing using the DMA controller being overwritten... sometimes).
As far as "how does the macro code work?", if you subtract 1 from a number that ends with all zeroes, you will get a number that ends with all ones. For example, 0b00010000 - 0b1 = 0b00001111. This is a way of creating a binary mask from the integer number of required-alignment bytes. This mask has ones only in the bits we are interested in checking for zero-value. After we AND the address with the mask containing ones in the lowest-order bits we get a 0 if any only if the lowest 9 (in this case) bits are zero.
"Why does it need to be aligned?": This comes down to the internal makeup of flash memory. Erasing and writing flash is a much less straightforward process then reading it, and typically it requires higher-than-logic-level voltages to be supplied to the memory cells. The circuitry required to make write and erase operations possible with a one-byte granularity would waste a great deal of silicon real estate only to be used rarely. Basically, designing a flash chip is a statistics and tradeoff game (like anything else in engineering) and the statistics work out such that writing and erasing in groups gives the best bang for the buck.
At no extra charge, I will tell you that you will be seeing a lot of this type of this type of thing if you are reading driver and kernel code. It may be helpful to familiarize yourself with the contents of this article (or at least keep it around as a reference): https://graphics.stanford.edu/~seander/bithacks.html

What do xfrm_replay_state_esn fields mean?

I'm trying to understand a little bit more about Linux kernel IPSec networking by looking at the kernel source. I understand conceptually that IPSec prevents replay attacks with a sequence number and a replay window, i.e. if a recipient receives a packet with a sequence number that is not within the replay window, or it has received before, then it drops that packet and increments the replay counter.
I'm trying to correlate this to the structure xfrm_replay_state_esn which is defined as such:
struct xfrm_replay_state_esn {
unsigned int bmp_len;
__u32 oseq;
__u32 seq;
__u32 oseq_hi;
__u32 seq_hi;
__u32 replay_window;
__u32 bmp[0];
};
I've tried searching for documentation, but it's scant and I haven't been able to find a man of the various functions and structures, so I don't understand what the individual fields relate to.
XFRM is an IPSec implementation for the Linux kernel. The name XFRM stands for "transform" referencing the transformation of IP packets as per the IPSec protocol.
The following RFCs are relevant for IPSec:
RFC4301: Definition of the IPSec protocol.
RFC4302: Definition of the Authentication Header (AH) sub-protocol for ensuring authenticity of IP packets.
RFC4303: Definition of the Encapsulating Security Payload (ESP) sub-protocol for ensuring authenticity and secrecy of IP packets.
The IPSec protocol allows for sequence numbers of size 32 bits or 64 bits. The 64 bit sequence numbers are referred to as Extended Sequence Numbers (ESN).
The anti-replay mechanism is defined in the RFCs for both AH and ESP. The mechanism keeps a window of acceptable sequence numbers of incoming packets. The window extends back from the highest sequence number received so far, defining a lower bound for the acceptable sequence numbers. When receiving a sequence number below that bound, it is rejected. When receiving a sequence number higher than the current highest sequence number, the window is shifted forward. When receiving a sequence number within the window, the mechanism will mark this sequence number in a checklist for ensuring that each sequence number in the window is only received once. If the sequence number has already been marked, it is rejected.
This checklist can be implemented as a bitmap, where each sequence number in the window is represented by a single bit, with 0 meaning this sequence number has not been received yet, and 1 meaning it has already been received.
Based on this information, the meaning of the fields in the xfrm_replay_state_esn struct can be given as follows.
The struct holds the state of the anti-replay mechanism with extended sequence numbers (64 bits):
The highest sequence number received so far is represented by seq and seq_hi. Each is a 32 bit integer, so together they can represent a 64 bit number, with seq holding the lower 32 bit and seq_hi holding the higher 32 bit. The reason for splitting the 64 bit value into two 32 bit values, instead of representing it as a single 64 bit variable, is that the IPSec protocol mandates an optimization where only the lower 32 bit of the sequence number are included in the package. For this reason, it is more convenient to have the lower 32 bits as a separate variable in the struct, so that it can be accessed directly without resorting to bit-operations.
The sequence number counter for outgoing packages is tracked in oseq and oseq_hi. As before, the 64 bit number is represented by two 32 bit variables.
The size of the window is represented by replay_window. The smallest acceptable sequence number if given by the sequence number expressed by seq and seq_hi minus replay_window plus one.
The bitmap for checking off received sequence numbers within the window is represented by bmp. It is defined as a zero-sized array, but when the memory for the struct is allocated, additional memory is reserved after the struct, which can then be accessed e.g. with bmp[i] (which is of course just syntactic sugar for *(bmp+i)). The size of the bitmap is held in bmp_len. It is of course related to the window size, i.e. window size divided by 8*sizeof(u32), rounded up. I would speculate that it is stored explicitly to avoid having to recalculate this value frequently.

Ada : Variant size in record type

I having some trouble with the type Record with Ada.
I'm using Sequential_IO to read a binary file. To do that I have to use a type where the size is a multiple of the file's size. In my case I need a structure of 50 bytes so I created a type like this ("Vecteur" is an array of 3 Float) :
type Double_Byte is mod 2 ** 16; for Double_Byte'Size use 16;
type Triangle is
record
Normal : Vecteur(1..3);
P1 : Vecteur(1..3);
P2 : Vecteur(1..3);
P3 : Vecteur(1..3);
Byte_count1 : Double_Byte;
end record;
When I use the type triangle the size is 52 bytes, but when I take the size of each one separetely within it I find 50 bytes. Because 52 is not a multiple of my file's size I have execution errors. But I don't know how to fix this size, I ran some test and I think it come from Double_Byte, because when I removed it from the record I found a size of 48 bytes and when I put it back it's 52 bytes again.
Thanks you for your help.
Given Simon's latest comment, it may be impossible to do this portably using Sequential_IO; namely, reading the file on some machines (which don't support unaligned accesses) may leave half its contents unaligned and therefore liable to fail when you access them.
I can't help feeling that a better solution is to divorce the file format (which is fixed by compatibility with other systems) from the machine format (which is not). And therefore moving to Stream_IO and writing your own Read and Write primitives where necessary (e.g. to pack the odd sized Double_Byte component into 2 bytes, whatever its representation in memory) would be a more robust solution.
Then you can guarantee a file format compatible with other systems, and an internal memory format guaranteed to work.
The compiler is in no way obligated to use a specific size for Triangle unless you specify it. As you don't, it chooses whatever size it sees fit for fast access to the data. Even if you specify representation details for every component type of the record, the compiler might still choose to use more space for the record itself than necessary.
Considering the sizes you give, it seems obvious that one component of Vecteur has 4 bytes, which gives a total payload of 50 bytes for Triangle. The compiler now chooses to add 2 bytes padding, so that the record size is a multiple of the size of a 4-byte word. You can override this behavior with:
for Triangle'Size use 50 * 8;
This will force the compiler to use only 50 bytes for the record. As this is a tight fit, there is only one way to represent the record, and no further specification is necessary. If you do need to specify how exactly the record is represented, you can use a record representation clause.
Edit:
The representation clause specifies the size for the type. However, each object of this type may still take up more space unless you additionally specify
pragma Pack (Triangle);
Edit 2:
After Simon's comment, I had a closer look at this and realized that there is a far better and cleaner solution. Instead of setting the 'Size and using pragma Pack, do this:
for Triangle use record at mod 2;
Normal at 0 range 0 .. 95;
P1 at 12 range 0 .. 95;
P2 at 24 range 0 .. 95;
P3 at 36 range 0 .. 95;
Byte_count1 at 48 range 0 .. 15;
end record;
The initial mod 2 defines that the record is to be aligned at a multiple of 2 bytes. This eliminates the padding at the end without the need of pragma Pack (which is not guaranteed to work the same way on every compiler).

linux socket programming with the consideration of real size of char

I'm writing a client and server program with Linux socket programming. I'm confused about something. Although sizeof(char) is guaranteed to be 1, I know the real size of char may be different in different computer. It may be 8bits,16bits or some other size. The problem is that what if client and server have different size of char. For example client char size is 8bits and server char size is 16bits. Client call write(socket_fd, *c, sizeof(char)) and Server call read(socket_fd, *c, sizeof(char)). Does Client sends 8bits and Server wants to receive 16bits? If it is true, what will happen?
Another question: Is it good for me to pass text between client and server because I don't need to consider the big endian and little endian problem?
Thanks in advance.
What system are you communicating with that has 16bits in a byte? In any case, if you want to know exactly how many bits you have - use int8 instead.
#Basile is right. A char is always eight bits in linux. I found this in the book Linux Kernel Development. This book also states some other rules:
Although there is no rule that the int type be 32 bits, it is in Linux on all currently supported architectures.
The same goes for the short type, which is 16 bits on all current architectures, although no rule explicitly decrees that.
Never assume the size of a pointer or a long, which can be either 32 or 64 bits on the currently supported machines in Linux.
Because the size of a long varies on different architectures, never assume that sizeof(int) is equal to sizeof(long).
Likewise, do not assume that a pointer and an int are the same size.
For the choice of pass by binary data or text data through the network, the book UNIX Network Programming Volume1 gives the two solutions:
Pass all numeric data as text strings.
Explicitly define the binary formats of the supported datatypes (number of bits, big- or little-endian) and pass all data between the client and server in this format. RPC packages normally use this technique. RFC 1832 [Srinivasan 1995] describes the External Data Representation (XDR) standard that is used with the Sun RPC package.
The c definition of char as the size of a memory cell is different from the definition used in Unicode.
A Unicode code-point can, depending on the encoding used, require up to 6 bytes of storage.
This is a slightly different problem than byte order and word size differences between different architectures, etc.
If you wish to express complex structures (containing unicode text), it's probably a
good idea to implement a message protocol, that encode messages to a byte array, that can be send over any communication channel.
A simple client/server mechanism is to send a fixed size header containing the length of the following message. It's a nice exercise to build something like this in c... :-)
Depending on what you are trying to do, it may be worthwhile to look at existing technologies for the message interface; Look at Etch, Thrift, SWIG, *-rpc, asn1, soap, xml, json, corba, etc.

Resources