netlink and big endian format - linux

I have not found any document/note in the kernel that would mandate to pass 16/32-bit values in netlink messages towards the kernel in network byte order. So my question is if I have to use htonl/htons functions when filling up netlink message. Is there such requirement at all?

according to this article this could be controlled on per-attribute basis
There are two special flags which may be present in netlink
attributes, though I have yet to encounter them in my work.
NLA_F_NESTED: specifies a nested attribute; used as a hint for
parsing. Doesn’t always appear to be used, even if nested attributes
are present. NLA_F_NET_BYTEORDER: attribute data is stored in network
byte order (big endian) instead of host endianness
UPD: looks like native (little) endian does not work well for some cases: I'm getting errno 4097 trying to pass IPSET CREATE timeout using it. network byte order works fine.

Related

NVME sensor reading error with more than 1 NVME configured in entity manager

Hi, I'm trying to read NVMe sensors using NVMeSensor from dbus-sensors. I have configured for 4 Nvmes in my *.json file of entity-manager (EM) config and it logged "Sensor x error reading" for all. I put the config in the common EM config for the board together with Fan sensors, ADCsensors and others, refering this (https://github.com/ibm-openbmc/entity-manager/blob/14a7bc9303d747dbc20cb702083e7af0a3cf0496/configurations/NVME%20P4000.json#L10-L41). In this case, I see that boost::asio::async_read at https://github.com/openbmc/dbus-sensors/blob/ce6bcdfc28f60173093087050a43adbc586fd6fa/src/NVMeBasicContext.cpp#L290 returns the response of size 0. But the resp from https://github.com/openbmc/dbus-sensors/blob/ce6bcdfc28f60173093087050a43adbc586fd6fa/src/NVMeBasicContext.cpp#L83 has size of 6 and valid value.
Howerver, when I config only 1 nvme in EM, it returns value normally on dbus.
I wonder if NVMeSensor only support nvme with a fru and we have to have a single json file for each just like NVMEP4000.json.
What should I do when I want to config all the nvme inside the EM config of the board?. Since I can't find any reference.
I have not found the meaning of "Address" in NVME1000 config since it will use 0x6a anyway, at least to what I have seen. Can you tell me what is it for?
I'm really new to OpenBMC and don't get much of the mechanism of the code, please help to remedy my understanding if it's not correct. Any advice from you will be appreciated a lot.
Thank you.
Edited
I realize that when 1 of the NVME is not present, all of them will fails. I think the failed one affects the stream for reading or the response stream (respStream) although each nvme has a separate request stream (reqStream). I don't know why they interfere each other, but I see that when the resp size from smbus is < 0, they still write them to the stream without resizing the resp vector like when the size is normal, I add the resp.resize(len) here (https://github.com/openbmc/dbus-sensors/blob/ce6bcdfc28f60173093087050a43adbc586fd6fa/src/NVMeBasicContext.cpp#L153), it works, and we can do hot plug. Is that because I did not use FRU probe for the NVMEs....?
I wonder if NVMeSensor only support nvme with a fru and we have to have a single json file for each just like NVMEP4000.json.
The "Probe" field in entity-manager configuration json is used for probe rules for the device. FRU is just one way. For example, if you know the exact i2c bus and address, you can use something like
xyz.openbmc_project.Inventory.Decorator.I2CDevice({'Bus': 4, 'Address': 60})
^ ^ ^
DBus Interface | Value
Property
And "Probe" can be an array with AND OR operators. Like this example.
What should I do when I want to config all the nvme inside the EM config of the board?.
I think adding all 4 NVME1000 blocks to your board json will do this, as long as they have different names and bus-address configuration.
I have not found the meaning of "Address" in NVME1000 config since it will use 0x6a anyway, at least to what I have seen. Can you tell me what is it for?
On Intel P4000 series SSDs, 0x53 (What in nvme_p4000.json) is the 7bit address of the FRU eeprom, while 0x6a is the 7bit address for NVM Express Basic Management Command (Appendix A of NVMe-MI 1.2b Specification). These addresses are only documented in the product spec that not generally available :(
Putting all nvme configs inside the baseboard EM config is OK. There are hotplug issues with dbus-sensors nvmesensor, so when one of my configured nvme is not present, all the others will fail. I only plugged 1 nvme to one of the 4 slots so it causes the problem. I was told they are checking on this, but I'm doing the trick I put in the Edit section of my question.
They hardcode 0x6A for i2c address in nvmesensor code, the reason is as #KagurazakaKotori said.

Are sockaddr_in and sockaddr_in6 still using sin_len and sin6_len?

So, basically the title says it all. I've been porting my unix socket C code to Windows, and apparently those structures do not have sin_len or sin6_len in windows.
I'm using a union between sockaddr_storage, sockaddr_in and sockaddr_in6 everywhere, and just using the correct member according to ss_family. It would make sense that the socket library could just deduce the size according to the family, so the length field would indeed be redundant.
If I comment out the code that sets the length field, everything still works on OSX and linux, but that may be just an illusion, so I decided to ask here.
Is that variable deprecated, somehow? Can I safely stop using it, and rely on the socket implementation to use the family variable?
The sin_len field is not required by the POSIX specification.
Here's the relevant information from Unix Network Programming - Vol 1. The Sockets Networking API, 3rd Edition:
3.2 Socket Address Structures
The length member, sin_len, was added with 4.3BSD-Reno, when support
for the OSI protocols was added (Figure 1.15). Before this release,
the first member was sin_family, which was historically an unsigned
short. Not all vendors support a length field for socket address
structures and the POSIX specification does not require this member.
Further Steven's provides the motivation behind the field:
Having a length field simplifies the handling of variable-length socket address structures.
Even if the length field is present, we need never set it and need never examine it, unless we are dealing with routing sockets (Chapter 18). It is used within the kernel by the routines that deal with socket address structures from various protocol families (e.g, the routing table code).
The four socket functions that pass a socket address structure from the process to the kernel, bind, connect, sendto, and sendmsg, all go through the sockargs function in a Berkeley-derived implementation (p. 452 of TCPv2). This function copies the socket address structure from the process and explicitly sets its sin_len member to the size of the structure that was passed as an argument to these four functions. The five socket functions that pass a socket address structure from the kernel to the process, accept, recvfrom, recvmsg, getpeername, and getsockname, all set the sin_len member before returning to the process.
Unfortunately, there is normally no simple compile-time test to determine whether an implementation defines a length field for its socket address structures...We will see in Figure 3.4 that IPv6 implementations are required to define SIN6_LEN if the socket address structures have a length field. Some IPv4 implementations provide the length field of the socket address structure to the application based on a compile-time option (e.g., _SOCKADDR_LEN).
You'll want to evaluate the usage of sin_len in your code. If it's just initializing it to 0, you can remove the code. If you're reading the value from the result of accept, recvfrom, recvmsg, getpeername, or getsockname, you'll unfortunately need some platform specific compile switches or switch to using the separate address length variable

tcpdump: capture outgoing packets on virtual interfaces that has an unknown link type to libpcap?

In the system I am testing right now, it has a couple of virtual L2 devices chained together to add our own L2.5 headers between Eth headers and IP headers. Now when I use
tcpdump -xx -i vir_device_1
, it actually shows the SLL header with IP header. How do I capture the full packet that is actually going out of the vir_device_1, i.e. after the ndo_start_xmit() device call?
How do I capture the full packet that is actually going out of the vir_device_1, i.e. after the ndo_start_xmit() device call?
Either by writing your own code to directly use a PF_PACKET/SOCK_RAW socket (you say "SLL header", so this is presumably Linux), or by:
making sure you've assigned a special ARPHRD_ value for your virtual interface;
using one of the DLT_USERn values for your special set of headers, or asking tcpdump-workers#lists.tcpdump.org for an official DLT_ value to be assigned for them;
modifying libpcap to map that ARPHRD_ value to the DLT_ value you're using;
modifying tcpdump to handle that DLT_ value;
if necessary, modifying other programs that would capture on that interface or read capture files as written by tcpdump on that interface to handle that value as well.
Note that the DLT_USERn values are specifically reserved for private use, and no official versions of libpcap, tcpdump, or Wireshark will ever assign them for their own use (i.e., if you use a DLT_USERn value, don't bother contributing patches to assign that value to your type of headers, as they won't be accepted; other people may already be using it for their own special headers, and that must continue to be supported), so you'll have to maintain the modified versions of libpcap, tcpdump, etc. yourself if you use one of those values rather than getting an official value assigned.
Thanks Guy Harris for providing very helpful answers to my original question!
I am adding this as an answer/note to a follow up question I asked in the comments.
Basically my question was what is the status of the packet received by PF_PACKET/SOCK_RAW.
For an software device(no queue), dev_queue_xmit() will call dev_hard_start_xmit(skb, dev) to start transmitting skb buffer. This function calls dev_queue_xmit_nit() before it calls dev->ops->ndo_start_xmit(skb,dev), which means the packet PF_PACKET sees is at the state before any changes made in ndo_start_xmit().

linux socket programming with the consideration of real size of char

I'm writing a client and server program with Linux socket programming. I'm confused about something. Although sizeof(char) is guaranteed to be 1, I know the real size of char may be different in different computer. It may be 8bits,16bits or some other size. The problem is that what if client and server have different size of char. For example client char size is 8bits and server char size is 16bits. Client call write(socket_fd, *c, sizeof(char)) and Server call read(socket_fd, *c, sizeof(char)). Does Client sends 8bits and Server wants to receive 16bits? If it is true, what will happen?
Another question: Is it good for me to pass text between client and server because I don't need to consider the big endian and little endian problem?
Thanks in advance.
What system are you communicating with that has 16bits in a byte? In any case, if you want to know exactly how many bits you have - use int8 instead.
#Basile is right. A char is always eight bits in linux. I found this in the book Linux Kernel Development. This book also states some other rules:
Although there is no rule that the int type be 32 bits, it is in Linux on all currently supported architectures.
The same goes for the short type, which is 16 bits on all current architectures, although no rule explicitly decrees that.
Never assume the size of a pointer or a long, which can be either 32 or 64 bits on the currently supported machines in Linux.
Because the size of a long varies on different architectures, never assume that sizeof(int) is equal to sizeof(long).
Likewise, do not assume that a pointer and an int are the same size.
For the choice of pass by binary data or text data through the network, the book UNIX Network Programming Volume1 gives the two solutions:
Pass all numeric data as text strings.
Explicitly define the binary formats of the supported datatypes (number of bits, big- or little-endian) and pass all data between the client and server in this format. RPC packages normally use this technique. RFC 1832 [Srinivasan 1995] describes the External Data Representation (XDR) standard that is used with the Sun RPC package.
The c definition of char as the size of a memory cell is different from the definition used in Unicode.
A Unicode code-point can, depending on the encoding used, require up to 6 bytes of storage.
This is a slightly different problem than byte order and word size differences between different architectures, etc.
If you wish to express complex structures (containing unicode text), it's probably a
good idea to implement a message protocol, that encode messages to a byte array, that can be send over any communication channel.
A simple client/server mechanism is to send a fixed size header containing the length of the following message. It's a nice exercise to build something like this in c... :-)
Depending on what you are trying to do, it may be worthwhile to look at existing technologies for the message interface; Look at Etch, Thrift, SWIG, *-rpc, asn1, soap, xml, json, corba, etc.

How to get the parity bit from characters received by serial port?

I am writing a driver for a device that is connected by serial port. Unfortunately, the 9th data bit indicates whether the character should be interpreted as command or as data.
Using the built-in parity check does not work for me because an error is indicated by an additional character (NUL). And then I don't know wheter I received two data bytes or one with an parity error.
Is there a way to get this parity bit elsewhere?
EDIT: Apparently, this problem also exists on Windows (see http://gilzu.com/?p=6). It ended up with rewriting the serial driver. Is this also my only option on Linux?
As I see it, you should be able to use PARMRK as is, assuming that the \377 \0 pattern is unlikely to appear in your input. otherwise, yes, you may modify your serial driver to prepend the parity (or rather, if this byte had a parity error) to each byte. I'd go with the former, though.

Resources